The previous commit commented out AddOtherRatio("n") in the Ali
adaptor to fix double-counting but this could cause billing evasion
when n is specified via extra["parameters"] instead of request.N.
Root cause: ImagePriceRatio in GetTokenCountMeta() already included
n, AND channel adaptors added OtherRatio("n"), resulting in n²
billing.
Proper fix:
- Remove n from ImagePriceRatio (keep sizeRatio * qualityRatio only)
- In ImageHelper, add default OtherRatio("n") when adaptor hasn't
set one; set fallback tokens to 1 (base unit)
- Restore Ali adaptor's AddOtherRatio("n") — it uses actual upstream
parameters/response count, preventing billing evasion
This commit introduces a major architectural refactoring to improve quota management, centralize logging, and streamline the relay handling logic.
Key changes:
- **Pre-consume Quota:** Implements a new mechanism to check and reserve user quota *before* making the request to the upstream provider. This ensures more accurate quota deduction and prevents users from exceeding their limits due to concurrent requests.
- **Unified Relay Handlers:** Refactors the relay logic to use generic handlers (e.g., `ChatHandler`, `ImageHandler`) instead of provider-specific implementations. This significantly reduces code duplication and simplifies adding new channels.
- **Centralized Logger:** A new dedicated `logger` package is introduced, and all system logging calls are migrated to use it, moving this responsibility out of the `common` package.
- **Code Reorganization:** DTOs are generalized (e.g., `dalle.go` -> `openai_image.go`) and utility code is moved to more appropriate packages (e.g., `common/http.go` -> `service/http.go`) for better code structure.