A proxy server that provides OpenAI/Gemini/Claude compatible API interfaces for CLI.
It now also supports OpenAI Codex (GPT models) and Claude Code via OAuth.
so you can use local or multi‑account CLI access with OpenAI‑compatible clients and SDKs.
Now, We added the first Chinese provider: Qwen Code.
CLI Proxy API is a proxy server designed to provide OpenAI/Gemini/Claude compatible API interfaces for command-line tools (CLI). It supports access to OpenAI Codex (GPT models), Claude Code, and Qwen Code via OAuth authentication. This software enables seamless local or multi-account CLI access using OpenAI-compatible clients and SDKs.
Key Features:
OpenAI/Gemini/Claude compatible API endpoints for CLI tools.
Support for OpenAI Codex (GPT models) and Claude Code via OAuth login.
Integration with Qwen Code, the first Chinese provider supported by the software.
Function calling and multimodal input support (text and images).
Multi-account load balancing across Gemini, OpenAI, Claude, and Qwen providers.
Audience & Benefit:
Ideal for developers, researchers, and enterprises seeking to integrate AI models into CLI tools or applications. Users benefit from unified access to multiple AI platforms through a single API interface, enabling efficient model switching, cost optimization via free tiers, and enhanced productivity through load balancing.
Set remote-management.disable-control-panel to true if you prefer to host the management UI elsewhere; the server will skip downloading management.html and /management.html will return 404.
Authentication
You can authenticate for Gemini, OpenAI, Claude, Qwen, and/or iFlow. All can coexist in the same auth-dir and will be load balanced.
Gemini (Google):
./cli-proxy-api --login
If you are an existing Gemini Code user, you may need to specify a project ID:
./cli-proxy-api --login --project_id
The local OAuth callback uses port 8085.
Options: add --no-browser to print the login URL instead of opening a browser. The local OAuth callback uses port 8085.
Gemini Web (via Cookies):
This method authenticates by simulating a browser, using cookies obtained from the Gemini website.
./cli-proxy-api --gemini-web-auth
You will be prompted to enter your __Secure-1PSID and __Secure-1PSIDTS values. Please retrieve these cookies from your browser's developer tools.
OpenAI (Codex/GPT via OAuth):
./cli-proxy-api --codex-login
Options: add --no-browser to print the login URL instead of opening a browser. The local OAuth callback uses port 1455.
Claude (Anthropic via OAuth):
./cli-proxy-api --claude-login
Options: add --no-browser to print the login URL instead of opening a browser. The local OAuth callback uses port 54545.
Qwen (Qwen Chat via OAuth):
./cli-proxy-api --qwen-login
Options: add --no-browser to print the login URL instead of opening a browser. Use the Qwen Chat's OAuth device flow.
iFlow (iFlow via OAuth):
./cli-proxy-api --iflow-login
Options: add --no-browser to print the login URL instead of opening a browser. The local OAuth callback uses port 11451.
Starting the Server
Once authenticated, start the server:
./cli-proxy-api
By default, the server runs on port 8317.
API Endpoints
List Models
GET http://localhost:8317/v1/models
Chat Completions
POST http://localhost:8317/v1/chat/completions
Request body example:
{
"model": "gemini-2.5-pro",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"stream": true
}
Notes:
Use a gemini-* model for Gemini (e.g., "gemini-2.5-pro"), a gpt-* model for OpenAI (e.g., "gpt-5"), a claude-* model for Claude (e.g., "claude-3-5-sonnet-20241022"), a qwen-* model for Qwen (e.g., "qwen3-coder-plus"), or an iFlow-supported model (e.g., "tstars2.0", "deepseek-v3.1", "kimi-k2", etc.). The proxy will route to the correct provider automatically.
Claude Messages (SSE-compatible)
POST http://localhost:8317/v1/messages
Using with OpenAI Libraries
You can use this proxy with any OpenAI-compatible library by setting the base URL to your local server:
Python (with OpenAI library)
from openai import OpenAI
client = OpenAI(
api_key="dummy", # Not used but required
base_url="http://localhost:8317/v1"
)
# Gemini example
gemini = client.chat.completions.create(
model="gemini-2.5-pro",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
# Codex/GPT example
gpt = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Summarize this project in one sentence."}]
)
# Claude example (using messages endpoint)
import requests
claude_response = requests.post(
"http://localhost:8317/v1/messages",
json={
"model": "claude-3-5-sonnet-20241022",
"messages": [{"role": "user", "content": "Summarize this project in one sentence."}],
"max_tokens": 1000
}
)
print(gemini.choices[0].message.content)
print(gpt.choices[0].message.content)
print(claude_response.json())
JavaScript/TypeScript
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'dummy', // Not used but required
baseURL: 'http://localhost:8317/v1',
});
// Gemini
const gemini = await openai.chat.completions.create({
model: 'gemini-2.5-pro',
messages: [{ role: 'user', content: 'Hello, how are you?' }],
});
// Codex/GPT
const gpt = await openai.chat.completions.create({
model: 'gpt-5',
messages: [{ role: 'user', content: 'Summarize this project in one sentence.' }],
});
// Claude example (using messages endpoint)
const claudeResponse = await fetch('http://localhost:8317/v1/messages', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'claude-3-5-sonnet-20241022',
messages: [{ role: 'user', content: 'Summarize this project in one sentence.' }],
max_tokens: 1000
})
});
console.log(gemini.choices[0].message.content);
console.log(gpt.choices[0].message.content);
console.log(await claudeResponse.json());
Supported Models
gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite
gemini-2.5-flash-image
gemini-2.5-flash-image-preview
gpt-5
gpt-5-codex
claude-opus-4-1-20250805
claude-opus-4-20250514
claude-sonnet-4-20250514
claude-sonnet-4-5-20250929
claude-3-7-sonnet-20250219
claude-3-5-haiku-20241022
qwen3-coder-plus
qwen3-coder-flash
qwen3-max
qwen3-vl-plus
deepseek-v3.2
deepseek-v3.1
deepseek-r1
deepseek-v3
kimi-k2
glm-4.5
glm-4.6
tstars2.0
And other iFlow-supported models
Gemini models auto-switch to preview variants when needed
Configuration
The server uses a YAML configuration file (config.yaml) located in the project root directory by default. You can specify a different configuration file path using the --config flag:
Directory where authentication tokens are stored. Supports using ~ for the home directory. If you use Windows, please set the directory like this: C:/cli-proxy-api/
Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504.
remote-management.allow-remote
boolean
false
Whether to allow remote (non-localhost) access to the management API. If false, only localhost can access. A management key is still required for localhost.
remote-management.secret-key
string
""
Management key. If a plaintext value is provided, it will be hashed on startup using bcrypt and persisted back to the config file. If empty, the entire management API is disabled (404).
remote-management.disable-control-panel
boolean
false
When true, skip downloading management.html and return 404 for /management.html, effectively disabling the bundled management UI.
quota-exceeded
object
{}
Configuration for handling quota exceeded.
quota-exceeded.switch-project
boolean
true
Whether to automatically switch to another project when a quota is exceeded.
quota-exceeded.switch-preview-model
boolean
true
Whether to automatically switch to a preview model when a quota is exceeded.
debug
boolean
false
Enable debug mode for verbose logging.
logging-to-file
boolean
true
Write application logs to rotating files instead of stdout. Set to false to log to stdout/stderr.
usage-statistics-enabled
boolean
true
Enable in-memory usage aggregation for management APIs. Disable to drop all collected usage metrics.
api-keys
string[]
[]
Legacy shorthand for inline API keys. Values are mirrored into the config-api-key provider for backwards compatibility.
generative-language-api-key
string[]
[]
List of Generative Language API keys.
codex-api-key
object
{}
List of Codex API keys.
codex-api-key.api-key
string
""
Codex API key.
codex-api-key.base-url
string
""
Custom Codex API endpoint, if you use a third-party API endpoint.
codex-api-key.proxy-url
string
""
Proxy URL for this specific API key. Overrides the global proxy-url setting. Supports socks5/http/https protocols.
claude-api-key
object
{}
List of Claude API keys.
claude-api-key.api-key
string
""
Claude API key.
claude-api-key.base-url
string
""
Custom Claude API endpoint, if you use a third-party API endpoint.
claude-api-key.proxy-url
string
""
Proxy URL for this specific API key. Overrides the global proxy-url setting. Supports socks5/http/https protocols.
Proxy URL for this specific API key. Overrides the global proxy-url setting. Supports socks5/http/https protocols.
openai-compatibility.*.models
object[]
[]
The actual model name.
openai-compatibility.*.models.*.name
string
""
The models supported by the provider.
openai-compatibility.*.models.*.alias
string
""
The alias used in the API.
gemini-web
object
{}
Configuration specific to the Gemini Web client.
gemini-web.context
boolean
true
Enables conversation context reuse for continuous dialogue.
gemini-web.gem-mode
string
""
Selects a predefined Gem to attach for Gemini Web requests; allowed values: coding-partner, writing-editor. When empty, no Gem is attached.
gemini-web.max-chars-per-request
integer
1,000,000
The maximum number of characters to send to Gemini Web in a single request.
gemini-web.disable-continuation-hint
boolean
false
Disables the continuation hint for split prompts.
Example Configuration File
# Server port
port: 8317
# Management API settings
remote-management:
# Whether to allow remote (non-localhost) management access.
# When false, only localhost can access management endpoints (a key is still required).
allow-remote: false
# Management key. If a plaintext value is provided here, it will be hashed on startup.
# All management requests (even from localhost) require this key.
# Leave empty to disable the Management API entirely (404 for all /v0/management routes).
secret-key: ""
# Disable the bundled management control panel asset download and HTTP route when true.
disable-control-panel: false
# Authentication directory (supports ~ for home directory). If you use Windows, please set the directory like this: `C:/cli-proxy-api/`
auth-dir: "~/.cli-proxy-api"
# API keys for authentication
api-keys:
- "your-api-key-1"
- "your-api-key-2"
# Enable debug logging
debug: false
# When true, write application logs to rotating files instead of stdout
logging-to-file: true
# When false, disable in-memory usage statistics aggregation
usage-statistics-enabled: true
# Proxy URL. Supports socks5/http/https protocols. Example: socks5://user:pass@192.168.1.1:1080/
proxy-url: ""
# Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504.
request-retry: 3
# Quota exceeded behavior
quota-exceeded:
switch-project: true # Whether to automatically switch to another project when a quota is exceeded
switch-preview-model: true # Whether to automatically switch to a preview model when a quota is exceeded
# Gemini Web client configuration
gemini-web:
context: true # Enable conversation context reuse
gem-mode: "" # Select Gem: "coding-partner" or "writing-editor"; empty means no Gem
max-chars-per-request: 1000000 # Max characters per request
# API keys for official Generative Language API
generative-language-api-key:
- "AIzaSy...01"
- "AIzaSy...02"
- "AIzaSy...03"
- "AIzaSy...04"
# Codex API keys
codex-api-key:
- api-key: "sk-atSM..."
base-url: "https://www.example.com" # use the custom codex API endpoint
proxy-url: "socks5://proxy.example.com:1080" # optional: per-key proxy override
# Claude API keys
claude-api-key:
- api-key: "sk-atSM..." # use the official claude API key, no need to set the base url
- api-key: "sk-atSM..."
base-url: "https://www.example.com" # use the custom claude API endpoint
proxy-url: "socks5://proxy.example.com:1080" # optional: per-key proxy override
# OpenAI compatibility providers
openai-compatibility:
- name: "openrouter" # The name of the provider; it will be used in the user agent and other places.
base-url: "https://openrouter.ai/api/v1" # The base URL of the provider.
# New format with per-key proxy support (recommended):
api-key-entries:
- api-key: "sk-or-v1-...b780"
proxy-url: "socks5://proxy.example.com:1080" # optional: per-key proxy override
- api-key: "sk-or-v1-...b781" # without proxy-url
# Legacy format (still supported, but cannot specify proxy per key):
# api-keys:
# - "sk-or-v1-...b780"
# - "sk-or-v1-...b781"
models: # The models supported by the provider.
- name: "moonshotai/kimi-k2:free" # The actual model name.
alias: "kimi-k2" # The alias used in the API.
OpenAI Compatibility Providers
Configure upstream OpenAI-compatible providers (e.g., OpenRouter) via openai-compatibility.
name: provider identifier used internally
base-url: provider base URL
api-key-entries: list of API key entries with optional per-key proxy configuration (recommended)
api-keys: (deprecated) simple list of API keys without proxy support
models: list of mappings from upstream model name to local alias
Call OpenAI's endpoint /v1/chat/completions with model set to the alias (e.g., kimi-k2). The proxy routes to the configured provider/model automatically.
Also, you may call Claude's endpoint /v1/messages, Gemini's /v1beta/models/model-name:streamGenerateContent or /v1beta/models/model-name:generateContent.
And you can always use Gemini CLI with CODE_ASSIST_ENDPOINT set to http://127.0.0.1:8317 for these OpenAI-compatible provider's models.
Authentication Directory
The auth-dir parameter specifies where authentication tokens are stored. When you run the login command, the application will create JSON files in this directory containing the authentication tokens for your Google accounts. Multiple accounts can be used for load balancing.
Request Authentication Providers
Configure inbound authentication through the auth.providers section. The built-in config-api-key provider works with inline keys:
Clients should send requests with an Authorization: Bearer your-api-key-1 header (or X-Goog-Api-Key, X-Api-Key, or ?key= as before). The legacy top-level api-keys array is still accepted and automatically synced to the default provider for backwards compatibility.
Official Generative Language API
The generative-language-api-key parameter allows you to define a list of API keys that can be used to authenticate requests to the official Generative Language API.
Hot Reloading
The server watches the config file and the auth-dir for changes and reloads clients and settings automatically. You can add or remove Gemini/OpenAI token JSON files while the server is running; no restart is required.
Gemini CLI with multiple account load balancing
Start CLI Proxy API server, and then set the CODE_ASSIST_ENDPOINT environment variable to the URL of the CLI Proxy API server.
The server will relay the loadCodeAssist, onboardUser, and countTokens requests. And automatically load balance the text generation requests between the multiple accounts.
> [!NOTE]
> This feature only allows local access because there is currently no way to authenticate the requests.
> 127.0.0.1 is hardcoded for load balancing.
Claude Code with multiple account load balancing
Start CLI Proxy API server, and then set the ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_MODEL, ANTHROPIC_SMALL_FAST_MODEL environment variables.
Start CLI Proxy API server, and then edit the ~/.codex/config.toml and ~/.codex/auth.json files.
config.toml:
model_provider = "cliproxyapi"
model = "gpt-5-codex" # Or gpt-5, you can also use any of the models that we support.
model_reasoning_effort = "high"
[model_providers.cliproxyapi]
name = "cliproxyapi"
base_url = "http://127.0.0.1:8317/v1"
wire_api = "responses"
auth.json:
{
"OPENAI_API_KEY": "sk-dummy"
}
Run with Docker
Run the following command to login (Gemini OAuth on port 8085):
docker run --rm -p 8317:8317 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest
Run with Docker Compose
Clone the repository and navigate into the directory:
git clone https://github.com/luispater/CLIProxyAPI.git
cd CLIProxyAPI
Prepare the configuration file:
Create a config.yaml file by copying the example and customize it to your needs.
cp config.example.yaml config.yaml
(Note for Windows users: You can use copy config.example.yaml config.yaml in CMD or PowerShell.)
Start the service:
For most users (recommended):
Run the following command to start the service using the pre-built image from Docker Hub. The service will run in the background.
docker compose up -d
For advanced users:
If you have modified the source code and need to build a new image, use the interactive helper scripts:
For Windows (PowerShell):
.\docker-build.ps1
For Linux/macOS:
bash docker-build.sh
The script will prompt you to choose how to run the application:
Option 1: Run using Pre-built Image (Recommended): Pulls the latest official image from the registry and starts the container. This is the easiest way to get started.
Option 2: Build from Source and Run (For Developers): Builds the image from the local source code, tags it as cli-proxy-api:local, and then starts the container. This is useful if you are making changes to the source code.
To authenticate with providers, run the login command inside the container: