AIsa-team

cn-llm

'China LLM Gateway - Unified interface for Chinese LLMs including Qwen, DeepSeek, GLM, Baichuan. OpenAI compatible, one API Key for all models. Use when: the user needs model routing, provider setup, or Chinese LLM access guidance.'

AIsa-team 11 2 Updated 2w ago

Resources

2
GitHub

Install

npx skillscat add aisa-team/agent-skills/cn-llm

Install via the SkillsCat registry.

SKILL.md

AIsa CN-LLM 🐉

China LLM Unified Gateway. Powered by AIsa.

One API Key to access all Chinese LLMs. OpenAI compatible interface.

Qwen, DeepSeek, GLM, Baichuan, Moonshot, and more - unified API access.

🔥 What You Can Do

Intelligent Chat

"Use Qwen to answer Chinese questions, use DeepSeek for coding"

Deep Reasoning

"Use DeepSeek-R1 for complex reasoning tasks"

Code Generation

"Use DeepSeek-Coder to generate Python code with explanations"

Long Text Processing

"Use Qwen-Long for ultra-long document summarization"

Model Comparison

"Compare response quality between Qwen-Max and DeepSeek-V3"

Supported Models

Qwen (Alibaba)

Model Input Price Output Price Features
qwen3-max $1.37/M $5.48/M Most powerful general model
qwen3-max-2026-01-23 $1.37/M $5.48/M Latest version
qwen3-coder-plus $2.86/M $28.60/M Enhanced code generation
qwen3-coder-flash $0.72/M $3.60/M Fast code generation
qwen3-coder-480b-a35b-instruct $2.15/M $8.60/M 480B large model
qwen3-vl-plus $0.43/M $4.30/M Vision-language model
qwen3-vl-flash $0.86/M $0.86/M Fast vision model
qwen3-omni-flash $4.00/M $16.00/M Multimodal model
qwen-vl-max $0.23/M $0.57/M Vision-language
qwen-plus-2025-12-01 $1.26/M $12.60/M Plus version
qwen-mt-flash $0.168/M $0.514/M Fast machine translation
qwen-mt-lite $0.13/M $0.39/M Lite machine translation

DeepSeek

Model Input Price Output Price Features
deepseek-r1 $2.00/M $8.00/M Reasoning model, supports Tools
deepseek-v3 $1.00/M $4.00/M General chat, 671B parameters
deepseek-v3-0324 $1.20/M $4.80/M V3 stable version
deepseek-v3.1 $4.00/M $12.00/M Latest Terminus version

Note: Prices are in M (million tokens). Model availability may change, see marketplace.aisa.one/pricing for the latest list.

Quick Start

export AISA_API_KEY="your-key"

API Endpoints

OpenAI Compatible Interface

POST https://api.aisa.one/v1/chat/completions

Qwen Example

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-max",
    "messages": [
      {"role": "system", "content": "You are a professional Chinese assistant."},
      {"role": "user", "content": "Please explain what a large language model is?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

DeepSeek Example

# DeepSeek-V3 general chat (671B parameters)
curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v3",
    "messages": [{"role": "user", "content": "Write a quicksort algorithm in Python"}],
    "temperature": 0.3
  }'

# DeepSeek-R1 deep reasoning (supports Tools)
curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1",
    "messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a sheep, and a cabbage. The boat can only carry the farmer and one item at a time. If the farmer is not present, the wolf will eat the sheep, and the sheep will eat the cabbage. How can the farmer safely cross?"}]
  }'

# DeepSeek-V3.1 Terminus latest version
curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v3.1",
    "messages": [{"role": "user", "content": "Implement an LRU cache with get and put operations"}]
  }'

Qwen3 Code Generation Example

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-coder-plus",
    "messages": [{"role": "user", "content": "Implement a thread-safe Map in Go"}]
  }'

Parameter Reference

Parameter Type Required Description
model string Yes Model identifier
messages array Yes Message list
temperature number No Randomness (0-2, default 1)
max_tokens integer No Maximum tokens to generate
stream boolean No Stream output (default false)
top_p number No Nucleus sampling parameter (0-1)

Response Format

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "qwen-max",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A large language model (LLM) is a deep learning-based..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 30,
    "completion_tokens": 150,
    "total_tokens": 180,
    "cost": 0.001
  }
}

Streaming Output

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-plus",
    "messages": [{"role": "user", "content": "Tell a Chinese folk story"}],
    "stream": true
  }'

Returns Server-Sent Events (SSE) format:

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Once"}}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" upon"}}]}
...
data: [DONE]

Python Client

CLI Usage

# Qwen chat
python3 scripts/cn_llm_client.py chat --model qwen3-max --message "Hello, please introduce yourself"

# Qwen3 code generation
python3 scripts/cn_llm_client.py chat --model qwen3-coder-plus --message "Write a binary search algorithm"

# DeepSeek-R1 reasoning
python3 scripts/cn_llm_client.py chat --model deepseek-r1 --message "Which is larger, 9.9 or 9.11? Please reason in detail"

# DeepSeek-V3 chat
python3 scripts/cn_llm_client.py chat --model deepseek-v3 --message "Tell a story" --stream

# With system prompt
python3 scripts/cn_llm_client.py chat --model qwen3-max --system "You are a classical poetry expert" --message "Write a poem about plum blossoms"

# Model comparison
python3 scripts/cn_llm_client.py compare --models "qwen3-max,deepseek-v3" --message "What is quantum computing?"

# List supported models
python3 scripts/cn_llm_client.py models

Python SDK Usage

from cn_llm_client import CNLLMClient

client = CNLLMClient()  # Uses AISA_API_KEY environment variable

# Qwen chat
response = client.chat(
    model="qwen3-max",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response["choices"][0]["message"]["content"])

# Qwen3 code generation
response = client.chat(
    model="qwen3-coder-plus",
    messages=[
        {"role": "system", "content": "You are a professional programmer."},
        {"role": "user", "content": "Implement a singleton pattern in Python"}
    ],
    temperature=0.3
)

# Streaming output
for chunk in client.chat_stream(
    model="deepseek-v3",
    messages=[{"role": "user", "content": "Tell a story about an idiom"}]
):
    print(chunk, end="", flush=True)

# Model comparison
results = client.compare_models(
    models=["qwen3-max", "deepseek-v3", "deepseek-r1"],
    message="Explain what machine learning is"
)
for model, result in results.items():
    print(f"{model}: {result['response'][:100]}...")

Use Cases

1. Chinese Content Generation

# Copywriting
response = client.chat(
    model="qwen3-max",
    messages=[
        {"role": "system", "content": "You are a professional copywriter."},
        {"role": "user", "content": "Write a product introduction for a smart watch"}
    ]
)

2. Code Development

# Code generation and explanation
response = client.chat(
    model="qwen3-coder-plus",
    messages=[{"role": "user", "content": "Implement a thread-safe Map in Go"}]
)

3. Complex Reasoning

# Mathematical reasoning
response = client.chat(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Prove: For any positive integer n, n³-n is divisible by 6"}]
)

4. Visual Understanding

# Image understanding
response = client.chat(
    model="qwen3-vl-plus",
    messages=[
        {"role": "user", "content": [
            {"type": "text", "text": "Describe the content of this image"},
            {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]}
    ]
)

5. Model Routing Strategy

MODEL_MAP = {
    "chat": "qwen3-max",           # General chat
    "code": "qwen3-coder-plus",    # Code generation
    "reasoning": "deepseek-r1",    # Complex reasoning
    "vision": "qwen3-vl-plus",     # Visual understanding
    "fast": "qwen3-coder-flash",   # Fast response
    "translate": "qwen-mt-flash"   # Machine translation
}

def route_by_task(task_type: str, message: str) -> str:
    model = MODEL_MAP.get(task_type, "qwen3-max")
    return client.chat(model=model, messages=[{"role": "user", "content": message}])

Error Handling

Errors return JSON with error field:

{
  "error": {
    "code": "model_not_found",
    "message": "Model 'xxx' is not available"
  }
}

Common error codes:

  • 401 - Invalid or missing API Key
  • 402 - Insufficient balance
  • 404 - Model not found
  • 429 - Rate limit exceeded
  • 500 - Server error

Pricing

Model Input ($/M) Output ($/M)
qwen3-max $1.37 $5.48
qwen3-coder-plus $2.86 $28.60
qwen3-coder-flash $0.72 $3.60
qwen3-vl-plus $0.43 $4.30
deepseek-v3 $1.00 $4.00
deepseek-r1 $2.00 $8.00
deepseek-v3.1 $4.00 $12.00

Price unit: $ per Million tokens. Each response includes usage.cost and usage.credits_remaining.

Get Started

  1. Register at aisa.one
  2. Get API Key
  3. Top up (pay-as-you-go)
  4. Set environment variable: export AISA_API_KEY="your-key"

Full API Reference

See API Reference for complete endpoint documentation.