Install
npx skillscat add jneless/ark-embedding-skill Install via the SkillsCat registry.
Ark-Embedding
- ç«å±±æ¹è大模åæå¡
- æä¾ä¸¤ç§SDKæ¥å
¥æ¹å¼
- OPENAI SDKï¼å ¼å®¹ï¼
- ç«å±±æ¹è SDK
- 使ç¨ç±»ä¼¼ openai 飿 ¼ç token 认è¯é´æ
- å¯ä»¥å¨è¯·æ±ä¸æºå¸¦ï¼éå¸¸å¨ env ç ARK_API_KEY ä¸
SDK å®è£
ç«å±±æ¹èæ¯æ PythonãGoãJava ä¸ç§å®æ¹ SDKï¼å¹¶åçå ¼å®¹ **OpenAI åè®®**ã
1. 宿¹ SDK å¿«éæ¦è§
| è¯è¨ | æä½çæ¬è¦æ± | 主è¦å®è£ /管çå½ä»¤ | 夿³¨ |
|---|---|---|---|
| Python | Python 3.7+ | pip install 'volcengine-python-sdk[ark]' |
æ¯æ -U åæ°å级 |
| Go | Go 1.18+ | go get -u github.com/volcengine/volcengine-go-sdk |
ä¾èµ go mod 管ç |
| Java | JDK 1.8+ | Maven (pom.xml) æ Gradle (build.gradle) |
ä» éæå¡ç«¯ï¼ä¸æ¯æ Android |
openai å ¼å®¹
pip install --upgrade "openai>=1.0"- å¿«éå¼å§
from openai import OpenAI
import os
client = OpenAI(
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
# ç¯å¢åéä¸é
ç½®æ¨çAPI Key
api_key=os.environ.get("ARK_API_KEY"),
)
completion = client.chat.completions.create(
# Replace with Model ID .
model="doubao-seed-1-6-251015",
messages = [
{"role": "user", "content": "Hello"},
],
)
print(completion.choices[0].message.content)- 设置é¢å¤å段
- ä¼ å ¥OpenAI SDKä¸ä¸æ¯æçåæ®µï¼å¯ä»¥éè¿ extra_body åå ¸ä¼ å ¥ï¼å¦å¼å ³æ¨¡åæ¯å¦æ·±åº¦æèç thinking åæ®µã
from openai import OpenAI
import os
client = OpenAI(
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
# ç¯å¢åéä¸é
ç½®æ¨çAPI Key
api_key=os.environ.get("ARK_API_KEY"),
)
completion = client.chat.completions.create(
# Replace with Model ID .
model="doubao-seed-1-6-251015",
messages = [
{"role": "user", "content": "Hello"},
],
extra_body={
"thinking": {
"type": "disabled", # ä¸ä½¿ç¨æ·±åº¦æèè½å
# "type": "enabled", # ä½¿ç¨æ·±åº¦æèè½å
}
}
)
print(completion.choices[0].message.content)- 设置èªå®ä¹header
- å¯ä»¥ç¨äºä¼ éé¢å¤ä¿¡æ¯ï¼å¦é ç½® IDæ¥ä¸²èæ¥å¿ï¼ä½¿è½æ°æ®å å¯è½åã
from openai import OpenAI
import os
client = OpenAI(
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
# ç¯å¢åéä¸é
ç½®æ¨çAPI Key
api_key=os.environ.get("ARK_API_KEY"),
)
completion = client.chat.completions.create(
# Replace with Model ID .
model="doubao-seed-1-6-251015",
messages = [
{"role": "user", "content": "Hello"},
],
# èªå®ä¹request id
extra_headers={"X-Client-Request-Id": "202406251728190000B7EA7A9648AC08D9"}
)
print(completion.choices[0].message.content)- ææ¬åéå Embedding
- 注æ:夿¨¡æåéåè½å模å䏿¯æ OpenAI API ï¼å¦é使ç¨è¯·ä½¿ç¨ æ¹è SDKï¼è¯¦æ 请åè夿¨¡æåéåã
from openai import OpenAI
import os
client = OpenAI(
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
# ç¯å¢åéä¸é
ç½®æ¨çAPI Key
api_key=os.environ.get("ARK_API_KEY"),
)
resp = client.embeddings.create(
# Replace with Model ID .
model="doubao-embedding-large-text-240915",
input=["Nice day."]
)
print(resp)- LangChain OpenAI SDK
- å®è£ LangChain OpenAI SDK:pip install langchain-openai
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
import os
llm = ChatOpenAI(
# ç¯å¢åéä¸é
ç½®æ¨çAPI Key
openai_api_key=os.environ.get("ARK_API_KEY"),
# The base URL for model invocation
openai_api_base="https://ark.cn-beijing.volces.com/api/v3",
# Replace with Model ID
model="doubao-seed-1-6-251015",
)
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
llm_chain = prompt | llm
print(llm_chain.invoke(question))SDK 常è§ä½¿ç¨ç¤ºä¾
- 彿¨ä½¿ç¨å®æ¹ SDK æ¶ï¼è®¾ç½®èªå®ä¹headerçå ¸åç使ç¨ï¼å¯ä»¥åèæ¬è¯´æã
- è®¾ç½®è¶ æ¶&éè¯æ¬¡æ°ï¼è¶ æ¶è®¾ç½®ï¼timeoutï¼ï¼éç¨äºå¯¹ååºæ¶é´è¾é¿çåºæ¯ï¼å¦æ·±åº¦æè模ååå¤é®é¢æ¶é´è¾é¿ï¼ï¼é¿å è¯·æ±æ¶é´è¶ æ¶å¯¼è´æå¡ä¸æï¼è¿éæ¨è30åéï¼1800ç§ï¼ã
- Python SDKï¼é»è®¤ 60ç§ è¿æ¥è¶ æ¶ï¼å»ºç«ç½ç»è¿æ¥çæå¤§çå¾ æ¶é´ï¼ï¼600 ç§ socket è¶ æ¶ï¼è¿æ¥å»ºç«åçæ°æ®ä¼ è¾è¶ æ¶ï¼ã
- Java SDKï¼é»è®¤ 60ç§ è¿æ¥è¶ æ¶ï¼600 ç§ socket è¶ æ¶ã
- Go SDKï¼é»è®¤ 600s 端å°ç«¯çæ»è¶ æ¶ï¼ä»è¯·æ±åèµ·è³ååºæ¥æ¶çæ»è¶ æ¶ï¼ã
- éè¯æ¬¡æ°ï¼max_retriesï¼ï¼éç¨äºç½ç»ä¸ç¨³å®åºæ¯ï¼èªå¨éè¯å ç¬æ¶æ éï¼å¦ç½ç»æ³¢å¨ï¼å¤±è´¥ç请æ±ï¼é»è®¤2次ï¼å³è¯·æ±å¤±è´¥é»è®¤ä¼åéè¯2æ¬¡ï¼æ¨å¯ä»¥æéé ç½®ã
import os
# éè¿å½ä»¤å®è£
æ¹èSDK pip install 'volcengine-python-sdk[ark]' .
from volcenginesdkarkruntime import Ark
client = Ark(
api_key=os.environ.get("ARK_API_KEY"),
# 设置æå¡ååºè¶
æ¶æ¶é´ï¼åä½ç§ï¼æ¨è1800ç§å以ä¸
timeout=1800,
# 设置éè¯æ¬¡æ°
max_retries=2,
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
)- 使ç¨Access Keyé´æ
- å½éè¦éè¿ç«å±±å¼æäºæå¡çæ åé´æä½ç³»ï¼Access Key/Secret Keyï¼è®¤è¯æ¶ä½¿ç¨ã
- ä½¿ç¨ Access Key é´æçå®ç°åçæ¯éè¿æ¥å£ GetApiKey è·åä¸´æ¶ API Key è¿è¡é´æï¼æ¤æ¥å£éæµè¾ä½ï¼å¡å¿ 使ç¨å便¨¡å¼è¿è¡è¯·æ±ï¼é¿å å 鿵坼è´é´æå¤±è´¥ï¼å¯åèåä¾è¯·æ±
import os
# éè¿å½ä»¤å®è£
æ¹èSDK pip install 'volcengine-python-sdk[ark]' .
from volcenginesdkarkruntime import Ark
client = Ark(
ak=os.environ.get("VOLC_ACCESSKEY"),
sk=os.environ.get("VOLC_SECRETKEY"),
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
)- 设置èªå®ä¹ header
- èªå®ä¹ç header åæ®µéç¨é®å¼å¯¹ï¼key-valueï¼ç»æï¼ä»¥å¯¹è±¡å½¢å¼åç°ï¼å ·ä½æ ¼å¼ç¤ºä¾å¦ä¸ï¼
- å¯ä»¥ç¨äºä¼ éé¢å¤ä¿¡æ¯ï¼å¦é ç½® IDæ¥ä¸²èæ¥å¿ï¼ä½¿è½æ°æ®å å¯è½åãæ¯æè¯¥åæ®µçæ¥å£æ 对è¯ï¼Chatï¼ APIãæ¹éï¼Chatï¼APIï¼ä¸é¢æ¯å ¸å示ä¾ã
{
"key1": "value1",
"key2": "value2",
...,
"keyN": "valueN"
}é®é¢å®ä½
æ¨å¯ä»¥å¨è¯·æ±æ¶ï¼ä¸ºå¤ä¸ªè¯·æ±è®¾ç½® key 为 X-Client-Request-Idï¼value é
置为èªå®ä¹ç IDï¼å¹¶æä¾ ID ç»æ¹èå®åå¢éï¼ä¸ºæ¨ä¸²è客æ·ç«¯åæå¡ç«¯æ¥å¿ï¼æ¹ä¾¿é®é¢å®ä½ã
import os
# éè¿å½ä»¤å®è£
æ¹èSDK pip install 'volcengine-python-sdk[ark]' .
from volcenginesdkarkruntime import Ark
client = Ark(
api_key=os.environ.get("ARK_API_KEY"),
# The base URL for model invocation .
base_url="https://ark.cn-beijing.volces.com/api/v3",
)
completion = client.chat.completions.create(
# Replace with Model ID .
model = "doubao-seed-1-6-251015",
messages = [
{"role": "user", "content": "Hello"},
],
# èªå®ä¹request id
extra_headers={"X-Client-Request-Id": "My-Request-Id"}
)
print(completion.choices[0].message.content)ææ¬åéå
- ææ¬å符串转æ¢åé
- éè¿è°ç¨doubao-embedding-text-240715模åï¼å°è¾å ¥çææ¬å符串转æ¢ä¸ºåé表示ï¼å¹¶è¾åºåé维度åå10ç»´æ°å¼ã
- 注æä¸ºè·å¾æ´å¥½æ§è½ï¼å»ºè®®ææ¬æ°éæ»tokenä¸è¶ è¿4096ï¼æè ææ¬æ¡æ°ä¸è¶ è¿4ã
import os
from volcenginesdkarkruntime import Ark
# åå§å客æ·ç«¯
client = Ark(
# ä»ç¯å¢åéä¸è¯»åæ¨çæ¹èAPI Key
api_key=os.environ.get("ARK_API_KEY"),
base_url="https://ark.cn-beijing.volces.com/api/v3"
)
response = client.embeddings.create(
model="doubao-embedding-text-240715",
input="Function Calling æ¯ä¸ç§å°å¤§æ¨¡åä¸å¤é¨å·¥å
·å API ç¸è¿çå
³é®åè½",
encoding_format="float"
)
# æå°ç»æ
print(f"åé维度: {len(response.data[0].embedding)}")
print(f"å10ç»´åé: {response.data[0].embedding[:10]}")- è¾å ¥ææ¬æä»¶éè¡è½¬æ¢
- åé忍¡åå¯ä»¥åºäºæ¨ä¸ä¼ çææ¡£çæåµå ¥åéãæ¤å¤ä»¥embedding_text.txtä½ä¸ºç¤ºä¾æä»¶ï¼æ¨å¯ä»¥éè¿ä»£ç å¯¹ææ¬æä»¶éè¡è½¬åæåéã
import os
from volcenginesdkarkruntime import Ark
client = Ark(
# ä»ç¯å¢åéä¸è¯»åæ¨çæ¹èAPI Key
api_key=os.environ.get("ARK_API_KEY"),
base_url="https://ark.cn-beijing.volces.com/api/v3"
)
# ä»æä»¶è¯»åææ¬å¹¶çæåé
with open("embedding_text.txt", "r", encoding="utf-8") as f:
# æè¡å岿æ¬ï¼æ¯è¡ä½ä¸ºä¸ä¸ªç¬ç«è¾å
¥ï¼
texts = [line.strip() for line in f if line.strip()]
response = client.embeddings.create(
model="doubao-embedding-text-240715",
input=texts,
encoding_format="float"
)
# æå°ç»æ
print(f"å¤çææ¬æ°é: {len(response.data)}")
print(f"é¦ä¸ªææ¬åé维度: {len(response.data[0].embedding)}")ç¸ä¼¼åº¦è®¡ç®
- Doubao-embeddingåéé´ç¸ä¼¼åº¦å¾åå¯ä»¥ä½¿ç¨ä½å¼¦ç¸ä¼¼åº¦ä½ä¸ºè®¡ç®æ¹å¼ï¼ä½å¼¦ç¸ä¼¼åº¦è®¡ç®æå为ä¸é¢ä¸¤æ¥ï¼
- ç¬¬ä¸æ¥: 请æ±doubao-embeddingæ¥å£å¾å°embeddingï¼å°embeddingåéL2_normå¤ç;
- ç¬¬äºæ¥: 对normå¤çåçåéè¿è¡ç¹ç§¯è®¡ç®å¾å°ä½å¼¦ç¸ä¼¼åº¦;
åééç»´
- åé忝éè¿åéæ¥è¡¨å¾ææ¬ãå¾åçéç»æåæ°æ®çè¿ç¨ï¼è®©è®¡ç®æºè½æç½è¯è¨ãå¾åççå«ä¹ãå ¶ä¸æ 注è¯ä¹çç»´åº¦æ¯æè¿°åéåååé䏿°åç个æ°ã卿æ¬åéååºæ¯ï¼æ¯ä¸ªç»´åº¦å¯¹åºææ¬çä¸ä¸ªç¹å¾ã
- 维度æ´å¤ï¼ä»¥æ´å¤ç¬ç«ç¹å¾æ 注è¯è¯æ¥ææè¯ä¹ç»èï¼æå表å¾ç²¾åº¦ãä½ä¼å¯¼è´æ°æ®éè¨èï¼å¸¦æ¥æ´é«çå卿æ¬ï¼å å/ç£çå ç¨ï¼å计ç®å¼éï¼ç¸ä¼¼åº¦è®¡ç®ãæ¨¡åæ¨çèæ¶ï¼ã
- 维度æ´ä½ï¼ä»¥æ´å°ç¹å¾æ 注è¯è¯ï¼å¯è½æå¤±é¨åç»èã使°æ®é缩åï¼å卿çä¸è®¡ç®éåº¦ä¼æåï¼èµæºæ¶è以忿¬æ´ä½ã
- ä¸åæ¨¡åæ¯æå¤ç§ç»´åº¦ï¼æ¨å¯éè¿åééç»´ï¼éæ©åéç维度æ¥åéåææ¬ï¼å¹³è¡¡âè¯ä¹ç²¾åº¦âãâ计ç®é度âä¸âèµæºææ¬âä¸è ææ¬ã
è±å åé忍¡å (Doubao Embedding) è§æ ¼è¡¨
| 模å ID (Model ID) | æ¯æç»´åº¦ | 夿³¨ |
|---|---|---|
| doubao-embedding-large-text-250515 | 2048 (æ¯æ 2048ã1024ã512ã256 é维使ç¨) | éè¦ L2 å½ä¸åå使ç¨ãL2 å½ä¸åæ¯å°åé䏿¯ä¸ªå ç´ é¤ä»¥è¯¥åéç L2 èæ°ï¼åå ç´ å¹³æ¹åçå¹³æ¹æ ¹ï¼ï¼ä½¿åéé¿åº¦ä¸º 1ï¼ä»èæ¶é¤é纲差å¼å¹¶ä¿çç¸å¯¹å¤§å°å ³ç³»ã |
| doubao-embedding-large-text-240915 | 4096 (æ¯æ 2048ã512ã1024 é维使ç¨) | æ |
| doubao-embedding-text-240715 | 2560 (æ¯æ 512ã1024ã2048 é维使ç¨) | æ |
| doubao-embedding-text-240515 | 2048 (æ¯æ 512ã1024 é维使ç¨) | æ |
æ¨èç¨æ³ï¼å¸¸è§éç»´æ¹å¼ä»¥doubao-embedding-text-240715模å为ä¾ï¼æé«ç»´åº¦2560å¯ä»¥å缩å°512, 1024, 2048维度å卿£ç´¢ï¼ç»´åº¦è¶é«è¶æ¥è¿æé«ç»´åº¦ææã
å¦ä½é维度&计ç®ç¸ä¼¼åº¦ï¼
é维度: å°embeddingæ¥å£è·åçåéç´æ¥æªåådim维度;
计ç®ç¸ä¼¼åº¦: 对æªååçembeddingåä½å¼¦ç¸ä¼¼åº¦è®¡ç®;
# éç»´ + L2_norm
def sliced_norm_l2(vec: List[float], dim=2560) -> List[float]:
# dim åå¼ 512,1024,2048
norm = float(np.linalg.norm(vec[ :dim]))
return [v / norm for v in vec[ :dim]]
# ä½å¼¦ç¸ä¼¼åº¦è®¡ç®
query_doc_relevance_score_2560d = np.matmul(
sliced_norm_l2(embeddings[0], 2560), #æ¥è¯¢åé
sliced_norm_l2(embeddings[1], 2560) #ææ¡£åé
)- é对doubao-embedding-large-text-250515模åï¼éè¦éç»´åL2å½ä¸å使ç¨ãæ¨¡åæ¯æå¤ç§åµå ¥ç»´åº¦ï¼[2048ã1024ã512ã256] ï¼å³ä½¿å¨è¾ä½ç»´åº¦ä¸æ§è½ä¸éä¹è¾å°ã
- å¦ä½é维度&计ç®ç¸ä¼¼åº¦ï¼
- é维度: å°embeddingæ¥å£è·åçåéç´æ¥æªåådim维度;
- å½ä¸åï¼ä½¿ç¨ L2 å½ä¸åç»ä¸åéé¿åº¦ï¼ç¡®ä¿ä½å¼¦ç¸ä¼¼åº¦è®¡ç®åç¡®ã
计ç®ç¸ä¼¼åº¦: 对æªååçembeddingåä½å¼¦ç¸ä¼¼åº¦è®¡ç®;
def encode(
client, inputs: List[str], is_query: bool = False, mrl_dim: Optional[int] = None
):
# å¤çæ¥è¯¢ææ¬ï¼æ·»å æä»¤æ¨¡æ¿ä¼åæ£ç´¢æ§è½ï¼
if is_query:
inputs = [f"Instruct: Given a web search query...\nQuery: {i}" for i in inputs]
# è°ç¨APIè·ååå§åéï¼æªå½ä¸åï¼
resp = client.embeddings.create(
model="doubao-embedding-large-text-250515",
input=inputs,
encoding_format="float",
)
# 转æ¢ä¸ºå¼ éå¹¶éç»´ï¼æªååmrl_dim维度ï¼
embedding = torch.tensor([d.embedding for d in resp.data], dtype=torch.bfloat16)
if mrl_dim is not None:
assert mrl_dim in [256, 512, 1024, 2048], "ä»
æ¯æ256/512/1024/2048ç»´"
embedding = embedding[:, :mrl_dim]
# å¿
é¡»æ§è¡å½ä¸åï¼L2å½ä¸ååæè½è®¡ç®ä½å¼¦ç¸ä¼¼åº¦
embedding = torch.nn.functional.normalize(embedding, dim=1, p=2).float().numpy()
return embeddingæä½³å®è·µ
- ä¸åç¨åºå®ç°äºå°æ¥è¯¢ææ¬åèµæåºææ¬åéå¹é çåè½ãæä»¬ä»¥embedding_text.txtçå¤è¡ææ¬ä½ä¸ºèµæåºï¼éè¿è°ç¨Doubao-embedding模åçæææ¬åéãå½ç¨åºæ¥æ¶å°ç¨æ·çæ¥è¯¢ææ¬æ¶ï¼å°å ¶åéåå¹¶éè¿ä½å¼¦ç¸ä¼¼åº¦å¹é èµæåºçåéï¼æç»è¿åæç¸å ³çå3æ¡ææ¬å对åºç¸ä¼¼åº¦åæ°ã
ç¬¬ä¸æ¥ï¼å®¢æ·ç«¯åå§å
å¯¼å ¥æéçåºå ï¼å¹¶è®¾ç½® API Keyï¼ä¸ºåç»çæ°æ®å¤çååæååå¤ã
import os
# 导å
¥ç«å±±å¼æå¤§æ¨¡åSDK
from volcenginesdkarkruntime import Ark
# åå§å客æ·ç«¯
client = Ark(
api_key=os.environ.get("ARK_API_KEY"),
base_url="https://ark.cn-beijing.volces.com/api/v3"
)ç¬¬äºæ¥ï¼ä»æä»¶è¯»åææ¬å¹¶çæåé
读åå 嫿æ¬çembedding_text.txtæä»¶ï¼è°ç¨ææ¬å鿍¡å API éè¡çæææ¬å¯¹åºçåéï¼å¹¶å°ææ¬ååéä¿å为 JSON æä»¶ã
def generate_and_save_embeddings(file_path="embedding_text.txt", output_path="embeddings.json"):
with open(file_path, "r", encoding="utf-8") as f:
texts = [line.strip() for line in f if line.strip()]
# è°ç¨åéåAPI
response = client.embeddings.create(
model="doubao-embedding-text-240715",
input=texts,
encoding_format="float"
)
# æå»ºç»æå¹¶ä¿å
results = [{"text": text, "embedding": data.embedding}
for text, data in zip(texts, response.data)]
with open(output_path, "w", encoding="utf-8") as f:
json.dump(results, f)
print(f"å·²çæå¹¶ä¿å {len(results)} æ¡åéè³ {output_path}")
return resultsç¬¬ä¸æ¥ï¼å è½½é¢è®¡ç®çåéæ°æ®
ä» JSON æä»¶å è½½é¢è®¡ç®çææ¬åéæ°æ®ï¼ä¸ºåç»çç¸ä¼¼åº¦è®¡ç®ååå¤ã
def load_embeddings(file_path="embeddings.json"):
with open(file_path, "r", encoding="utf-8") as f:
return json.load(f)ç¬¬åæ¥ï¼å®ä¹è®¡ç®ä½å¼¦ç¸ä¼¼åº¦å½æ°åæç´¢ç¸ä¼¼ææ¬å½æ°
å©ç¨ä½å¼¦ç¸ä¼¼åº¦æ¥åº¦éææ¬ä¹é´çç¸ä¼¼æ§ï¼å®ç°äºä¸ä¸ªåºäºå
å®¹çææ¬æç´¢åè½ã
ç¨æ·å¯ä»¥éè¿è¾å
¥æ¥è¯¢ææ¬ï¼æ£ç´¢ä¸è¯¥æ¥è¯¢ææ¬æç¸å
³çææ¬ã
# å®ä¹è®¡ç®ä½å¼¦ç¸ä¼¼åº¦å½æ°
def cosine_similarity(a, b):
a = np.array(a)
b = np.array(b)
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# å®ä¹search_similar_text 彿°ï¼æç´¢ä¸æ¥è¯¢ææ¬æç¸ä¼¼çåNæ¡ææ¬
def search_similar_text(query_text, embeddings, top_n=3):
# çææ¥è¯¢ææ¬çåé
query_response = client.embeddings.create(
model="doubao-embedding-text-240715",
input=[query_text],
encoding_format="float"
)
query_embedding = query_response.data[0].embedding
# 计ç®ç¸ä¼¼åº¦
for item in embeddings:
item["similarity"] = cosine_similarity(item["embedding"], query_embedding)
# æåºå¹¶è¿åç»æ
sorted_results = sorted(embeddings, key=lambda x: x["similarity"], reverse=True)
return sorted_results[:top_n]ç¬¬äºæ¥ï¼æµè¯æç´¢åè½
æµè¯æç´¢åè½ï¼è°ç¨ search_similar_text 彿°æ¥è¯¢ä¸queryåéç¸å ³çææ¬ï¼å¹¶è¿åä¸è¯¥æ¥è¯¢ææ¬æç¸å ³çå 3 æ¡ææ¬åå ¶ç¸ä¼¼åº¦åæ°ã
# 示ä¾ï¼çæåéå¹¶æç´¢
if __name__ == "__main__":
# çææå è½½åé
try:
embeddings = load_embeddings()
print(f"å·²å è½½ {len(embeddings)} æ¡é¢è®¡ç®åé")
except FileNotFoundError:
print("æªæ¾å°é¢è®¡ç®åéï¼å°ä»æä»¶çæ...")
embeddings = generate_and_save_embeddings()
# æ§è¡æç´¢ï¼ç¤ºä¾æ¥è¯¢ï¼ä¸ä¸æç¼åæºå¶ï¼
query = "ä¸ä¸æç¼åï¼Context APIï¼æ¯æ¹èæä¾çä¸ä¸ªé«æçç¼åæºå¶ï¼æ¨å¨ä¸ºæ¨ä¼åçæå¼AIå¨ä¸å交äºåºæ¯ä¸çæ§è½åææ¬ã"
results = search_similar_text(query, embeddings, top_n=3)
# æå°ç»æ
print(f"\næç´¢æ¥è¯¢: '{query}'")
for i, result in enumerate(results, 1):
print(f"\nTop {i} (ç¸ä¼¼åº¦: {result['similarity']:.4f}):")
print(f"{result['text'][:200]}...") # æ¾ç¤ºå200个å符请æ±åæ°
model string å¿ é
æ¨éè¦è°ç¨ç模åç ID ï¼Model IDï¼ï¼å¼é模åæå¡ï¼å¹¶æ¥è¯¢ Model ID ã
æ¨ä¹å¯éè¿ Endpoint ID æ¥è°ç¨æ¨¡åï¼è·å¾éæµã计费类åï¼åä»è´¹/åä»è´¹ï¼ãè¿è¡ç¶ææ¥è¯¢ãçæ§ãå®å ¨çé«çº§è½åï¼å¯åèè·å Endpoint IDãinput string / string[] å¿ é
éè¦åéåçå 容åè¡¨ï¼æ¯æä¸æãè±æãè¾å ¥å 容鿻¡è¶³ä¸é¢æ¡ä»¶ï¼
ä¸å¾è¶ è¿æ¨¡åçæå¤§è¾å ¥ token æ°ãdoubao-embdding 模åï¼æ¯ä¸ªå表å ç´ ï¼å¹¶é忬¡è¯·æ±æ»æ°ï¼æå¤§è¾å ¥token æ°ä¸º 4096ã
ä¸è½ä¸ºç©ºå表ï¼åè¡¨çæ¯ä¸ªæåä¸è½ä¸ºç©ºå符串ã
åæ¡ææ¬ä»¥ utf-8 ç¼ç ï¼é¿åº¦ä¸è¶ è¿ 100,000 åèã
为è·å¾æ´å¥½æ§è½ï¼å»ºè®®ææ¬æ°éæ»tokenä¸è¶ è¿4096ï¼æè ææ¬æ¡æ°ä¸è¶ è¿4ãencoding_format string / null é»è®¤å¼ float
åå¼èå´ï¼ floatãbase64ãnullã
表示 embedding è¿åçæ ¼å¼ã
ååºåæ°
id string
æ¬æ¬¡è¯·æ±çå¯ä¸æ è¯ ã
model string
æ¬æ¬¡è¯·æ±å®é
使ç¨ç模ååç§°åçæ¬ã
created integer
æ¬æ¬¡è¯·æ±å建æ¶é´ç Unix æ¶é´æ³ï¼ç§ï¼ã
object string
åºå®ä¸º listã
data object
æ¬æ¬¡è¯·æ±çç®æ³è¾åºå
容ã
data.index integer
åéçåºå·ï¼ä¸è¯·æ±åæ° input å表ä¸çå
容顺åºå¯¹åºã
data.embedding float[]
对åºå
容çåéåç»æã
data.object string
åºå®ä¸º embeddingã
usage object
æ¬æ¬¡è¯·æ±ç token ç¨éã
usage.prompt_tokens integer
è¾å
¥å
容 token æ°éã
usage.total_tokens integer
æ¬æ¬¡è¯·æ±æ¶èçæ» token æ°éï¼è¾å
¥ + è¾åºï¼ã
夿¨¡æåéå
Doubao-embedding-vision æ¯ç±åèè·³å¨ç åç夿¨¡æåé忍¡åãå®è½å°ææ¬ãå¾ç以åè§é¢çæ··åè¾å
¥å
容转æ¢ä¸ºç»ä¸çåé表示ï¼ä»è叮婿¨æ´é«æå°å¤çè·¨æ¨¡ææ°æ®ï¼å®ç°ç²¾åçææå¾ã徿å¾åå¾ææ··åæç´¢ã
å½åæ¨¡åæ¯æä»¥ä¸å ç§åéè¾åºç±»åï¼
ç¨ å¯åé (Dense Embedding)ï¼ææçæ¬åé»è®¤æ¯æã
ç¨çåé (Sparse Embedding)ï¼ä» doubao-embedding-vision-250615 çæ¬èµ·æ¯æï¼ä¸ä»
æ¯æææ¬è¾å
¥ã
夿¨¡æå鿍¡å (Vision Embedding)
| 模åçæ¬ | è¾å ¥è½å | ç¨çåé (Sparse) | instructions åæ®µ |
|---|---|---|---|
vision-250328 |
æå¤ 1 ææ¬ + 1 å¾ç | 䏿¯æ | 䏿¯æ |
vision-250615 |
ä¸éæ°éææ¬/å¾ç/è§é¢ | æ¯æï¼ä» éææ¬è¾å ¥ï¼ | 䏿¯æ |
vision-251215 ååç» |
ä¸éæ°éææ¬/å¾ç/è§é¢ | æ¯æï¼ä» éææ¬è¾å ¥ï¼ | æ¯æ |
注æ:ãéè¦ãinstructions åæ®µçé ç½®ç´æ¥å³å®æ¨¡åæ¨çææãä¸ºäºæ¾èæååé表示çç²¾åº¦ï¼æ¨éè¦æ ¹æ®å ·ä½çä¸å¡åºæ¯æ¥å®å¶è¯¥æä»¤ã请å¿ç´æ¥ä½¿ç¨ç³»ç»é»è®¤å¼ã详æ 请åè设置 instructions åæ®µï¼æ¨èï¼ã
æ¬ææ¡£ä¸å å« instructions ç¨æ³
å¼å¯ç¨çåéï¼Sparse Embeddingï¼
ä» doubao-embedding-vision-250615 ååç»çæ¬æ¯æãç¨çåéä» æ¯æçº¯ææ¬è¾å ¥ã
- TOS vectors 䏿¯æç¨çåéï¼å¨ä¸ tos vectors èå使ç¨çåºæ¯ï¼è¯·ä¸è¦ä½¿ç¨ç¨çåé
éè¿ç¬ç«ç sparse_embedding åæ®µæ§å¶ï¼ç¤ºä¾ï¼
{
"model": "doubao-embedding-vision-251215",
"input": [
{
"type":"text",
"text":"天å¾è"
}
],
"sparse_embedding": {
"type":"enabled" # å¼å¯ç¨çåéï¼é»è®¤å¼ä¸º "disabled"
},
"encoding_format":"float"
}设置åé维度 dimensions
åé忝éè¿åéæ¥è¡¨å¾ææ¬ãå¾åçéç»æåæ°æ®çè¿ç¨ï¼è®©è®¡ç®æºè½çè§£è¯è¨ãå¾åççå«ä¹ãå
¶ä¸ï¼åéç»´åº¦æ¯æè¿°åéåååéä¸å
ç´ ç个æ°ï¼æ 注è¯ä¹/å¾åç¹å¾ç维度ï¼ãå¨å¤æ¨¡æåéååºæ¯ï¼æ¯ä¸ªç»´åº¦å¯¹åºææ¬çä¸ä¸ªç¹å¾æè
æ¯å¯¹åºå¾åçåç´ ãè²å½©çè§è§ç¹å¾ã
doubao-embedding-vision-250615ååç»çæ¬æ¯æéè¿dimensions åæ°æå®ç¨ å¯åéçè¾åºç»´åº¦ï¼å¯¹è¾åºçdata.embedding åæ®µçæã
注æ
ç®åä»
æ¯æå¯¹ç¨ å¯åé设置维度ï¼ç¨çåéï¼åºå®ç»´åº¦ï¼ä¸æ¯æã
doubao-embedding-vision-250328 模å䏿¯ææ¤å段ï¼è¯·åèéè¿ç¼ç å®ç°åééç»´ã
{
"model": "doubao-embedding-vision-250615",
"input": [
{
"type":"text",
"text":"天å¾è"
}
],
"dimensions": 1024, #设置åé维度
"encoding_format":"float"
}å¾çæ ¼å¼è¯´æ
å¾çä¼ å
¥æ¹å¼ï¼å¾ç URL æå¾ç Base64 ç¼ç ãç¨å¾ç URL æ¹å¼æ¶ï¼éç¡®ä¿å¾ç URL å¯è¢«è®¿é®ã
å¾çæä»¶å®¹éï¼åå¼ å¾çå°äº 10 MBãä½¿ç¨ base64 ç¼ç ï¼è¯·æ±ä¸è¯·æ±ä½å¤§å°ä¸å¯è¶
è¿ 64 MBã
å¾çåç´ è¯´æï¼æ¨¡åè½å¤æ¯æå°ºå¯¸æ´å çµæ´»çå¾çï¼ä¼ å
¥å¾ç满足ä¸é¢æ¡ä»¶ï¼å¾ç宽é«é¿åº¦ï¼åä½ pxï¼ï¼å¤§äº 14ãå¾çåç´ ï¼å®½ï¼é«ï¼åä½ pxï¼ï¼å°äº 3600ä¸ã
å¾çæ°é说æï¼doubao-embedding-vision-250615 ååç»çæ¬ æ¯æä¸éæ°éçè§é¢ãææ¬åå¾çæ··åè¾å
¥ãdoubao-embedding-vision-250328 模åä»
æ¯ææå¤ 1 ææ¬ å 1 å¾ç è¾å
¥ã
æä½³å®è·µï¼å¤æ¨¡æç¸ä¼¼åº¦å¹é 示ä¾
ç¬¬ä¸æ¥ï¼å¯¼å
¥åºå
导å
¥æéçåºå
ï¼å¹¶è®¾ç½® API Keyï¼ä¸ºåç»çæ°æ®å¤çååæååå¤ã
import os
import numpy as np
from volcenginesdkarkruntime import Ark
from sklearn.metrics.pairwise import cosine_similarityç¬¬äºæ¥ï¼è·ååé
å®ä¹ä¸ä¸ªå½æ°å°åä¸ªææ¬æå¾ç转æ¢ä¸ºåéè¡¨ç¤ºãæ¯æä¸¤ç§è¾å
¥ç±»åï¼ææ¬åå¾ç URLãè°ç¨doubao-embedding-vision-241215模åè·åfloatæ ¼å¼åéï¼å转æ¢ä¸ºnumpyæ°ç»å¹¶å±å¹³ä¸ºä¸ç»´åéã
def get_embedding(input_data, input_type="text"):
"""è°ç¨ç«å±±å¼æAPIè·ååä¸ªææ¬æå¾ççåé表示"""
client = Ark(api_key=os.environ.get("ARK_API_KEY"))
if input_type == "text":
input_item = {"type": "text", "text": input_data}
elif input_type == "image_url":
input_item = {"type": "image_url", "image_url": {"url": input_data}}
else:
raise ValueError("è¾å
¥ç±»åä»
æ¯æ'text'æ'image_url'")
try:
resp = client.multimodal_embeddings.create(
model="doubao-embedding-vision-241215",
encoding_format="float",
input=[input_item]
)
if hasattr(resp, 'data') and hasattr(resp.data, 'embedding'):
embedding = resp.data['embedding']
# ç¡®ä¿å鿝numpyæ°ç»å¹¶å±å¹³ä¸ºä¸ç»´
embedding = np.array(embedding).flatten()
return embedding
else:
raise ValueError("APIååºæ ¼å¼ä¸ç¬¦åé¢æï¼æ æ³è·ååµå
¥åé")
except Exception as e:
print(f" è·ååé失败ï¼è¾å
¥ç±»å: {input_type}, é误: {str(e)}")
raiseç¬¬ä¸æ¥ï¼çæåéåº
æ¹éå¤çå¾ç URL å表ï¼çæå¯¹åºçåé表示ï¼å¹¶æå»ºåéåºã
def generate_image_embeddings(image_urls):
"""æ¹éçæå¾çåéå¹¶æå»ºåéåº"""
print(f"[1/3] å¼å§çæ {len(image_urls)} å¼ å¾ççåé...")
embeddings = []
for i, url in enumerate(image_urls):
try:
embedding = get_embedding(url, "image_url")
embeddings.append({
"image_url": url,
"embedding": embedding
})
print(f" [{i+1}/{len(image_urls)}] æå: {url}")
except Exception as e:
print(f" [{i+1}/{len(image_urls)}] 失败: {url} - {str(e)}")
continue
if not embeddings:
raise ValueError("ææå¾çåéçæå¤±è´¥")
print(f"[2/3] 宿: {len(embeddings)} 个ææåé")
return embeddingsç¬¬åæ¥ï¼å¾çç¸ä¼¼åº¦è®¡ç®
å©ç¨ä½å¼¦ç¸ä¼¼åº¦æ¥åº¦éææ¬ä¸å¾çä¹é´çç¸ä¼¼æ§ï¼å®ç°äºä¸ä¸ªåºäºå
容çå¾çæç´¢åè½ãç¨æ·å¯ä»¥éè¿è¾å
¥ææ¬æè¿°ï¼æ£ç´¢ä¸è¯¥æè¿°æç¸å
³çå¾çã
def search_similar_images(query_embedding, embeddings, top_n=1, query_type="ææ¬"):
"""æç´¢ä¸æ¥è¯¢åéæç¸ä¼¼çå¾ç"""
print(f"\n[3/3] å¼å§æç´¢ä¸{query_type}æç¸ä¼¼çå¾ç...")
results = []
# ç¡®ä¿æ¥è¯¢å鿝numpyæ°ç»ä¸ç»´åº¦æ£ç¡®
query_vec = np.array(query_embedding).reshape(1, -1)
for item in embeddings:
# ç¡®ä¿å鿝numpyæ°ç»å¹¶è°æ´ä¸ºäºç»´æ°ç»ç¨äºç¸ä¼¼åº¦è®¡ç®
item_vec = np.array(item["embedding"]).reshape(1, -1)
similarity = cosine_similarity(query_vec, item_vec)[0][0]
results.append({
"image_url": item["image_url"],
"similarity": similarity
})
results.sort(key=lambda x: x["similarity"], reverse=True)
print(f" - ç¸ä¼¼åº¦è®¡ç®å®æï¼å
± {len(results)} ä¸ªç»æ")
return results[:top_n]ç¬¬äºæ¥ï¼ç¤ºä¾ç¨æ³
æµè¯æç´¢åè½ï¼è°ç¨generate_image_embeddings彿°çæå¾çåéåºï¼ä½¿ç¨ææ¬æ¥è¯¢æç´¢ç¸ä¼¼å¾çï¼å¹¶è¿åæç¸ä¼¼çç»æã示ä¾ä»£ç å¦ä¸ï¼
if __name__ == "__main__":
image_urls = [
"https://ark-project.tos-cn-beijing.volces.com/doc_image/Fruit1.jpg",
"https://ark-project.tos-cn-beijing.volces.com/doc_image/Fruit2.jpg",
"https://ark-project.tos-cn-beijing.volces.com/doc_image/Fruit3.jpg",
"https://ark-project.tos-cn-beijing.volces.com/doc_image/Fruit4.jpg",
"https://ark-project.tos-cn-beijing.volces.com/doc_image/Fruit5.jpg"
]
query_text = "é¦è"
try:
# çæå¾çåé
image_embs = generate_image_embeddings(image_urls)
# çæææ¬åé
text_emb = get_embedding(query_text, "text")
print(f"ææ¬åé维度: {len(text_emb)}")
# æç´¢ç¸ä¼¼å¾ç
similar_images = search_similar_images(text_emb, image_embs)
print(f"æç¸ä¼¼å¾ç: {similar_images[0]['image_url']}")
print(f"ç¸ä¼¼åº¦åæ°: {similar_images[0]['similarity']:.4f}")
except Exception as e:
print(f"ç¨åºå¤±è´¥: {e}")ç¸å ³ææ¯
éè¿ç¼ç å®ç°åééç»´
注æ doubao-embedding-vision-250615ååç»çæ¬æ¯æè®¾ç½®åé维度 dimensionsã
ç¨çåé䏿¯æéç»´ã
doubao-embedding-vision-241215åé维度3072ç»´ï¼ä¸æ¯æé维使ç¨
doubao-embedding-vision-250328æ¨¡åæ¯ææé«ç»´åº¦2048å¯ä»¥å缩å°1024维度å卿£ç´¢ï¼ç»´åº¦è¶é«è¶æ¥è¿æé«ç»´åº¦ææã
å¦ä½é维度&计ç®ç¸ä¼¼åº¦ï¼
é维度: å°embeddingæ¥å£è·åçåéç´æ¥æªåådim维度;
计ç®ç¸ä¼¼åº¦: 对æªååçembeddingåä½å¼¦ç¸ä¼¼åº¦è®¡ç®;
# éç»´+ L2_norm
def sliced_norm_l2(vec: List[float], dim=2048) -> List[float]:
# dim 为1024
norm = float(np.linalg.norm(vec[ :dim]))
return [v / norm for v in vec[ :dim]]
# ä½å¼¦ç¸ä¼¼åº¦è®¡ç®
query_doc_relevance_score_2048d = np.matmul(
sliced_norm_l2(embeddings[0], 2048), #æ¥è¯¢åé
sliced_norm_l2(embeddings[1], 2048) #ææ¡£åé
)Base64 ç¼ç è¾å ¥
å¦æä½ è¦ä¼ å
¥çè§é¢/å¾ç卿¬å°ï¼ä½ å¯ä»¥å°è¿ä¸ªè§é¢/å¾ç转å为 Base64 ç¼ç ï¼ç¶åæäº¤ç»å¤§æ¨¡åãä¸é¢æ¯ä¸ä¸ªç®åç转æ¢ç¤ºä¾ä»£ç ã
注æ
ä¼ å
¥ Base64 ç¼ç æ ¼å¼æ¶ï¼è¯·éµå¾ªä»¥ä¸è§åï¼
ä¼ å
¥çæ¯å¾çï¼
æ ¼å¼éµå¾ªdata:image/<å¾çæ ¼å¼>;base64,<Base64ç¼ç >ï¼å
¶ä¸ï¼
å¾çæ ¼å¼ï¼jpegãpngãgifçï¼æ¯æçå¾çæ ¼å¼è¯¦ç»è§å¾çæ ¼å¼è¯´æã
Base64 ç¼ç ï¼å¾çç Base64 ç¼ç ã
ä¼ å
¥çæ¯è§é¢ï¼
æ ¼å¼éµå¾ªdata:video/<è§é¢æ ¼å¼>;base64,<Base64ç¼ç >ï¼å
¶ä¸ï¼
è§é¢æ ¼å¼ï¼MP4ãAVIçï¼æ¯æçè§é¢æ ¼å¼è¯¦ç»è§è§é¢æ ¼å¼è¯´æã
Base64 ç¼ç ï¼è§é¢ç Base64 ç¼ç ã
# å®ä¹æ¹æ³å°æå®è·¯å¾å¾ç转为Base64ç¼ç
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
# éè¦ä¼ ç»å¤§æ¨¡åçå¾ç
image_path = "path_to_your_image.jpg"
# å°å¾ç转为Base64ç¼ç
base64_image = encode_image(image_path)转æ¢åï¼å¾ççurlæ ¼å¼åèå¦ä¸ï¼
{
"type": "image_url",
"image_url": {
"url": f"data:image/<IMAGE_FORMAT>;base64,{base64_image}"
}
},åéæ¹éå¤ççè§£å³æ¹æ¡
é对 API ä» æ¯æåæ¬¡ä¼ å ¥åå¼ å¾ççéå¶ï¼æ¬ä»£ç éè¿å¼æ¥å¹¶å + æ¹éåç»çç¥æåå¾çåéåçæ¹éå¤çæçï¼å©ç¨ asyncio åå»ºå¼æ¥ä»»å¡ï¼å°å¾çæè®¾å®æ¹æ¬¡åç»åå¹¶åè°ç¨ APIï¼éç¨ âå ¨ç»å¤±è´¥åæ»â 模å¼ç¡®ä¿æ¯ç»ä»»å¡ç»æä¸è´ï¼æ¯æå¤±è´¥éè¯ï¼ä¸åç¬ä¿åæå / å¤±è´¥ç»æï¼å å«åé详æ ãé误信æ¯åå¾çè¾å ¥ URLï¼ã
import asyncio
import os
from pathlib import Path
from volcenginesdkarkruntime import AsyncArk
class MultimodalEmbedder:
"""夿¨¡æåéåæ¹éå¤çå·¥å
·ï¼å
¨ç»å¤±è´¥æ¨¡å¼ï¼"""
def __init__(self, api_key: str, model: str = "doubao-embedding-vision-241215",
batch_size: int = 10, retries: int = 2):
self.api_key = api_key
self.model = model
self.batch_size = batch_size # æ¹é大å°é
ç½®ï¼å¯æ©å±åçé»è¾ï¼
self.retries = retries # æ¹é请æ±éè¯æ¬¡æ°
async def batch_process(self, items_list):
"""æ ¸å¿ï¼å¼æ¥æ¹éå¤çåéä»»å¡ï¼å
¨ç»å¤±è´¥ï¼"""
# 1. åå§åæ¹é客æ·ç«¯
async with AsyncArk(max_retries=self.retries) as client:
# 2. æ¹éåå»ºå¼æ¥ä»»å¡ï¼æè¾å
¥å表åçï¼
batch_tasks = [
asyncio.create_task(client.multimodal_embeddings.create(model=self.model, input=items))
for items in items_list
]
try:
# 3. æ¹éçå¾
ä»»å¡å®æï¼ä»»ä¸å¤±è´¥åå
¨ç»ç»æ¢ï¼
batch_results = await asyncio.gather(*batch_tasks)
return batch_results
except Exception:
# 4. æ¹éæ¸
çæªå®æä»»å¡
for task in batch_tasks:
if not task.done():
task.cancel()
raise # æåºå¼å¸¸ï¼ä¿æå
¨ç»å¤±è´¥è¯ä¹
def batch_save(self, batch_results, output_dir: str = "embedding_results"):
"""æ ¸å¿ï¼æ¹éä¿ååéç»æ"""
# 1. åå§åè¾åºç®å½
Path(output_dir).mkdir(exist_ok=True)
# 2. æ¹éåå
¥ç»ææä»¶
with open(f"{output_dir}/batch_embeddings.txt", "w", encoding="utf-8") as f:
for idx, result in enumerate(batch_results, 1):
embedding = result.data.get("embedding", [])
f.write(f"æ¹é项 #{idx} | åéé¿åº¦: {len(embedding)} | å20ç»´: {embedding[:20]}...\n")
print(f"æ¹éç»æå·²ä¿åï¼{output_dir}/batch_embeddings.txt")
if __name__ == "__main__":
# 1. é
ç½®åå§å
api_key = os.environ.get("ARK_API_KEY") or input("è¾å
¥APIå¯é¥: ").strip()
if not api_key:
raise ValueError("APIå¯é¥ä¸è½ä¸ºç©º")
# 2. åå§åæ¹éå¤çå¨
embedder = MultimodalEmbedder(api_key=api_key, batch_size=10, retries=2)
# 3. æé æ¹éè¾å
¥æ°æ®
batch_inputs = [
[{"type": "text", "text": "天å¾èï¼æµ·å¾æ·±"}, {"type": "image_url", "image_url": {"url": "https://example.com/image1.jpg"}}],
[{"type": "text", "text": "é³å
æåªçæ²æ»©"}, {"type": "image_url", "image_url": {"url": "https://example.com/image2.jpg"}}]
]
# 4. æ§è¡æ¹éå¤ç+ç»æä¿å
try:
batch_results = asyncio.run(embedder.batch_process(batch_inputs))
embedder.batch_save(batch_results)
print(f"æ¹éå¤ç宿ï¼å
±çæ {len(batch_results)} 个åé")
except Exception as e:
print(f"æ¹éå¤ç失败: {str(e)}")