CaptchaAI's Image/OCR solver supports over 27,500 CAPTCHA types across multiple writing systems. The language parameter tells the solver which character set to expect, directly affecting recognition accuracy. Using the wrong setting means the solver looks for the wrong characters.
Language Parameter Reference
| Value | Character sets | Best for |
|---|---|---|
0 |
Not specified (default) | Latin-only CAPTCHAs — English, Spanish, French, German |
1 |
Cyrillic only | Russian, Ukrainian, Bulgarian CAPTCHAs |
2 |
Non-Latin characters | Chinese, Japanese, Korean, Arabic, Cyrillic, mixed scripts |
When to Use Each Setting
| CAPTCHA content | Setting | Why |
|---|---|---|
| English letters + numbers | 0 or omit |
Default Latin recognition |
| Russian text | 1 or 2 |
Cyrillic-specific model |
| Chinese characters | 2 |
CJK character set required |
| Japanese hiragana/katakana | 2 |
Non-Latin recognition needed |
| Arabic script | 2 |
Non-Latin recognition needed |
| Korean hangul | 2 |
Non-Latin recognition needed |
| Mixed Latin + Cyrillic | 2 |
Handles multiple scripts |
| Numbers only | 0 or omit |
Digits are universal |
Python: Language-Aware CAPTCHA Solving
import requests
import base64
import time
API_KEY = "YOUR_API_KEY"
SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"
def solve_image_captcha(image_path: str, language: int = 0) -> str:
"""Solve an image CAPTCHA with the specified language setting.
Args:
image_path: Path to the CAPTCHA image file.
language: 0=Latin, 1=Cyrillic, 2=non-Latin/mixed.
"""
with open(image_path, "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
params = {
"key": API_KEY,
"method": "base64",
"body": image_b64,
"json": 1,
}
# Only set language if non-default
if language > 0:
params["language"] = language
resp = requests.post(SUBMIT_URL, data=params, timeout=30).json()
if resp.get("status") != 1:
raise RuntimeError(f"Submit: {resp.get('request')}")
task_id = resp["request"]
for _ in range(24):
time.sleep(5)
poll = requests.get(RESULT_URL, params={
"key": API_KEY, "action": "get", "id": task_id, "json": 1,
}, timeout=15).json()
if poll.get("request") == "CAPCHA_NOT_READY":
continue
if poll.get("status") == 1:
return poll["request"]
raise RuntimeError(f"Solve: {poll.get('request')}")
raise RuntimeError("Timeout")
def detect_script(text: str) -> str:
"""Detect the primary script of solved text."""
for ch in text:
cp = ord(ch)
if 0x0400 <= cp <= 0x04FF:
return "cyrillic"
if 0x4E00 <= cp <= 0x9FFF:
return "cjk"
if 0x3040 <= cp <= 0x30FF:
return "japanese"
if 0xAC00 <= cp <= 0xD7AF:
return "korean"
if 0x0600 <= cp <= 0x06FF:
return "arabic"
if 0x0590 <= cp <= 0x05FF:
return "hebrew"
return "latin"
# --- Usage examples ---
# Latin CAPTCHA (default)
latin_text = solve_image_captcha("english_captcha.png", language=0)
print(f"Latin: {latin_text} (script: {detect_script(latin_text)})")
# Russian CAPTCHA
cyrillic_text = solve_image_captcha("russian_captcha.png", language=1)
print(f"Cyrillic: {cyrillic_text} (script: {detect_script(cyrillic_text)})")
# Chinese CAPTCHA
chinese_text = solve_image_captcha("chinese_captcha.png", language=2)
print(f"CJK: {chinese_text} (script: {detect_script(chinese_text)})")
# Mixed script CAPTCHA
mixed_text = solve_image_captcha("mixed_captcha.png", language=2)
print(f"Mixed: {mixed_text} (script: {detect_script(mixed_text)})")
JavaScript: Multi-Language Solver
const API_KEY = "YOUR_API_KEY";
const SUBMIT_URL = "https://ocr.captchaai.com/in.php";
const RESULT_URL = "https://ocr.captchaai.com/res.php";
const fs = require("fs");
async function solveImageCaptcha(imagePath, language = 0) {
const imageB64 = fs.readFileSync(imagePath, "base64");
const params = {
key: API_KEY,
method: "base64",
body: imageB64,
json: "1",
};
if (language > 0) params.language = String(language);
const body = new URLSearchParams(params);
const resp = await (await fetch(SUBMIT_URL, { method: "POST", body })).json();
if (resp.status !== 1) throw new Error(`Submit: ${resp.request}`);
const taskId = resp.request;
for (let i = 0; i < 24; i++) {
await new Promise((r) => setTimeout(r, 5000));
const url = `${RESULT_URL}?key=${API_KEY}&action=get&id=${taskId}&json=1`;
const poll = await (await fetch(url)).json();
if (poll.request === "CAPCHA_NOT_READY") continue;
if (poll.status === 1) return poll.request;
throw new Error(`Solve: ${poll.request}`);
}
throw new Error("Timeout");
}
function detectScript(text) {
for (const ch of text) {
const cp = ch.codePointAt(0);
if (cp >= 0x0400 && cp <= 0x04ff) return "cyrillic";
if (cp >= 0x4e00 && cp <= 0x9fff) return "cjk";
if (cp >= 0x3040 && cp <= 0x30ff) return "japanese";
if (cp >= 0xac00 && cp <= 0xd7af) return "korean";
if (cp >= 0x0600 && cp <= 0x06ff) return "arabic";
}
return "latin";
}
// Auto-detect-and-solve helper
async function solveWithAutoLanguage(imagePath, hint = "auto") {
const languageMap = {
latin: 0,
cyrillic: 1,
russian: 1,
chinese: 2,
japanese: 2,
korean: 2,
arabic: 2,
auto: 2, // language=2 handles all scripts
};
const language = languageMap[hint] ?? 2;
return solveImageCaptcha(imagePath, language);
}
// Usage
const text = await solveWithAutoLanguage("captcha.png", "auto");
console.log(`Text: ${text}, Script: ${detectScript(text)}`);
Common Mistakes
| Mistake | Effect | Fix |
|---|---|---|
Using language=0 for Cyrillic |
Returns Latin lookalikes (B instead of В) | Use language=1 or language=2 |
Using language=1 for Chinese |
Solver expects Cyrillic, gets CJK | Use language=2 for non-Latin scripts |
| Omitting language for mixed scripts | May misidentify ambiguous characters | Always use language=2 for mixed content |
| Assuming default handles everything | Latin-only model misses non-Latin chars | Set language explicitly for non-English sites |
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| Correct-looking text but wrong Unicode | Cyrillic/Latin homoglyph confusion | Check codepoints: hex(ord(char)) |
| Empty result for CJK CAPTCHA | Language not set to 2 | Set language=2 for Chinese/Japanese/Korean |
| Mixed numbers and characters wrong | Numbers universal but context matters | Use language=2 for any non-Latin context |
| Solve rate drops when switching sites | Different language requirements | Match language param to each site's character set |
| Result encoding garbled | HTTP response not decoded as UTF-8 | Force UTF-8: response.encoding = 'utf-8' |
FAQ
What happens if I use the wrong language setting?
The solver attempts to match the image against the wrong character models. Latin characters may be returned for Cyrillic text (B instead of В), or the solve may fail entirely for CJK characters. Setting language=2 is the safest fallback for unknown scripts.
Can I use language=2 for everything?
Yes, but it's slightly less optimised for pure Latin CAPTCHAs. For English-only sites, omitting the parameter (default Latin) gives the best accuracy. For any site where you're unsure about the script, language=2 handles all character sets.
Does the language parameter affect solve speed?
Minimally. The solver selects the appropriate OCR model based on the parameter. More complex character sets (CJK with thousands of characters) may take marginally longer than Latin (26 characters), but the difference is typically under a second.
Related Articles
Next Steps
Solve CAPTCHAs in any language — get your CaptchaAI API key and set the right language parameter.
Related guides:
Discussions (0)
Join the conversation
Sign in to share your opinion.
Sign InNo comments yet.