Processing 10,000 CAPTCHAs per hour means ~2.8 solves per second sustained. That's achievable with the right architecture. This guide walks through the math, the code, and the tuning required to reach this throughput using CaptchaAI.
The Math
If a single reCAPTCHA v2 solve takes 15 seconds (median):
- Sequential: 3,600s / 15s = 240 solves/hour
- To reach 10,000/hour: you need ~42 concurrent solves in flight at all times
The key insight: you're not waiting for CaptchaAI to be faster — you're overlapping enough requests that 42 solves complete during the same 15-second window.
Architecture
┌──────────┐ ┌────────────┐ ┌─────────────┐ ┌──────────┐
│ Task │────▶│ Submit │────▶│ CaptchaAI │────▶│ Result │
│ Queue │ │ Workers │ │ API │ │ Store │
│ (Redis) │ │ (async) │ │ │ │ (DB) │
└──────────┘ └────────────┘ └─────────────┘ └──────────┘
│ ▲
│ ┌──────────┐ │
└───▶│ Poll │────┘
│ Workers │
└──────────┘
Components:
- Task queue — Holds pending CAPTCHA tasks with sitekeys and URLs
- Submit workers — Send tasks to CaptchaAI API concurrently
- Poll workers — Check for results at optimized intervals
- Result store — Saves tokens as they arrive
Python: Async Pipeline
# high_throughput_solver.py
import os
import asyncio
import time
import aiohttp
API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")
BASE_URL = "https://ocr.captchaai.com"
MAX_CONCURRENT = 50 # Max simultaneous solves
POLL_INTERVAL = 5 # Seconds between polls
INITIAL_WAIT = 12 # Seconds before first poll
semaphore = asyncio.Semaphore(MAX_CONCURRENT)
stats = {"submitted": 0, "solved": 0, "failed": 0, "start": 0}
async def solve_one(session, sitekey, pageurl, task_num):
"""Submit and poll a single CAPTCHA."""
async with semaphore:
try:
# Submit
async with session.get(f"{BASE_URL}/in.php", params={
"key": API_KEY, "method": "userrecaptcha",
"googlekey": sitekey, "pageurl": pageurl, "json": "1",
}) as resp:
result = await resp.json(content_type=None)
if result.get("status") != 1:
stats["failed"] += 1
return None
stats["submitted"] += 1
task_id = result["request"]
# Wait before first poll
await asyncio.sleep(INITIAL_WAIT)
# Poll
for _ in range(25):
async with session.get(f"{BASE_URL}/res.php", params={
"key": API_KEY, "action": "get",
"id": task_id, "json": "1",
}) as resp:
poll_result = await resp.json(content_type=None)
if poll_result.get("status") == 1:
stats["solved"] += 1
return poll_result["request"]
if poll_result.get("request") != "CAPCHA_NOT_READY":
stats["failed"] += 1
return None
await asyncio.sleep(POLL_INTERVAL)
stats["failed"] += 1
return None
except Exception as e:
stats["failed"] += 1
return None
async def run_batch(tasks):
"""Process a batch of CAPTCHA tasks concurrently."""
connector = aiohttp.TCPConnector(
limit=MAX_CONCURRENT,
keepalive_timeout=60,
)
async with aiohttp.ClientSession(connector=connector) as session:
coros = [
solve_one(session, task["sitekey"], task["pageurl"], i)
for i, task in enumerate(tasks)
]
results = await asyncio.gather(*coros)
return results
async def main():
# Generate test tasks (replace with your task source)
tasks = [
{
"sitekey": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
"pageurl": "https://www.google.com/recaptcha/api2/demo",
}
for _ in range(100) # Start with 100 tasks
]
stats["start"] = time.time()
print(f"Processing {len(tasks)} tasks with {MAX_CONCURRENT} concurrent workers")
results = await run_batch(tasks)
elapsed = time.time() - stats["start"]
print(f"\nCompleted in {elapsed:.0f}s")
print(f"Submitted: {stats['submitted']}")
print(f"Solved: {stats['solved']}")
print(f"Failed: {stats['failed']}")
print(f"Throughput: {stats['solved'] / (elapsed / 3600):.0f} solves/hour")
asyncio.run(main())
JavaScript: Concurrent Pipeline
// high_throughput_solver.js
const axios = require('axios');
const https = require('https');
const API_KEY = process.env.CAPTCHAAI_KEY || 'YOUR_API_KEY';
const BASE = 'https://ocr.captchaai.com';
const MAX_CONCURRENT = 50;
const agent = new https.Agent({ keepAlive: true, maxSockets: MAX_CONCURRENT });
const api = axios.create({ baseURL: BASE, httpsAgent: agent, timeout: 30000 });
const stats = { submitted: 0, solved: 0, failed: 0 };
async function solveOne(sitekey, pageurl) {
try {
const submit = await api.get('/in.php', {
params: { key: API_KEY, method: 'userrecaptcha', googlekey: sitekey, pageurl, json: '1' },
});
if (submit.data.status !== 1) { stats.failed++; return null; }
stats.submitted++;
await new Promise(r => setTimeout(r, 12000));
for (let i = 0; i < 25; i++) {
const poll = await api.get('/res.php', {
params: { key: API_KEY, action: 'get', id: submit.data.request, json: '1' },
});
if (poll.data.status === 1) { stats.solved++; return poll.data.request; }
if (poll.data.request !== 'CAPCHA_NOT_READY') { stats.failed++; return null; }
await new Promise(r => setTimeout(r, 5000));
}
stats.failed++;
return null;
} catch { stats.failed++; return null; }
}
async function runWithConcurrency(tasks, limit) {
const results = [];
const executing = new Set();
for (const task of tasks) {
const p = solveOne(task.sitekey, task.pageurl).then(r => {
executing.delete(p);
return r;
});
executing.add(p);
results.push(p);
if (executing.size >= limit) {
await Promise.race(executing);
}
}
return Promise.all(results);
}
(async () => {
const tasks = Array.from({ length: 100 }, () => ({
sitekey: '6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-',
pageurl: 'https://www.google.com/recaptcha/api2/demo',
}));
const start = Date.now();
console.log(`Processing ${tasks.length} tasks, ${MAX_CONCURRENT} concurrent`);
await runWithConcurrency(tasks, MAX_CONCURRENT);
const elapsed = (Date.now() - start) / 1000;
console.log(`\nDone in ${elapsed.toFixed(0)}s`);
console.log(`Solved: ${stats.solved}, Failed: ${stats.failed}`);
console.log(`Throughput: ${(stats.solved / (elapsed / 3600)).toFixed(0)} solves/hour`);
agent.destroy();
})();
Tuning Parameters
| Parameter | Conservative | Balanced | Aggressive |
|---|---|---|---|
| MAX_CONCURRENT | 20 | 50 | 100 |
| INITIAL_WAIT | 15s | 12s | 10s |
| POLL_INTERVAL | 7s | 5s | 3s |
| MAX_POLL_ATTEMPTS | 30 | 25 | 20 |
| Expected throughput | ~4,800/hr | ~10,000/hr | ~18,000/hr |
Start conservative and increase MAX_CONCURRENT until you see diminishing returns or increased error rates.
Monitoring Throughput
Track these metrics in real-time:
- Solves per minute — Should stay at ~167 for 10K/hour target
- Error rate — Keep below 5%. If it spikes, reduce concurrency
- Queue depth — If growing, increase workers. If empty, you're over-provisioned
- P90 solve time — If increasing, CaptchaAI may be rate-limiting
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| Throughput plateaus at ~5K/hr | Insufficient concurrency | Increase MAX_CONCURRENT to 80–100 |
| Error rate > 10% | Overloading API or bad proxies | Reduce concurrency, check proxy health |
| Memory usage growing | Unbounded task accumulation | Process results as they arrive, don't buffer |
ERROR_NO_SLOT_AVAILABLE |
CaptchaAI queue full | Back off and retry after 5 seconds |
FAQ
What's the CaptchaAI concurrency limit?
There's no hard limit on concurrent requests, but extremely high concurrency (500+) may trigger rate limiting. Start at 50 and scale up.
Can I run this across multiple machines?
Yes. Use a shared queue (Redis, RabbitMQ) and run the worker script on multiple servers. Each worker pulls tasks independently.
What about balance consumption at this rate?
At 10,000 solves/hour, monitor your balance closely. Use the balance check endpoint (res.php?action=getbalance) and set up alerts.
Next Steps
Build your high-throughput CAPTCHA pipeline — get your CaptchaAI API key.
Related guides:
Discussions (0)
Join the conversation
Sign in to share your opinion.
Sign InNo comments yet.