CAPTCHA Solving Throughput: Process 10,000 Tasks per Hour

Processing 10,000 CAPTCHAs per hour means ~2.8 solves per second sustained. That's achievable with the right architecture. This guide walks through the math, the code, and the tuning required to reach this throughput using CaptchaAI.

The Math

If a single reCAPTCHA v2 solve takes 15 seconds (median):

Sequential: 3,600s / 15s = 240 solves/hour
To reach 10,000/hour: you need ~42 concurrent solves in flight at all times

The key insight: you're not waiting for CaptchaAI to be faster — you're overlapping enough requests that 42 solves complete during the same 15-second window.

Architecture

┌──────────┐     ┌────────────┐     ┌─────────────┐     ┌──────────┐
│  Task     │────▶│  Submit    │────▶│  CaptchaAI  │────▶│  Result  │
│  Queue    │     │  Workers   │     │  API        │     │  Store   │
│  (Redis)  │     │  (async)   │     │             │     │  (DB)    │
└──────────┘     └────────────┘     └─────────────┘     └──────────┘
                       │                    ▲
                       │    ┌──────────┐    │
                       └───▶│  Poll    │────┘
                            │  Workers │
                            └──────────┘

Components:

Task queue — Holds pending CAPTCHA tasks with sitekeys and URLs
Submit workers — Send tasks to CaptchaAI API concurrently
Poll workers — Check for results at optimized intervals
Result store — Saves tokens as they arrive

Python: Async Pipeline

# high_throughput_solver.py
import os
import asyncio
import time
import aiohttp

API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")
BASE_URL = "https://ocr.captchaai.com"
MAX_CONCURRENT = 50  # Max simultaneous solves
POLL_INTERVAL = 5    # Seconds between polls
INITIAL_WAIT = 12    # Seconds before first poll

semaphore = asyncio.Semaphore(MAX_CONCURRENT)
stats = {"submitted": 0, "solved": 0, "failed": 0, "start": 0}

async def solve_one(session, sitekey, pageurl, task_num):
    """Submit and poll a single CAPTCHA."""
    async with semaphore:
        try:
            # Submit
            async with session.get(f"{BASE_URL}/in.php", params={
                "key": API_KEY, "method": "userrecaptcha",
                "googlekey": sitekey, "pageurl": pageurl, "json": "1",
            }) as resp:
                result = await resp.json(content_type=None)

            if result.get("status") != 1:
                stats["failed"] += 1
                return None

            stats["submitted"] += 1
            task_id = result["request"]

            # Wait before first poll
            await asyncio.sleep(INITIAL_WAIT)

            # Poll
            for _ in range(25):
                async with session.get(f"{BASE_URL}/res.php", params={
                    "key": API_KEY, "action": "get",
                    "id": task_id, "json": "1",
                }) as resp:
                    poll_result = await resp.json(content_type=None)

                if poll_result.get("status") == 1:
                    stats["solved"] += 1
                    return poll_result["request"]

                if poll_result.get("request") != "CAPCHA_NOT_READY":
                    stats["failed"] += 1
                    return None

                await asyncio.sleep(POLL_INTERVAL)

            stats["failed"] += 1
            return None

        except Exception as e:
            stats["failed"] += 1
            return None

async def run_batch(tasks):
    """Process a batch of CAPTCHA tasks concurrently."""
    connector = aiohttp.TCPConnector(
        limit=MAX_CONCURRENT,
        keepalive_timeout=60,
    )
    async with aiohttp.ClientSession(connector=connector) as session:
        coros = [
            solve_one(session, task["sitekey"], task["pageurl"], i)
            for i, task in enumerate(tasks)
        ]
        results = await asyncio.gather(*coros)
    return results

async def main():
    # Generate test tasks (replace with your task source)
    tasks = [
        {
            "sitekey": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
            "pageurl": "https://www.google.com/recaptcha/api2/demo",
        }
        for _ in range(100)  # Start with 100 tasks
    ]

    stats["start"] = time.time()
    print(f"Processing {len(tasks)} tasks with {MAX_CONCURRENT} concurrent workers")

    results = await run_batch(tasks)
    elapsed = time.time() - stats["start"]

    print(f"\nCompleted in {elapsed:.0f}s")
    print(f"Submitted: {stats['submitted']}")
    print(f"Solved: {stats['solved']}")
    print(f"Failed: {stats['failed']}")
    print(f"Throughput: {stats['solved'] / (elapsed / 3600):.0f} solves/hour")

asyncio.run(main())

JavaScript: Concurrent Pipeline

// high_throughput_solver.js
const axios = require('axios');
const https = require('https');

const API_KEY = process.env.CAPTCHAAI_KEY || 'YOUR_API_KEY';
const BASE = 'https://ocr.captchaai.com';
const MAX_CONCURRENT = 50;

const agent = new https.Agent({ keepAlive: true, maxSockets: MAX_CONCURRENT });
const api = axios.create({ baseURL: BASE, httpsAgent: agent, timeout: 30000 });

const stats = { submitted: 0, solved: 0, failed: 0 };

async function solveOne(sitekey, pageurl) {
  try {
    const submit = await api.get('/in.php', {
      params: { key: API_KEY, method: 'userrecaptcha', googlekey: sitekey, pageurl, json: '1' },
    });
    if (submit.data.status !== 1) { stats.failed++; return null; }
    stats.submitted++;

    await new Promise(r => setTimeout(r, 12000));

    for (let i = 0; i < 25; i++) {
      const poll = await api.get('/res.php', {
        params: { key: API_KEY, action: 'get', id: submit.data.request, json: '1' },
      });
      if (poll.data.status === 1) { stats.solved++; return poll.data.request; }
      if (poll.data.request !== 'CAPCHA_NOT_READY') { stats.failed++; return null; }
      await new Promise(r => setTimeout(r, 5000));
    }
    stats.failed++;
    return null;
  } catch { stats.failed++; return null; }
}

async function runWithConcurrency(tasks, limit) {
  const results = [];
  const executing = new Set();

  for (const task of tasks) {
    const p = solveOne(task.sitekey, task.pageurl).then(r => {
      executing.delete(p);
      return r;
    });
    executing.add(p);
    results.push(p);

    if (executing.size >= limit) {
      await Promise.race(executing);
    }
  }
  return Promise.all(results);
}

(async () => {
  const tasks = Array.from({ length: 100 }, () => ({
    sitekey: '6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-',
    pageurl: 'https://www.google.com/recaptcha/api2/demo',
  }));

  const start = Date.now();
  console.log(`Processing ${tasks.length} tasks, ${MAX_CONCURRENT} concurrent`);

  await runWithConcurrency(tasks, MAX_CONCURRENT);
  const elapsed = (Date.now() - start) / 1000;

  console.log(`\nDone in ${elapsed.toFixed(0)}s`);
  console.log(`Solved: ${stats.solved}, Failed: ${stats.failed}`);
  console.log(`Throughput: ${(stats.solved / (elapsed / 3600)).toFixed(0)} solves/hour`);

  agent.destroy();
})();

Tuning Parameters

Parameter	Conservative	Balanced	Aggressive
MAX_CONCURRENT	20	50	100
INITIAL_WAIT	15s	12s	10s
POLL_INTERVAL	7s	5s	3s
MAX_POLL_ATTEMPTS	30	25	20
Expected throughput	~4,800/hr	~10,000/hr	~18,000/hr

Start conservative and increase MAX_CONCURRENT until you see diminishing returns or increased error rates.

Monitoring Throughput

Track these metrics in real-time:

Solves per minute — Should stay at ~167 for 10K/hour target
Error rate — Keep below 5%. If it spikes, reduce concurrency
Queue depth — If growing, increase workers. If empty, you're over-provisioned
P90 solve time — If increasing, CaptchaAI may be rate-limiting

Troubleshooting

Issue	Cause	Fix
Throughput plateaus at ~5K/hr	Insufficient concurrency	Increase `MAX_CONCURRENT` to 80–100
Error rate > 10%	Overloading API or bad proxies	Reduce concurrency, check proxy health
Memory usage growing	Unbounded task accumulation	Process results as they arrive, don't buffer
`ERROR_NO_SLOT_AVAILABLE`	CaptchaAI queue full	Back off and retry after 5 seconds

FAQ

What's the CaptchaAI concurrency limit?

There's no hard limit on concurrent requests, but extremely high concurrency (500+) may trigger rate limiting. Start at 50 and scale up.

Can I run this across multiple machines?

Yes. Use a shared queue (Redis, RabbitMQ) and run the worker script on multiple servers. Each worker pulls tasks independently.

What about balance consumption at this rate?

At 10,000 solves/hour, monitor your balance closely. Use the balance check endpoint (res.php?action=getbalance) and set up alerts.

Next Steps

Build your high-throughput CAPTCHA pipeline — get your CaptchaAI API key.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

CAPTCHA Solving Throughput: How to Process 10,000 Tasks per Hour

The Math

Architecture

Python: Async Pipeline

JavaScript: Concurrent Pipeline

Tuning Parameters

Monitoring Throughput

Troubleshooting

FAQ

What's the CaptchaAI concurrency limit?

Can I run this across multiple machines?

What about balance consumption at this rate?

Next Steps

Discussions (0)

Python ThreadPoolExecutor for CAPTCHA Solving Parallelism

Rate Limiting CAPTCHA Solving Workflows

Semaphore Patterns for CAPTCHA Concurrency Control

CaptchaAI API Rate Limiting: Handling 429 Responses

CAPTCHA Solving Performance by Region: Latency Analysis

DNS Resolution Impact on CAPTCHA API Performance

The Math

Architecture

Python: Async Pipeline

JavaScript: Concurrent Pipeline

Tuning Parameters

Monitoring Throughput

Troubleshooting

FAQ

What's the CaptchaAI concurrency limit?

Can I run this across multiple machines?

What about balance consumption at this rate?

Next Steps

Discussions (0)

Join the conversation

Related Posts

Python ThreadPoolExecutor for CAPTCHA Solving Parallelism

Rate Limiting CAPTCHA Solving Workflows

Semaphore Patterns for CAPTCHA Concurrency Control

CaptchaAI API Rate Limiting: Handling 429 Responses

CAPTCHA Solving Performance by Region: Latency Analysis

DNS Resolution Impact on CAPTCHA API Performance