Tutorials

Profiling CAPTCHA Solving Bottlenecks in Python Applications

When your CAPTCHA solving script is slower than expected, you need to know where the time goes. Is it network latency? JSON parsing? Image encoding? This guide shows how to profile CaptchaAI integrations in Python to find and fix the actual bottleneck.

Time Budget for a Single Solve

A typical reCAPTCHA v2 solve breaks down like this:

Phase Expected Time What's Happening
Submit request 50–200ms HTTP call to in.php
CaptchaAI processing 10–25s Solving on CaptchaAI servers
Poll requests (3–5 calls) 150–500ms HTTP calls to res.php
JSON parsing < 1ms Deserializing responses
Your code (between calls) Variable Business logic, DB writes
Total ~12–30s

If your total exceeds 45 seconds consistently, something in your pipeline is adding overhead.

Method 1: Manual Timing Instrumentation

Add timing to each phase of the solve:

# profiled_solver.py
import os
import time
import requests

API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")

def solve_with_timing(sitekey, pageurl):
    """Solve with detailed timing for each phase."""
    timings = {}
    session = requests.Session()

    # Phase 1: Submit
    t0 = time.perf_counter()
    resp = session.get("https://ocr.captchaai.com/in.php", params={
        "key": API_KEY,
        "method": "userrecaptcha",
        "googlekey": sitekey,
        "pageurl": pageurl,
        "json": "1",
    })
    timings["submit_request"] = time.perf_counter() - t0

    t0 = time.perf_counter()
    result = resp.json()
    timings["submit_parse"] = time.perf_counter() - t0

    if result.get("status") != 1:
        return None, timings

    task_id = result["request"]

    # Phase 2: Wait
    t0 = time.perf_counter()
    time.sleep(15)
    timings["initial_wait"] = time.perf_counter() - t0

    # Phase 3: Poll
    poll_times = []
    poll_count = 0
    t_poll_start = time.perf_counter()

    for _ in range(25):
        t0 = time.perf_counter()
        poll = session.get("https://ocr.captchaai.com/res.php", params={
            "key": API_KEY, "action": "get",
            "id": task_id, "json": "1",
        })
        poll_result = poll.json()
        poll_time = time.perf_counter() - t0
        poll_times.append(poll_time)
        poll_count += 1

        if poll_result.get("status") == 1:
            break
        if poll_result.get("request") != "CAPCHA_NOT_READY":
            break
        time.sleep(5)

    timings["poll_total"] = time.perf_counter() - t_poll_start
    timings["poll_count"] = poll_count
    timings["poll_avg_request"] = sum(poll_times) / len(poll_times) if poll_times else 0
    timings["total"] = sum(v for k, v in timings.items() if isinstance(v, float))

    token = poll_result.get("request") if poll_result.get("status") == 1 else None
    return token, timings

# Run and display results
token, timings = solve_with_timing(
    "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
    "https://www.google.com/recaptcha/api2/demo"
)

print("\n=== Timing Breakdown ===")
for key, value in timings.items():
    if isinstance(value, float):
        print(f"  {key}: {value*1000:.1f}ms")
    else:
        print(f"  {key}: {value}")

Expected output:

=== Timing Breakdown ===
  submit_request: 145.3ms
  submit_parse: 0.2ms
  initial_wait: 15001.2ms
  poll_total: 10234.5ms
  poll_count: 3
  poll_avg_request: 67.8ms
  total: 25381.2ms

Method 2: cProfile for Call Stack Analysis

import cProfile
import pstats

def run_solver():
    """Wrapper for profiling."""
    solve_with_timing(
        "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
        "https://www.google.com/recaptcha/api2/demo"
    )

# Profile the entire solve
profiler = cProfile.Profile()
profiler.enable()
run_solver()
profiler.disable()

# Show top 20 time-consuming functions
stats = pstats.Stats(profiler)
stats.sort_stats("cumulative")
stats.print_stats(20)

This reveals whether time is spent in:

  • socket.recv (network I/O — expected)
  • json.loads (JSON parsing — should be < 1ms)
  • ssl.read (TLS — expected for HTTPS)
  • Your own functions (business logic — optimize here)

Method 3: Async Profiling for Concurrent Solvers

For asyncio-based solvers, standard profiling doesn't work well. Use timing decorators:

import asyncio
import functools
import time
from collections import defaultdict

# Timing decorator for async functions
timing_data = defaultdict(list)

def timed_async(func):
    @functools.wraps(func)
    async def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = await func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        timing_data[func.__name__].append(elapsed)
        return result
    return wrapper

@timed_async
async def submit_captcha(session, sitekey, pageurl):
    """Submit with timing."""
    import aiohttp
    API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")
    async with session.get("https://ocr.captchaai.com/in.php", params={
        "key": API_KEY, "method": "userrecaptcha",
        "googlekey": sitekey, "pageurl": pageurl, "json": "1",
    }) as resp:
        return await resp.json(content_type=None)

@timed_async
async def poll_result(session, task_id):
    """Poll with timing."""
    import aiohttp
    API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")
    async with session.get("https://ocr.captchaai.com/res.php", params={
        "key": API_KEY, "action": "get",
        "id": task_id, "json": "1",
    }) as resp:
        return await resp.json(content_type=None)

# After running, print statistics
def print_timing_stats():
    import statistics
    for func_name, times in timing_data.items():
        print(f"\n{func_name}:")
        print(f"  Calls: {len(times)}")
        print(f"  Median: {statistics.median(times)*1000:.1f}ms")
        print(f"  Max: {max(times)*1000:.1f}ms")
        print(f"  Total: {sum(times)*1000:.1f}ms")

Common Bottlenecks and Fixes

Bottleneck How to Detect Fix
High submit_request time (> 500ms) Manual timing shows slow submit Check DNS, use keep-alive
High poll count (> 8 polls) poll_count consistently high Increase initial wait time
Slow JSON parsing submit_parse > 10ms Shouldn't happen; check response size
Time between polls > 5s Gap between poll end and next poll start Verify no blocking code between polls
Image encoding bottleneck Large base64.b64encode time Pre-encode or stream images
Database writes blocking solver cProfile shows DB function time Make DB writes async or batch

Troubleshooting

Issue Cause Fix
Total time 2x expected Business logic between API calls Profile to find the slow function
First solve slow, rest fast Connection setup (DNS + TLS) Use Session with keep-alive
Memory growing during profiling Profiler accumulating data Use sampling profiler for long runs
Profiling changes timing Profiler overhead Use time.perf_counter() for production

FAQ

Does profiling affect solve accuracy?

No. Profiling only measures execution timing. It doesn't change the API calls or CAPTCHA solving behavior.

Should I profile in production?

Use lightweight timing (Method 1) in production. Avoid cProfile in production as it adds CPU overhead. Sample periodically instead.

What's the minimum useful sample size for profiling?

Profile at least 10 solves to get meaningful statistics. Single-solve profiling is too noisy due to network variability.

Next Steps

Profile your CAPTCHA pipeline and eliminate bottlenecks — get your CaptchaAI API key.

Related guides:

Discussions (0)

No comments yet.

Related Posts

DevOps & Scaling Ansible Playbooks for CaptchaAI Worker Deployment
Deploy and manage Captcha AI workers with Ansible — playbooks for provisioning, configuration, rolling updates, and health checks across your server fleet.

Deploy and manage Captcha AI workers with Ansible — playbooks for provisioning, configuration, rolling updates...

Automation Python All CAPTCHA Types
Apr 07, 2026
DevOps & Scaling Blue-Green Deployment for CAPTCHA Solving Infrastructure
Implement blue-green deployments for CAPTCHA solving infrastructure — zero-downtime upgrades, traffic switching, and rollback strategies with Captcha AI.

Implement blue-green deployments for CAPTCHA solving infrastructure — zero-downtime upgrades, traffic switchin...

Automation Python All CAPTCHA Types
Apr 07, 2026
Troubleshooting CaptchaAI API Error Handling: Complete Decision Tree
Complete decision tree for every Captcha AI API error.

Complete decision tree for every Captcha AI API error. Learn which errors are retryable, which need parameter...

Automation Python All CAPTCHA Types
Mar 17, 2026
Tutorials Using Fiddler to Inspect CaptchaAI API Traffic
How to use Fiddler Everywhere and Fiddler Classic to capture, inspect, and debug Captcha AI API requests and responses — filters, breakpoints, and replay for tr...

How to use Fiddler Everywhere and Fiddler Classic to capture, inspect, and debug Captcha AI API requests and r...

Automation Python All CAPTCHA Types
Mar 05, 2026
Tutorials CAPTCHA Handling in Mobile Apps with Appium
Handle CAPTCHAs in mobile app automation using Appium and Captcha AI — extract Web sitekeys, solve, and inject tokens on Android and i OS.

Handle CAPTCHAs in mobile app automation using Appium and Captcha AI — extract Web View sitekeys, solve, and i...

Automation Python All CAPTCHA Types
Feb 13, 2026
Tutorials Streaming Batch Results: Processing CAPTCHA Solutions as They Arrive
Process CAPTCHA solutions the moment they arrive instead of waiting for tasks to complete — use async generators, event emitters, and callback patterns for stre...

Process CAPTCHA solutions the moment they arrive instead of waiting for all tasks to complete — use async gene...

Automation Python All CAPTCHA Types
Apr 07, 2026
Reference CaptchaAI CLI Tool: Command-Line CAPTCHA Solving and Testing
A reference for building and using a Captcha AI command-line tool — solve CAPTCHAs, check balance, test parameters, and integrate with shell scripts and CI/CD p...

A reference for building and using a Captcha AI command-line tool — solve CAPTCHAs, check balance, test parame...

Automation Python All CAPTCHA Types
Feb 26, 2026
DevOps & Scaling Auto-Scaling CAPTCHA Solving Workers
Build auto-scaling CAPTCHA solving workers that adjust capacity based on queue depth, balance, and solve rates.

Build auto-scaling CAPTCHA solving workers that adjust capacity based on queue depth, balance, and solve rates...

Automation Python All CAPTCHA Types
Mar 23, 2026
DevOps & Scaling CaptchaAI Monitoring with Datadog: Metrics and Alerts
Monitor Captcha AI performance with Datadog — custom metrics, dashboards, anomaly detection alerts, and solve rate tracking for CAPTCHA solving pipelines.

Monitor Captcha AI performance with Datadog — custom metrics, dashboards, anomaly detection alerts, and solve...

Automation Python All CAPTCHA Types
Feb 19, 2026
Tutorials Pytest Fixtures for CaptchaAI API Testing
Build reusable pytest fixtures to test CAPTCHA-solving workflows with Captcha AI.

Build reusable pytest fixtures to test CAPTCHA-solving workflows with Captcha AI. Covers mocking, live integra...

Automation Python reCAPTCHA v2
Apr 08, 2026
Tutorials GeeTest Token Injection in Browser Automation Frameworks
how to inject Gee Test v 3 solution tokens into Playwright, Puppeteer, and Selenium — including the three-value response, callback triggering, and form submissi...

Learn how to inject Gee Test v 3 solution tokens into Playwright, Puppeteer, and Selenium — including the thre...

Automation Python Testing
Jan 18, 2026