Reference

Complete Guide: CAPTCHA Solving from Basics to Production

Everything you need to go from your first CAPTCHA solve to a production-grade pipeline.


Part 1: Understanding CAPTCHAs

What Is a CAPTCHA?

A CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a challenge designed to block automated access while allowing human users through.

Types You'll Encounter

Type Examples Challenge
Text/Image Distorted letters, math expressions Type what you see
Checkbox reCAPTCHA v2 Click checkbox, possibly solve image grid
Invisible reCAPTCHA v3, Turnstile No user interaction — behavioral scoring
Interactive GeeTest slide, BLS grid Drag, click, or order elements

Why Sites Use CAPTCHAs

  • Prevent automated account creation
  • Block scraping and data harvesting
  • Stop spam in forms and comments
  • Rate-limit API access

Part 2: How CAPTCHA Solving Services Work

The Flow

Your Code  →  Submit CAPTCHA to API  →  Solving Service  →  Return Token/Text  →  Your Code Injects Result

Step-by-Step

  1. Extract CAPTCHA parameters from the target page (sitekey, challenge, image)
  2. Submit parameters to the solving API
  3. Poll for the result (token or text)
  4. Inject the result back into the page
  5. Submit the form

Part 3: Setting Up CaptchaAI

Install Dependencies

pip install requests

Core Solver Class

import time
import requests


class CaptchaAI:
    BASE = "https://ocr.captchaai.com"

    def __init__(self, api_key):
        self.api_key = api_key

    def submit(self, params):
        params["key"] = self.api_key
        params["json"] = 1
        resp = requests.post(f"{self.BASE}/in.php", data=params)
        data = resp.json()
        if data["status"] != 1:
            raise Exception(f"Submit failed: {data['request']}")
        return data["request"]

    def get_result(self, task_id, timeout=300, interval=5, initial_wait=10):
        time.sleep(initial_wait)
        deadline = time.time() + timeout
        while time.time() < deadline:
            resp = requests.get(
                f"{self.BASE}/res.php",
                params={
                    "key": self.api_key,
                    "action": "get",
                    "id": task_id,
                    "json": 1,
                },
            ).json()
            if resp["request"] == "CAPCHA_NOT_READY":
                time.sleep(interval)
                continue
            if resp["status"] == 1:
                return resp["request"]
            raise Exception(f"Solve failed: {resp['request']}")
        raise TimeoutError("Solve timed out")

    def solve(self, params, **kwargs):
        task_id = self.submit(params)
        return self.get_result(task_id, **kwargs)

    def balance(self):
        resp = requests.get(
            f"{self.BASE}/res.php",
            params={"key": self.api_key, "action": "getbalance"},
        )
        return float(resp.text)

Part 4: Solving Each CAPTCHA Type

reCAPTCHA v2

solver = CaptchaAI("YOUR_API_KEY")
token = solver.solve({
    "method": "userrecaptcha",
    "googlekey": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
    "pageurl": "https://example.com/login",
})

reCAPTCHA v3

token = solver.solve({
    "method": "userrecaptcha",
    "googlekey": "SITE_KEY",
    "pageurl": "https://example.com",
    "version": "v3",
    "action": "submit",
    "min_score": "0.9",
}, initial_wait=20)

Cloudflare Turnstile

token = solver.solve({
    "method": "turnstile",
    "sitekey": "0x4AAAAAAAC3a...",
    "pageurl": "https://example.com",
})

GeeTest v3

result = solver.solve({
    "method": "geetest",
    "gt": "GT_VALUE",
    "challenge": "CHALLENGE_VALUE",
    "pageurl": "https://example.com",
})

Image/OCR

import base64

with open("captcha.png", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

text = solver.solve({
    "method": "base64",
    "body": img_b64,
    "numeric": "1",
    "minLen": "4",
    "maxLen": "6",
})

Part 5: Extracting CAPTCHA Parameters

reCAPTCHA Sitekey

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("https://example.com/login")

# Method 1: From div attribute
sitekey = driver.find_element(
    By.CSS_SELECTOR, "[data-sitekey]"
).get_attribute("data-sitekey")

# Method 2: From iframe URL
import re
iframe = driver.find_element(By.CSS_SELECTOR, "iframe[src*='recaptcha']")
src = iframe.get_attribute("src")
sitekey = re.search(r"k=([^&]+)", src).group(1)

Turnstile Sitekey

sitekey = driver.find_element(
    By.CSS_SELECTOR, "[data-sitekey], .cf-turnstile"
).get_attribute("data-sitekey")

GeeTest Parameters

import json

gt_data = driver.execute_script("""
    return {
        gt: document.querySelector('[data-gt]')?.getAttribute('data-gt'),
        challenge: document.querySelector('[data-challenge]')?.getAttribute('data-challenge')
    };
""")

Part 6: Injecting Solutions

Token-Based (reCAPTCHA, Turnstile)

driver.execute_script(f"""
    document.querySelector('[name="g-recaptcha-response"]').value = '{token}';
    document.querySelector('[name="cf-turnstile-response"]').value = '{token}';
""")

For callbacks

driver.execute_script(f"""
    if (typeof ___grecaptcha_cfg !== 'undefined') {{
        Object.keys(___grecaptcha_cfg.clients).forEach(function(key) {{
            var client = ___grecaptcha_cfg.clients[key];
            // Find and call the callback
        }});
    }}
""")

Part 7: Error Handling

Retry Logic

def solve_with_retry(solver, params, max_retries=3):
    for attempt in range(max_retries):
        try:
            return solver.solve(params)
        except Exception as e:
            error = str(e)
            if "ZERO_BALANCE" in error:
                raise  # Don't retry — need funds
            if "UNSOLVABLE" in error:
                print(f"Attempt {attempt + 1} failed, retrying...")
                continue
            raise
    raise Exception(f"Failed after {max_retries} attempts")

Balance Monitoring

def check_balance_before_solve(solver, min_balance=0.10):
    balance = solver.balance()
    if balance < min_balance:
        raise Exception(f"Low balance: ${balance:.2f}")
    return balance

Part 8: Production Patterns

Connection Pooling

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


def create_session():
    session = requests.Session()
    retry = Retry(total=3, backoff_factor=1, status_forcelist=[500, 502, 503])
    adapter = HTTPAdapter(max_retries=retry, pool_connections=10, pool_maxsize=20)
    session.mount("https://", adapter)
    return session

Concurrent Solving

from concurrent.futures import ThreadPoolExecutor, as_completed


def solve_batch(solver, captcha_list, max_workers=5):
    results = {}
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {
            executor.submit(solver.solve, params): url
            for url, params in captcha_list
        }
        for future in as_completed(futures):
            url = futures[future]
            try:
                results[url] = future.result()
            except Exception as e:
                results[url] = f"ERROR: {e}"
    return results

Rate Limiting

import threading

class RateLimiter:
    def __init__(self, max_per_second=10):
        self.interval = 1.0 / max_per_second
        self.lock = threading.Lock()
        self.last_call = 0

    def wait(self):
        with self.lock:
            now = time.time()
            wait_time = self.last_call + self.interval - now
            if wait_time > 0:
                time.sleep(wait_time)
            self.last_call = time.time()

Part 9: Monitoring

Logging

import logging

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger("captcha")

def solve_logged(solver, params):
    start = time.time()
    logger.info(f"Submitting {params.get('method')} CAPTCHA")
    try:
        result = solver.solve(params)
        elapsed = time.time() - start
        logger.info(f"Solved in {elapsed:.1f}s")
        return result
    except Exception as e:
        elapsed = time.time() - start
        logger.error(f"Failed after {elapsed:.1f}s: {e}")
        raise

Metrics Tracking

class SolveMetrics:
    def __init__(self):
        self.total = 0
        self.success = 0
        self.failures = 0
        self.total_time = 0.0

    def record(self, success, elapsed):
        self.total += 1
        self.total_time += elapsed
        if success:
            self.success += 1
        else:
            self.failures += 1

    def summary(self):
        rate = (self.success / self.total * 100) if self.total else 0
        avg = (self.total_time / self.total) if self.total else 0
        return {
            "total": self.total,
            "success_rate": f"{rate:.1f}%",
            "avg_time": f"{avg:.1f}s",
        }

Part 10: Checklist

Step Task
1 Install requests and get API key
2 Identify CAPTCHA type on target page
3 Extract sitekey/parameters
4 Submit to CaptchaAI with correct method
5 Poll with proper timing
6 Inject token and submit form
7 Add retry logic for production
8 Monitor success rate and costs
9 Scale with connection pooling and concurrency

FAQ

How much does CAPTCHA solving cost?

Pricing varies by type. Check your dashboard for current rates. Image CAPTCHAs are cheapest; token-based types cost more.

Which CAPTCHA type is fastest to solve?

Image/OCR CAPTCHAs solve in 5-15 seconds. Turnstile solves quickly due to 100% success rate. reCAPTCHA v3 may take 20-30 seconds.

Can I solve CAPTCHAs without a browser?

Yes, for token-based CAPTCHAs you only need the sitekey and page URL — no browser required. Image CAPTCHAs need just the image data.

How do I handle token expiration?

Solve CAPTCHAs just before you need them. reCAPTCHA tokens expire in ~120 seconds, Turnstile in ~300 seconds. Don't pre-solve in bulk.



From basics to production in one guide — start with CaptchaAI.

Discussions (0)

No comments yet.

Related Posts

Explainers Rate Limiting CAPTCHA Solving Workflows
Sending too many requests too fast triggers blocks, bans, and wasted CAPTCHA solves.

Sending too many requests too fast triggers blocks, bans, and wasted CAPTCHA solves. Smart rate limiting keeps...

Automation Python Web Scraping
Apr 04, 2026
Tutorials Dynamic CAPTCHA Loading: Detecting Lazy-Loaded CAPTCHAs
Detect and solve CAPTCHAs that load dynamically after user interaction — Mutation Observer, scroll triggers, and event-based rendering.

Detect and solve CAPTCHAs that load dynamically after user interaction — Mutation Observer, scroll triggers, a...

Python Web Scraping All CAPTCHA Types
Apr 03, 2026
Comparisons ScrapingBee vs Building with CaptchaAI: When to Use Which
Compare Scraping Bee's -in-one scraping API with building your own solution using Captcha AI.

Compare Scraping Bee's all-in-one scraping API with building your own solution using Captcha AI. Cost, flexibi...

Python Web Scraping All CAPTCHA Types
Mar 16, 2026
Explainers IP Reputation and CAPTCHA Solving: Best Practices
Manage IP reputation for CAPTCHA solving workflows.

Manage IP reputation for CAPTCHA solving workflows. Understand IP scoring, proxy rotation, and how IP quality...

Python Web Scraping All CAPTCHA Types
Mar 23, 2026
Explainers User-Agent Management for CAPTCHA Solving Workflows
Manage user-agent strings for CAPTCHA solving workflows.

Manage user-agent strings for CAPTCHA solving workflows. Avoid detection with proper UA rotation, consistency,...

Automation Python Web Scraping
Mar 09, 2026
API Tutorials Building a Custom Scraping Framework with CaptchaAI
Build a modular scraping framework with built-in Captcha AI CAPTCHA solving.

Build a modular scraping framework with built-in Captcha AI CAPTCHA solving. Queue management, middleware pipe...

Python Web Scraping All CAPTCHA Types
Feb 27, 2026
Troubleshooting Turnstile Token Invalid After Solving: Diagnosis and Fixes
Fix Cloudflare Turnstile tokens that come back invalid after solving with Captcha AI.

Fix Cloudflare Turnstile tokens that come back invalid after solving with Captcha AI. Covers token expiry, sit...

Python Cloudflare Turnstile Web Scraping
Apr 08, 2026
API Tutorials Case-Sensitive CAPTCHA API Parameter Guide
How to use the regsense parameter for case-sensitive CAPTCHA solving with Captcha AI.

How to use the regsense parameter for case-sensitive CAPTCHA solving with Captcha AI. Covers when to use, comm...

Python Web Scraping Image OCR
Apr 09, 2026
Tutorials Extracting reCAPTCHA Parameters from Page Source
Extract re CAPTCHA parameters from any web page — sitekey, action, data-s, enterprise flag, and version — using regex, DOM queries, and network interception.

Extract all re CAPTCHA parameters from any web page — sitekey, action, data-s, enterprise flag, and version —...

Python reCAPTCHA v2 Web Scraping
Apr 07, 2026
Tutorials Handling Multiple CAPTCHAs on a Single Page
how to detect and solve multiple CAPTCHAs on a single web page using Captcha AI.

Learn how to detect and solve multiple CAPTCHAs on a single web page using Captcha AI. Covers multi-iframe ext...

Python reCAPTCHA v2 Cloudflare Turnstile
Apr 09, 2026
Reference CAPTCHA Token Injection Methods Reference
Complete reference for injecting solved CAPTCHA tokens into web pages.

Complete reference for injecting solved CAPTCHA tokens into web pages. Covers re CAPTCHA, Turnstile, and Cloud...

Automation Python reCAPTCHA v2
Apr 08, 2026
Reference CAPTCHA Solving Performance by Region: Latency Analysis
Analyze how geographic region affects Captcha AI solve times — network latency, proxy location, and optimization strategies for global deployments.

Analyze how geographic region affects Captcha AI solve times — network latency, proxy location, and optimizati...

Automation Python All CAPTCHA Types
Apr 05, 2026
Reference Optimizing CaptchaAI Speed and Cost
Reduce CAPTCHA solving costs and improve speed with smart batching, caching, method selection, and architecture patterns.

Reduce CAPTCHA solving costs and improve speed with smart batching, caching, method selection, and architectur...

Automation Cloudflare Turnstile
Jan 13, 2026