Every second matters in CAPTCHA solving. This guide covers proven strategies to reduce solve times, lower costs, and maximize throughput with CaptchaAI.
Speed Optimization
1. Start Solving Before You Need the Token
Submit the CAPTCHA as soon as you detect it, then continue scraping other pages while it solves:
import asyncio
import aiohttp
async def prefetch_solve(solver, session, site_key, page_url):
"""Start solving in advance, return a future."""
return asyncio.create_task(
solver.solve(session, {
"method": "userrecaptcha",
"googlekey": site_key,
"pageurl": page_url,
})
)
# Start solve immediately
solve_task = await prefetch_solve(solver, session, site_key, url)
# Do other work while CAPTCHA is being solved
other_data = await fetch_other_pages(session)
# Now retrieve the token (already solved or almost done)
token = await solve_task
2. Use the Right Method
Different methods have different solve times:
| Method | Avg. Solve Time | Cost |
|---|---|---|
| Image/OCR | ~3-5s | Lowest |
| reCAPTCHA v3 | ~5-8s | Low |
| Cloudflare Turnstile | ~8-12s | Medium |
| reCAPTCHA v2 | ~10-15s | Medium |
| reCAPTCHA v2 Enterprise | ~12-18s | Higher |
| Cloudflare Challenge | ~12-20s | Highest |
If the site uses reCAPTCHA v3, prefer solving it over v2 — it's faster and cheaper.
3. Reduce Poll Frequency Smartly
Don't poll every 1 second. Start at 5 seconds, then adjust:
async def smart_poll(session, solver, task_id):
"""Poll with increasing intervals based on expected solve time."""
intervals = [5, 5, 5, 10, 10, 15, 15, 30] # seconds
for wait in intervals:
await asyncio.sleep(wait)
result = await solver.check(session, task_id)
if result:
return result
raise TimeoutError("Solve timed out")
4. Use Callbacks for High Volume
For 100+ CAPTCHAs/hour, use the callback (pingback) mechanism instead of polling:
resp = requests.get("https://ocr.captchaai.com/in.php", params={
"key": API_KEY,
"method": "userrecaptcha",
"googlekey": site_key,
"pageurl": page_url,
"pingback": "https://your-server.com/captcha-done",
})
This eliminates all polling requests, reducing API calls by 60-80%.
Cost Optimization
1. Avoid Unnecessary Solves
Check if a CAPTCHA is actually required before solving:
async def scrape_smart(url, session, solver):
resp = await session.get(url)
html = await resp.text()
# Only solve if CAPTCHA is present
if "g-recaptcha" not in html and "cf-turnstile" not in html:
return html # No CAPTCHA needed
# Solve only when necessary
token = await solver.solve(...)
2. Cache Tokens When Possible
Some tokens are valid for multiple minutes. Reuse them:
import time
token_cache = {}
def get_or_solve(site_key, page_url, solver, cache_ttl=60):
cache_key = f"{site_key}:{page_url}"
if cache_key in token_cache:
token, timestamp = token_cache[cache_key]
if time.time() - timestamp < cache_ttl:
return token
token = solver.solve({
"method": "userrecaptcha",
"googlekey": site_key,
"pageurl": page_url,
})
token_cache[cache_key] = (token, time.time())
return token
Note: reCAPTCHA tokens typically expire after 120 seconds. Cloudflare Turnstile tokens last shorter. Test the effective lifetime for your target site.
3. Report Bad Solves
Report incorrect results to get credits back and improve quality:
def report_bad(task_id):
requests.get("https://ocr.captchaai.com/res.php", params={
"key": API_KEY,
"action": "reportbad",
"id": task_id,
})
4. Monitor Your Spending
Check your balance regularly and set alerts:
def check_balance():
resp = requests.get("https://ocr.captchaai.com/res.php", params={
"key": API_KEY,
"action": "getbalance",
})
balance = float(resp.text)
if balance < 1.0:
send_alert(f"Low CaptchaAI balance: ${balance:.2f}")
return balance
5. Use Image OCR for Simple CAPTCHAs
Image CAPTCHAs cost less than token-based ones. If the site uses simple text CAPTCHAs, solve them as images instead of using reCAPTCHA methods.
Architecture Patterns
Producer-Consumer Queue
import asyncio
from asyncio import Queue
async def captcha_worker(queue, solver, session, results):
"""Worker that processes CAPTCHA tasks from a queue."""
while True:
task = await queue.get()
try:
token = await solver.solve(session, task["params"])
results[task["id"]] = token
except Exception as e:
results[task["id"]] = None
finally:
queue.task_done()
async def run_pipeline(tasks, num_workers=5):
solver = AsyncCaptchaAI(os.environ["CAPTCHAAI_API_KEY"])
queue = Queue()
results = {}
async with aiohttp.ClientSession() as session:
# Start workers
workers = [
asyncio.create_task(captcha_worker(queue, solver, session, results))
for _ in range(num_workers)
]
# Add tasks
for task in tasks:
await queue.put(task)
# Wait for completion
await queue.join()
# Cancel workers
for w in workers:
w.cancel()
return results
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Solve times over 30s | Server load or complex CAPTCHA | Retry; check CAPTCHA type |
| High cost per page | Solving CAPTCHAs unnecessarily | Cache tokens; check before solving |
| Tokens rejected | Token expired | Submit within 60s of receiving |
| Balance draining fast | Duplicate solves | Deduplicate requests; cache tokens |
FAQ
What's the fastest CAPTCHA type to solve?
Image/OCR CAPTCHAs solve in 3-5 seconds. reCAPTCHA v3 solves in 5-8 seconds. Use the simplest method your target site accepts.
How much does CaptchaAI cost per solve?
Pricing varies by CAPTCHA type. Image CAPTCHAs are the cheapest; Enterprise reCAPTCHA costs more. Check current pricing at captchaai.com.
Can I preload CAPTCHA tokens?
Yes. Submit solves ahead of time and cache the tokens. This reduces perceived latency to near-zero.
Discussions (0)
Join the conversation
Sign in to share your opinion.
Sign InNo comments yet.