Use Cases

Proxy Rotation for CAPTCHA Scraping

Proxy rotation reduces CAPTCHA frequency by distributing requests across multiple IPs. Combined with CaptchaAI for solving the CAPTCHAs that still appear, you get a reliable scraping pipeline that handles any anti-bot system.

Why Proxy Rotation Reduces CAPTCHAs

Sites trigger CAPTCHAs based on per-IP request patterns:

Factor Single IP Rotating Proxies
Requests per minute 10+ triggers CAPTCHA Distributed across IPs
IP reputation Degrades over time Fresh IPs from pool
Session patterns Suspicious patterns visible Patterns spread across IPs
Geographic consistency Single location Natural geographic diversity

Proxy Types for Scraping

Type Best For CAPTCHA Rate Cost
Residential High-value targets (Google, Amazon) Lowest $$$
Mobile Ultra-low detection Lowest $$$$
ISP/Static Sustained sessions Low $$
Datacenter High-volume, lenient sites Higher $

Recommendation: Use residential proxies for sites with aggressive CAPTCHA triggers. Datacenter proxies work for less protected sites.

Basic Proxy Rotation (Python)

import requests
import random
import time

PROXIES = [
    "http://user:pass@proxy1.example.com:8080",
    "http://user:pass@proxy2.example.com:8080",
    "http://user:pass@proxy3.example.com:8080",
]

API_KEY = "YOUR_API_KEY"

def get_random_proxy():
    proxy = random.choice(PROXIES)
    return {"http": proxy, "https": proxy}

def scrape_with_rotation(url):
    proxy = get_random_proxy()
    session = requests.Session()
    session.proxies = proxy
    session.headers.update({
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
    })

    resp = session.get(url)

    # If CAPTCHA appears, solve it
    if "g-recaptcha" in resp.text or "captcha" in resp.text.lower():
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(resp.text, "html.parser")
        rc = soup.find("div", class_="g-recaptcha")
        if rc:
            site_key = rc["data-sitekey"]
            token = solve_captcha(site_key, url)
            resp = session.post(url, data={"g-recaptcha-response": token})

    return resp.text

def solve_captcha(site_key, page_url):
    resp = requests.get("https://ocr.captchaai.com/in.php", params={
        "key": API_KEY, "method": "userrecaptcha",
        "googlekey": site_key, "pageurl": page_url
    })
    task_id = resp.text.split("|")[1]

    for _ in range(60):
        time.sleep(5)
        result = requests.get("https://ocr.captchaai.com/res.php", params={
            "key": API_KEY, "action": "get", "id": task_id
        })
        if result.text == "CAPCHA_NOT_READY": continue
        if result.text.startswith("OK|"): return result.text.split("|")[1]
        raise Exception(result.text)
    raise TimeoutError()

Smart Proxy Rotation

Track which proxies trigger CAPTCHAs and avoid them:

from collections import defaultdict
import random

class SmartProxyRotator:
    def __init__(self, proxies):
        self.proxies = proxies
        self.captcha_count = defaultdict(int)
        self.success_count = defaultdict(int)

    def get_proxy(self):
        # Prefer proxies with lower CAPTCHA rates
        scored = []
        for proxy in self.proxies:
            total = self.captcha_count[proxy] + self.success_count[proxy]
            if total == 0:
                score = 0.5  # Unknown proxy, neutral score
            else:
                score = self.success_count[proxy] / total
            scored.append((proxy, score))

        # Weight selection by score
        scored.sort(key=lambda x: x[1], reverse=True)
        top_proxies = scored[:max(len(scored) // 2, 1)]
        proxy = random.choice(top_proxies)[0]
        return proxy

    def report_success(self, proxy):
        self.success_count[proxy] += 1

    def report_captcha(self, proxy):
        self.captcha_count[proxy] += 1

# Usage
rotator = SmartProxyRotator(PROXIES)

def scrape(url):
    proxy = rotator.get_proxy()
    resp = requests.get(url, proxies={"http": proxy, "https": proxy})

    if "captcha" in resp.text.lower():
        rotator.report_captcha(proxy)
        # Solve CAPTCHA...
    else:
        rotator.report_success(proxy)

    return resp.text

Proxy Rotation with Selenium

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def create_driver_with_proxy(proxy_url):
    options = Options()
    options.add_argument(f"--proxy-server={proxy_url}")
    options.add_argument("--disable-blink-features=AutomationControlled")
    return webdriver.Chrome(options=options)

# Rotate proxy per session
proxy = random.choice(PROXIES)
driver = create_driver_with_proxy(proxy)
driver.get("https://example.com")

Proxy + CAPTCHA Solving for Cloudflare

Cloudflare Challenge solving requires passing a proxy to CaptchaAI:

proxy = "http://user:pass@proxy.example.com:8080"

resp = requests.get("https://ocr.captchaai.com/in.php", params={
    "key": API_KEY,
    "method": "cloudflare_challenge",
    "pageurl": "https://example.com",
    "proxy": proxy,
    "proxytype": "HTTP"
})
task_id = resp.text.split("|")[1]

# Poll for cf_clearance cookie
# Use the same proxy for subsequent requests

Best Practices

  1. Match proxy geo to target — Use US proxies for US sites
  2. One session per proxy — Don't reuse sessions across different proxies
  3. Rate limit per proxy — Max 5-10 requests/minute per IP
  4. Monitor CAPTCHA rates — Track which proxies trigger more CAPTCHAs
  5. Use sticky sessions — Keep the same proxy for multi-step workflows
  6. Handle proxy failures — Retry with a different proxy on connection errors

Troubleshooting

Issue Fix
All proxies trigger CAPTCHAs Switch to residential proxies; reduce rate
Proxy timeout errors Remove slow proxies from pool; increase timeout
Different content per proxy Some sites serve geo-specific content; normalize
CAPTCHA tokens don't work with proxy Ensure token is used from the same session/IP

FAQ

Do I need proxies if I use CaptchaAI?

Not strictly — CaptchaAI can solve CAPTCHAs regardless. But proxies reduce how often CAPTCHAs appear, saving time and API costs.

Should I use the same proxy for CAPTCHA solving and scraping?

For most CAPTCHA types, the token is valid regardless of IP. For Cloudflare Challenge, you must use the same proxy since the cf_clearance cookie is IP-bound.

How many proxies do I need?

For moderate scraping (1,000 pages/day), 10-20 rotating residential proxies suffice. For high volume, use a proxy provider with automatic rotation.

Discussions (0)

No comments yet.

Related Posts

Comparisons ScrapingBee vs Building with CaptchaAI: When to Use Which
Compare Scraping Bee's -in-one scraping API with building your own solution using Captcha AI.

Compare Scraping Bee's all-in-one scraping API with building your own solution using Captcha AI. Cost, flexibi...

Python All CAPTCHA Types Web Scraping
Mar 16, 2026
Explainers How Proxy Quality Affects CAPTCHA Solve Success Rate
Understand how proxy quality, IP reputation, and configuration affect CAPTCHA frequency and solve success rates with Captcha AI.

Understand how proxy quality, IP reputation, and configuration affect CAPTCHA frequency and solve success rate...

Python reCAPTCHA v2 Cloudflare Turnstile
Feb 06, 2026
Reference CAPTCHA Types Comparison Matrix 2025
Complete side-by-side comparison of every major CAPTCHA type in 2025 — re CAPTCHA, Turnstile, Gee Test, BLS, h Captcha, and image CAPTCHAs.

Complete side-by-side comparison of every major CAPTCHA type in 2025 — re CAPTCHA, Turnstile, Gee Test, BLS, h...

All CAPTCHA Types Web Scraping
Mar 31, 2026
Explainers Rate Limiting CAPTCHA Solving Workflows
Sending too many requests too fast triggers blocks, bans, and wasted CAPTCHA solves.

Sending too many requests too fast triggers blocks, bans, and wasted CAPTCHA solves. Smart rate limiting keeps...

Automation Python All CAPTCHA Types
Apr 04, 2026
Explainers Mobile Proxies for CAPTCHA Solving: Higher Success Rates Explained
Why mobile proxies produce the lowest CAPTCHA trigger rates and how to use them with Captcha AI for maximum success.

Why mobile proxies produce the lowest CAPTCHA trigger rates and how to use them with Captcha AI for maximum su...

Python reCAPTCHA v2 Cloudflare Turnstile
Apr 03, 2026
Tutorials Dynamic CAPTCHA Loading: Detecting Lazy-Loaded CAPTCHAs
Detect and solve CAPTCHAs that load dynamically after user interaction — Mutation Observer, scroll triggers, and event-based rendering.

Detect and solve CAPTCHAs that load dynamically after user interaction — Mutation Observer, scroll triggers, a...

Python All CAPTCHA Types Web Scraping
Apr 03, 2026
Integrations Oxylabs + CaptchaAI: Datacenter Proxy Integration
Integrate Oxylabs datacenter, residential, and SERP proxies with Captcha AI for fast, reliable CAPTCHA solving at high throughput.

Integrate Oxylabs datacenter, residential, and SERP proxies with Captcha AI for fast, reliable CAPTCHA solving...

Python reCAPTCHA v2 Cloudflare Turnstile
Jan 31, 2026
Troubleshooting CaptchaAI Proxy Connection Failures: Diagnosis and Fixes
Troubleshoot proxy connection failures when using Captcha AI.

Troubleshoot proxy connection failures when using Captcha AI. Fix timeout errors, authentication issues, and p...

Python reCAPTCHA v2 Web Scraping
Mar 27, 2026
Explainers Rotating Residential Proxies: Best Practices for CAPTCHA Solving
Best practices for using rotating residential proxies with Captcha AI to reduce CAPTCHA frequency and maintain high solve rates.

Best practices for using rotating residential proxies with Captcha AI to reduce CAPTCHA frequency and maintain...

Python reCAPTCHA v2 Cloudflare Turnstile
Mar 01, 2026
Reference Complete Guide: CAPTCHA Solving from Basics to Production
End-to-end guide covering CAPTCHA fundamentals, solving approaches, API integration, error handling, scaling, and production deployment with Captcha AI.

End-to-end guide covering CAPTCHA fundamentals, solving approaches, API integration, error handling, scaling,...

Python All CAPTCHA Types Web Scraping
Jan 13, 2026
Use Cases Retail Site Data Collection with CAPTCHA Handling
Amazon uses image CAPTCHAs to block automated access.

Amazon uses image CAPTCHAs to block automated access. When you hit their anti-bot threshold, you'll see a page...

Web Scraping Image OCR
Apr 07, 2026
Use Cases CAPTCHA Solving in Ticket Purchase Automation
How to handle CAPTCHAs on ticketing platforms Ticketmaster, AXS, and event sites using Captcha AI for automated purchasing workflows.

How to handle CAPTCHAs on ticketing platforms Ticketmaster, AXS, and event sites using Captcha AI for automate...

Automation Python reCAPTCHA v2
Feb 25, 2026
Use Cases Event Ticket Monitoring with CAPTCHA Handling
Build an event ticket availability monitor that handles CAPTCHAs using Captcha AI.

Build an event ticket availability monitor that handles CAPTCHAs using Captcha AI. Python workflow for checkin...

Automation Python reCAPTCHA v2
Jan 17, 2026