Integrations

Selenium Grid + CaptchaAI: Distributed CAPTCHA Solving

Selenium Grid distributes browser sessions across multiple machines. Combined with CaptchaAI, you can solve CAPTCHAs at scale — running dozens or hundreds of browser instances in parallel, each solving CAPTCHAs independently through the same API.

This guide covers Grid setup (Selenium 4), node configuration, CaptchaAI integration, and scaling patterns for high-throughput CAPTCHA automation.


Architecture Overview

┌─────────────┐     ┌──────────────┐     ┌──────────────┐
│  Test Script │────▶│  Grid Hub    │────▶│  Node 1      │
│  (Client)    │     │  (Router)    │     │  Chrome x 5  │
└─────────────┘     └──────────────┘     └──────────────┘
                           │              ┌──────────────┐
                           ├─────────────▶│  Node 2      │
                           │              │  Chrome x 5  │
                           │              └──────────────┘
                           │              ┌──────────────┐
                           └─────────────▶│  Node 3      │
                                          │  Chrome x 5  │
                                          └──────────────┘

All nodes share ──▶ CaptchaAI API (single API key)

Selenium Grid 4 Setup

Docker Compose

version: "3"
services:
  selenium-hub:
    image: selenium/hub:4.21.0
    container_name: selenium-hub
    ports:

      - "4442:4442"
      - "4443:4443"
      - "4444:4444"

  chrome-node-1:
    image: selenium/node-chrome:4.21.0
    depends_on:

      - selenium-hub
    environment:

      - SE_EVENT_BUS_HOST=selenium-hub
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=5
      - SE_NODE_OVERRIDE_MAX_SESSIONS=true

  chrome-node-2:
    image: selenium/node-chrome:4.21.0
    depends_on:

      - selenium-hub
    environment:

      - SE_EVENT_BUS_HOST=selenium-hub
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=5
      - SE_NODE_OVERRIDE_MAX_SESSIONS=true

  chrome-node-3:
    image: selenium/node-chrome:4.21.0
    depends_on:

      - selenium-hub
    environment:

      - SE_EVENT_BUS_HOST=selenium-hub
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=5
      - SE_NODE_OVERRIDE_MAX_SESSIONS=true
docker-compose up -d

CaptchaAI Client for Grid

import requests
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from concurrent.futures import ThreadPoolExecutor, as_completed


class GridCaptchaSolver:
    CAPTCHAAI_URL = "https://ocr.captchaai.com"

    def __init__(self, api_key, grid_url="http://localhost:4444"):
        self.api_key = api_key
        self.grid_url = grid_url

    def create_session(self):
        """Create a new browser session on the Grid."""
        options = webdriver.ChromeOptions()
        options.add_argument("--no-sandbox")
        options.add_argument("--disable-blink-features=AutomationControlled")
        options.add_argument("--window-size=1920,1080")

        driver = webdriver.Remote(
            command_executor=self.grid_url,
            options=options,
        )
        return driver

    def solve_recaptcha_v2(self, site_url, sitekey):
        """Solve reCAPTCHA v2 via CaptchaAI API."""
        # Submit
        resp = requests.post(f"{self.CAPTCHAAI_URL}/in.php", data={
            "key": self.api_key,
            "method": "userrecaptcha",
            "googlekey": sitekey,
            "pageurl": site_url,
            "json": 1,
        })
        data = resp.json()
        if data["status"] != 1:
            raise Exception(f"Submit: {data['request']}")

        task_id = data["request"]

        # Poll
        for _ in range(60):
            time.sleep(5)
            resp = requests.get(f"{self.CAPTCHAAI_URL}/res.php", params={
                "key": self.api_key, "action": "get",
                "id": task_id, "json": 1,
            })
            data = resp.json()
            if data["request"] == "CAPCHA_NOT_READY":
                continue
            if data["status"] != 1:
                raise Exception(f"Solve: {data['request']}")
            return data["request"]

        raise Exception("Timeout")

    def solve_turnstile(self, site_url, sitekey):
        resp = requests.post(f"{self.CAPTCHAAI_URL}/in.php", data={
            "key": self.api_key, "method": "turnstile",
            "key": sitekey, "pageurl": site_url, "json": 1,
        })
        data = resp.json()
        if data["status"] != 1:
            raise Exception(f"Submit: {data['request']}")

        task_id = data["request"]
        for _ in range(60):
            time.sleep(5)
            resp = requests.get(f"{self.CAPTCHAAI_URL}/res.php", params={
                "key": self.api_key, "action": "get",
                "id": task_id, "json": 1,
            })
            data = resp.json()
            if data["request"] == "CAPCHA_NOT_READY":
                continue
            if data["status"] != 1:
                raise Exception(f"Solve: {data['request']}")
            return data["request"]

        raise Exception("Timeout")

    def process_task(self, task):
        """Process a single CAPTCHA-protected task on a Grid node."""
        driver = self.create_session()

        try:
            driver.get(task["url"])
            time.sleep(2)

            # Detect sitekey
            sitekey = task.get("sitekey")
            if not sitekey:
                sitekey = driver.execute_script(
                    "return document.querySelector('[data-sitekey]')?.getAttribute('data-sitekey')"
                )

            if not sitekey:
                return {"url": task["url"], "status": "no_captcha", "data": driver.page_source[:500]}

            # Solve
            token = self.solve_recaptcha_v2(task["url"], sitekey)

            # Inject
            driver.execute_script(f"""
                document.querySelector('#g-recaptcha-response').value = '{token}';
                document.querySelectorAll('[name="g-recaptcha-response"]').forEach(
                    el => el.value = '{token}'
                );
            """)

            # Fill form and submit
            if task.get("form_data"):
                for field, value in task["form_data"].items():
                    driver.find_element(By.NAME, field).send_keys(value)

            if task.get("submit_selector"):
                driver.find_element(By.CSS_SELECTOR, task["submit_selector"]).click()
                time.sleep(3)

            return {
                "url": task["url"],
                "status": "success",
                "result_url": driver.current_url,
                "data": driver.page_source[:1000],
            }

        except Exception as e:
            return {"url": task["url"], "status": "error", "error": str(e)}

        finally:
            driver.quit()

Parallel Execution

def run_parallel_tasks(api_key, tasks, max_workers=10):
    """Run CAPTCHA tasks in parallel across Grid nodes."""
    solver = GridCaptchaSolver(api_key)
    results = []

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {
            executor.submit(solver.process_task, task): task
            for task in tasks
        }

        for future in as_completed(futures):
            task = futures[future]
            try:
                result = future.result(timeout=600)
                results.append(result)
                print(f"[{result['status']}] {result['url']}")
            except Exception as e:
                results.append({
                    "url": task["url"],
                    "status": "exception",
                    "error": str(e),
                })

    return results


# Usage
tasks = [
    {
        "url": "https://site-a.com/form",
        "sitekey": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
        "form_data": {"name": "Test User", "email": "test@example.com"},
        "submit_selector": "#submit",
    },
    {
        "url": "https://site-b.com/register",
        "sitekey": "6LdKlZEpAAAAAAOQjzC2v_mJ-",
        "form_data": {"username": "testuser"},
        "submit_selector": "button[type='submit']",
    },
    # Add more tasks...
]

results = run_parallel_tasks("YOUR_API_KEY", tasks, max_workers=15)

# Summary
success = sum(1 for r in results if r["status"] == "success")
print(f"\nCompleted: {success}/{len(results)} successful")

Grid Status Monitoring

import requests

def check_grid_status(grid_url="http://localhost:4444"):
    """Check Selenium Grid status and available nodes."""
    try:
        resp = requests.get(f"{grid_url}/status")
        data = resp.json()

        nodes = data.get("value", {}).get("nodes", [])
        total_slots = 0
        available_slots = 0

        print(f"Grid Status: {data['value']['ready']}")
        print(f"Nodes: {len(nodes)}")

        for i, node in enumerate(nodes):
            slots = node.get("slots", [])
            free = sum(1 for s in slots if not s.get("session"))
            total_slots += len(slots)
            available_slots += free
            print(f"  Node {i+1}: {free}/{len(slots)} slots available")

        print(f"Total capacity: {available_slots}/{total_slots} available")
        return available_slots

    except Exception as e:
        print(f"Grid check failed: {e}")
        return 0


# Adjust workers based on grid capacity
available = check_grid_status()
optimal_workers = min(available, 20)
print(f"Optimal workers: {optimal_workers}")

Auto-Scaling with Kubernetes

# selenium-grid-k8s.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: selenium-chrome-node
spec:
  replicas: 5
  selector:
    matchLabels:
      app: selenium-chrome
  template:
    metadata:
      labels:
        app: selenium-chrome
    spec:
      containers:

        - name: chrome
          image: selenium/node-chrome:4.21.0
          env:

            - name: SE_EVENT_BUS_HOST
              value: selenium-hub

            - name: SE_EVENT_BUS_PUBLISH_PORT
              value: "4442"

            - name: SE_EVENT_BUS_SUBSCRIBE_PORT
              value: "4443"

            - name: SE_NODE_MAX_SESSIONS
              value: "3"
          resources:
            limits:
              memory: "2Gi"
              cpu: "1"
            requests:
              memory: "1Gi"
              cpu: "500m"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: chrome-node-hpa
spec:
  scaleRef:
    apiVersion: apps/v1
    kind: Deployment
    name: selenium-chrome-node
  minReplicas: 2
  maxReplicas: 20
  metrics:

    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Java Integration

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import java.net.URL;
import java.net.http.*;
import java.net.URI;
import java.util.concurrent.*;

public class GridCaptchaSolver {
    private final String apiKey;
    private final String gridUrl;
    private final HttpClient httpClient;

    public GridCaptchaSolver(String apiKey, String gridUrl) {
        this.apiKey = apiKey;
        this.gridUrl = gridUrl;
        this.httpClient = HttpClient.newHttpClient();
    }

    public WebDriver createSession() throws Exception {
        ChromeOptions options = new ChromeOptions();
        options.addArguments("--no-sandbox", "--window-size=1920,1080");
        return new RemoteWebDriver(new URL(gridUrl), options);
    }

    public List<Map<String, String>> runParallel(
        List<Map<String, String>> tasks, int workers
    ) throws Exception {
        ExecutorService executor = Executors.newFixedThreadPool(workers);
        List<Future<Map<String, String>>> futures = new ArrayList<>();

        for (Map<String, String> task : tasks) {
            futures.add(executor.submit(() -> processTask(task)));
        }

        List<Map<String, String>> results = new ArrayList<>();
        for (Future<Map<String, String>> future : futures) {
            results.add(future.get(600, TimeUnit.SECONDS));
        }

        executor.shutdown();
        return results;
    }
}

Troubleshooting

Issue Cause Fix
SessionNotCreated No available slots Increase node count or SE_NODE_MAX_SESSIONS
Timeout on Grid Node overloaded Reduce concurrent sessions per node
WebDriverException Node disconnected Add retry logic for session creation
Memory exhaustion Too many browser instances Set resource limits and max sessions
CAPTCHA solve timeout API under load Increase poll timeout, add retries
Stale sessions Grid cleanup delay Set SE_SESSION_TIMEOUT

FAQ

How many concurrent sessions can I run?

Each Grid node typically handles 3-5 Chrome sessions. With 10 nodes, you get 30-50 parallel sessions. CaptchaAI handles the API-side concurrency.

Should I run one solve per session or reuse sessions?

Create a new session per task for isolation. Reuse sessions only if tasks share the same domain and cookies.

Does Selenium Grid 4 support dynamic scaling?

Yes. With Kubernetes and HPA (Horizontal Pod Autoscaler), nodes scale automatically based on CPU utilization.

Can I mix browser types on the Grid?

Yes. Add Firefox or Edge nodes alongside Chrome. CaptchaAI's API is browser-agnostic — it only needs sitekey and URL.



Scale CAPTCHA solving across distributed browser instances — get your CaptchaAI key and deploy with Selenium Grid.

Discussions (0)

No comments yet.

Related Posts

Reference CAPTCHA Token Injection Methods Reference
Complete reference for injecting solved CAPTCHA tokens into web pages.

Complete reference for injecting solved CAPTCHA tokens into web pages. Covers re CAPTCHA, Turnstile, and Cloud...

Automation Python reCAPTCHA v2
Apr 08, 2026
Tutorials Pytest Fixtures for CaptchaAI API Testing
Build reusable pytest fixtures to test CAPTCHA-solving workflows with Captcha AI.

Build reusable pytest fixtures to test CAPTCHA-solving workflows with Captcha AI. Covers mocking, live integra...

Automation Python reCAPTCHA v2
Apr 08, 2026
Reference Browser Session Persistence for CAPTCHA Workflows
Manage browser sessions, cookies, and storage across CAPTCHA-solving runs to reduce repeat challenges and maintain authenticated state.

Manage browser sessions, cookies, and storage across CAPTCHA-solving runs to reduce repeat challenges and main...

Automation Python reCAPTCHA v2
Feb 24, 2026
Integrations Browser Profile Isolation + CaptchaAI Integration
Browser profile isolation tools create distinct browser environments with unique fingerprints per session.

Browser profile isolation tools create distinct browser environments with unique fingerprints per session. Com...

Automation Python reCAPTCHA v2
Feb 21, 2026
Comparisons WebDriver vs Chrome DevTools Protocol for CAPTCHA Automation
Compare Web Driver and Chrome Dev Tools Protocol (CDP) for CAPTCHA automation — detection, performance, capabilities, and when to use each with Captcha AI.

Compare Web Driver and Chrome Dev Tools Protocol (CDP) for CAPTCHA automation — detection, performance, capabi...

Automation Python reCAPTCHA v2
Mar 27, 2026
Use Cases CAPTCHA Solving in Ticket Purchase Automation
How to handle CAPTCHAs on ticketing platforms Ticketmaster, AXS, and event sites using Captcha AI for automated purchasing workflows.

How to handle CAPTCHAs on ticketing platforms Ticketmaster, AXS, and event sites using Captcha AI for automate...

Automation Python reCAPTCHA v2
Feb 25, 2026
Tutorials Caching CAPTCHA Tokens for Reuse
Cache and reuse CAPTCHA tokens with Captcha AI to reduce API calls and costs.

Cache and reuse CAPTCHA tokens with Captcha AI to reduce API calls and costs. Covers token lifetimes, cache st...

Automation Python reCAPTCHA v2
Feb 15, 2026
Use Cases Event Ticket Monitoring with CAPTCHA Handling
Build an event ticket availability monitor that handles CAPTCHAs using Captcha AI.

Build an event ticket availability monitor that handles CAPTCHAs using Captcha AI. Python workflow for checkin...

Automation Python reCAPTCHA v2
Jan 17, 2026
Tutorials CAPTCHA Retry Queue with Exponential Backoff
Implement a retry queue with exponential backoff for Captcha AI API calls.

Implement a retry queue with exponential backoff for Captcha AI API calls. Handles transient failures, rate li...

Automation Python reCAPTCHA v2
Feb 15, 2026
Integrations Retool + CaptchaAI: Internal Tool CAPTCHA Form Handling
Build Retool internal tools that solve re CAPTCHA v 2 CAPTCHAs by integrating Captcha AI API through REST API queries and Java Script transformers.

Build Retool internal tools that solve re CAPTCHA v 2 CAPTCHAs by integrating Captcha AI API through REST API...

reCAPTCHA v2 Testing No-Code
Mar 19, 2026
Integrations Axios + CaptchaAI: Solve CAPTCHAs Without a Browser
Use Axios and Captcha AI to solve re CAPTCHA, Turnstile, and image CAPTCHAs in Node.js without launching a browser.

Use Axios and Captcha AI to solve re CAPTCHA, Turnstile, and image CAPTCHAs in Node.js without launching a bro...

Automation All CAPTCHA Types
Apr 08, 2026
Integrations Selenium Wire + CaptchaAI: Request Interception for CAPTCHA Solving
Complete guide to using Selenium Wire for request interception, proxy routing, and automated CAPTCHA solving with Captcha AI in Python.

Complete guide to using Selenium Wire for request interception, proxy routing, and automated CAPTCHA solving w...

Python reCAPTCHA v2 Cloudflare Turnstile
Mar 13, 2026