Your scraper solves reCAPTCHA v2, Turnstile, and image CAPTCHAs concurrently. When the reCAPTCHA service slows down, all 50 concurrent slots fill with waiting reCAPTCHA tasks — Turnstile and image solves queue behind them. The bulkhead pattern partitions resources into isolated compartments so one failing type can't starve the others.
How Bulkheads Work
Named after ship compartments that contain flooding, the pattern assigns each CAPTCHA type its own resource pool:
| Pool | Max Concurrent | Queued | Effect of Failure |
|---|---|---|---|
| reCAPTCHA | 20 | 10 | Only reCAPTCHA tasks slow down |
| Turnstile | 15 | 10 | Turnstile keeps solving normally |
| Image | 10 | 20 | Image queue stays independent |
| Default | 5 | 5 | Unknown types get minimal resources |
Python: Semaphore Bulkheads
import asyncio
import aiohttp
import time
from dataclasses import dataclass
API_KEY = "YOUR_API_KEY"
SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"
@dataclass
class BulkheadConfig:
max_concurrent: int
max_queued: int
timeout: int = 180
class Bulkhead:
"""Resource-limited compartment for a CAPTCHA type."""
def __init__(self, name: str, config: BulkheadConfig):
self.name = name
self._semaphore = asyncio.Semaphore(config.max_concurrent)
self._max_queued = config.max_queued
self._queued = 0
self._active = 0
self._rejected = 0
self.timeout = config.timeout
@property
def stats(self) -> dict:
return {
"name": self.name,
"active": self._active,
"queued": self._queued,
"rejected": self._rejected,
}
async def execute(self, coro):
"""Run a coroutine within the bulkhead's resource limits."""
if self._queued >= self._max_queued:
self._rejected += 1
raise BulkheadFullError(
f"Bulkhead '{self.name}' full: {self._active} active, "
f"{self._queued} queued (max {self._max_queued})"
)
self._queued += 1
try:
await self._semaphore.acquire()
self._queued -= 1
self._active += 1
try:
return await asyncio.wait_for(coro, timeout=self.timeout)
finally:
self._active -= 1
self._semaphore.release()
except asyncio.TimeoutError:
self._queued -= 1
raise
class BulkheadFullError(Exception):
pass
class IsolatedCaptchaSolver:
"""CAPTCHA solver with bulkhead isolation per type."""
def __init__(self, api_key: str, bulkheads: dict[str, BulkheadConfig] | None = None):
self.api_key = api_key
defaults = {
"recaptcha": BulkheadConfig(max_concurrent=20, max_queued=10),
"turnstile": BulkheadConfig(max_concurrent=15, max_queued=10),
"image": BulkheadConfig(max_concurrent=10, max_queued=20),
"default": BulkheadConfig(max_concurrent=5, max_queued=5),
}
configs = {**defaults, **(bulkheads or {})}
self._bulkheads = {name: Bulkhead(name, cfg) for name, cfg in configs.items()}
def _get_bulkhead(self, method: str) -> Bulkhead:
if "recaptcha" in method:
return self._bulkheads["recaptcha"]
if method == "turnstile":
return self._bulkheads["turnstile"]
if method in ("base64", "post"):
return self._bulkheads["image"]
return self._bulkheads["default"]
async def _submit_and_poll(self, session: aiohttp.ClientSession, params: dict) -> str:
params["key"] = self.api_key
params["json"] = 1
async with session.post(SUBMIT_URL, data=params) as resp:
data = await resp.json(content_type=None)
if data.get("status") != 1:
raise RuntimeError(f"Submit failed: {data.get('request')}")
task_id = data["request"]
for _ in range(60):
await asyncio.sleep(5)
poll_params = {"key": self.api_key, "action": "get", "id": task_id, "json": 1}
async with session.get(RESULT_URL, params=poll_params) as resp:
poll = await resp.json(content_type=None)
if poll.get("request") == "CAPCHA_NOT_READY":
continue
if poll.get("status") == 1:
return poll["request"]
raise RuntimeError(f"Solve failed: {poll.get('request')}")
raise RuntimeError("Timeout")
async def solve(self, params: dict) -> str:
"""Solve a CAPTCHA within its type-specific bulkhead."""
method = params.get("method", "default")
bulkhead = self._get_bulkhead(method)
async with aiohttp.ClientSession() as session:
return await bulkhead.execute(
self._submit_and_poll(session, params)
)
def get_stats(self) -> list[dict]:
return [bh.stats for bh in self._bulkheads.values()]
# --- Usage ---
async def main():
solver = IsolatedCaptchaSolver("YOUR_API_KEY")
tasks = []
# 30 reCAPTCHA — fills the recaptcha bulkhead
for _ in range(30):
tasks.append(solver.solve({
"method": "userrecaptcha",
"googlekey": "SITEKEY_A",
"pageurl": "https://site-a.com",
}))
# 10 Turnstile — runs in its own pool, unaffected by reCAPTCHA
for _ in range(10):
tasks.append(solver.solve({
"method": "turnstile",
"sitekey": "SITEKEY_B",
"pageurl": "https://site-b.com",
}))
results = await asyncio.gather(*tasks, return_exceptions=True)
solved = sum(1 for r in results if isinstance(r, str))
rejected = sum(1 for r in results if isinstance(r, BulkheadFullError))
errors = sum(1 for r in results if isinstance(r, Exception) and not isinstance(r, BulkheadFullError))
print(f"Solved: {solved}, Rejected: {rejected}, Errors: {errors}")
for stat in solver.get_stats():
print(f" {stat['name']}: rejected={stat['rejected']}")
asyncio.run(main())
JavaScript: Bulkhead with Concurrency Limiter
const API_KEY = "YOUR_API_KEY";
const SUBMIT_URL = "https://ocr.captchaai.com/in.php";
const RESULT_URL = "https://ocr.captchaai.com/res.php";
class Bulkhead {
constructor(name, maxConcurrent, maxQueued) {
this.name = name;
this.maxConcurrent = maxConcurrent;
this.maxQueued = maxQueued;
this.active = 0;
this.queue = [];
this.rejected = 0;
}
async execute(fn) {
if (this.active >= this.maxConcurrent) {
if (this.queue.length >= this.maxQueued) {
this.rejected++;
throw new Error(`Bulkhead '${this.name}' full`);
}
await new Promise((resolve, reject) => {
this.queue.push({ resolve, reject });
});
}
this.active++;
try {
return await fn();
} finally {
this.active--;
if (this.queue.length > 0) {
this.queue.shift().resolve();
}
}
}
}
const bulkheads = {
recaptcha: new Bulkhead("recaptcha", 20, 10),
turnstile: new Bulkhead("turnstile", 15, 10),
image: new Bulkhead("image", 10, 20),
default: new Bulkhead("default", 5, 5),
};
function getBulkhead(method) {
if (method.includes("recaptcha")) return bulkheads.recaptcha;
if (method === "turnstile") return bulkheads.turnstile;
if (method === "base64") return bulkheads.image;
return bulkheads.default;
}
async function submitAndPoll(params) {
const body = new URLSearchParams({ key: API_KEY, json: "1", ...params });
const resp = await (await fetch(SUBMIT_URL, { method: "POST", body })).json();
if (resp.status !== 1) throw new Error(`Submit: ${resp.request}`);
const taskId = resp.request;
for (let i = 0; i < 60; i++) {
await new Promise((r) => setTimeout(r, 5000));
const url = `${RESULT_URL}?key=${API_KEY}&action=get&id=${taskId}&json=1`;
const poll = await (await fetch(url)).json();
if (poll.request === "CAPCHA_NOT_READY") continue;
if (poll.status === 1) return poll.request;
throw new Error(`Solve: ${poll.request}`);
}
throw new Error("Timeout");
}
async function solve(params) {
const bulkhead = getBulkhead(params.method);
return bulkhead.execute(() => submitAndPoll(params));
}
// Usage — Turnstile solves continue even if reCAPTCHA is overloaded
const results = await Promise.allSettled([
...Array(30).fill(null).map(() =>
solve({ method: "userrecaptcha", googlekey: "SITEKEY_A", pageurl: "https://site-a.com" })
),
...Array(10).fill(null).map(() =>
solve({ method: "turnstile", sitekey: "SITEKEY_B", pageurl: "https://site-b.com" })
),
]);
const fulfilled = results.filter((r) => r.status === "fulfilled").length;
const rejected = results.filter((r) => r.status === "rejected").length;
console.log(`Solved: ${fulfilled}, Rejected: ${rejected}`);
Sizing Bulkheads
| Factor | Guidance |
|---|---|
| Average solve time | Longer solve times need more slots for same throughput |
| Request volume per type | Allocate more slots to higher-volume types |
| Failure tolerance | Smaller pools = less resource waste during outages |
| API rate limits | Total across all pools shouldn't exceed your rate limit |
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| All requests rejected | Bulkhead too small for traffic | Increase max_concurrent or max_queued |
| One type still affects others | Wrong bulkhead mapping | Verify _get_bulkhead routes the method correctly |
| Queue grows unbounded | No queue limit set | Always set max_queued to prevent memory issues |
| Deadlock under load | Semaphore not released on error | Use try/finally to always release the semaphore |
| Stats show zero rejected | Bulkhead too large | Size pools based on actual traffic patterns |
FAQ
How do I choose bulkhead sizes?
Start with your expected peak concurrency per type divided by your total capacity. Monitor rejection rates — if a pool rejects frequently, increase its size. If it rarely fills, reduce it and give capacity to busier pools.
Should I combine bulkheads with circuit breakers?
Yes. The bulkhead limits concurrency and the circuit breaker stops sending requests when failure rates are too high. Together, they prevent resource exhaustion and avoid hammering a failing service.
What happens to rejected requests?
Rejected tasks get a BulkheadFullError. Your caller decides what to do: retry after a delay, route to a different pool, or return a cached result. Don't silently drop rejected tasks.
Next Steps
Isolate CAPTCHA failures properly — get your CaptchaAI API key and implement bulkheads.
Related guides:
Discussions (0)
Join the conversation
Sign in to share your opinion.
Sign InNo comments yet.