Reference

CAPTCHA Solve Rate SLI/SLO: How to Define and Monitor

"Our CAPTCHA solving works most of the time" isn't a reliability target. SLIs (Service Level Indicators) and SLOs (Service Level Objectives) give you measurable thresholds, error budgets, and actionable alerts for your CAPTCHA pipeline.

Definitions

Term Meaning CAPTCHA Example
SLI A metric that measures service quality Solve success rate: 94.2%
SLO A target value for an SLI Solve success rate ≥ 92% over 30 days
Error Budget Allowed failures before SLO breach 8% failure budget = 800 failures per 10,000 tasks
Burn Rate How fast you're consuming error budget 2x burn rate = budget exhausted in 15 days

SLI 1: Solve Success Rate

Success Rate = Successful Solves / Total Solve Attempts
CAPTCHA Type Typical Rate SLO Target
reCAPTCHA v2 95–99% ≥ 92%
reCAPTCHA v3 90–97% ≥ 88%
Cloudflare Turnstile 95–99% ≥ 92%
hCaptcha 90–97% ≥ 88%
Image/OCR 85–95% ≥ 82%

SLI 2: Solve Latency

Latency = Time from task submission to solution received
Percentile Target Alert Threshold
p50 < 25s
p95 < 90s > 120s
p99 < 180s > 300s

SLI 3: Pipeline Availability

Availability = Time pipeline is accepting and solving tasks / Total time

Target: ≥ 99.5% (allows 3.6 hours downtime per month)

Python — SLI/SLO Tracker

import os
import time
from collections import deque
from dataclasses import dataclass, field

API_KEY = os.environ["CAPTCHAAI_API_KEY"]


@dataclass
class SLITracker:
    """Track CAPTCHA solving SLIs over a sliding window."""

    window_seconds: int = 86400 * 30  # 30 days default
    events: deque = field(default_factory=deque)

    def record_success(self, latency_seconds):
        self.events.append({
            "time": time.time(),
            "success": True,
            "latency": latency_seconds
        })
        self._prune()

    def record_failure(self, error_code):
        self.events.append({
            "time": time.time(),
            "success": False,
            "error": error_code
        })
        self._prune()

    def _prune(self):
        cutoff = time.time() - self.window_seconds
        while self.events and self.events[0]["time"] < cutoff:
            self.events.popleft()

    @property
    def success_rate(self):
        if not self.events:
            return 1.0
        successes = sum(1 for e in self.events if e["success"])
        return successes / len(self.events)

    @property
    def latency_percentiles(self):
        latencies = sorted(
            e["latency"] for e in self.events if e.get("latency")
        )
        if not latencies:
            return {"p50": 0, "p95": 0, "p99": 0}

        def percentile(data, p):
            idx = int(len(data) * p / 100)
            return data[min(idx, len(data) - 1)]

        return {
            "p50": round(percentile(latencies, 50), 2),
            "p95": round(percentile(latencies, 95), 2),
            "p99": round(percentile(latencies, 99), 2),
        }

    @property
    def error_breakdown(self):
        errors = {}
        for e in self.events:
            if not e["success"]:
                code = e.get("error", "unknown")
                errors[code] = errors.get(code, 0) + 1
        return errors


class SLOChecker:
    """Check SLIs against SLO targets."""

    def __init__(self, tracker):
        self.tracker = tracker
        self.slos = {
            "success_rate": 0.92,    # ≥ 92%
            "latency_p95": 90.0,     # < 90 seconds
            "latency_p99": 180.0,    # < 180 seconds
        }

    @property
    def error_budget_total(self):
        """Total allowed failures in the window."""
        total = len(self.tracker.events)
        return int(total * (1 - self.slos["success_rate"]))

    @property
    def error_budget_remaining(self):
        """How many more failures before SLO breach."""
        total = len(self.tracker.events)
        failures = sum(1 for e in self.tracker.events if not e["success"])
        budget = self.error_budget_total
        return max(0, budget - failures)

    @property
    def error_budget_pct(self):
        """Percentage of error budget remaining."""
        total = self.error_budget_total
        if total == 0:
            return 100.0
        return round(self.error_budget_remaining / total * 100, 1)

    @property
    def burn_rate(self):
        """How fast error budget is being consumed.
        1.0 = on track, 2.0 = will exhaust in half the window.
        """
        total = len(self.tracker.events)
        if total == 0:
            return 0.0
        failures = sum(1 for e in self.tracker.events if not e["success"])
        expected_failures = total * (1 - self.slos["success_rate"])
        if expected_failures == 0:
            return 0.0
        return round(failures / expected_failures, 2)

    def check_all(self):
        """Check all SLOs and return status."""
        rate = self.tracker.success_rate
        latencies = self.tracker.latency_percentiles

        return {
            "success_rate": {
                "current": round(rate, 4),
                "target": self.slos["success_rate"],
                "met": rate >= self.slos["success_rate"]
            },
            "latency_p95": {
                "current": latencies["p95"],
                "target": self.slos["latency_p95"],
                "met": latencies["p95"] <= self.slos["latency_p95"]
            },
            "latency_p99": {
                "current": latencies["p99"],
                "target": self.slos["latency_p99"],
                "met": latencies["p99"] <= self.slos["latency_p99"]
            },
            "error_budget": {
                "remaining_pct": self.error_budget_pct,
                "remaining_count": self.error_budget_remaining,
                "burn_rate": self.burn_rate,
            },
            "overall": rate >= self.slos["success_rate"]
                       and latencies["p95"] <= self.slos["latency_p95"]
        }


# Usage
tracker = SLITracker(window_seconds=86400 * 30)
slo = SLOChecker(tracker)

# After each solve:
# tracker.record_success(latency_seconds=24.5)
# tracker.record_failure("ERROR_CAPTCHA_UNSOLVABLE")

# Check SLOs:
# print(slo.check_all())

JavaScript — SLO Dashboard

class SLODashboard {
  constructor(windowMs = 30 * 24 * 60 * 60 * 1000) {
    this.windowMs = windowMs;
    this.events = [];
    this.slos = {
      successRate: 0.92,
      latencyP95: 90,
      latencyP99: 180,
    };
  }

  recordSuccess(latencySeconds) {
    this.events.push({ time: Date.now(), success: true, latency: latencySeconds });
    this._prune();
  }

  recordFailure(errorCode) {
    this.events.push({ time: Date.now(), success: false, error: errorCode });
    this._prune();
  }

  _prune() {
    const cutoff = Date.now() - this.windowMs;
    this.events = this.events.filter((e) => e.time > cutoff);
  }

  get successRate() {
    if (this.events.length === 0) return 1;
    const successes = this.events.filter((e) => e.success).length;
    return successes / this.events.length;
  }

  get errorBudget() {
    const total = this.events.length;
    const allowedFailures = Math.floor(total * (1 - this.slos.successRate));
    const actualFailures = this.events.filter((e) => !e.success).length;
    const remaining = Math.max(0, allowedFailures - actualFailures);

    return {
      total: allowedFailures,
      consumed: actualFailures,
      remaining,
      remainingPct: allowedFailures > 0
        ? ((remaining / allowedFailures) * 100).toFixed(1)
        : "100.0",
      burnRate: allowedFailures > 0
        ? (actualFailures / allowedFailures).toFixed(2)
        : "0.00",
    };
  }

  get report() {
    const latencies = this.events
      .filter((e) => e.success && e.latency)
      .map((e) => e.latency)
      .sort((a, b) => a - b);

    const p95 = latencies.length > 0
      ? latencies[Math.floor(latencies.length * 0.95)]
      : 0;

    return {
      sliSuccessRate: (this.successRate * 100).toFixed(2) + "%",
      sloSuccessRate: (this.slos.successRate * 100).toFixed(0) + "%",
      sloMet: this.successRate >= this.slos.successRate,
      latencyP95: p95.toFixed(1) + "s",
      errorBudget: this.errorBudget,
      totalEvents: this.events.length,
    };
  }
}

const dashboard = new SLODashboard();
// dashboard.recordSuccess(24.5);
// console.log(dashboard.report);

Burn Rate Alert Thresholds

Burn Rate Meaning Alert
1.0 On track — budget lasts the full window None
2.0 Budget exhausted in half the window Warning
6.0 Budget exhausted in 5 days Page on-call
14.0 Budget exhausted in ~2 days Critical — immediate action

Troubleshooting

Issue Cause Fix
SLO always breached Target too aggressive Start with current performance − 3% as SLO
Error budget always full SLO too loose Tighten SLO to drive improvements
Burn rate spikes Burst of failures Check if transient (retry storm) or systemic
Budget consumed by one error type Single root cause Fix that error type; see error breakdown

FAQ

What SLO should I start with?

Measure your current success rate over 7 days. Subtract 3 percentage points — that's your starting SLO. Tighten it as you improve reliability.

Who owns the CAPTCHA SLO?

The team that operates the CAPTCHA solving pipeline. If scraping and CAPTCHA solving are separate teams, the CAPTCHA team owns solve rate SLOs while the scraping team owns end-to-end SLOs.

Should I set different SLOs per CAPTCHA type?

Yes. Image/OCR CAPTCHAs have fundamentally different success rates than reCAPTCHA v2. Setting per-type SLOs prevents one type from masking another's issues.

Next Steps

Set measurable reliability targets — get your CaptchaAI API key and define SLOs for your pipeline.

Related guides:

Discussions (0)

No comments yet.

Related Posts

DevOps & Scaling CaptchaAI Monitoring with Datadog: Metrics and Alerts
Monitor Captcha AI performance with Datadog — custom metrics, dashboards, anomaly detection alerts, and solve rate tracking for CAPTCHA solving pipelines.

Monitor Captcha AI performance with Datadog — custom metrics, dashboards, anomaly detection alerts, and solve...

Automation Python All CAPTCHA Types
Feb 19, 2026
Tutorials Webhook Endpoint Monitoring for CAPTCHA Solve Callbacks
Monitor your Captcha AI callback endpoints — track uptime, response latency, error rates, and set up alerts before missed results impact your pipeline.

Monitor your Captcha AI callback endpoints — track uptime, response latency, error rates, and set up alerts be...

Automation Python All CAPTCHA Types
Mar 12, 2026
DevOps & Scaling CaptchaAI Monitoring with New Relic: APM Integration
Integrate Captcha AI with New Relic APM — custom events, transaction tracing, dashboards, and alert policies for CAPTCHA solving performance.

Integrate Captcha AI with New Relic APM — custom events, transaction tracing, dashboards, and alert policies f...

Automation Python All CAPTCHA Types
Jan 31, 2026
DevOps & Scaling Building Custom CaptchaAI Alerts with PagerDuty
Integrate Captcha AI with Pager Duty for incident management — trigger alerts on low balance, high error rates, and pipeline failures with escalation policies.

Integrate Captcha AI with Pager Duty for incident management — trigger alerts on low balance, high error rates...

Automation Python All CAPTCHA Types
Jan 15, 2026
DevOps & Scaling Grafana Dashboard Templates for CaptchaAI Metrics
Ready-to-import Grafana dashboard templates for Captcha AI — solve rate panels, latency histograms, balance gauges, and queue depth monitors.

Ready-to-import Grafana dashboard templates for Captcha AI — solve rate panels, latency histograms, balance ga...

Automation Python All CAPTCHA Types
Feb 21, 2026
API Tutorials Building a CaptchaAI Usage Dashboard and Monitoring
Build a real-time Captcha AI usage dashboard.

Build a real-time Captcha AI usage dashboard. Track solve rates, spending, success/failure ratios, and respons...

Automation Python All CAPTCHA Types
Mar 11, 2026
Tutorials Batch CAPTCHA Solving Cost Estimation and Budget Alerts
Estimate costs for batch CAPTCHA solving, set budget limits, track per-task spending, and configure alerts to prevent unexpected charges with Captcha AI.

Estimate costs for batch CAPTCHA solving, set budget limits, track per-task spending, and configure alerts to...

Automation Python All CAPTCHA Types
Mar 28, 2026
Tutorials Discord Webhook Alerts for CAPTCHA Pipeline Status
Send CAPTCHA pipeline alerts to Discord — webhook integration for balance warnings, error spikes, queue status, and daily summary reports with Captcha AI.

Send CAPTCHA pipeline alerts to Discord — webhook integration for balance warnings, error spikes, queue status...

Automation Python All CAPTCHA Types
Tutorials Structured Logging for CAPTCHA Operations
Add structured JSON logging to your CAPTCHA solving workflows — track task IDs, solve times, errors, and costs with Python and Node.js.

Add structured JSON logging to your CAPTCHA solving workflows — track task IDs, solve times, errors, and costs...

Automation Python All CAPTCHA Types
Jan 30, 2026
Reference CAPTCHA Token Injection Methods Reference
Complete reference for injecting solved CAPTCHA tokens into web pages.

Complete reference for injecting solved CAPTCHA tokens into web pages. Covers re CAPTCHA, Turnstile, and Cloud...

Automation Python reCAPTCHA v2
Apr 08, 2026
Reference API Endpoint Mapping: CaptchaAI vs Competitors
Side-by-side API endpoint comparison between Captcha AI, 2 Captcha, Anti-Captcha, and Cap Monster — endpoints, parameters, and response formats.

Side-by-side API endpoint comparison between Captcha AI, 2 Captcha, Anti-Captcha, and Cap Monster — endpoints,...

All CAPTCHA Types Migration
Feb 05, 2026
Reference Browser Session Persistence for CAPTCHA Workflows
Manage browser sessions, cookies, and storage across CAPTCHA-solving runs to reduce repeat challenges and maintain authenticated state.

Manage browser sessions, cookies, and storage across CAPTCHA-solving runs to reduce repeat challenges and main...

Automation Python reCAPTCHA v2
Feb 24, 2026