Solve Image CAPTCHA with Python OCR and CaptchaAI

Image CAPTCHAs (also called normal or text CAPTCHAs) show distorted text that users must type. They are still common on government portals, legacy systems, and registration forms. CaptchaAI accepts the image and returns the recognized text using its OCR engine.

This guide covers both file upload and base64 submission, accuracy tuning parameters, and math CAPTCHA handling.

Prerequisites

Item	Value
CaptchaAI API key	From captchaai.com
Python	3.7+
Libraries	`requests`
Image format	JPG, PNG, or GIF (100 bytes – 100 KB)

Method A: File upload

import requests
import time

API_KEY = "YOUR_API_KEY"

# Submit the image file
with open("captcha.png", "rb") as f:
    response = requests.post("https://ocr.captchaai.com/in.php",
        data={"key": API_KEY, "method": "post", "json": 1},
        files={"file": ("captcha.png", f, "image/png")},
    )

result = response.json()
if result["status"] != 1:
    raise Exception(f"Submit failed: {result['request']}")

task_id = result["request"]
print(f"Task submitted: {task_id}")

Method B: Base64 submission

import base64

# Read and encode the image
with open("captcha.png", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

response = requests.post("https://ocr.captchaai.com/in.php", data={
    "key": API_KEY,
    "method": "base64",
    "body": img_b64,
    "json": 1,
})

task_id = response.json()["request"]

Poll for the text result

time.sleep(5)

for _ in range(30):
    result = requests.get("https://ocr.captchaai.com/res.php", params={
        "key": API_KEY,
        "action": "get",
        "id": task_id,
        "json": 1,
    }).json()

    if result["status"] == 1:
        text = result["request"]
        print(f"CAPTCHA text: {text}")
        break

    if result["request"] != "CAPCHA_NOT_READY":
        raise Exception(f"Error: {result['request']}")

    time.sleep(5)

Accuracy tuning parameters

Add these to the submit request to improve recognition:

Parameter	Value	Purpose
`numeric`	`1` = digits only, `2` = letters only	Limits character set
`min_len`	Integer	Minimum text length
`max_len`	Integer	Maximum text length
`language`	`0` = any, `1` = Cyrillic, `2` = Latin	Character language
`calc`	`1`	CAPTCHA is a math expression
`phrase`	`1`	CAPTCHA contains spaces
`regsense`	`1`	Case-sensitive answer

# Example: digits only, 4-6 characters
response = requests.post("https://ocr.captchaai.com/in.php", data={
    "key": API_KEY,
    "method": "base64",
    "body": img_b64,
    "numeric": 1,
    "min_len": 4,
    "max_len": 6,
    "json": 1,
})

Handling math CAPTCHAs

For images showing math expressions like "3 + 7":

response = requests.post("https://ocr.captchaai.com/in.php", data={
    "key": API_KEY,
    "method": "base64",
    "body": img_b64,
    "calc": 1,  # Tells solver to compute the answer
    "json": 1,
})
# Returns "10" instead of "3 + 7"

Reporting incorrect solutions

If the solution is wrong, report it to improve accuracy and potentially get a refund:

requests.get("https://ocr.captchaai.com/res.php", params={
    "key": API_KEY,
    "action": "reportbad",
    "id": task_id,
})

Complete working example

import requests
import time
import base64
from selenium import webdriver
from selenium.webdriver.common.by import By

API_KEY = "YOUR_API_KEY"

# 1. Load page and capture CAPTCHA image
driver = webdriver.Chrome()
driver.get("https://example.com/register")
captcha_el = driver.find_element(By.CSS_SELECTOR, "#captcha-image")
captcha_el.screenshot("captcha.png")

# 2. Encode and submit
with open("captcha.png", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

submit = requests.post("https://ocr.captchaai.com/in.php", data={
    "key": API_KEY, "method": "base64", "body": img_b64, "json": 1
}).json()
task_id = submit["request"]

# 3. Poll for result
time.sleep(5)
for _ in range(30):
    poll = requests.get("https://ocr.captchaai.com/res.php", params={
        "key": API_KEY, "action": "get", "id": task_id, "json": 1
    }).json()
    if poll["status"] == 1:
        text = poll["request"]
        break
    if poll["request"] != "CAPCHA_NOT_READY":
        raise Exception(poll["request"])
    time.sleep(5)

# 4. Type and submit
driver.find_element(By.CSS_SELECTOR, "#captcha-input").send_keys(text)
driver.find_element(By.CSS_SELECTOR, "form").submit()
print(f"Solved: {text}")
driver.quit()

Expected output:

Solved: ABC123

Common errors

Error	Cause	Fix
`ERROR_WRONG_FILE_EXTENSION`	Unsupported format	Use JPG, PNG, or GIF
`ERROR_TOO_BIG_CAPTCHA_FILESIZE`	Image > 100 KB	Compress the image
`ERROR_ZERO_CAPTCHA_FILESIZE`	Image < 100 bytes	Check the image is valid
`CAPCHA_NOT_READY`	Still processing	Poll every 5 seconds

FAQ

How accurate is OCR solving?

Accuracy depends on distortion level. Using hint parameters (numeric, min_len, max_len) significantly improves results.

How fast is image CAPTCHA solving?

Typically 5–15 seconds — faster than token-based CAPTCHAs.

Can I solve colored or noisy CAPTCHAs?

Yes. CaptchaAI's OCR handles noise, color distortion, and overlapping characters.

Start solving image CAPTCHAs with CaptchaAI →

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Solve Image CAPTCHA with Python OCR and CaptchaAI

Prerequisites

Method A: File upload

Method B: Base64 submission

Poll for the text result

Accuracy tuning parameters

Handling math CAPTCHAs

Reporting incorrect solutions

Complete working example

Common errors

FAQ

How accurate is OCR solving?

How fast is image CAPTCHA solving?

Can I solve colored or noisy CAPTCHAs?

Discussions (0)

Image CAPTCHA Confidence Scores: Using CaptchaAI Quality Metrics

CAPTCHA Solving Fallback Chains

CaptchaAI API Latency Optimization: Faster Solves

Python Multiprocessing for Parallel CAPTCHA Solving

Building a Python Wrapper Library for CaptchaAI API

Grid Image Coordinate Errors: Diagnosis and Fix

Prerequisites

Method A: File upload

Method B: Base64 submission

Poll for the text result

Accuracy tuning parameters

Handling math CAPTCHAs

Reporting incorrect solutions

Complete working example

Common errors

FAQ

How accurate is OCR solving?

How fast is image CAPTCHA solving?

Can I solve colored or noisy CAPTCHAs?

Related guides

Discussions (0)

Join the conversation

Related Posts

Image CAPTCHA Confidence Scores: Using CaptchaAI Quality Metrics

CAPTCHA Solving Fallback Chains

CaptchaAI API Latency Optimization: Faster Solves

Python Multiprocessing for Parallel CAPTCHA Solving

Building a Python Wrapper Library for CaptchaAI API

Grid Image Coordinate Errors: Diagnosis and Fix