captchav2_prototype / README.md
OpenCaptchaWorld's picture
debug dependencies
441dd2f

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: CAPTCHAv2 Leaderboard
emoji: πŸ†
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit

CAPTCHAv2 Leaderboard

A comprehensive leaderboard for comparing model performance across different CAPTCHA puzzle types. This interactive dashboard allows you to:

  • πŸ“Š View performance rankings across different CAPTCHA categories
  • πŸ“ˆ Compare models using interactive visualizations
  • πŸ’° Analyze cost-effectiveness of different models
  • πŸ”„ Upload and update results easily

Features

  • Interactive Leaderboard Table: Sortable rankings with color-coded performance indicators
  • Performance Comparison Charts: Visual bar charts showing pass rates across models
  • Performance by Type: Detailed breakdown of performance across different CAPTCHA puzzle types
  • Cost-Effectiveness Analysis: Scatter plot comparing performance vs. cost
  • Easy Upload: Support for CSV and JSON result files

How to Use

  1. View the Leaderboard: Browse the current rankings and filter by category
  2. Sort Results: Sort by Pass Rate, Duration, or Cost
  3. Upload Results: Use the upload section to add new evaluation results
  4. Compare Models: Use the visualizations to compare different models

Uploading Results

The leaderboard supports multiple file formats:

  • CSV files: Aggregated results with columns for Model, Provider, Agent Framework, Type, metrics, and per-type pass rates
  • JSON files: Single object or array of aggregated results
  • benchmark_results.json: Per-puzzle results in JSONL format (auto-converted)

See the upload section in the app for detailed instructions and file format requirements.

Categories

The leaderboard tracks performance across various CAPTCHA types including:

  • Dice Count
  • Color Cipher
  • Color Counting
  • Dynamic Jigsaw
  • Mirror
  • Set Game
  • Shadow Plausible
  • Spooky variants (Circle, Jigsaw, Shape Grid, Size, Text)
  • Trajectory Recovery
  • Transform Pipeline
  • And more...

Model Types

Models are automatically categorized as:

  • Proprietary: Commercial models (OpenAI, Anthropic, Google, etc.)
  • Open source: Open source models (Llama, Mistral, Qwen, etc.)