Tools
Value ranking
Benchmark-Adjusted Value Calculator
Rank capable models by cost per benchmark point for chat, coding, RAG, batch, and vision workloads.
Quality-adjusted value
Large repo context, tool-call results, long generations and reasoning. Cursor / Claude Code style — modern agent loops routinely send 30k+ input and emit 5k+ output per call.
8,000
Best value model
Step 3.5 Flash
$38.40 / month at 31.6 benchmark points
Value winner
.22
cost / point
Monthly volume
8.0K
32K in / call
Quality leaders
GPT-5.5
highest benchmark score
| Model | Score | Monthly | Cost / point |
|---|---|---|---|
Step 3.5 Flash StepFun (China) | 31.6 | .22 | |
MiMo-V2-Flash Xiaomi | 33.5 | .19 | |
DeepSeek V4 Flash DeepSeek | 39.8 | .24 | |
DeepSeek V3.2 Exp Alibaba (China) | 36.7 |