Calculator

Batch vs Realtime Calculator

Split traffic between immediate API calls and discounted batch processing to quantify the latency-for-savings trade.

Traffic split

Model

5) for Haiku (

/$5) can cut a token bill ~5×.

Requests / day20,000

Input tokens / request2,500

Output tokens / request350

Batchable traffic65%

Batch discount50%

Realtime retry overhead4%

Batch savings

$700

33%

per month when 65% of traffic can wait

All realtime

,153

with retry overhead

Mixed mode

,453

390K batch requests

Annual savings

$8,396

50% discount

Line item	Monthly	Share
Realtime share	$753	52%
Batch share after discount	$700	48%

Use this for latency-tolerant work like extraction, moderation, summarization, and offline evals.