Token Speed Emulator
Explore what local model speed actually feels like. Select hardware, choose an open model, then compare the same tok/s in a chatbot, an agentic editor, and a spreadsheet workflow.
Approximation. All numbers here are estimates, not benchmarks. They are meant to help you visualize how token speed can feel in different contexts—not to represent exact hardware or model performance.
How this works
- Pick your GPU or Apple chip.
- Choose a model from the main list or browse variants.
- Watch the same speed play out in chat, coding, and spreadsheet workflows.
Hardware and quantization
Pick your chip, choose a quant, then select a model to preload the speed slider.
Platform
Hardware
Quantization
Visible models
48
Runnable
38
Current quant
Q4_K_M
Available models
Main models below; expand “Other variants” for the full catalog.
Main models
Estimates use memory bandwidth vs model size (see canirun.ai methodology). Actual speed varies with drivers, thermals, and load.
Generation speed
60
tok/sSame speed in all three scenarios.
Chat simulator
User question and a longer assistant reply streamed at your chosen speed.
Assistant
Tokens streamed
0
Speed setting
60 tok/s
Mode
Ready