Token Speed Emulator

Explore what local model speed actually feels like. Select hardware, choose an open model, then compare the same tok/s in a chatbot, an agentic editor, and a spreadsheet workflow.

Approximation. All numbers here are estimates, not benchmarks. They are meant to help you visualize how token speed can feel in different contexts—not to represent exact hardware or model performance.

How this works

Pick your GPU or Apple chip.
Choose a model from the main list or browse variants.
Watch the same speed play out in chat, coding, and spreadsheet workflows.

Hardware and quantization

Pick your chip, choose a quant, then select a model to preload the speed slider.

Platform

Hardware

Quantization

Visible models

Runnable

Current quant

Q4_K_M

Available models

Main models below; expand “Other variants” for the full catalog.

All

Chat

Code

Reasoning

Vision

Main models

Qwen 3 8BMain

AlibabaRuns great

141 tok/s

Qwen 2.5 Coder 14BMain

AlibabaRuns great

82 tok/s

Qwen 3 32BMain

AlibabaDecent

39 tok/s

Qwen 2.5 Coder 32BMain

AlibabaDecent

38 tok/s

Qwen 3.6 35B A3BMainMoE

AlibabaDecent

35 tok/s

Command R 35BMain

CohereDecent

36 tok/s

DeepSeek R1 32BMain

DeepSeekDecent

38 tok/s

DeepSeek R1 70BMain

DeepSeekToo heavy

39 GB req.

Gemma 3 12BMain

GoogleRuns great

99 tok/s

Gemma 3 27BMain

GoogleRuns well

45 tok/s

Llama 3.1 8BMain

MetaRuns great

144 tok/s

Llama 3.3 70BMain

MetaToo heavy

39.3 GB req.

Phi-4 14BMain

MicrosoftRuns great

86 tok/s

Codestral 22BMain

MistralRuns well

56 tok/s

Devstral 24BMain

MistralRuns well

52 tok/s

Mistral Small 3.1 24BMain

MistralRuns well

52 tok/s

GPT-OSS 20BMainMoE

OpenAIRuns well

57 tok/s

Estimates use memory bandwidth vs model size (see canirun.ai methodology). Actual speed varies with drivers, thermals, and load.

Generation speed

tok/s

Slow and deliberateFast and fluid

Same speed in all three scenarios.

Chat simulator

User question and a longer assistant reply streamed at your chosen speed.

I'm going to Lisbon for four days in April with a friend. We like food markets, neighborhood walks, and one museum per day—but we really dislike huge crowds and packed 'must-see' lists. Could you suggest a relaxed day-by-day plan, how to get around without stress, and a few common tourist mistakes to avoid?

Assistant

Ask a follow-up…

Tokens streamed

Speed setting

60 tok/s

Mode

Ready