Token Counter

Estimate

Estimate token counts for GPT-4, Claude, and other LLMs. Plan your prompts and calculate costs. All processing happens in your browser.

Input Text

0

Characters

0

Words

0

Lines

0

Sentences

Token Estimates

Char-based (~4 chars/token)

0

Word-based (~0.75 tok/word)

0

Average Estimate

~0

Token counts are estimates. Actual counts depend on the model's tokenizer (BPE for GPT, SentencePiece for others). Estimates are most accurate for plain English text.

Cost Calculator
GPT-4oOpenAI
$0.00($2.5/1M in)
Claude Sonnet 4Anthropic
$0.00($3/1M in)
Claude Opus 4Anthropic
$0.00($15/1M in)
Gemini 2.5 ProGoogle
$0.00($1.25/1M in)

About Token Estimation

Tokens are the basic units that language models process. A token can be as short as one character or as long as a word. For English text, a rough rule of thumb is 1 token per ~4 characters, or ~0.75 tokens per word. Code, special characters, and non-English text often use more tokens. Prices shown are approximate and subject to change.

About This Tool

Tokens are the units LLMs use to process text. A token is roughly 0.75 English words on average, but varies — common words may be a single token while rare words split into multiple subword pieces. Pricing, context windows, and rate limits are denominated in tokens, not characters or words.

The counter estimates token count for a given text using the tokenizer family (cl100k for GPT-4 era, tiktoken's various encodings for newer models). Estimates are within a few percent of the exact count; truly precise counting requires running the actual model's tokenizer.

Tokenizers come from a class of algorithms called byte-pair encoding (BPE) or its variants (WordPiece, SentencePiece, Tiktoken). The training process scans a large text corpus and merges frequent adjacent character pairs into tokens, then frequent token pairs, repeating until a target vocabulary size (typically 50K to 200K tokens) is reached. Common English words like 'the' or 'because' end up as single tokens; rare words split into pieces ('antidisestablishmentarianism' might tokenize as 'anti', 'dis', 'establishment', 'arian', 'ism'). The vocabulary is fixed once the tokenizer is trained, and different model families use different tokenizers. GPT-4o, Claude, Gemini, Llama, and Mistral all use distinct schemes. Token counts can vary 10 to 30 percent for the same text. The counter here uses a tiktoken-class approximation; for precise counts, run the actual model's tokenizer.

A worked example. The sentence 'The quick brown fox jumps over the lazy dog' is 9 words and 43 characters. In the cl100k_base tokenizer (GPT-4): roughly 10 tokens. In Claude's tokenizer: about 11 tokens. In Llama 3's tokenizer: about 12 tokens. The differences come from how each tokenizer handles common words, the leading space convention (most modern tokenizers prepend space to words after the first), and vocabulary size. A more illustrative example: the JSON string '{"key":"value","count":42}' is roughly 12 tokens despite being 28 characters. Code, structured data, and non-English text typically tokenize less efficiently than natural English prose. A 1,000-word English document is around 1,300 to 1,400 tokens; the same content translated to Chinese is often 2,500 to 4,000 tokens because CJK characters tokenize less efficiently in models trained primarily on English.

Limitations and practical pricing implications. Most LLM pricing is per-1,000 tokens for both input and output, and a single chat exchange easily includes 1 to 5K tokens of context, history, and system prompts. Long documents, code samples, and multi-turn conversations balloon quickly. Watching the token-to-cost ratio is the biggest single lever on production LLM operating cost. Context window limits (e.g., 200K for Claude 3, 128K for GPT-4 Turbo, up to 1M for some Gemini models) cap input + output combined per request — hitting the limit either truncates input or rejects the request entirely. Counting before sending is essential for production reliability; silent truncation produces subtly wrong outputs. Spanish, Chinese, and other non-English languages cost more per equivalent message because most major tokenizers were trained on English-heavy corpora and assign efficient single-token representations to common English words. Tools that need exact token counts for billing or routing should use the model's actual tokenizer rather than approximations; the counter is for capacity planning and rough cost estimation.

The about text and FAQ on this page were drafted with AI assistance and reviewed by a member of the Coherence Daddy team before publishing. See our Content Policy for editorial standards.

Frequently Asked Questions