Token
The unit a language model processes roughly 4 characters or 0.75 of an English word.
Tokens are the chunks of text that LLMs read and write. They are usually subwords common words become single tokens, rare words split into multiple tokens. Numbers, code, and non-English text often use more tokens per character.
A rule of thumb: 1,000 tokens ≈ 750 English words. A 200-token tweet, a 4,000-token essay, a 100,000-token novel.
Tokens matter for two practical reasons: pricing (you pay per input and output token) and limits (each model has a maximum context window). A poorly designed prompt can burn 10x the tokens of a good one. Tools like our token counter and OpenAI's tiktoken help you estimate before you ship.