Glossary Term

Tokens.

Learn what Tokens means in modern search and SEO.

Part of speechnounOriginOld English tacen (sign, symbol); in NLP, the smallest discrete unit a model processes

The basic units of text that AI language models process — roughly corresponding to word fragments — used to measure input and output length and compute API costs.

In the context of large language models, tokens are the discrete units into which text is broken during processing. Tokenisation algorithms (like OpenAI's tiktoken or Anthropic's tokeniser) split input text into tokens — roughly ¾ of a word on average in English, but variable. 'unhelpful' might be two tokens: 'un' and 'helpful'. Spaces, punctuation, and subword fragments are all tokens.

Why Tokens Matter

LLM API costs are priced per token (input tokens and output tokens separately). Context windows — the maximum amount of information a model can consider at once — are measured in tokens. Understanding tokenisation helps estimate costs, optimise prompt length, and stay within context limits for large-scale content operations.

Context Window and Long Documents

A 100K token context window can hold approximately 75,000 words — roughly a novel. Models with larger context windows can process entire codebases, long reports, or extended conversation histories in a single call. However, models tend to 'lose' information from the middle of very long contexts (the 'lost in the middle' problem), performing better on content at the beginning and end.

Tokens Across Languages

Tokenisation efficiency varies by language. English text is typically 1 token per ~0.75 words. Languages not well-represented in training data (many Asian languages, non-Latin scripts) often require more tokens per equivalent meaning — increasing API costs and reducing effective context window capacity for multilingual applications.

Articles about Tokens

Kimi K3 and Qwen3.8 Aren't a Surprise — They're a Pattern

Two Chinese AI models dropped last week and the world acted shocked — again. Kimi K3 from Moonshot AI and Qwen3.8 from Alibaba are making credible claims against OpenAI and Anthropic, and the real story isn't the models. It's why we keep being surprised.

Read article

NewsJune 29, 2026

OpenAI Launches GPT-5.6 — Three Models, Government Oversight

OpenAI dropped a limited preview of GPT-5.6 less than 24 hours after reports surfaced that the Trump administration had asked the company to stagger the release. The suite includes three models — Sol, Terra, and Luna — with a heavy focus on safety, coding, and agentic tasks. Here's what happened and what it means for anyone using AI tools at work.

Read article

AI SearchMay 29, 2026

How LLMs Actually "Read" Your Website (And Why It's Different From Google)

ChatGPT might be serving visitors a six-month-old version of your homepage. Perplexity is live but cuts you off mid-sentence. Google's AI Overviews run on a completely different pipeline than either one. Here's the plain-English breakdown of how LLMs actually process your site — and why it matters for how you write and structure your content.

Ready to close the loop?