What Are AI Tokens? Explained Simply

Tokens are the chunks of text AI reads and writes, and they decide your cost, speed, and limits. Here is what a token is, in plain English.

By Aditya Kumar Jha · June 27, 2026 · 9 min read · AI Explained

A token is a small chunk of text that an AI model reads and writes, usually a word, part of a word, or a punctuation mark. As a rule of thumb in English, one token is about four characters, and roughly 100 tokens is about 75 words. Tokens matter to you for three practical reasons: AI pricing is charged per token, every model can only hold so many tokens at once (its context window), and longer token counts mean slower, costlier responses. Understand tokens and the rest of how AI billing and limits work suddenly makes sense.

Models do not read letters or whole sentences the way we do. They break text into tokens first, then work with those. This one idea quietly explains why a long document costs more to process, why a model 'forgets' the start of a very long chat, and why your API bill is shaped the way it is. Here it is in plain English.

What a Token Actually Is

When you send text to a model, it is split into tokens by a tokenizer. Common short words are often a single token, while longer or rarer words get split into pieces. 'Cat' is one token. 'Unbelievable' might be three. A space or a comma can be its own token. Numbers and code split in their own ways. The exact split varies by model, but the rough English conversion holds: about four characters per token, about 75 words per 100 tokens.

Text	Approx. tokens	Note
'Hello'	1 token	Common short word
'unbelievable'	2 to 3 tokens	Longer words split into pieces
A 75-word paragraph	~100 tokens	The everyday rule of thumb
A 4-page document	~2,000 tokens	Why long inputs cost more

Why Tokens Decide Your Cost

AI providers price by the token, and they usually charge separately for input tokens (what you send) and output tokens (what the model writes back), with output often more expensive. So a request is billed on the full conversation it has to read plus everything it generates. This is why a short question with a huge pasted document can cost more than a long question with no attachment, and why trimming what you send is the simplest way to lower your bill on paid APIs.

Why Tokens Decide the Limits (Context Windows)

Every model has a context window, the maximum number of tokens it can consider at once, counting both your input and its output. When a conversation or document exceeds that window, the oldest content falls out of view, which is why a very long chat can seem to 'forget' what you said at the start. Bigger context windows let a model hold entire books or codebases at once, but more tokens in the window also means slower and more expensive responses, so bigger is not automatically better for every task.

Three things you care about all trace back to one unit: cost is per token, the memory limit is a token count, and speed drops as token counts rise. Master tokens and AI pricing stops being mysterious.

Input vs Output Tokens

It helps to separate the two. Input tokens are everything the model reads: your prompt, any pasted text or files, and the earlier turns of the conversation it is shown. Output tokens are what it writes in reply. On most paid APIs, output costs more per token than input, which means asking for a shorter answer can save money even when your input is large. If you only use a consumer chat app on a flat monthly plan, you do not pay per token directly, but the same limits still shape how much the model can handle at once.

How to Use Fewer Tokens (and Pay Less)

Send only what is needed: paste the relevant section of a document, not the entire file, when the rest is irrelevant.
Ask for the length you want: 'answer in 3 bullet points' produces fewer output tokens than an open-ended request.
Start fresh for new topics: a brand-new chat does not carry the token weight of a long, unrelated history.
Summarize long threads: ask the model to summarize a long conversation, then continue from the summary to shed old tokens.
Check before you scale: if you build on an API, run sample text through a tokenizer so you can estimate cost before going live.

Do You Need to Count Tokens Yourself?

For everyday chatting, no. The rough conversion, about 75 words per 100 tokens, is all most people ever need, and consumer apps handle the rest invisibly. Token counting becomes worth your attention only when you are building on an API at scale, where input and output token costs add up across thousands of requests, or when you are deciding whether a long document will fit a model's context window. A tool like LumiChats, which lets you use 40-plus models in one place, also makes it easy to see how different models handle the same long input without managing each provider's billing yourself.

What is a token in AI? A token is a small chunk of text, typically a word, part of a word, or a punctuation mark, that an AI model reads and writes. Models break all text into tokens before processing it. In English, a token averages about four characters, and roughly 100 tokens equals about 75 words.

How many words is 1,000 tokens? Roughly 750 words in English, using the common rule of about 75 words per 100 tokens. The exact number varies because longer and rarer words split into more tokens, but 750 words is a reliable estimate for planning cost and length.

Why do AI tools charge by the token? Tokens are the unit of work a model processes, so providers price by them. Most charge separately for input tokens (what you send) and output tokens (what the model writes), with output often more expensive. Sending less and requesting shorter answers lowers cost on paid APIs.

What is the difference between tokens and a context window? Tokens are the unit; the context window is how many tokens a model can consider at once, counting both your input and its output. When content exceeds the window, the oldest tokens drop out of view, which is why long chats can seem to forget earlier messages.

Do I need to count tokens as a regular user? No. For everyday chatting the rough conversion of about 75 words per 100 tokens is enough, and apps handle it invisibly. Token counting matters mainly when building on an API at scale or checking whether a long document fits a model's context window.

In short, tokens are the hidden unit behind everything you notice about AI: the price, the memory limit, and the speed. Remember about 75 words per 100 tokens, that input and output are billed separately, and that bigger context windows cost more, and you will understand AI pricing and limits better than most people who use these tools every day.

What a Token Actually Is

Text	Approx. tokens	Note
'Hello'	1 token	Common short word
'unbelievable'	2 to 3 tokens	Longer words split into pieces
A 75-word paragraph	~100 tokens	The everyday rule of thumb
A 4-page document	~2,000 tokens	Why long inputs cost more

Why Tokens Decide Your Cost

Also on LumiChats

AI Explained

Vector Embeddings, Explained Simply

10 min read→

AI Explained

Every Major AI Benchmark Explained: The 2026 Master Guide to What Every Score Actually Means, Which Ones Are Being Gamed — and the Number GPT-5.5 Hoped You'd Never See

14 min read→

AI Explained

How to Write AI Prompts That Actually Work

12 min read→

Why Tokens Decide the Limits (Context Windows)

Insight

Input vs Output Tokens

How to Use Fewer Tokens (and Pay Less)

Send only what is needed: paste the relevant section of a document, not the entire file, when the rest is irrelevant.
Ask for the length you want: 'answer in 3 bullet points' produces fewer output tokens than an open-ended request.
Start fresh for new topics: a brand-new chat does not carry the token weight of a long, unrelated history.
Summarize long threads: ask the model to summarize a long conversation, then continue from the summary to shed old tokens.
Check before you scale: if you build on an API, run sample text through a tokenizer so you can estimate cost before going live.

Do You Need to Count Tokens Yourself?

Frequently Asked Questions

01What is a token in AI?

A token is a small chunk of text, typically a word, part of a word, or a punctuation mark, that an AI model reads and writes. Models break all text into tokens before processing it. In English, a token averages about four characters, and roughly 100 tokens equals about 75 words.

02How many words is 1,000 tokens?

Roughly 750 words in English, using the common rule of about 75 words per 100 tokens. The exact number varies because longer and rarer words split into more tokens, but 750 words is a reliable estimate for planning cost and length.

03Why do AI tools charge by the token?

Tokens are the unit of work a model processes, so providers price by them. Most charge separately for input tokens (what you send) and output tokens (what the model writes), with output often more expensive. Sending less and requesting shorter answers lowers cost on paid APIs.

04What is the difference between tokens and a context window?

Tokens are the unit; the context window is how many tokens a model can consider at once, counting both your input and its output. When content exceeds the window, the oldest tokens drop out of view, which is why long chats can seem to forget earlier messages.

05Do I need to count tokens as a regular user?

No. For everyday chatting the rough conversion of about 75 words per 100 tokens is enough, and apps handle it invisibly. Token counting matters mainly when building on an API at scale or checking whether a long document fits a model's context window.

What Are AI Tokens? Explained Simply

What a Token Actually Is

Why Tokens Decide Your Cost

Why Tokens Decide the Limits (Context Windows)

Input vs Output Tokens

How to Use Fewer Tokens (and Pay Less)

Do You Need to Count Tokens Yourself?

What Are AI Tokens? Explained Simply

What a Token Actually Is

Why Tokens Decide Your Cost

Why Tokens Decide the Limits (Context Windows)

Input vs Output Tokens

How to Use Fewer Tokens (and Pay Less)

Do You Need to Count Tokens Yourself?

Claude, GPT-5.4, Gemini —
all in one place.

Keep reading

What a Token Actually Is

Why Tokens Decide Your Cost

Why Tokens Decide the Limits (Context Windows)

Input vs Output Tokens

How to Use Fewer Tokens (and Pay Less)

Do You Need to Count Tokens Yourself?

Claude, GPT-5.4, Gemini —all in one place.

Keep reading

Claude, GPT-5.4, Gemini —
all in one place.