KnownByLLM

Reference · 7 min read

llms.txt vs llms-full.txt

When to publish each, and what changes if you do.

Two files at the root of your site. One says “here’s what we are and which pages matter.” The other says “here are those pages, in full.” Most teams only need the first. Documentation-heavy sites may benefit from publishing both.

This is the short reference: what each file is, when it pays off to publish llms-full.txt on top of llms.txt, and what the large adopters (Anthropic, Stripe, Cloudflare) actually ship.

The 30-second answer

llms.txt is a curated table of contents in Markdown — a one-page summary plus 5–25 link entries. It tells AI assistants what your site is and which pages are worth fetching.

llms-full.txt is the full-text bundle — the bodies of your most important pages, concatenated as Markdown in a single file. It removes the per-page round trip when an AI wants to quote you.

Both files were proposed by Jeremy Howard at Answer.AI in September 2024. llms.txt has the broader adoption curve; llms-full.txt shows up most often on documentation sites where the content itself is the product.

llms.txtllms-full.txt
JobCurated index of important pagesFull-text dump of those pages
FormatMarkdown — H1, blockquote, link listMarkdown — concatenated page bodies
Typical size1–20 KB50 KB – several MB
AudienceAI assistants picking what to fetchAI assistants quoting in answers
Required?No (becoming expected)No (mostly docs sites)
Updated whenSite structure or top pages changePage bodies change

What llms-full.txt looks like

Mechanically it’s just the Markdown body of each important page, with H1 headings to separate them. A trimmed example:

# Quickstart

Get a cluster running in 5 minutes. Install the CLI, run `acme init`, and...

# API reference

## Authentication
All requests use bearer tokens in the Authorization header...

## Endpoints
### POST /v1/search
Run a query against the index...

# Architecture

The index is sharded across nodes by document ID. Each shard runs an LSM tree...

That’s the whole convention. There’s no XML, no metadata, no required front matter. The H1s mark page boundaries; everything else is the page body in Markdown.

When publishing llms-full.txt is worth it

For most small-business sites — a services list, an about page, a pricing page, a contact form — llms-full.txt is overkill. The pages are short, the AI can fetch them directly when cited, and the curation in llms.txt is what does the work.

It pays off in three specific cases.

  • Documentation sites where the product is the docs (developer tools, APIs, libraries). Anthropic, Stripe, Cloudflare, and Mintlify-hosted sites publish llms-full.txt for exactly this reason.
  • Sites with heavy JavaScript or paywalls that AI crawlers struggle to read in raw HTML. A pre-rendered Markdown bundle works around the rendering problem entirely.
  • Large knowledge bases (handbooks, policies, long-form blogs) where you want answers grounded in your wording, not paraphrased from a page summary.

If your site is none of those, ship llms.txt alone. Adding llms-full.txt later is straightforward.

How the two files work together

When an AI assistant wants to cite you, the workflow looks something like this:

  • It fetches llms.txt to learn what your site is and which pages matter.
  • For each page it’s about to cite, it tries to retrieve the body. If llms-full.txt is present, the body is already in hand — no second fetch, no HTML parsing.
  • If llms-full.txt is not present, it fetches each relevant URL individually and parses the HTML.

The reported impact: roughly 10× lower token cost for the AI compared to crawling full HTML, which makes citation more likely on token-budgeted assistants. That’s the same figure cited for llms.txt alone — llms-full.txt extends the reduction to the body content.

What the large adopters actually publish

  • Anthropic publishes both at docs.anthropic.com/llms.txt and docs.anthropic.com/llms-full.txt. The full bundle covers the API reference and the Claude documentation.
  • Stripe publishes both at stripe.com/llms.txt and stripe.com/llms-full.txt. The full bundle contains the public API documentation in Markdown.
  • Cloudflare and Vercel publish llms.txt for their docs sites. As of 2026, llms-full.txt presence on each varies with the docs platform they use.
  • Mintlify-hosted sites get both files generated automatically — that’s why so many developer-tool docs in 2025–2026 ship both.

Generating llms-full.txt

You don’t write llms-full.txt by hand. The standard pattern is:

  • Take the URLs in your llms.txt link list.
  • Convert each page’s body to Markdown (most static-site generators and docs platforms can export Markdown directly).
  • Concatenate them, separated by H1 page-title headings, and publish at https://yoursite.com/llms-full.txt.

For Mintlify, GitBook, and Docusaurus-style platforms, llms-full.txt is a checkbox or a built-in file. For self-hosted sites, a small build script that walks your llms.txt URL list and serializes Markdown is enough.

Common mistakes

  • Publishing llms-full.txt without llms.txt. You skip the curation layer that helps AI pick the right page to cite. Always ship both, with llms.txt first.
  • Letting the bundle drift. llms-full.txt needs to be regenerated when page bodies change. A six-month-old bundle that contradicts your live docs is worse than no bundle.
  • Including pages that need a login. llms-full.txt is public. Don’t put gated content in it just because you can — only include pages that are already accessible without authentication.
  • Mismatched URLs. Page bodies in llms-full.txt should correspond to URLs that appear in llms.txt. Otherwise the AI can’t link the quoted text back to a page on your site.

FAQ

Is llms-full.txt part of the llms.txt spec?

It's a companion convention proposed alongside llms.txt by the same author (Jeremy Howard, September 2024). The spec defines the format — a single Markdown file with the full bodies of your most important pages concatenated — but treats publishing it as optional. llms.txt is the baseline; llms-full.txt is the expanded form.

Do AI assistants actually read llms-full.txt?

Some do, mostly when they need full-text grounding rather than navigation. ChatGPT, Claude, and Perplexity have all been observed fetching llms-full.txt on documentation sites that publish it. As of 2026 it's far less universally consumed than llms.txt — treat llms.txt as the priority and llms-full.txt as a useful extra for content-heavy sites.

If I publish llms-full.txt, do I still need llms.txt?

Yes. The two files have different jobs. llms.txt tells AI which pages matter and how to find them; llms-full.txt gives the bodies of those pages so the AI can quote them without crawling each page individually. Publishing only llms-full.txt skips the curation layer that helps AI choose what to cite.

How big can llms-full.txt get?

There is no hard limit in the spec, but practical limits matter. AI assistants have context windows; a multi-megabyte llms-full.txt may be truncated or skipped. Real-world examples range from ~50 KB for a small docs site to several megabytes for Anthropic and Stripe. If your bundle exceeds a few megabytes, prefer publishing only the most cited pages.

Does llms-full.txt help my Google ranking?

Not directly. Google does not use llms-full.txt for ranking (as of May 2026). Like llms.txt, it's an AI-search artifact. The indirect upside is the same: writing a clean Markdown bundle forces you to clarify your most important pages, which is good content hygiene.

Will publishing llms-full.txt let AI train on my content?

No more than publishing the same pages on your site already does. llms-full.txt is a retrieval artifact — meant to be read at answer time, not added to a training set. If you want to opt out of training, block GPTBot, ClaudeBot, and similar in robots.txt; that affects both your HTML pages and your llms-full.txt the same way.