Serving Markdown version of pages to AI: Is it worth it?

{ "author": { "bio": null, "image": { "_type": "image", "alt": "Caroline Scholles, Content & SEO Specialist in Lisbon, Portual ", "asset": { "_ref": "image-30252a4308f1def976faa953ad20b174c29761a0-180x320-jpg", "_type": "reference" } }, "name": "Caroline Scholles" }, "body": [ { "_key": "069f30b543cd", "_type": "block", "children": [ { "_key": "92ec73411cf6", "_type": "span", "marks": [], "text": "Recently, " }, { "_key": "33591f9b08e8", "_type": "span", "marks": [ "9efded140e80" ], "text": "Cloudflare released a new feature" }, { "_key": "a007ebbff841", "_type": "span", "marks": [], "text": " that enables websites to serve Markdown versions of pages to AI agents via content negotiation. But is this actually a good practice?" } ], "markDefs": [ { "_key": "9efded140e80", "_type": "link", "href": "https://x.com/Cloudflare/status/2021955521213800489" } ], "style": "normal" }, { "_key": "9dd0daed62fe", "_type": "block", "children": [ { "_key": "1f09c17da218", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "5e4c5a69d2c7", "_type": "block", "children": [ { "_key": "1942cd0732f0", "_type": "span", "marks": [], "text": "I applied a similar strategy on my personal website and did some research to better understand the advantages and disadvantages of making Markdown available through content negotiation." } ], "markDefs": [], "style": "normal" }, { "_key": "c926a66f618e", "_type": "block", "children": [ { "_key": "a6a1458f9eb7", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "f70d6eada2c5", "_type": "block", "children": [ { "_key": "8e9024fabac5", "_type": "span", "marks": [], "text": "Why consider markdown for agents?" } ], "markDefs": [], "style": "h2" }, { "_key": "3b31c711ff0e", "_type": "block", "children": [ { "_key": "3d40dd4835a4", "_type": "span", "marks": [], "text": "Before going through the implementation process, there are a few reasons why this approach should, or should not, be considered." } ], "markDefs": [], "style": "normal" }, { "_key": "3b37ec461970", "_type": "block", "children": [ { "_key": "b40c0af6fb61", "_type": "span", "marks": [], "text": "Cloudflare states that some agents, such as Claude Code and OpenCode, send requests with headers like:" } ], "markDefs": [], "style": "normal" }, { "_key": "efc7a58a1de7", "_type": "code", "code": "{ headers: { Accept: \"text/markdown, text/html\" } }", "language": "javascript" }, { "_key": "80db909f21df", "_type": "block", "children": [ { "_key": "cd8672a973fe", "_type": "span", "marks": [], "text": "This indicates that these agents can accept Markdown and may prioritize it if available." } ], "markDefs": [], "style": "normal" }, { "_key": "d54b4c6a0db6", "_type": "block", "children": [ { "_key": "dac7ce18517b", "_type": "span", "marks": [], "text": "However, this is not the case for many conventional crawlers such as Googlebot, and even OpenAI’s standard crawlers. I haven’t audited the exact Accept headers of every crawler or agent, but it’s generally understood that most agents still prioritize the browser version of the page, which is the full HTML." } ], "markDefs": [], "style": "normal" }, { "_key": "f372abd71645", "_type": "block", "children": [ { "_key": "18c99c918ad8", "_type": "span", "marks": [], "text": "If you explore " }, { "_key": "f47a11b0ad28", "_type": "span", "marks": [ "3a30e75850ea" ], "text": "Cloudflare Radar’s AI bot data, you can see how early and fragmented this ecosystem still is." } ], "markDefs": [ { "_key": "3a30e75850ea", "_type": "link", "href": "https://radar.cloudflare.com/explorer?dataSet=ai.bots&groupBy=content_type&filters=userAgent%253DGPTBot&timeCompare=1" } ], "style": "normal" }, { "_key": "41b65e4ee3d5", "_type": "block", "children": [ { "_key": "6e9abe475f0f", "_type": "span", "marks": [], "text": "At this stage, most user agents do not appear to actively negotiate Markdown over HTML. In practice, the majority still consume full HTML documents." } ], "markDefs": [], "style": "normal" }, { "_key": "997d0f30077d", "_type": "block", "children": [ { "_key": "4617d05477bd", "_type": "span", "marks": [], "text": "Markdown for agents isn't an SEO practice" } ], "markDefs": [], "style": "h2" }, { "_key": "73338539a6b3", "_type": "block", "children": [ { "_key": "afee9f376815", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "fbd6d602ba8d", "_type": "block", "children": [ { "_key": "43ea0113474e", "_type": "span", "marks": [], "text": "Where this becomes interesting is not SEO, but efficiency. With recent discussions around WebMCP, we’re starting to see a broader shift in how agents interact with web content. WebMCP itself is a separate concept, but it touches on a similar principle: reducing unnecessary overhead in machine-to-web interactions." } ], "markDefs": [], "style": "normal" }, { "_key": "48c7ed9cb9af", "_type": "block", "children": [ { "_key": "244777a0fcc8", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "5174f490256e", "_type": "block", "children": [ { "_key": "d4ed929a1892", "_type": "span", "marks": [], "text": "Processing full HTML documents is not particularly efficient for LLMs. In fact, it consumes significantly more tokens than serving a clean Markdown file." } ], "markDefs": [], "style": "normal" }, { "_key": "e0e6c965952c", "_type": "block", "children": [ { "_key": "313b45fcf4db", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "c21578e0f1dd", "_type": "block", "children": [ { "_key": "6346a4e7269a", "_type": "span", "marks": [], "text": "The main argument in favor of Markdown here is computational efficiency:" } ], "markDefs": [], "style": "normal" }, { "_key": "84bd9dbdc695", "_type": "block", "children": [ { "_key": "5deb3db4ccee", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "1db5e19da432", "_type": "block", "children": [ { "_key": "a8159de218e3", "_type": "span", "marks": [ "strong" ], "text": "1. Input Tokens: " }, { "_key": "054cf8d5be00", "_type": "span", "marks": [], "text": "on average, 1 token corresponds to approximately 4 characters. An HTML page contains far more than just readable content. It includes markup, layout structure, scripts, styles, navigation, and metadata. All of that contributes to token usage." } ], "markDefs": [], "style": "normal" }, { "_key": "f6645f5c2419", "_type": "block", "children": [ { "_key": "296cabf08238", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "3940508a3c09", "_type": "block", "children": [ { "_key": "2150b3547bf9", "_type": "span", "marks": [ "strong" ], "text": "2. Processing Complexity: " }, { "_key": "090b060b8556", "_type": "span", "marks": [], "text": "a web page contains many more structural elements than the informational content itself. Even if an agent ultimately extracts only the meaningful text, it still has to process the entire document first. That introduces computational overhead." } ], "markDefs": [], "style": "normal" }, { "_key": "adc9fdcc7c00", "_type": "block", "children": [ { "_key": "c7d15fd354d5", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "0c35f1e82eee", "_type": "block", "children": [ { "_key": "06d32ae41383", "_type": "span", "marks": [ "strong" ], "text": "3. Context Window Utilization: " }, { "_key": "f4ac88e6def7", "_type": "span", "marks": [], "text": "if you serve Markdown instead, you provide a cleaner and more direct representation of the content. More efficient use of the context window. By delivering only the content that matters, you potentially maximize the model’s reasoning capacity." } ], "markDefs": [], "style": "normal" }, { "_key": "221649b74806", "_type": "block", "children": [ { "_key": "0cae95cd151b", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "2aaada2d9211", "_type": "block", "children": [ { "_key": "98a4c9f09f31", "_type": "span", "marks": [], "text": "This discussion is primarily about content consumption efficiency, not about page interaction or search engine optimization. We’re not diving deeply into WebMCP here. Instead, it’s about offering a more efficient representation of the same content to agents that explicitly request Markdown instead of HTML." } ], "markDefs": [], "style": "normal" }, { "_key": "aa77cbe12a42", "_type": "block", "children": [ { "_key": "67e41392be89", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "cd8aeb7c20c6", "_type": "block", "children": [ { "_key": "9e12d44496f3", "_type": "span", "marks": [], "text": "At the moment, this remains experimental. " }, { "_key": "54f2ea59edac", "_type": "span", "marks": [ "strong" ], "text": "Most agents still default to HTML. But as LLM web interactions evolve, exploring cleaner content delivery formats may become increasingly relevant; especially if the goal is to improve response quality through better context efficiency." } ], "markDefs": [], "style": "normal" }, { "_key": "3d798899e1a3", "_type": "block", "children": [ { "_key": "82ad1d8172ef", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "a3b133a0ea9f", "_type": "block", "children": [ { "_key": "2c8352fe8ae9", "_type": "span", "marks": [], "text": "The Trade-offs of Content Negotiation" } ], "markDefs": [], "style": "h2" }, { "_key": "cc2dd27bda47", "_type": "block", "children": [ { "_key": "61a71edba528", "_type": "span", "marks": [], "text": "By implementing Markdown via content negotiation, I’m effectively providing a lower-token version of my website. But this doesn’t come without trade-offs." } ], "markDefs": [], "style": "normal" }, { "_key": "c3ac44b05ab9", "_type": "block", "children": [ { "_key": "357a66d72dae", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "19c4365246c4", "_type": "block", "children": [ { "_key": "604adf8e4115", "_type": "span", "marks": [], "text": "The first change is architectural. With content negotiation in place, my blog posts are now rendered on demand by the server. The site is no longer fully pre-rendered as static HTML. That introduces additional complexity. For example, I can’t rely on the standard @astrojs/sitemap package without adjustments. What used to be straightforward static generation now requires more intentional configuration." } ], "markDefs": [], "style": "normal" }, { "_key": "39d4f25da5fe", "_type": "block", "children": [ { "_key": "63bb32413d96", "_type": "span", "marks": [], "text": "Caching also becomes more nuanced. I’m serving two representations from the same URL: HTML for browsers and traditional crawlers, and Markdown for agents that explicitly request it. Serving tailored content from a single URL isn’t unusual, it’s actually how content negotiation is supposed to work, but it does require careful handling to avoid cache inconsistencies." } ], "markDefs": [], "style": "normal" }, { "_key": "3d3c65984464", "_type": "block", "children": [ { "_key": "21608d9c4fa6", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "ded9353f57ca", "_type": "block", "children": [ { "_key": "433096b01191", "_type": "span", "marks": [], "text": "It’s also important to clarify who this benefits." } ], "markDefs": [], "style": "normal" }, { "_key": "892277af434c", "_type": "block", "children": [ { "_key": "853a5ad5a85e", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "0db1f2fff39a", "_type": "block", "children": [ { "_key": "2b58b754cf1b", "_type": "span", "marks": [], "text": "Traditional crawlers are largely unaffected. Search engines still crawl links from page to page, download HTML, render the page much like a browser would, and evaluate structure, layout, performance, and discoverability. Their concern is whether the content can be indexed correctly and whether it reflects what a user sees." } ], "markDefs": [], "style": "normal" }, { "_key": "66bd24a0e2b5", "_type": "block", "children": [ { "_key": "d0fb12686dc6", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "59a8e2fb9583", "_type": "block", "children": [ { "_key": "fe772870052b", "_type": "span", "marks": [], "text": "AI agents operate differently. They are not trying to render a page visually. They ingest its content and process it through a language model. In that context, what matters most is the semantic content itself (the meaning of the text) and how efficiently it can be processed. Fewer tokens mean lower computational cost. Cleaner structure means less ambiguity. Markdown naturally aligns with those priorities because it removes structural noise and presents content in a more direct form." } ], "markDefs": [], "style": "normal" }, { "_key": "736a86d2c7f9", "_type": "block", "children": [ { "_key": "d3255f39522b", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "7dd66849ec3f", "_type": "block", "children": [ { "_key": "55c6f6adad81", "_type": "span", "marks": [ "strong" ], "text": "The key difference, then, is this: search engines care about rendering and indexing. AI agents care about semantic extraction and efficiency. Understanding that distinction is essential before deciding whether introducing content negotiation is worth the added complexity." } ], "markDefs": [], "style": "normal" }, { "_key": "89dfc8f3f036", "_type": "block", "children": [ { "_key": "5167b2f9f04c", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "72455e7e051d", "_type": "block", "children": [ { "_key": "5ea21a657ee6", "_type": "span", "marks": [], "text": "Step-by-Step: How I Added a Markdown Version to My Astro + Sanity Website" } ], "markDefs": [], "style": "h2" }, { "_key": "66773d6806b3", "_type": "block", "children": [ { "_key": "931a0db23798", "_type": "span", "marks": [], "text": "I recently added a Markdown representation of my blog posts to my personal website (Astro + Sanity). I wanted to see what it looks like to offer a low-token version of the same content for clients that explicitly ask for Markdown via content negotiation." } ], "markDefs": [], "style": "normal" }, { "_key": "8f3e658426d7", "_type": "block", "children": [ { "_key": "66734d492789", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "a8eed2a43b79", "_type": "block", "children": [ { "_key": "c0dd0def2608", "_type": "span", "marks": [], "text": "This post is a walkthrough of exactly what I implemented, and the few trade-offs I ran into along the way." } ], "markDefs": [], "style": "normal" }, { "_key": "19f839cffa4d", "_type": "block", "children": [ { "_key": "9c225aa7dd0e", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "b4074dc045ca", "_type": "block", "children": [ { "_key": "d1c746f769b7", "_type": "span", "marks": [ "strong" ], "text": "Prerequisites (What My Setup Looked Like)" } ], "markDefs": [], "style": "normal" }, { "_key": "585cc85e3982", "_type": "block", "children": [ { "_key": "4bbc4be414b5", "_type": "span", "marks": [], "text": "Before I changed anything, my site already had:" } ], "markDefs": [], "style": "normal" }, { "_key": "13f6ab829399", "_type": "block", "children": [ { "_key": "985b6d4fb493", "_type": "span", "marks": [], "text": "A working Astro project connected to a Sanity.io backend" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "6d6933948418", "_type": "block", "children": [ { "_key": "2a31f692b4e5", "_type": "span", "marks": [], "text": "Blog posts routed with Astro’s file-based routing (ex: /insights/[slug])" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "eb0ff4de5e3a", "_type": "block", "children": [ { "_key": "faed611ab6ec", "_type": "span", "marks": [], "text": "A standard HTML blog page rendering Sanity content" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "cb0f1a1f48a4", "_type": "block", "children": [ { "_key": "34a5b474a830", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "b5502681d8e7", "_type": "block", "children": [ { "_key": "8858c768d1ab", "_type": "span", "marks": [], "text": "If your setup is similar, you’ll be able to follow this pretty closely." } ], "markDefs": [], "style": "normal" }, { "_key": "f50b5c6af335", "_type": "block", "children": [ { "_key": "952b031ed36f", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "3c95da8bc4d4", "_type": "block", "children": [ { "_key": "e9dff3d45916", "_type": "span", "marks": [ "strong" ], "text": "Step 1: Create the Markdown API Endpoint" } ], "markDefs": [], "style": "h2" }, { "_key": "050da832442b", "_type": "block", "children": [ { "_key": "2ea6ecf6faff", "_type": "span", "marks": [], "text": "The first thing I needed was a dedicated route that:" } ], "markDefs": [], "style": "normal" }, { "_key": "4c608f924bd2", "_type": "block", "children": [ { "_key": "55c772be8ac2", "_type": "span", "marks": [], "text": "Fetches content from Sanity" } ], "level": 1, "markDefs": [], "style": "normal" }, { "_key": "2bd184fe2eda", "_type": "block", "children": [ { "_key": "511aa1e52517", "_type": "span", "marks": [], "text": "Converts Portable Text to Markdown" } ], "level": 1, "markDefs": [], "style": "normal" }, { "_key": "ee50d3eee438", "_type": "block", "children": [ { "_key": "cb20adda955c", "_type": "span", "marks": [], "text": "Returns it with the correct Content-Type" } ], "level": 1, "markDefs": [], "style": "normal" }, { "_key": "5fc2b4f29857", "_type": "block", "children": [ { "_key": "dd74e2f44031", "_type": "span", "marks": [], "text": "For organization, I placed this inside a markdown directory." } ], "markDefs": [], "style": "normal" }, { "_key": "c5928e9a48ac", "_type": "block", "children": [ { "_key": "038838ab593e", "_type": "span", "marks": [ "strong" ], "text": "File:" }, { "_key": "0546e9ac79bc", "_type": "span", "marks": [], "text": " src/pages/markdown/insights/[slug].md.ts" } ], "markDefs": [], "style": "normal" }, { "_key": "ad8094bc6be5", "_type": "block", "children": [ { "_key": "8556fc23390c", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "82c36b291560", "_type": "code", "code": "import type { APIRoute } from \"astro\";\n\nimport { loadQuery } from \"../../../sanity/lib/load-query\";\n\nimport { portableTextToMarkdown } from \"@portabletext/markdown\";\n\nimport type { SanityDocument } from \"@sanity/client\";\n\n// This route will be rendered on-demand at request time.\n\nexport const prerender = false;\n\ninterface Post extends SanityDocument {\n\nbody: any;\n\n}\n\nexport const GET: APIRoute = async ({ params }) => {\n\nconst { slug } = params;\n\ntry {\n\nconst { data: post } = await loadQuery<Post>({\n\nquery: `*[_type == \"post\" && slug.current == $slug][0]{ body }`,\n\nparams: { slug },\n\n});\n\nif (!post || !post.body) {\n\nreturn new Response(\"Not found\", { status: 404 });\n\n}\n\nconst markdown = portableTextToMarkdown(post.body);\n\nreturn new Response(markdown, {\n\nstatus: 200,\n\nheaders: {\n\n\"Content-Type\": \"text/markdown; charset=utf-8\",\n\n},\n\n});\n\n} catch (error) {\n\nreturn new Response(\"Internal Server Error\", { status: 500 });\n\n}\n\n};", "language": "typescript" }, { "_key": "3c9839d85b4d", "_type": "block", "children": [ { "_key": "3e288c0f5757", "_type": "span", "marks": [], "text": "This route always returns Markdown. It’s clean, explicit, and easy to test." } ], "markDefs": [], "style": "normal" }, { "_key": "105b0f984e3f", "_type": "block", "children": [ { "_key": "7eaee76f93af", "_type": "span", "marks": [], "text": "The important detail here is:" } ], "markDefs": [], "style": "normal" }, { "_key": "dc3f0daa7a4c", "_type": "block", "children": [ { "_key": "3674f61abc08", "_type": "span", "marks": [], "text": "export const prerender = false;" } ], "markDefs": [], "style": "normal" }, { "_key": "d23f9a1c645b", "_type": "block", "children": [ { "_key": "a7effbd18782", "_type": "span", "marks": [], "text": "That tells Astro this route must run dynamically at request time." } ], "markDefs": [], "style": "normal" }, { "_key": "2ef8577a3055", "_type": "block", "children": [ { "_key": "50171b5fb115", "_type": "span", "marks": [ "strong" ], "text": "Step 2: Forcing Dynamic Rendering" } ], "markDefs": [], "style": "h2" }, { "_key": "c55472494e30", "_type": "block", "children": [ { "_key": "9c22da1ff33f", "_type": "span", "marks": [], "text": "At first, I ran into a ForbiddenRewrite error." } ], "markDefs": [], "style": "normal" }, { "_key": "f698f50b4586", "_type": "block", "children": [ { "_key": "f6aee6cb216e", "_type": "span", "marks": [], "text": "The reason? Astro middleware can only rewrite between routes that are both dynamically rendered. By default, Astro tries to pre-render pages as static HTML. So I was trying to rewrite from a dynamic request to a static file." } ], "markDefs": [], "style": "normal" }, { "_key": "953914a08113", "_type": "block", "children": [ { "_key": "5af8e226c033", "_type": "span", "marks": [], "text": "That doesn’t work." } ], "markDefs": [], "style": "normal" }, { "_key": "8a114bd24c25", "_type": "block", "children": [ { "_key": "124fa4489fec", "_type": "span", "marks": [], "text": "The fix was simple — but not obvious at first." } ], "markDefs": [], "style": "normal" }, { "_key": "5a043ee06325", "_type": "block", "children": [ { "_key": "53c57f5b01f8", "_type": "span", "marks": [], "text": "I had to make my blog post page dynamic as well." } ], "markDefs": [], "style": "normal" }, { "_key": "f12ca2199368", "_type": "block", "children": [ { "_key": "7e29a3f26158", "_type": "span", "marks": [ "strong" ], "text": "File:" }, { "_key": "0e9e9d9ec02a", "_type": "span", "marks": [], "text": " src/pages/insights/[slug].astro" } ], "markDefs": [], "style": "normal" }, { "_key": "264750f6f9d7", "_type": "block", "children": [ { "_key": "dbcf891d06c8", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "4fa051b5177f", "_type": "code", "code": "---\n\nimport PortableText from \"../../components/PortableText.astro\";\n\nimport ReadMore from \"../../components/ReadMore.astro\";\n\nimport FAQSection from \"../../components/FAQSection.astro\";\n\nimport KeyTakeaways from \"../../components/KeyTakeaways.astro\";\n\nexport const prerender = false;\n\nexport async function getStaticPaths() {\n\nconst { data: posts } = await loadQuery({\n\nquery: `*[_type == \"post\"]`,\n\n});\n\n...\n\n}\n\n---", "language": "typescript" }, { "_key": "334763d3d065", "_type": "block", "children": [ { "_key": "adebec479520", "_type": "span", "marks": [], "text": "Once prerender = false is added, Astro ignores getStaticPaths() for static generation and switches the page to on-demand rendering." } ], "markDefs": [], "style": "normal" }, { "_key": "328e46cc29be", "_type": "block", "children": [ { "_key": "9d9cb544a245", "_type": "span", "marks": [], "text": "Now both:" } ], "markDefs": [], "style": "normal" }, { "_key": "0dab0e1e75a8", "_type": "block", "children": [ { "_key": "bbbec48c69fd", "_type": "span", "marks": [], "text": "/insights/[slug]" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "a973c9dcafd4", "_type": "block", "children": [ { "_key": "d55b3d142f8f", "_type": "span", "marks": [], "text": "/markdown/insights/[slug].md" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "8203591af5a1", "_type": "block", "children": [ { "_key": "262663613db4", "_type": "span", "marks": [], "text": "are dynamic." } ], "markDefs": [], "style": "normal" }, { "_key": "b9d425dc8282", "_type": "block", "children": [ { "_key": "eee360cdcc56", "_type": "span", "marks": [], "text": "That unlocks middleware rewriting." } ], "markDefs": [], "style": "normal" }, { "_key": "dda39eb29f27", "_type": "block", "children": [ { "_key": "ec0b717799db", "_type": "span", "marks": [ "strong" ], "text": "Step 3: The Middleware" } ], "markDefs": [], "style": "h2" }, { "_key": "b0cfaa416f00", "_type": "block", "children": [ { "_key": "2fcb935de626", "_type": "span", "marks": [], "text": "This is where everything comes together. I created a middleware file that intercepts requests and decides what to do based on:" } ], "markDefs": [], "style": "normal" }, { "_key": "f6f2201bf4be", "_type": "block", "children": [ { "_key": "d4870fefc088", "_type": "span", "marks": [], "text": "The URL" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "ba9774778f2b", "_type": "block", "children": [ { "_key": "4270d736df7c", "_type": "span", "marks": [], "text": "The Accept header" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "87f0994edb86", "_type": "block", "children": [ { "_key": "80565963c37b", "_type": "span", "marks": [ "strong" ], "text": "File:" }, { "_key": "3b24ff2ce820", "_type": "span", "marks": [], "text": " src/middleware.ts" } ], "markDefs": [], "style": "normal" }, { "_key": "129af56a0b8d", "_type": "code", "code": "import { defineMiddleware } from 'astro:middleware';\n\nexport const onRequest = defineMiddleware((context, next) => {\n\nconst { url, request } = context;\n\nconst { pathname } = url;\n\n// Rule 1: If URL ends with .md\n\nif (pathname.startsWith('/insights/') && pathname.endsWith('.md')) {\n\nconst slug = pathname.replace('/insights/', '').replace('.md', '');\n\nconst newPath = `/markdown/insights/${slug}.md`;\n\nreturn context.rewrite(newPath);\n\n}\n\n// Rule 2: If Accept header requests markdown\n\nif (pathname.startsWith('/insights/') && !pathname.endsWith('.md')) {\n\nconst acceptHeader = request.headers.get('accept');\n\nif (acceptHeader && acceptHeader.includes('text/markdown')) {\n\nconst slug = pathname.replace('/insights/', '');\n\nconst newPath = `/markdown/insights/${slug}.md`;\n\nreturn context.rewrite(newPath);\n\n}\n\n}\n\nreturn next();\n\n});", "language": "typescript" }, { "_key": "838cae41220c", "_type": "block", "children": [ { "_key": "a3f7cdfb42e1", "_type": "span", "marks": [], "text": "What this does:" } ], "markDefs": [], "style": "normal" }, { "_key": "7f1f132f3601", "_type": "block", "children": [ { "_key": "12175c5ba03b", "_type": "span", "marks": [], "text": "/insights/post.md → always rewritten to Markdown handler" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "ff9d1096fdf4", "_type": "block", "children": [ { "_key": "90cdceb9b715", "_type": "span", "marks": [], "text": "/insights/post + Accept: text/markdown → rewritten" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "182f95fa124d", "_type": "block", "children": [ { "_key": "389e13227767", "_type": "span", "marks": [], "text": "Everything else → continues as normal HTML" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "bd6b3115763e", "_type": "block", "children": [ { "_key": "3f4911ae1a16", "_type": "span", "marks": [], "text": "One canonical URL. Two formats. Clean separation." } ], "markDefs": [], "style": "normal" }, { "_key": "6a97c1efc584", "_type": "block", "children": [ { "_key": "8c4b6422d7b1", "_type": "span", "marks": [ "strong" ], "text": "Step 4: Testing Everything" } ], "markDefs": [], "style": "h2" }, { "_key": "f1fa4a31e485", "_type": "block", "children": [ { "_key": "20f58b69c811", "_type": "span", "marks": [], "text": "I verified both behaviors just by adjusting the accept header (you can also add the specific UA):" } ], "markDefs": [], "style": "normal" }, { "_key": "022654b410bb", "_type": "block", "children": [ { "_key": "e44a96f84021", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "76aa21af561c", "_type": "block", "children": [ { "_key": "02bd3a9b5035", "_type": "span", "marks": [], "text": "# Ask for markdown" } ], "markDefs": [], "style": "normal" }, { "_key": "4d08d7a41804", "_type": "code", "code": "curl -L -H \"Accept: text/markdown\" http://localhost:4321/insights/your-post", "language": "sh" }, { "_key": "e2c13e538744", "_type": "block", "children": [ { "_key": "3b9c717aef28", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "fee39eadf5e8", "_type": "block", "children": [ { "_key": "9bfc66b5ddcf", "_type": "span", "marks": [], "text": "# Ask for HTML" } ], "markDefs": [], "style": "normal" }, { "_key": "9e0afeb0e9e0", "_type": "code", "code": "curl -L -H \"Accept: text/html\" http://localhost:4321/insights/your-post", "language": "sh" }, { "_key": "12035e88ae51", "_type": "block", "children": [ { "_key": "715560c1c309", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "ff6b82713557", "_type": "block", "children": [ { "_key": "2c8352fe8ae9", "_type": "span", "marks": [], "text": "Final Considerations" } ], "markDefs": [], "style": "h2" }, { "_key": "7f5267783fb9", "_type": "block", "children": [ { "_key": "0587847742f9", "_type": "span", "marks": [], "text": "Serving the raw Markdown version of the blog post instead of the full HTML results in a token reduction of approximately 96%. This represents a significant saving in both the computational cost and time required for an AI agent to process the page's core content." } ], "markDefs": [], "style": "normal" }, { "_key": "b5e2428da965", "_type": "block", "children": [ { "_key": "1481b8bb0eba", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "47be5214df83", "_type": "image", "alt": "HTML vs Markdown savings", "asset": { "_ref": "image-7ef80f5551ddd3f0071539b41e397d14ea22109e-652x126-png", "_type": "reference" } }, { "_key": "710c491ff079", "_type": "block", "children": [ { "_key": "90701697e210", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "b9a1b33d1247", "_type": "block", "children": [ { "_key": "01080cca3b07", "_type": "span", "marks": [ "strong" ], "text": "Approximate Tokens Saved:" }, { "_key": "52d46fe6a35b", "_type": "span", "marks": [], "text": " 32,021" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "e22b410fcd80", "_type": "block", "children": [ { "_key": "68b28582e4fb", "_type": "span", "marks": [ "strong" ], "text": "Reduction Percentage:" }, { "_key": "f92d04a4a0f6", "_type": "span", "marks": [], "text": " " }, { "_key": "1e7dcd21ddba", "_type": "span", "marks": [ "strong" ], "text": "~96%" } ], "level": 1, "listItem": "bullet", "markDefs": [], "style": "normal" }, { "_key": "f3975981b33b", "_type": "block", "children": [ { "_key": "deeceaeb976e", "_type": "span", "marks": [], "text": "" } ], "markDefs": [], "style": "normal" }, { "_key": "71641e23198c", "_type": "block", "children": [ { "_key": "890871c0e3c7", "_type": "span", "marks": [], "text": "While there are significant computational gains from token savings, it's clear that Googlebot currently prefers to fetch and render a page just as a human user would. Although this standard may evolve, for now, I have instructed Googlebot via robots.txt not to access the raw Markdown versions." } ], "markDefs": [], "style": "normal" }, { "_key": "3009dc417dcc", "_type": "code", "code": "# Default for all bots (except Googlebot)\nUser-agent: *\nDisallow: /studio/\nSitemap: https://carolinescholles.com/sitemap-index.xml\nSitemap: https://carolinescholles.com/sitemap.md\n\n# Specific rules for Googlebot\nUser-agent: Googlebot\nDisallow: /studio/\nDisallow: /insights/*.md$\nDisallow: /markdown/\nDisallow: /sitemap.md\nSitemap: https://carolinescholles.com/sitemap-index.xml", "filename": "robots.txt" } ], "categories": [ { "title": "Agentic Commerce & AI" } ], "faqs": [ { "_key": "d4edbab7ccdd83e1e88abc858e28cb59", "answer": "Markdown significantly reduces token consumption—often by up to 80%—by stripping away HTML boilerplate like scripts, styles, and nested tags. This allows AI models to process the semantic content more accurately and fit more information into their context windows.", "question": "How does serving Markdown improve AI efficiency?" }, { "_key": "a2f0a00c0131c720df85928446c7b808", "answer": "Currently, no. Search engines like Google prioritize the full HTML version to understand page layout and user experience. While it helps AI agents, Google representatives have expressed skepticism, noting that bots may struggle to parse links or structure correctly in Markdown compared to standard HTML.", "question": "Is serving Markdown to bots considered a good SEO practice?" }, { "_key": "63ae237a3ff504551b5d28fc97ed3a92", "answer": "Modern developer-focused AI agents, such as Claude Code and OpenCode, are among the first to explicitly include 'text/markdown' in their headers. Most general-purpose crawlers, like GPTBot, still default to traditional HTML for the time being.", "question": "Which AI tools actually use the Markdown 'Accept' header?" }, { "_key": "297ebb46819b0eca2b1ddccaa9a05c48", "answer": "The primary trade-off is architectural complexity. Sites often have to switch from static pre-rendering to dynamic, on-demand rendering. Additionally, developers must carefully manage caching using the 'Vary: accept' header to ensure browsers don't accidentally receive Markdown intended for bots.", "question": "What are the main technical challenges of implementing this?" } ], "mainImage": null, "publishedAt": "2026-02-24T16:18:00.000Z", "references": [ { "_key": "67ffedd4694a", "_type": "referenceModule", "anchorText": "webMCP: Efficient AI-Native Client-Side Interaction for Agent-Ready Web Design", "url": "https://arxiv.org/abs/2508.09171" }, { "_key": "b989940691ba", "_type": "referenceModule", "anchorText": "Markdown Routes with Next.js", "url": "https://www.sanity.io/learn/course/markdown-routes-with-nextjs" }, { "_key": "de93805ca823", "_type": "referenceModule", "anchorText": "Cloudflare just shipped HTML-to-markdown for AI agents. I built it from the content layer instead.", "url": "https://www.reddit.com/r/sanity_io/comments/1r46g0s/cloudflare_just_shipped_htmltomarkdown_for_ai/" }, { "_key": "b577fdfbb4e9", "_type": "referenceModule", "anchorText": "eal-time content conversion to Markdown at the source using content negotiation headers", "url": "https://x.com/Cloudflare/status/2021955521213800489" }, { "_key": "75cec5bf6210", "_type": "referenceModule", "anchorText": "What are tokens and how to count them?", "url": "https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them" } ], "seo": { "title": "Serving Markdown version of pages to AI: Is it worth it?" }, "slug": { "_type": "slug", "current": "serving-markdown-version-of-pages-to-ai-is-it-worth-it" }, "takeaways": [ "Serving Markdown to AI agents can reduce token usage by up to 80%, maximizing LLM reasoning capacity.", "Content negotiation via the 'Accept' header allows a single URL to provide format-specific content for both humans and machines.", "Leading coding assistants like Claude Code are already adopting this standard to improve documentation ingestion.", "Transitioning to Markdown serving often requires moving away from static site generation toward dynamic, server-side rendering.", "While excellent for computational efficiency, this approach is currently an experimental optimization rather than an SEO strategy." ], "title": "Serving Markdown version of pages to AI: Is it worth it?" }

Recently, Cloudflare released a new feature that enables websites to serve Markdown versions of pages to AI agents via content negotiation. But is this actually a good practice?

I applied a similar strategy on my personal website and did some research to better understand the advantages and disadvantages of making Markdown available through content negotiation.

Why consider markdown for agents?

Before going through the implementation process, there are a few reasons why this approach should, or should not, be considered.

Cloudflare states that some agents, such as Claude Code and OpenCode, send requests with headers like:

{ headers: { Accept: "text/markdown, text/html" } }

This indicates that these agents can accept Markdown and may prioritize it if available.

However, this is not the case for many conventional crawlers such as Googlebot, and even OpenAI’s standard crawlers. I haven’t audited the exact Accept headers of every crawler or agent, but it’s generally understood that most agents still prioritize the browser version of the page, which is the full HTML.

If you explore Cloudflare Radar’s AI bot data, you can see how early and fragmented this ecosystem still is.

At this stage, most user agents do not appear to actively negotiate Markdown over HTML. In practice, the majority still consume full HTML documents.

Markdown for agents isn't an SEO practice

Where this becomes interesting is not SEO, but efficiency. With recent discussions around WebMCP, we’re starting to see a broader shift in how agents interact with web content. WebMCP itself is a separate concept, but it touches on a similar principle: reducing unnecessary overhead in machine-to-web interactions.

Processing full HTML documents is not particularly efficient for LLMs. In fact, it consumes significantly more tokens than serving a clean Markdown file.

The main argument in favor of Markdown here is computational efficiency:

1. Input Tokens: on average, 1 token corresponds to approximately 4 characters. An HTML page contains far more than just readable content. It includes markup, layout structure, scripts, styles, navigation, and metadata. All of that contributes to token usage.

2. Processing Complexity: a web page contains many more structural elements than the informational content itself. Even if an agent ultimately extracts only the meaningful text, it still has to process the entire document first. That introduces computational overhead.

3. Context Window Utilization: if you serve Markdown instead, you provide a cleaner and more direct representation of the content. More efficient use of the context window. By delivering only the content that matters, you potentially maximize the model’s reasoning capacity.

This discussion is primarily about content consumption efficiency, not about page interaction or search engine optimization. We’re not diving deeply into WebMCP here. Instead, it’s about offering a more efficient representation of the same content to agents that explicitly request Markdown instead of HTML.

At the moment, this remains experimental. Most agents still default to HTML. But as LLM web interactions evolve, exploring cleaner content delivery formats may become increasingly relevant; especially if the goal is to improve response quality through better context efficiency.

The Trade-offs of Content Negotiation

By implementing Markdown via content negotiation, I’m effectively providing a lower-token version of my website. But this doesn’t come without trade-offs.

The first change is architectural. With content negotiation in place, my blog posts are now rendered on demand by the server. The site is no longer fully pre-rendered as static HTML. That introduces additional complexity. For example, I can’t rely on the standard @astrojs/sitemap package without adjustments. What used to be straightforward static generation now requires more intentional configuration.

Caching also becomes more nuanced. I’m serving two representations from the same URL: HTML for browsers and traditional crawlers, and Markdown for agents that explicitly request it. Serving tailored content from a single URL isn’t unusual, it’s actually how content negotiation is supposed to work, but it does require careful handling to avoid cache inconsistencies.

It’s also important to clarify who this benefits.

Traditional crawlers are largely unaffected. Search engines still crawl links from page to page, download HTML, render the page much like a browser would, and evaluate structure, layout, performance, and discoverability. Their concern is whether the content can be indexed correctly and whether it reflects what a user sees.

AI agents operate differently. They are not trying to render a page visually. They ingest its content and process it through a language model. In that context, what matters most is the semantic content itself (the meaning of the text) and how efficiently it can be processed. Fewer tokens mean lower computational cost. Cleaner structure means less ambiguity. Markdown naturally aligns with those priorities because it removes structural noise and presents content in a more direct form.

The key difference, then, is this: search engines care about rendering and indexing. AI agents care about semantic extraction and efficiency. Understanding that distinction is essential before deciding whether introducing content negotiation is worth the added complexity.

Step-by-Step: How I Added a Markdown Version to My Astro + Sanity Website

I recently added a Markdown representation of my blog posts to my personal website (Astro + Sanity). I wanted to see what it looks like to offer a low-token version of the same content for clients that explicitly ask for Markdown via content negotiation.

This post is a walkthrough of exactly what I implemented, and the few trade-offs I ran into along the way.

Prerequisites (What My Setup Looked Like)

Before I changed anything, my site already had:

A working Astro project connected to a Sanity.io backend
Blog posts routed with Astro’s file-based routing (ex: /insights/[slug])
A standard HTML blog page rendering Sanity content

If your setup is similar, you’ll be able to follow this pretty closely.

Step 1: Create the Markdown API Endpoint

The first thing I needed was a dedicated route that:

Fetches content from Sanity

Converts Portable Text to Markdown

Returns it with the correct Content-Type

For organization, I placed this inside a markdown directory.

File: src/pages/markdown/insights/[slug].md.ts

import type { APIRoute } from "astro";

import { loadQuery } from "../../../sanity/lib/load-query";

import { portableTextToMarkdown } from "@portabletext/markdown";

import type { SanityDocument } from "@sanity/client";

// This route will be rendered on-demand at request time.

export const prerender = false;

interface Post extends SanityDocument {

body: any;

}

export const GET: APIRoute = async ({ params }) => {

const { slug } = params;

try {

const { data: post } = await loadQuery<Post>({

query: `*[_type == "post" && slug.current == $slug][0]{ body }`,

params: { slug },

});

if (!post || !post.body) {

return new Response("Not found", { status: 404 });

}

const markdown = portableTextToMarkdown(post.body);

return new Response(markdown, {

status: 200,

headers: {

"Content-Type": "text/markdown; charset=utf-8",

},

});

} catch (error) {

return new Response("Internal Server Error", { status: 500 });

}

};

This route always returns Markdown. It’s clean, explicit, and easy to test.

The important detail here is:

export const prerender = false;

That tells Astro this route must run dynamically at request time.

Step 2: Forcing Dynamic Rendering

At first, I ran into a ForbiddenRewrite error.

The reason? Astro middleware can only rewrite between routes that are both dynamically rendered. By default, Astro tries to pre-render pages as static HTML. So I was trying to rewrite from a dynamic request to a static file.

That doesn’t work.

The fix was simple — but not obvious at first.

I had to make my blog post page dynamic as well.

File: src/pages/insights/[slug].astro

---

import PortableText from "../../components/PortableText.astro";

import ReadMore from "../../components/ReadMore.astro";

import FAQSection from "../../components/FAQSection.astro";

import KeyTakeaways from "../../components/KeyTakeaways.astro";

export const prerender = false;

export async function getStaticPaths() {

const { data: posts } = await loadQuery({

query: `*[_type == "post"]`,

});

...

}

---

Once prerender = false is added, Astro ignores getStaticPaths() for static generation and switches the page to on-demand rendering.

Now both:

/insights/[slug]
/markdown/insights/[slug].md

are dynamic.

That unlocks middleware rewriting.

Step 3: The Middleware

This is where everything comes together. I created a middleware file that intercepts requests and decides what to do based on:

The URL
The Accept header

File: src/middleware.ts

import { defineMiddleware } from 'astro:middleware';

export const onRequest = defineMiddleware((context, next) => {

const { url, request } = context;

const { pathname } = url;

// Rule 1: If URL ends with .md

if (pathname.startsWith('/insights/') && pathname.endsWith('.md')) {

const slug = pathname.replace('/insights/', '').replace('.md', '');

const newPath = `/markdown/insights/${slug}.md`;

return context.rewrite(newPath);

}

// Rule 2: If Accept header requests markdown

if (pathname.startsWith('/insights/') && !pathname.endsWith('.md')) {

const acceptHeader = request.headers.get('accept');

if (acceptHeader && acceptHeader.includes('text/markdown')) {

const slug = pathname.replace('/insights/', '');

const newPath = `/markdown/insights/${slug}.md`;

return context.rewrite(newPath);

}

}

return next();

});

What this does:

/insights/post.md → always rewritten to Markdown handler
/insights/post + Accept: text/markdown → rewritten
Everything else → continues as normal HTML

One canonical URL. Two formats. Clean separation.

Step 4: Testing Everything

I verified both behaviors just by adjusting the accept header (you can also add the specific UA):

# Ask for markdown

curl -L -H "Accept: text/markdown" http://localhost:4321/insights/your-post

# Ask for HTML

curl -L -H "Accept: text/html" http://localhost:4321/insights/your-post

Final Considerations

Serving the raw Markdown version of the blog post instead of the full HTML results in a token reduction of approximately 96%. This represents a significant saving in both the computational cost and time required for an AI agent to process the page's core content.

Approximate Tokens Saved: 32,021
Reduction Percentage: ~96%

While there are significant computational gains from token savings, it's clear that Googlebot currently prefers to fetch and render a page just as a human user would. Although this standard may evolve, for now, I have instructed Googlebot via robots.txt not to access the raw Markdown versions.

robots.txt

# Default for all bots (except Googlebot)
User-agent: *
Disallow: /studio/
Sitemap: https://carolinescholles.com/sitemap-index.xml
Sitemap: https://carolinescholles.com/sitemap.md

# Specific rules for Googlebot
User-agent: Googlebot
Disallow: /studio/
Disallow: /insights/*.md$
Disallow: /markdown/
Disallow: /sitemap.md
Sitemap: https://carolinescholles.com/sitemap-index.xml

Why consider markdown for agents?

Markdown for agents isn't an SEO practice

The Trade-offs of Content Negotiation

Step-by-Step: How I Added a Markdown Version to My Astro + Sanity Website

Step 1: Create the Markdown API Endpoint

Step 2: Forcing Dynamic Rendering

Step 3: The Middleware

Step 4: Testing Everything

Final Considerations

References

How does serving Markdown improve AI efficiency?

Is serving Markdown to bots considered a good SEO practice?

Which AI tools actually use the Markdown 'Accept' header?

What are the main technical challenges of implementing this?