AI SEO and Generative Search Optimization

Engineering that makes a site legible and quotable to AI crawlers and answer engines — llms.txt, named AI-bot policy in robots.txt, citation-friendly schema, auditable retrieval.

AI SEO is engineering, not content marketing with a new label. The work is making a site legible to AI crawlers, allowable under a measured policy instead of a blanket block, quotable with correct attribution, and auditable so a client can see which bots arrived and what they took. Most teams have never measured that.

Two pipelines, not one. Google AI Overviews are a search-result-page feature: Googlebot indexes the page and the Overview is generated at query time from that index. ChatGPT, Claude, and Perplexity citations are a different pipeline — pages must be reachable by the runtime crawler (ChatGPT-User, ClaudeBot, PerplexityBot) and structured how that retrieval layer surfaces. Blocking one set does not affect the other. Ranking in one does not earn a citation in the other.

What I ship

  • Generated llms.txt. A plaintext-markdown overview at the site root, produced from the database or content management system per the llmstxt.org convention so it stays in sync with the live corpus. The convention is a community proposal authored by Jeremy Howard at Answer.AI; no major AI vendor has publicly committed to honoring it, and I say so before shipping it.
  • Named AI-crawler policy in robots.txt. Explicit per-bot directives for GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, Applebot-Extended, Bytespider, and CCBot. Wildcard rules are ignored by most AI bots; each user-agent needs its own line, with the policy chosen per bot — allow, throttle, or deny.
  • Citation-friendly schema. Article with a real Author node, Author with sameAs links to LinkedIn, GitHub, and ORCID where they exist, Organization, and FAQ. Schema that gives a retrieval layer something to attach a name and a source URL to.
  • Content structured for extraction. Semantic HTML, a clear H1 and H2 hierarchy, short attributable paragraphs, definitions before flourishes, and fact-dense opening sentences — written for a model with a finite context window.
  • AI Overviews and chatbot positioning. Audit which queries trigger AI Overviews and which trigger citations inside ChatGPT, Claude, or Perplexity. Identify the entity and attribute gaps. Ship the schema and template changes that close them, with the two pipelines tracked separately.
  • Auditability. Server-log dashboards showing which AI crawlers arrived, on what cadence, against which URLs. For most clients this is the first measured view of the AI-bot footprint on their infrastructure.

Where it fits

Invisible to ChatGPT and Claude

A prospect tells the founder they asked ChatGPT for vendors in the category and the company was not mentioned. I audit which runtime crawlers can actually reach the site, diagnose the schema and content gaps that prevent the retrieval layer from naming the brand, and ship the fixes that put the company in the candidate set. Whether a model cites the page is the model's call.

Bandwidth bleed from AI crawlers

AI crawlers are hammering the site and the operations team wants to block everything. I measure the actual footprint per bot, identify which crawlers cite back (PerplexityBot and ChatGPT-User often do) and which mostly consume bandwidth (Bytespider, CCBot), and ship a per-bot policy that allows the citation-friendly crawlers and denies the ones that pay nothing back.

Displaced by an AI Overview

A category keyword now returns a Google AI Overview that summarizes competitor pages. The client wants to become one of the cited sources. I diagnose the entity and attribute gaps in the content, ship schema upgrades and template restructuring aligned with what the Overview is summarizing, and verify across the next crawl cycle.

How I work

Every engagement opens with a written audit across five surfaces: which AI-bot user-agents reach the site and at what cadence, what llms.txt and robots.txt look like today, what JSON-LD the top page types emit, how the content is structured for extraction, and where the brand currently appears across ChatGPT, Claude, Perplexity, and Google AI Overviews. The audit and a prioritized fix list ship before any code changes. The principal carrying the work is described on the about page.

Fixes land as reviewable pull requests. llms.txt is generated, not hand-edited. Schema is validated against the Rich Results Test and the Schema.org parser. The AI-bot policy is verified against the actual user-agents in the server logs after the next crawl cycle. The patterns on this page are running on the page you are reading, and the longer write-ups live in the research notes.

What I will not promise

I will not promise an appearance in ChatGPT, an entry in Google AI Overviews, or a position in Perplexity's source list. Those outcomes are owned by retrieval layers I do not control, on a cadence that shifts every sixty to ninety days. The contract is the artifact — a generated llms.txt, a named AI-bot policy in robots.txt, a validated schema graph with author identity, a content template restructured for extraction, and an AI-crawler audit dashboard. Whether a given chatbot quotes the page on a given prompt is the chatbot's call.

Engagement model

AI-readiness audits run one to two weeks and deliver a written review of llms.txt, robots.txt, schema, content structure, and the current AI-surface footprint, with a prioritized remediation plan. Build-out engagements run three to six weeks and implement the llms.txt generator, the AI-bot policy, the schema upgrades, and the template restructuring. Quarterly reviews exist for landscapes that shift every sixty to ninety days. To scope an audit, get in touch.

AI SEO is the discovery-layer sibling to the primary AI consulting practice on this site and a peer to the Technical SEO Engineering service. Most engagements that need one end up needing both — different pipelines, scoped separately.

FREQUENTLY ASKED

Is llms.txt actually a standard?

No, it is a community proposal authored by Jeremy Howard at Answer.AI, published at llmstxt.org. No major AI vendor has publicly committed to honoring it. It is low-cost, forward-compatible, and demonstrates technical literacy — sold as that, not as a guaranteed citation channel.

Will blocking GPTBot hurt my Google ranking?

No. GPTBot is OpenAI's training crawler; Googlebot is independent. Likewise Google-Extended controls Gemini training use only and does not affect classic search. Blocking and ranking are decoupled — sold as one thing by people who do not actually know.

How do I get cited in Google AI Overviews and ChatGPT?

Those are two different pipelines. AI Overviews are generated from the Google index — you need to rank well classically and be structured for extractive summarization. ChatGPT and Claude citations require a different runtime crawler to reach the page and a different content shape. Engineering for one does not earn the other.

RELATED SERVICES