A test most business owners run once and never forget: open ChatGPT, type "what's a good [your category] in [your area]?" or "best [your industry] software for small business" — and watch yourself not show up. Your competitors get named. You don't. It's a quietly horrible feeling, because you can see the lost pipeline.

There are usually one or two specific reasons behind it. This guide walks through the seven most common, in roughly the order they cause damage. Work through them in sequence and most businesses see citation share improve inside 30–60 days.

Reason 1: AI crawlers can't read your site

Many sites still block AI-specific crawlers in robots.txt — sometimes deliberately, sometimes because a developer turned them off as a privacy gesture three years ago and never revisited. If GPTBot, ClaudeBot, and PerplexityBot can't crawl you, they can't cite you. Period.

How to check: visit yourdomain.com/robots.txt and look for lines like User-agent: GPTBot / Disallow: /. If you see them, those engines are blocked.

How to fix: update robots.txt to explicitly allow the crawlers you care about. At minimum: GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-Web, PerplexityBot, Google-Extended. (Reffed's own robots.txt is a working example.)

Reason 2: Your content is hidden behind JavaScript

If your site is built with a JS-heavy framework (older React or Vue setups, Single Page Applications without SSR) and the actual content only appears after JavaScript runs, AI crawlers may see a nearly empty page. Some crawlers execute JS, many don't. The safe assumption is: if it's not in the raw HTML response, it doesn't count.

How to check: right-click your homepage, "View Page Source." If the body content is mostly empty or just <div id="root"></div>, your content is JS-rendered.

How to fix: migrate to server-side rendering (SSR) or static generation. Next.js, Astro, Nuxt, SvelteKit, and Hugo all do this natively. For React SPAs, adding SSR is non-trivial — sometimes it's worth rebuilding on a static framework instead. If migration isn't feasible immediately, pre-render at least your high-value commercial pages.

Reason 3: No structured data (schema)

JSON-LD schema is how you explicitly tell crawlers "this is a business, it does X, it's located in Y, it offers these services." Without schema, AI engines have to infer everything from prose, and they often get it wrong or skip you entirely. Sites with proper schema get cited noticeably more than sites without — the gap is one of the most reliable signals in the entire GEO space.

How to check: paste your URL into Google's Rich Results Test or Schema.org's Validator. You should see at minimum an Organization or LocalBusiness entry, plus a WebSite entry. Service businesses should also have Service and Offer schemas; e-commerce should have Product and Offer.

How to fix: generate the right JSON-LD blocks and include them in your HTML head. The required minimum:

  • Organization or LocalBusiness — name, URL, logo, description, address (for local), contact info
  • WebSite — name, URL, potential search action
  • BreadcrumbList on internal pages
  • Article on blog posts (with author, publish date, modified date)
  • FAQPage wherever you have Q&A content
  • Product or Service on offering pages

Reffed deploys all of these automatically when you connect a site. Most modern CMSes have plugins (RankMath, Yoast, Schema Pro for WordPress; Schema App for Shopify) that produce passable versions.

Reason 4: Weak or missing "about" framing

AI engines build a mental model of your business by reading your about page, your homepage, and the way other sites describe you. If those sources are inconsistent ("AI consultancy" on one page, "marketing agency" on another, "creative studio" on the third), the model gets confused and falls back to clearer alternatives. Same for sites that describe what they do in jargon ("a unique synthesis of strategic insight and creative execution") instead of plain category terms.

How to check: ask ChatGPT or Claude to summarize what your company does, in one sentence, using only your website. If the summary is vague, generic, or wrong, the model doesn't have a clean read on you.

How to fix: write a single sentence about your business that names (1) what you are (category), (2) who you serve (target customer), and (3) the outcome you produce. Use that sentence in your homepage hero, your about page, your meta description, your Organization schema, and your social profiles. Consistency compounds; inconsistency cancels out.

Reason 5: You're not cited by sources AI engines trust

AI engines don't only read your site — they read everyone else's site too. When third-party sources mention your business in the context of your category, that gets folded into the model's representation of you. Inversely, if your competitors are mentioned everywhere and you're mentioned nowhere, the model concludes your competitors are the answer to category queries.

This is partly traditional digital PR. The sources that move the needle most: industry-specific roundups and "best of" lists, niche directories, reputable trade publications, podcast appearances (transcripts get crawled), guest posts on category-relevant blogs, and Wikipedia where editorially appropriate.

How to check: search "[your category] tools 2026" or "best [your category] in [your city]" and see which third-party sites appear. Are you mentioned in those articles? If not, those are your highest-leverage outreach targets.

How to fix: identify 10–15 of the highest-authority third-party sources that AI engines clearly trust in your category. Pitch each one — guest post, expert quote, product inclusion, listicle submission. Aim for original contribution; don't beg for a mention. This is slow work that compounds. Reffed Watch helps you spot which third-party sources AI is citing in your category, so you know which ones to pitch.

Reason 6: Your content doesn't answer real questions

AI engines excerpt content. They lift sentences and short paragraphs that directly answer the prompt the user typed. Content written as marketing prose ("Our innovative platform empowers organizations to unlock unprecedented value...") gets ranked nowhere and excerpted by no one.

Content written as direct answers to specific questions gets surfaced repeatedly, because it's literally the shape the model is looking for.

How to check: open three or four of your most important pages. Are the H2 headings questions a customer would actually type? Is the first paragraph below each H2 a crisp, complete answer in 40–80 words? If your H2s are things like "Our Approach" and "Solutions," they're not earning citations.

How to fix: rewrite headings as questions, and lead each section with a self-contained answer paragraph. Save the detail, qualifications, and examples for the paragraphs below. This pattern is sometimes called the "inverted pyramid" or "summary-first" structure, and it's what AI engines reward.

Reason 7: You don't exist at the prompt level

Sometimes the issue isn't your site — it's that nobody is asking AI engines questions where you'd be a natural answer. If you sell highly niche B2B software and AI users aren't yet researching that category through AI, no amount of optimization changes the volume.

How to check: brainstorm 30 queries a real customer might type. Run each through ChatGPT, Claude, and Perplexity. How many produce useful answers at all? If most produce vague, generic, or unhelpful responses across all engines, the category itself isn't AI-mature yet.

How to fix: two paths. First, optimize anyway — AI search adoption is growing fast in every category and the early-mover advantage compounds. Second, focus your content on the adjacent broader questions where AI is being used. If your specific product isn't queried much but the problem it solves is, write content about the problem. Be the source when AI explains the problem; you'll be cited when AI is asked about solutions.

The order to fix these in

If you only have a few hours:

  1. Fix robots.txt to allow AI crawlers (10 min).
  2. Add minimum-viable schema: Organization, WebSite, BreadcrumbList, plus FAQPage on any Q&A content (2–4 hours).
  3. Audit and tighten your "what we do" framing across homepage, about page, and meta descriptions (2 hours).

If you have a month:

  1. Rewrite your top 5 commercial pages around customer questions (8–12 hours, ideally with research).
  2. Identify 10 high-leverage third-party citation targets and start outreach.
  3. Publish 2–3 original-research or original-framework articles that other sites will want to cite.

How to measure progress

Pick 20 queries that a real customer would type. Run them against ChatGPT, Claude, and Perplexity once a month. Track three numbers: how many name you, how many name a direct competitor instead, how many produce no useful answer. A healthy program moves the first number up, the second down, and the third down (because as you build content, more queries become well-answered).

Don't expect linear progress. Citation gains often come in clusters — a third-party mention lands, the model gets re-trained, and you suddenly appear in five new prompts. Then nothing for two weeks. The trend matters more than any single check.

See what's broken on your site

Reffed's free audit checks all seven of these areas plus 30 more. Full report in 60 seconds.

Related reading