Before you spend money on AI search optimization, find out if AI assistants can even read your site. Most owners assume their site is "fine" because it loads in a browser and ranks somewhere in Google. Those are two different things from being readable to ChatGPT, Claude, and Perplexity. We routinely run Reffed on sites the owner thought were optimized, only to find one of these five signals completely broken.
Here are the five free tests, in order of how often they catch real problems.
Test 1: Check robots.txt for AI crawler blocks
This is the test that catches the most "I had no idea" moments. Visit yourdomain.com/robots.txt in any browser. Scan the file for entries that look like this:
User-agent: GPTBot
Disallow: /
Or worse:
User-agent: *
Disallow: /
Both of these mean AI crawlers are blocked from your site. The first explicitly blocks OpenAI's GPTBot. The second blocks everyone, including AI crawlers. If you see either pattern, your site is invisible to ChatGPT during browsing mode, period.
The bots that matter most: GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-Web, PerplexityBot, and Google-Extended. All of these should be allowed if you want AI citation. We cover the exact configuration in our robots.txt for AI crawlers guide.
Test 2: View page source for actual content
Right-click anywhere on your homepage and choose "View Page Source" (or press Ctrl+U on Windows, Cmd+Option+U on Mac). A new window opens showing the raw HTML — the same HTML AI crawlers see.
Now scan the source for your actual content. Look for the main headline of your site, your value proposition, your service descriptions. They should be visible as plain text in the HTML.
If instead you see something like <div id="root"></div> and almost nothing else, your content is being rendered by JavaScript after the page loads. AI crawlers vary in their JS handling. The safe assumption is: anything not in the raw HTML doesn't count.
This is most common on sites built with older React, Vue, or Angular setups that didn't enable server-side rendering. The fix is either migrating to a framework that pre-renders (Next.js, Astro, Nuxt) or adding SSR to your existing stack.
Test 3: Run Google's Rich Results Test
Go to Google's Rich Results Test and paste your homepage URL. Click "Test URL" and wait 10–15 seconds.
If it returns "Page is not eligible for rich results," you have no structured data on your homepage. AI engines rely heavily on schema markup to understand what your site is. Without it, they have to guess from prose, and they often get it wrong or skip your site entirely.
At minimum, your homepage should show valid Organization (or LocalBusiness) and WebSite schema. If you sell products, add Product schema. If you have an FAQ section, add FAQPage schema — this one is especially powerful for AI extraction.
The Rich Results Test also flags errors in your existing schema, which is just as important. Broken schema is sometimes worse than no schema because it signals carelessness.
Test 4: Check Bing's index
Open Bing and search for site:yourdomain.com (no spaces). Bing returns every page from your domain that it has indexed. Count the results.
If Bing returns fewer than 10 pages for a site that has dozens or hundreds of real pages, you have a Bing indexation problem. And here's the kicker: when ChatGPT browses the web, it queries Bing — not Google. If you're not in Bing's index, ChatGPT can't find you during live retrieval, no matter how well-optimized your site is.
The fix: register at Bing Webmaster Tools, submit your sitemap, and request indexing on your most important pages. Most sites see their Bing index population grow within 1–2 weeks once they actively engage with the tool.
Test 5: Ask ChatGPT directly
Open ChatGPT, enable web search, and ask: "What does [your domain.com] do, and what services do they offer?"
If ChatGPT returns a coherent answer with specific details about your business, your site is technically readable. If it returns a generic answer, the wrong description, or "I couldn't find specific information about that site," you have a real problem and at least one of Tests 1–4 will explain it.
Try the same query in Claude (with web search) and Perplexity. Different engines weight signals differently. Perplexity tends to be the most explicit about sources, so it's the best engine for diagnosing which pages it actually found.
What to do with these results
If Tests 1–4 pass cleanly, you're technically ready for AI citation. From there, the bottleneck is usually content depth and authority signals (backlinks, brand mentions, press), which is a longer game.
If any test fails, fix that issue first. Each test corresponds to a specific blocker:
- Test 1 fails: update robots.txt to allow AI crawlers
- Test 2 fails: migrate to SSR or pre-render key pages
- Test 3 fails: add JSON-LD schema to your pages
- Test 4 fails: submit your sitemap to Bing Webmaster Tools
- Test 5 fails: work through Tests 1–4 first; usually one of them is the culprit
The 60-second alternative
If you'd rather not run five tests manually, our free 60-second Reffed audit automates all of them — plus seven more — and gives you a ranked list of fixes by impact. No signup, no credit card. Run it on your homepage and you'll know in under a minute whether AI assistants can actually read your site.
Either way: run the tests. The two minutes it takes to check robots.txt could be the most valuable two minutes you spend on AI search optimization this year.