A technical SEO audit identifies the issues preventing your site from being properly crawled, indexed, and ranked. It’s the foundation everything else in SEO and GEO builds on. Without solid technical health, great content and strong backlinks can’t perform to their potential.
Key takeaway: This checklist covers 50+ audit points organized by priority. Start with crawlability and indexing — they determine whether search engines and AI systems can even see your content. Then work through performance, architecture, and advanced items.
What Should You Check First in Any Technical SEO Audit?
Start with the fundamentals: can search engines and AI systems access your content? The most common technical SEO failures are access issues that prevent crawling and indexing entirely.
Robots.txt audit:
Your robots.txt file controls which crawlers can access which parts of your site. Misconfigurations here can silently block entire sections.
| Check Item | What to Look For | How to Fix |
|---|---|---|
| File accessibility | robots.txt returns 200 status | Ensure file exists at domain root |
| No blanket Disallow | Disallow: / blocking Googlebot | Remove or scope the directive |
| AI crawler access | GPTBot, PerplexityBot, ChatGPT-User allowed | Add explicit Allow rules |
| Sitemap reference | Sitemap: directive present | Add full sitemap URL |
| No accidental blocks | Important directories not disallowed | Review each Disallow rule |
A common mistake: staging site robots.txt (Disallow: /) getting pushed to production during deployment. Always verify robots.txt after any deployment.
XML Sitemap audit:
Your sitemap tells search engines which pages exist and which matter most. Check these items: If you want to go deeper, Core Web Vitals Explained: LCP, INP, and CLS for SEO in 2026 breaks this down step by step.
- Sitemap is accessible — Fetch your sitemap URL. It should return a 200 status with valid XML.
- All important pages are included — Compare sitemap URLs against your actual pages. Missing pages won’t be discovered as quickly.
- No non-indexable pages — Every URL in your sitemap should return a 200 status and be indexable (no noindex tag, no canonical pointing elsewhere).
- Sitemap isn’t too large — Maximum 50,000 URLs or 50MB uncompressed per sitemap file. Use sitemap index files for larger sites.
- Last modified dates are accurate — Don’t set all
<lastmod>dates to today. Google will ignore inaccurate dates. - Sitemap is registered in Search Console and referenced in robots.txt.
Indexing status:
Pull your Page Indexing report from Google Search Console. Key things to check:
- How many pages are indexed vs. not indexed?
- What are the reasons for non-indexing? (“Crawled - currently not indexed” and “Discovered - currently not indexed” indicate quality or crawl budget issues)
- Are any important pages excluded by “noindex” tags you didn’t intend?
- Check for “Page with redirect,” “Not found (404),” and “Soft 404” issues
Use the site:yourdomain.com search operator to spot-check indexing. Compare the number of results against your expected page count.
How Do You Audit Crawlability and Internal Linking?
Crawlability determines how efficiently search engine spiders navigate your site. Poor crawlability means important pages don’t get crawled — or get crawled too infrequently to stay fresh in the index. (We explore this further in Featured Snippet Types: Complete Guide.)
Crawl your site with Screaming Frog or Sitebulb:
Run a full crawl of your site. For large sites, start with a sample of 10,000-50,000 URLs. Review these metrics:
Response codes:
- 200s — These are fine. Verify the count matches your expectations.
- 301/302 redirects — Map out redirect chains. Any chain longer than 2 hops should be shortened. Redirect chains slow crawling and dilute link equity.
- 404s — Identify broken internal links pointing to 404 pages. Fix the links or set up redirects.
- 5xx errors — Server errors indicate infrastructure problems. Log when they occur — intermittent 5xx errors during peak traffic suggest capacity issues.
Crawl depth analysis:
Every important page should be reachable within 3 clicks from the homepage. Pages buried 5+ clicks deep get crawled less frequently and pass less PageRank.
| Crawl Depth | Recommended Max Pages | Action if Exceeded |
|---|---|---|
| 0 (homepage) | 1 | N/A |
| 1 click | Core category/section pages | Link from homepage navigation |
| 2 clicks | Important subcategories, key content | Link from level-1 pages |
| 3 clicks | Individual pages, blog posts | Ensure breadcrumbs and internal links |
| 4+ clicks | Minimize pages at this depth | Restructure navigation or add links |
Internal linking audit:
Internal links distribute authority and help crawlers discover content. Check:
- Orphan pages — Pages with zero internal links pointing to them. These are nearly invisible to crawlers. Every page should have at least 2-3 internal links.
- Link equity distribution — Are your most important commercial pages getting enough internal links? Use Screaming Frog’s “Inlinks” count to identify pages with too few internal links.
- Anchor text variety — Internal link anchor text should be descriptive and varied. Don’t use “click here” — use keyword-rich, natural anchor text.
- Broken internal links — Links pointing to 404s, redirects, or non-canonical URLs. Fix these directly — update the href to the correct destination.
JavaScript rendering check:
If your site uses client-side rendering (React, Vue, Angular), verify that Googlebot can see your content. Use Search Console’s URL Inspection tool to compare the raw HTML source with the rendered HTML. Content only visible after JavaScript execution may be delayed in indexing or missed entirely by AI crawlers.
How Do You Audit On-Page Technical Elements?
On-page technical elements — title tags, meta descriptions, heading hierarchy, canonicals — are the metadata layer that tells search engines what your pages are about and how they relate to each other.
Title tag audit:
| Issue | How to Detect | Impact |
|---|---|---|
| Missing titles | Screaming Frog → Page Titles filter | High — no ranking signal |
| Duplicate titles | Screaming Frog → Duplicate filter | Medium — confuses search engines |
| Too long (>60 chars) | Screaming Frog → Over 60 Characters | Low — truncated in SERPs |
| Too short (<30 chars) | Screaming Frog → Under 30 Characters | Low — missed keyword opportunity |
| Keyword stuffing | Manual review | Medium — potential penalty signal |
Every indexable page needs a unique, descriptive title tag between 30-60 characters that includes the primary keyword naturally.
Meta description audit:
Meta descriptions don’t directly affect rankings, but they impact click-through rate — which indirectly affects rankings. Check for:
- Missing meta descriptions (search engines will auto-generate, often poorly)
- Duplicate descriptions across pages
- Descriptions over 155 characters (truncated)
- Descriptions that don’t accurately reflect page content
Heading hierarchy:
Every page should have exactly one <h1> tag that matches the page’s primary topic. Subsequent headings should follow a logical hierarchy: <h2> for main sections, <h3> for subsections, etc.
Common issues:
- Multiple
<h1>tags (logo and title both wrapped in<h1>) - Skipped heading levels (
<h1>→<h3>, missing<h2>) - Empty heading tags
- Headings used for styling rather than structure (use CSS instead)
Canonical tags:
Canonical tags tell search engines which version of a page is the “original.” Audit these carefully: This relates closely to what we cover in People Also Ask: Dominate PAA Boxes (2026).
- Every indexable page should have a self-referencing canonical tag.
- Canonical URLs should be absolute (include the full domain), not relative.
- Canonical tags should point to 200-status pages, not redirects or 404s.
- Paginated pages should have self-referencing canonicals (not pointing to page 1).
- HTTP/HTTPS and www/non-www variations should all canonical to one version.
Hreflang tags (for multilingual sites):
If your site serves content in multiple languages or regions, check:
- Every page with hreflang has return tags on each referenced page
- Language/region codes are valid (e.g.,
en-us, noten-USA) - Self-referencing hreflang is included
x-defaultis specified for the fallback version
What Performance Issues Should a Technical SEO Audit Cover?
Performance is both a ranking factor and a user experience issue. Your audit should cover Core Web Vitals, server performance, and resource optimization.
Core Web Vitals assessment:
Pull site-wide CWV data from Search Console’s Core Web Vitals report. Identify which page groups are failing and which metric is the problem. For detailed diagnosis, see our Core Web Vitals guide.
Key CWV audit items:
- LCP under 2.5s for 75% of page loads
- INP under 200ms for 75% of interactions
- CLS under 0.1 for 75% of page views
Server performance:
- TTFB (Time to First Byte) — Should be under 800ms. Test from multiple geographic locations using WebPageTest.
- Uptime — Check your server monitoring for downtime incidents. Even brief outages during Googlebot crawls can cause temporary deindexing.
- SSL/TLS — HTTPS is a ranking signal. Verify your SSL certificate is valid, not expired, and covers all subdomains. Check for mixed content (HTTP resources loaded on HTTPS pages).
Resource optimization:
| Resource | Target | Tool |
|---|---|---|
| Total page weight | < 3MB (mobile) | WebPageTest |
| Image optimization | WebP/AVIF, proper sizing | PageSpeed Insights |
| CSS delivery | Critical CSS inlined, rest deferred | Lighthouse |
| JavaScript | Deferred, code-split, tree-shaken | Chrome DevTools Coverage |
| Font loading | font-display: swap, subset | Lighthouse |
| Compression | Brotli or gzip enabled | Check response headers |
Mobile performance:
Google uses mobile-first indexing. Test your site on real mobile devices, not just responsive design previews:
- Tap targets should be at least 48x48 CSS pixels with 8px spacing
- Text should be readable without zooming (minimum 16px font)
- Content shouldn’t be wider than the viewport (no horizontal scrolling)
- Interstitials and popups shouldn’t block content (Google penalizes intrusive interstitials)
How Do You Audit Structured Data and Schema Markup?
Structured data helps search engines understand your content and enables rich results. For AI search, schema markup provides clear signals about what your content covers.
Schema markup audit checklist:
- Validate existing markup — Use Google’s Rich Results Test on representative pages. Fix any errors.
- Check for appropriate schema types — Article pages should have
ArticleorBlogPostingschema. Product pages needProductschema. FAQ pages needFAQPageschema. - Verify required properties — Each schema type has required and recommended properties. Missing required properties prevent rich results.
- Test JSON-LD implementation — JSON-LD is Google’s preferred format. Verify it’s in the
<head>or<body>of the page, not dynamically injected after render. - Check for markup/content parity — Schema data must match visible page content. If your schema says the price is $29.99, the page must visibly show $29.99.
Key schema types for SEO and GEO:
| Schema Type | Use Case | GEO Impact |
|---|---|---|
| Article / BlogPosting | Blog content | Helps AI identify author, date, topic |
| FAQPage | FAQ sections | Direct FAQ extraction by AI |
| HowTo | Step-by-step guides | Process extraction by AI |
| Product | Product pages | Product data for AI shopping |
| Organization | About/homepage | Entity recognition by AI |
| BreadcrumbList | Navigation breadcrumbs | Site structure understanding |
| LocalBusiness | Local businesses | Local AI search results |
For AI search specifically:
AI engines use structured data to understand entities and relationships. Organization, Person, and SameAs properties help AI systems connect your brand with its online presence across platforms. Implementing comprehensive entity markup increases the likelihood of AI citation. For more on this, see our guide to Content for Position Zero: Win Snippets & AI.
What Site Architecture Issues Impact SEO?
Site architecture determines how authority flows through your site and how easily users and crawlers find content.
URL structure:
- URLs should be descriptive, lowercase, hyphen-separated:
/category/product-namenot/p?id=3847 - Avoid unnecessary URL parameters that create duplicate content
- Keep URLs under 100 characters when possible
- Use consistent trailing slash convention (with or without — pick one)
- Avoid session IDs, tracking parameters, or dynamic parameters in URLs that get indexed
Faceted navigation (eCommerce):
Faceted navigation creates exponential URL combinations (color × size × brand × price = thousands of URLs). This wastes crawl budget and creates duplicate/thin content.
Solutions:
- Use
noindexor canonical tags on filtered pages - Block faceted URLs in robots.txt (aggressive but effective)
- Use AJAX-based filtering that doesn’t create new URLs
- Implement
rel="canonical"pointing filtered pages to the parent category
Pagination:
For paginated content (category pages, blog archives):
- Each paginated page should have a self-referencing canonical
- Use
rel="next"andrel="prev"links (Google says they’re hints, not directives, but they still help) - Ensure all paginated pages are crawlable and not blocked by robots.txt
- Include paginated URLs in your sitemap
Breadcrumb implementation:
Breadcrumbs improve user navigation and provide structural signals to search engines: Our On-Page SEO Checklist 2026: 25 Essential Optimizations guide covers this in detail.
- Implement
BreadcrumbListschema markup - Ensure breadcrumb links use anchor tags (not just visual separators)
- Breadcrumb hierarchy should match your URL structure
- Every page should have breadcrumbs (except the homepage)
How Do You Audit Security and HTTPS Implementation?
Security is a baseline requirement. Google has used HTTPS as a ranking signal since 2014, and browsers now flag HTTP sites as “Not Secure.”
HTTPS audit checklist:
- All pages serve over HTTPS — Check for any HTTP pages still accessible. All HTTP URLs should 301 redirect to HTTPS equivalents.
- No mixed content — All resources (images, scripts, stylesheets, fonts) must be loaded over HTTPS. Mixed content triggers browser warnings and can block resources.
- Valid SSL certificate — Certificate covers your domain and all subdomains you use. Not expired or about to expire. Issued by a trusted CA.
- HSTS header —
Strict-Transport-Securityheader prevents protocol downgrade attacks and signals permanent HTTPS commitment to browsers. - Security headers — Implement
Content-Security-Policy,X-Frame-Options,X-Content-Type-Options, andReferrer-Policyheaders. These aren’t direct ranking factors but protect your site and build trust.
Common HTTPS issues:
- Internal links using
http://instead ofhttps://— bulk update using search-and-replace - Sitemap containing
http://URLs — regenerate with correct protocol - Canonical tags pointing to
http://versions — update tohttps:// - Third-party resources loaded over HTTP — update or find HTTPS alternatives
How Do You Check AI Search Readiness in a Technical Audit?
Traditional technical SEO audits don’t cover AI search readiness. In 2026, you need additional checks to ensure AI engines can access, understand, and cite your content.
AI crawler access:
Check your robots.txt and server logs for these user agents:
| AI Crawler | User Agent | Parent Company |
|---|---|---|
| GPTBot | GPTBot | OpenAI |
| ChatGPT-User | ChatGPT-User | OpenAI |
| PerplexityBot | PerplexityBot | Perplexity |
| Google-Extended | Google-Extended | Google (AI training) |
| Anthropic | anthropic-ai | Anthropic |
| Bytespider | Bytespider | ByteDance |
Verify each is allowed in robots.txt. Check server logs to confirm they’re actually crawling your site. If you see zero visits from major AI crawlers, investigate — you may be blocking them unintentionally through your CDN, firewall, or hosting configuration.
Content accessibility without JavaScript:
AI crawlers may not execute JavaScript. Test your pages by disabling JavaScript in your browser — if your main content disappears, AI crawlers can’t see it either. Ensure primary content is present in the initial HTML response. As we discuss in ChatGPT vs Perplexity vs Google AI Compared, this is a critical factor.
Structured data for AI:
Implement schema markup that helps AI systems understand your content:
Articleschema withauthor,datePublished,dateModifiedFAQPageschema for FAQ sectionsOrganizationschema on your about pageSameAsproperties linking to your social profilesspeakableschema property identifying the most important content sections
Content structure for AI extraction:
- Use clear heading hierarchy (
<h1>→<h2>→<h3>) - Write atomic paragraphs (one concept per paragraph)
- Include definition-style sentences that AI can extract as citations
- Use tables for comparative data
- Include author information and expertise signals (E-E-A-T)
Monitoring AI crawl activity:
Set up log analysis to track AI crawler behavior:
- Which pages do they crawl most?
- How often do they return?
- What’s their crawl depth?
- Do they encounter errors?
This data helps you prioritize content for AI visibility and identify access issues before they impact your AI search presence.
What’s the Best Way to Prioritize Technical SEO Fixes?
Not all technical SEO issues have equal impact. Use this priority framework:
Priority 1 — Blocking issues (fix immediately):
- Important pages returning 404 or 5xx
- Robots.txt blocking important pages
- Noindex on pages that should be indexed
- Canonical tags pointing to wrong URLs
- HTTPS implementation broken
Priority 2 — High impact (fix within 1 week):
- Core Web Vitals failures
- Redirect chains longer than 2 hops
- Orphan pages with no internal links
- Missing or invalid structured data
- Mobile usability issues
Priority 3 — Medium impact (fix within 1 month):
- Duplicate title tags
- Missing meta descriptions
- Heading hierarchy issues
- Image optimization
- Sitemap inaccuracies
Priority 4 — Low impact (fix when possible):
- URL structure inconsistencies
- Missing hreflang return tags
- Minor schema warnings
- Pagination markup
- Security header improvements
Create a tracking spreadsheet with columns for issue, URL(s) affected, priority, status, and date fixed. Review weekly until all Priority 1 and 2 items are resolved. Then tackle 3 and 4 in sprints.
The most effective approach: schedule quarterly full audits and monthly spot-checks on Priority 1 items. Technical SEO isn’t a one-time project — it’s ongoing maintenance that protects your search visibility.