Rules
4XX Pages in Sitemap
Checks for sitemap URLs returning 4XX status codes
All Non-Indexed Pages
Lists all pages blocked from indexing for user audit
Canonical Chain
Checks for redirect chains on canonical URLs
HTML Size
Checks HTML document size against Googlebot crawl limits
Indexability Check
Identifies pages blocked from search engine indexing
Indexability Conflicts
Detects conflicting signals between robots.txt and meta/headers
Noindex in Sitemap
Checks for noindexed pages listed in sitemap
Pagination
Checks that paginated pages have proper canonicals
PDF Size
Checks linked PDF sizes against Googlebot 64MB truncation limit
Redirect Chains
Detects multi-hop redirect chains that waste crawl budget
Robots Meta Conflict
Detects conflicts between robots meta tags and robots.txt
Robots.txt
Checks if robots.txt exists and is properly configured
Schema + Noindex Conflict
Detects pages with rich result schema that are blocked from indexing
Sitemap Coverage
Checks for indexable pages that are not in the sitemap
Sitemap Domain
Checks that all sitemap URLs belong to the expected domain
Sitemap Exists
Checks if XML sitemap exists and is referenced in robots.txt
Sitemap Valid
Validates sitemap structure and URL limits
Disable All Crawlability Rules
squirrel.toml