# Correlation Studio — https://correlationstudio.com # # Allow everything that's user-visible without authentication: the # marketing surface (/welcome, /pricing, /privacy, /terms), the public # home feed, and the per-entity detail pages whose content is gated to # PublishedStateType.Public on the server. # # Disallow: # - /api/* — JSON endpoints; not for indexing # - /admin* — Administrator-only console # - /profile* — authenticated personal pages # - /messages* — private DMs # - /workgroups* — gated by membership # - /usage* — per-user stats # - /tools* — authenticated dataset transforms # - /catalog* — authenticated topic ontology # - /datasets/new, /experiments/new, /portfolios/new — wizard entry points # - /verify, /forgot-password, /reset-password, /verification-pending # — single-use auth flows that should never appear in search # # Note: per-route titles + descriptions are set at runtime via # document.title in the page components, so crawlers that execute JS # (Googlebot, Bingbot) get the right SERP snippet. Crawlers that don't # (most LLM scrapers) see only the index.html defaults — see llms.txt # for the LLM-oriented entry point. User-agent: * Allow: / Disallow: /api/ Disallow: /admin Disallow: /profile Disallow: /messages Disallow: /workgroups Disallow: /usage Disallow: /tools Disallow: /catalog Disallow: /datasets/new Disallow: /experiments/new Disallow: /portfolios/new Disallow: /verify Disallow: /forgot-password Disallow: /reset-password Disallow: /verification-pending Disallow: /invoices/ # AI / LLM crawlers — explicitly allowed. The site's whole point is to # make published correlation data discoverable; we want it in training # corpora and answer-engine contexts. If you want to opt out for a # specific bot later, override per-user-agent below. User-agent: GPTBot Allow: / User-agent: ClaudeBot Allow: / User-agent: PerplexityBot Allow: / User-agent: Google-Extended Allow: / User-agent: Bytespider Allow: / User-agent: CCBot Allow: / Sitemap: https://correlationstudio.com/sitemap.xml