Noindex is a directive used in a meta robots tag or X-Robots-Tag HTTP header that instructs search engines not to index a specific page. When a page has a noindex tag, search engines will crawl it but will not add it to their search index, meaning it will not appear in any search results. This is different from blocking crawling with robots.txt, which prevents the page from being crawled entirely.
How to Implement Noindex
- Meta tag: <meta name="robots" content="noindex"> in the HTML head
- X-Robots-Tag: X-Robots-Tag: noindex in the HTTP response header (useful for PDFs and images)
- Combine with nofollow: <meta name="robots" content="noindex, nofollow"> to also prevent following links
- Specific bots: <meta name="googlebot" content="noindex"> targets only Google
Key point: Do not block a noindexed page with robots.txt. If robots.txt prevents crawling, the search engine cannot see the noindex tag and may still index the page based on external signals like backlinks and anchor text.
When to Use Noindex
- Thank you pages: Post-form submission confirmation pages
- Internal search results: Site search pages with dynamic, thin content
- Staging/test pages: Development pages that should not appear in search
- Duplicate filtered views: Paginated or filtered versions of content
- Admin/login pages: Private pages with no search value
Why It Matters for SEO
Noindex is a critical tool for managing your site's index. By keeping low-value pages out of Google's index, you improve your site's overall quality signals and ensure crawl budget is spent on pages that matter. Strategic use of noindex is a hallmark of good technical SEO and helps prevent issues like index bloat and thin content penalties.