Crawled but not indexed pages: why it happens and how to fix it
You open Google Search Console, go to Indexation → Pages and see a bunch of URLs with the status "Crawled but not indexed". Googlebot has been there, read the content and decided not to show it in search results. Without indexation there's no ranking, and the problem is that this status doesn't explain why. I'll do that here.
What exactly "crawled but not indexed" means
Google works in two separate phases. First it crawls: Googlebot visits the URL, downloads the code and renders the page. Then it indexes: it decides whether that page deserves to appear in search results. When you see "crawled but not indexed", the first phase has worked perfectly. The problem is in the second.
It's worth distinguishing it from other statuses that cause confusion:
- "Excluded by noindex tag": you have explicitly blocked indexation with a meta tag. Google obeys.
- "Blocked by robots.txt": Googlebot can't even access it. It knows nothing about the content.
- "Crawled but not indexed": Google can index it, has seen the content, and has chosen not to. It's a quality judgment, not a configuration error.
The most common causes, with real examples
This is where most articles fail: they list causes in the abstract without helping you recognize your situation. I'll go through concrete cases.
Thin content or nothing that differentiates it
A physiotherapy clinic in Tarragona with 12 service pages, each with 90 words extracted from the corporate brochure. Google visits them, sees nothing it can't find in a hundred other similar websites, and doesn't index them. The problem isn't the length: it's that they don't answer any real question. If those pages explained how they treat a specific injury, how long recovery takes or why their approach is different, the situation would change.
Internal duplicates generated by website structure
In e-commerce this is the most frequent case. A clothing store in Sabadell that allows filtering by color, size and season automatically generates URLs like /t-shirts?color=blue&size=M&season=winter. Google sees hundreds of pages with virtually identical content and doesn't index any. The solution isn't to remove the filters: it's to configure the canonical tag so all variants point to the main category URL.
Pages that no one on your website recommends
A restaurant in Gràcia that publishes articles about wine pairings or seasonal recipes but doesn't link to them from any other page. Google finds them through the sitemap, visits them, but interprets that if the website itself doesn't consider them important enough to mention, they probably aren't. Internal links aren't decoration: they're internal trust votes that Google reads.
Domain too new or with little authority
A business in Girona six months old with four external backlinks will have a very limited indexation budget. Google will index the homepage and main categories, but will postpone secondary pages until the domain demonstrates more relevance. There's no technical shortcut here: you need to build authority progressively.
Pages that Google considers unnecessary
Blog tags, pagination pages (/page/2, /page/3), internal search results, author pages on single-author websites. Many websites of freelance professionals in Barcelona or Lleida generate them automatically without knowing it. Google crawls them, doesn't index them, and in the process consumes crawl budget that could be dedicated to pages you actually care about.
How to diagnose it in Search Console step by step
The workflow I use in audits, without unnecessary steps:
- Export the complete list. Search Console → Indexation → Pages → filter by "Crawled but not indexed" → export as CSV. The interface shows a sample of 10 URLs; you need the full list to work properly.
- Group by URL pattern. Open the CSV and sort by URL. If you see 200 URLs like
/product?variant=, the problem is systemic and the solution is technical (canonicals or parameter exclusion). If the URLs don't follow any pattern, the problem is individual quality for each page. - URL inspection for pages that matter to you. For each key page (main categories, services, landing pages), use Search Console's URL inspection. Pay special attention to the screenshot Google sees: if the page renders poorly or main content doesn't appear, you have a JavaScript or page speed problem.
- Detect orphan pages with Screaming Frog. Crawl the website, export all URLs and cross-reference them with the Search Console list. Those that appear as "orphan pages" (zero incoming internal links) need to be integrated into the site structure.
- Check the last crawl date. In the URL inspection you'll see when Googlebot last visited the page. If it's been more than two months, crawl budget is limited and you need to prioritize which pages you want Google to visit first (well-structured sitemap, fewer low-quality pages).
| Main tool | Google Search Console (free) |
|---|---|
| For internal links | Screaming Frog SEO Spider (free up to 500 URLs) |
| For large websites | Server logs + Ahrefs or Semrush |
| Analysis time | 1-2 hours for small websites; 1 day for large e-commerce |
Errors that worsen the problem
When I arrive at a project that has already tried to solve it alone, I often find one of these situations:
- Requesting reindexation without changing anything. Google will see the same thing it has already rejected. Plus, manual requests have a daily limit and wasting them on pages that haven't improved is squandering them.
- Adding filler text to "reach 500 words". Google doesn't count words. Three paragraphs that answer a real question beat ten inflated paragraphs. Length is a consequence of depth, not a goal in itself.
- Applying noindex to everything that doesn't index. Some of these pages should actually be indexed; they simply need improvements. Blindly applying noindex reduces the indexable surface of the domain without solving the underlying problem.
- Ignoring it because "the important pages are already indexed". A high volume of low-quality pages consumes crawl budget and can affect the overall perception of the domain. It's not an urgent problem, but it's worth cleaning up progressively.
Action plan by priority
Order matters. Not everything needs to be done at once, and not everything has the same urgency:
| Priority | Cause | Concrete action | Response time |
|---|---|---|---|
| 🔴 High | Duplicates from URL parameters | Configure canonical tag to the main URL of each group | 2-4 weeks |
| 🔴 High | Unnecessary pages (tags, pagination, internal search) | Apply noindex consciously and remove them from sitemap | Immediate |
| 🟡 Medium | Thin content on service or category pages | Rewrite answering real questions: use cases, comparisons, practical details | 4-8 weeks |
| 🟡 Medium | Orphan pages with no internal links | Identify them with Screaming Frog and add 2-3 internal links from related pages | 2-4 weeks |
| 🟢 Low | Domain with little external authority | Work on external link building and thematic authority progressively | 3-6 months |
Once each page is improved, request reindexation individually from the URL inspection in Search Console. Don't wait for Google to detect it on its own: on domains with limited crawl budget, it can take weeks or months.
If you have an e-commerce in Sabadell or Terrassa with hundreds of affected pages, the practical advice is this: don't try to save them all. Identify which categories and products have real search volume, consolidate variants with canonicals and apply noindex to the rest. An index of 200 quality pages works better than one of 2,000 mediocre pages.
If you want us to review your website's indexation status and give you a concrete action plan, contact us for a free SEO review. In one hour of analysis we already know which pages need improvement, which need to be removed and in what order to act.
Frequently asked questions
How long does it take Google to index a page once I've improved it?
If you request manual reindexation in Search Console, on active domains it usually takes between 1 and 4 weeks. On new domains or those with little authority, it can extend to 2-3 months. The manual request speeds up the process, but if the content hasn't actually improved, Google will reject it again.
Do I need to fix all crawled but not indexed pages?
No. Internal search pages, deep pagination or blog tags are often better left with noindex consciously. The question you need to ask for each URL is: "If a user reaches this page from Google, will it be useful to them?" If the answer is no, noindex is the right decision.
Can it affect the ranking of pages that are indexed?
Yes, indirectly. A high volume of low-quality pages consumes crawl budget and can reduce Google's trust in the domain. It's not an immediate or dramatic effect, but on large websites like e-commerce or content portals, cleaning up the index improves overall performance noticeably.