Fix Page Indexing Issues – Google Search Console

Fix Page Indexing Issues

Did you receive an email saying “Page indexing issues detected in submitted URLs for example.com”?

page indexing issues detected by Google search console
The email by google related to page indexing issues normally looks like this.

Don’t worry, Let’s find out how to deal with them and how to fix page indexing issues.

What are these page indexing issues?

Google Says: “These URLs are not indexed by Google. In some cases, this may be your intent; in other cases, it might be an error. Examine the issues to decide whether you need to fix these URLs or not.

what-are-page-indexing-issues-as-per-google

Don’t Go For Page Indexing Issues Unless you Read this Information:

What to look for:

  • Check that all your important URLs are indexed (green).
  • If any URLs are not indexed (gray), make sure it’s for a good reason (e.g. robots.txt rule, noindex tag, duplicate URL, or removed page).
  • If the total URL count is much smaller than your site’s page count, Google might not be finding all your pages. Reasons for this include having a new site or page, or pages not being findable by Google. Refer to the documentation for specific issues and fixes.

What not to look for:

  • Don’t expect every URL to be indexed, as some might be duplicates or not contain meaningful information. Just make sure your key pages are indexed.
  • Non-indexed URLs can be fine if there’s a good reason.
  • Don’t expect the totals to match your estimated number of URLs, as small discrepancies can occur.
  • Even if a page is indexed, it might not show up in every search or in the same ranking.

If you want to learn more on this check the official guide

Overview of Indexing Issues

  1. Excluded by noindex tag: These are normally blocked from google intentionally. Some good systems (CMS) block many URLs as a good practice so we should filter the URLs carefully. It is very rare that your main pages are found in this head. For a step-by-step guide on how to fix URLs excluded by the noindex tag, check out our detailed article on resolving noindex issues.
  2. Page with redirect: The URLs reported under this head may or may not have links to fix. The pages with redirect links should be fixed by replacing them with the correct(redirected) link. For a comprehensive guide on how to resolve pages with redirect links, including best practices for updating or replacing them, refer to our detailed article on fixing redirect issues.
  3. Soft 404: The links reported under soft 404 are normally empty pages or deleted pages. For a complete guide on how to identify and fix soft 404 errors, including handling empty or deleted pages, check out our detailed article on resolving soft 404 issues.
  4. Not found (404): The URLs reported under this issue may have existed before but were deleted later on. Google may also find incorrect links that return a 404 error, either from internal links or from external third-party websites. For a comprehensive guide on fixing 404 errors, including managing deleted URLs and correcting incorrect internal or external links, refer to our detailed article on resolving 404 issues.
  5. Duplicate, Google chose different canonical than user: This issue occurs when Google decides to index a different URL as the canonical version instead of the one you specified. Google may choose a different canonical URL if it finds the alternative more useful or relevant. This can happen temporarily or due to technical issues like inconsistent canonical tags, duplicate content, or poor internal linking. For a detailed guide on how to resolve canonical URL issues, including why Google might choose a different canonical and how to fix it, check out our complete article on fixing Duplicate, google choose different canonical than user issue.
  6. Alternate page with the proper canonical tag: The URLs reported under this section are normally the alternatives to what is indexed already. It is not good to index duplicate content URLs so google normally finds URLs as duplicate pages and doesn’t index them. For more information on how to manage alternate pages and ensure proper use of canonical tags, refer to our detailed article on fixing alternate pages with the proper canonical tag.
  7. Duplicate without user-selected canonical: Some pages are picked by google as duplicate content pages. But these pages don’t have the preferred URL (canonical) mentioned as well. In this case, we can define the canonical URL in the code to choose our preferred URL for indexing. For more on fixing duplicate content and setting canonical URLs, check out our full guide on fixing duplicate without usser-selected canonical issues.
  8. Crawled – currently not indexed: You can check if the reported URLs are important URLs and if these are crawled a week or two ago. If these are important URLs and crawled recently then they will soon be indexed. If the URLs are important but crawled a month ago then you need to increase the quality of that URL by improving the content. You can also increase internal linking to such pages to tell google about the importance of such URLs. However, I find fewer useful pages in this section mostly. Google crawls such pages but doesn’t find them so useful to the index for its users.
    Check out our full guide on fixing crawled – currently not indexed issue.
  9. Server error (5xx): These are URLs that were found with server error when the google bot tries to access these URLs. For guidance on diagnosing and fixing server errors, refer to our detailed article on fixing server error (5xx).
  10. Redirect error: These URLs typically lead to a series of redirects that eventually cause an error. This can occur when a URL redirects to another page, which then redirects to yet another page, creating a redirect loop or excessive chaining that results in a failure to load the final destination. It’s crucial to identify and fix these redirect errors to ensure a smooth user experience. For more information on troubleshooting and resolving redirect errors, check out our comprehensive guide on fixing redirect error.
  11. Blocked due to unauthorized request (401): This issue arises when Googlebot tries to access a URL that requires authentication, resulting in an unauthorized error. To fix this, ensure that important pages are accessible without authentication. If a page must remain restricted, consider marking it as ‘noindex’ to prevent crawl attempts. For a quick guide on resolving 401 errors, check out our article on fixing blocked due to unauthorized request (401) issue.
  12. Blocked due to access forbidden (403): This issue occurs when Googlebot attempts to access a URL but is denied permission, resulting in a 403 Forbidden error. This can happen due to server settings or access controls. To resolve this, review the permissions for the affected pages and ensure that important content is accessible to search engines. For a quick guide on fixing, check out our article on resolving Blocked due to access forbidden (403) issue.
  13. URL blocked by robots.txt: This issue arises when a URL is restricted from being crawled by Googlebot due to directives in your robots.txt file. While this can be useful for preventing indexing of certain pages, it’s important to ensure that critical pages are not inadvertently blocked. Review your robots.txt file to confirm that essential URLs are accessible. For a quick guide on how to manage robots.txt and resolve blocking issues, check out our article on fixing URL blocked by robots.txt issues.
  14. URL blocked due to other 4xx issue: This issue refers to URLs that are blocked because they return various 4xx errors, such as 409 (Conflict) or 405 etc. These errors indicate that the requested pages are unavailable. To address this, review the affected URLs and either restore the content, redirect to relevant pages, or update internal links accordingly. For a quick guide on resolving 4xx errors, check out our article on fixing URL blocked due to other 4xx issue.
  15. Discovered – currently not indexed: This issue indicates that Google has found the URL but has not yet indexed it. This can happen for various reasons, such as low content quality or insufficient internal links. To improve the chances of indexing, enhance the page content and increase internal linking to signal its importance to Google. For a quick guide on how to improve indexing for discovered URLs, check out our article on fixing Discovered – currently not indexed issue.

When to Fix Page Indexing Issues

We should start checking the reported URLs under each heading to list out URLs that are important to take the recommended action.

We should fix it especially when we have a difference in the number of indexed pages vs the number of available pages for index or the page in the sitemap.

How To Check How Many Available Pages We Have

The quickest way is to go to the sitemap section in the search console and note the number of pages available to index as illustrated below.

Index Pages

There you can see we have 72 pages in the sitemap available for indexing. If you just check the last screenshot of the indexed pages where we have 64 valid pages. It means we just have to find out which 8 pages are not indexed.

We can easily grab this information from discovered currently not indexed or crawled currently not indexed section under the “Not Indexed” area. In rare cases, these missing URLs can be found in the other not-indexing list of issues that needs to be fixed.

Why Google has Reported These Issues

Google may find these issues (non-indexed URLs) within the website and outside of the website. It means we may have filter pages, search pages, the non-canonical pages that might be exhausting the search console indexing log or making it horrific.

The golden rule of thumb is to keep the website clean. The search console is the closest way how google sees our website. This also helps site owners to take any action before it actually causes damage. Each section should be checked to see if there is a real problem or if there is any potential problem that we can fix from the website.

CMS like WordPress, Shopify, and many other have usually apps or plugins which creates functional links that are not needed to be indexed. Filter pages and search pages are one of the main reasons for such issues.

These CMS software (WordPress, Joomla, Big Commerce) have plenty of archive pages (category, tags, date archives) that site owners sometimes don’t aware of. These pages are not useful for site readers. So Google treats them as duplicate content or low-quality pages which is why these pages are often listed in these logs.

You can ignore them if you have indexed pages slightly more than what you have in the sitemaps like you have 100 pages in the sitemap and the search engine is indexing 200 pages. Still, we should why is that.

You should fix it when we have a huge difference in the number of available pages and indexed pages. We have had issues with a hacked (malware-infected) website where the actual site size was around 100 pages and the search console was reporting 50000 pages.

 

Was this article helpful?
YesNo