I have a quick story to tell where page indexing issues fixing caused trouble instead of bringing positive change to the website; A client came for page indexing issues fixing done by a third person and it was affected by a novice technical SEO expert who has set everything to the index. He tried to fix the Google search console issue “blocked by robots.txt” and “Indexed, though blocked by robots.txt” issues by setting every URL to index.
As you can see below screenshot he has massive numbers of indexed pages and millions of not indexed pages. Guess what! all of these pages are spam pages. It is because the so-called technical SEO expert has allowed every URL for indexing. Spammers found that opportunity and attached so many of these unwanted pages
It is not always necessary to index everything; in fact, google does not index everything as seen in their official documentation on page indexing.
So that website was blocking the search pages (with ? q=search terms) from searching through robots.txt. However, someone has changed its setting to unblock the search pages so they could be indexed. This decision was wrong as Google doesn’t index everything and now the client is experiencing issues of so many spam pages being indexed and many are part of not indexed log pages.
So what would be the right approach to Fix Page Indexing Issues?
I always suggest to either hire an SEO expert who can evaluate your website and make the decision based on the reported pages in the page indexing log.
So if you have no-index pages either through robots.txt or meta robot you should check if that page is necessary to be indexed.
Ideally, we should not index the search pages or pages that can accept user-generated search terms like I shared many spammy URLs.
The same happened with this client causing so many unwanted pages indexed for users.
Please share if you have any questions.