• Resolved Copiaurbietorbi

    (@copiaurbietorbi)


    Good day,

    We are using your plugin with some of the settings suggested for its proper functioning.


    But we are still seeing non-relevant pages indexed by google, like one of the pages (that we removed some time ago and made private on our wordpress dashboard) and our privacy policy (which is open to whomever wants to review it, but by clicking on it though).

    By reviewing the information available here, we understand that we can customize the robots.txt file using the plugin.?


    Could you tell us where this file is and how to adjust it accordingly please? This could fix our current problem.

    We tried to find the robots.txt file or sitemap.xml on our server but we couldn’t find anything. Perhaps we missed them.

    Thank you for your help and interest!

Viewing 4 replies - 1 through 4 (of 4 total)
  • We also use this module on our news portal, and we’ve noticed that it’s accessing pages that have long been deleted, resulting in 404 errors. Additionally, it’s accessing media images that were deleted over 12 months ago.

    Plugin Author Auctollo

    (@auctollo)

    @copiaurbietorbi, do you have some examples of published pages/posts that appear in the sitemap, but shouldn’t?

    Thread Starter Copiaurbietorbi

    (@copiaurbietorbi)

    We made some changes in the plugin settings that altered the original sitemap. But we also renewed and updated our sitemap with google.

    We requested a temporal removal of these pages and after a week or so, they were removed.

    We are not sure if after the 6 month period google provides to hide these pages they will reappear.

    We hope this is not the case with the changes in the plugin.

    On another note and as we mentioned in the original post, we tried to find the robots.txt file or sitemap.xml on our server but we couldn’t find anything. Perhaps we missed them.

    Could you provide some guidance as to where we can find these?

    Thank you for your help and interest!

    Plugin Author Auctollo

    (@auctollo)

    @copiaurbietorbi. we would have to take a closer look at your exact site to answer your questions about /sitemap.xml and /robots.txt, but for now I can say that by default those files are dynamic (generated upon request) and won’t appear in the filesystem on your server when you browse it.

    As for the indexaction of pages you wanted out search results, your sitemap doesn’t directly control that, sitemaps are for discovery, not control over your indexation. Robots.txt and meta tags serve that purpose.

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘Stop Irrelevant Pages from Crawling’ is closed to new replies.