• I’m kind of amazed I’ve been unable to find anything to do what I want here. I am often brought in to work on existing websites, and one thing I frequently want to do is find any “orphan” pages which are not linked to from anywhere on the site. Frequently these contain outdated information that we don’t want people to find when doing a search of the site.

    I’m aware that often you want to have such pages, and I’m also aware it’s difficult to be certain a page is not linked to in some way other than a URL link in the content of a page or menu item, but even an imperfect tool that lists prominent suspects, is better than nothing at all.

    I’ve found articles advising how to find such pages by comparing web crawler logs with database logs or some such thing, but that sounds like way too much work. Is there not something better? If not, I’d like to understand why not. Is it somehow harder than it appears to do this?

Viewing 5 replies - 1 through 5 (of 5 total)
  • Is there not something better? If not, I’d like to understand why not.

    Here’s how the WordPress ecosystem (and open-source software generally) works:

    1. Someone needs a feature that’s not available in WordPress, and can’t find a suitable plugin/theme that does the work at all or well enough to their satisfaction.
    2. They build a new (or better) solution for themselves (or hire someone to build one if they don’t have the skill or time to do it themselves)
    3. They decide to share their new solution with the rest of the WordPress community… and boom: another plugin or theme is born.

    So if you can’t find a solution to your particular problem, it just means that no one else has found this particular problem big enough to be willing to commit resources (time and/or money) to solve it.

    Will you be the one to make it happen?

    • This reply was modified 4 years, 2 months ago by George Appiah. Reason: Fixed a typo
    Thread Starter Tyler Tork

    (@tylertork)

    George, thank you for the non-answer. I’m very familiar with how the “ecosystem” works and have submitted a plugin myself (not for this!). I’m just amazed that nobody HAS been motivated enough. So before I try my hand at it, I want to know whether I’d be wasting my time.

    Anyone with actual experience on this particular issue?

    HTTrack might help you but I’d think, as an active participant in your own site, you’d develop an ‘eye’ to content that might need attention. You might miss something here or there but I think you’ll be doing fine.

    https://www.httrack.com/

    You’ll want to pay attention to the error logs in HTTrack… do scrutinize the output itself carefully. HTTrack will create a local copy of the pages and posts from your website.

    When it comes to posts, I’ve found looking at the last page of posts in my blog helps me.

    I’d also recommend running the broken link checker plugin to find content that no longer links outwards to its originally intended destination.

    https://www.ads-software.com/plugins/broken-link-checker/

    Setting BLC to traverse the site over 480 hours instead of its original 72 hrs will save a good bit of load from a heavily populated site.

    This Auto Post Scheduler plugin can be set to take your oldest post and repost it up as a brand new post. If there’s anything wrong and you check the site regularly, you’ll probably catch most problems and can just delete the post if it is no longer timely.

    How to find orphan pages for this best paintball markers review website?

    johnkavin

    (@johnkavin)

    also for this amazon fba courses website?

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Finding orphan pages’ is closed to new replies.