• jbauguss

    (@joshbaugussnet-1)


    So wordpress’s default robots.txt file looks like this.

    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    Seems logical enough. However we have been investigating a site and google is complaining about not being able to see the javascript files that are under the wp-includes dir.

    In June 2014, Google did the Panda update. In it, they apparently very much care about seeing all javascript and css files being included on the site.

    Why would google care about javascript/css?
    Because javascript/css can be used to alter what a search engine might see vs what the user really sees. (hiding heavily optimized content for search engines for example) And this of course has long been a no-no in google’s eyes.

    Any theme/plugin that makes use of wp_register_script (for example wp_register_script(‘myjs’, $srcfile, array(‘jquery’));) would rely on pulling the WP provided jquery.js file from the wp-includes folder.

    So if I’m reading the chatter right from google searches relating to this, wordpress’s default Disallow of /wp-includes/ is terrible!!!

    I’m hoping for some discussion here from folks that may know a bit more than I’ve researched so far.

Viewing 2 replies - 1 through 2 (of 2 total)
  • Thread Starter jbauguss

    (@joshbaugussnet-1)

    So I’m seeing no feedback from wordpress folks. I’m beginning to think this is a HUGE HUGE deal.

    We have been investigating the crawls on several of our sites. The number of indexed pages has dropped like a rock and the number of “blocked by robots” has increased from zero to several hundred on many many sites.

    We fixed a robots.txt file on one and the problem went away with the next crawl and we are seeing pages get indexed again.

    Someone on the wordpress dev team seriously needs to see this post.

    Thread Starter jbauguss

    (@joshbaugussnet-1)

    the default wordpress robots.txt file should only be

    User-agent: *
    Disallow: /wp-admin/

    This will keep any site that has a theme or plugin that makes use of js files provided by wordpress (like jquery) from being blocked by googles indexer.

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘WP's Default robots.txt bad for google?’ is closed to new replies.