• Resolved Deon

    (@deon-b)


    Hello,
    the ONLY thing I want indexed is: POSTS and PAGES.

    One month ago I was using Yoast 8.0 and I updated to 12.4 and a lot of stuff has being indexed.

    Please see link:
    https://prnt.sc/q6xj7s
    https://prnt.sc/q6xk3w

    At the same time as 100 of these URLs have been indexed my ranking dropped.

    Could someone please help me out?

    • This topic was modified 4 years, 12 months ago by Deon.
Viewing 15 replies - 1 through 15 (of 15 total)
  • I think that isn’t related to WordPress (and Yoast plugin). It seems that your server has enabled “Directory Indexes”. You should add something like Options -Indexes into .htaccess (or find similar option in Control Panel).

    Similar issue – https://www.ads-software.com/support/topic/noindex-on-wp-content-urls/. After you apply some solution and your directories aren’t more public then I recommend that you manually remove them from google index to speed up entire process.

    Thread Starter Deon

    (@deon-b)

    But how has my server, suddenly started enabling “directory indexes”?

    I have had this website for years, same site, same theme, same yoast plugin, never had any issue. Auto-updates turned off.

    Now all of a sudden this came out with me doing nothing.
    Actually the only thing I did recently was installing this plugin:
    https://tablepress.org/extensions/responsive-tables/
    An extension to make tables responsive. Could this be the culprit?

    I don’t honestly understand exactly what I should do from your reply.

    You can tell me to put some code in my robots or htaccess. but the question remains: how? how all of a sudden this started, I didn’t need this code until yesterday and now yes? This just doesn’t make sense to me :/

    Last, how can I verify “my server” has magically on its own started to enable “directory indexes”? Are you saying some guy that works at the web host decided to do this? Or a server has a life of its own?

    • This reply was modified 4 years, 12 months ago by Deon.
    • This reply was modified 4 years, 12 months ago by Deon.

    It’s possible that installation of some plugin have opened link to wp-content listing which has enabled by default. Other possibility is side effect of Apache update on the server (which you can’t control on shared server). You should check .htaccess file in WP root directory (and existence of .htaccess in wp-content, wp-content/uploads directories – it’s possible that somehow is created .htaccess file which overrides main .htaccess). You should check is there word Indexes somewhere in the file (or files).

    If there aren’t explicit option to enable “Directory Indexes” then you can add somewhere (at begin for example) in .htaccess something like this:

    Options -Indexes
    

    Please make backup of .htaccess file before changing. Also, after changing try to access to wp-content/uploads/2019/… directories. You should see 403 errors for all these listings (except doc file).

    Thread Starter Deon

    (@deon-b)

    Hi @stodorovic
    thanks a lot for your reply.

    I went in .htaccess and I added that code at the bottom, and now I get a 403 page when I browse the directories. But I have 2 questions:

    1) Here Google says to use 404 or 410
    https://support.google.com/webmasters/answer/1663419?hl=en

    2) Here
    https://www.ads-software.com/support/topic/noindex-on-wp-content-urls/
    you suggest to use the remove outdated URLs
    https://support.google.com/webmasters/answer/7041154
    but google writes:
    What is this tool used for?
    If you see a search result that you do NOT own.

    But I own these pages.

    Based on my experience, I think that any 4xx error instead of 200 status code will have similar effect. It isn’t easy to override 403 for these listings to something else (It’s possible to you upload custom index.php at each directory, but it’s too boring). Anyway, disabling directory indexes are default settings on many hosts.

    I didn’t use this tool recently (but it requires to you sign in to GSC and it was part of old GSC), but point is that you remove URLs (listings) from google index faster (in next few hours) and google will re-visit these URLs in next days, to confirm that they actually don’t exist more (and they will be moved from indexed to excluded, after some period they should be disappeared from GSC). It’ll speedup entire process. From other side, you can wait to google re-visit them in next couple iterations and google will remove them from index.

    I saw doc file on you screenshot. Maybe you need to add custom rule for this file or maybe you can remove it or rename it.

    You can use Google Search Operators to easier check actual google index. Examples: site:mywebsite.com inurl:wp-content ext:doc alone or in combination.

    I give my best to explain all details, but maybe I missed something.

    Thread Starter Deon

    (@deon-b)

    Hi
    thank you so much.
    That .doc file I have no idea how was there.

    I contacted my host, they said they always have mod_autoindex on by default. So I had this on my website for 6 years. And only now it came out in search engines, after I installed the plugin Tablepress, and same day my ranking dropped.

    Yes I will use the URL removal tool and remove those URLs.

    But I want to be 100% sure 403 is the right code because here they say to use 410 :/

    https://support.google.com/webmasters/answer/1663419?hl=en
    Remove or update the actual content from your site (images, pages, directories) and make sure that your web server returns either a 404 (Not Found) or 410 (Gone) HTTP status code.

    https://www.seroundtable.com/google-403-status-codes-25674.html

    There is about 100 of these URLs indexed.
    https://prnt.sc/q77bzw
    And my rankings dropped suddenly.

    I just want to fix it and make sure I am doing it correctly:

    Maybe I should just set 410 and listen to what they say?
    Have you tried 403 and 410 and both worked?
    But if I want to do 410, how do I do it?

    Thanks so much for helping me.

    • This reply was modified 4 years, 11 months ago by Deon.

    Many stuff related to Google indexing isn’t documented and they are often changed. So, I can’t say which is exact difference between various 4XX codes, but finally all 4xx URLs will be excluded from indexing (by my tests).

    I’ve nightmare on couple website where googlebot finds wrong wp-content URLs in JavaScript (I didn’t research all details and I don’t yet have proper solution). But if all listings are only cross-linked between their-self then it should be automatically disappeared in next months.

    Sending 410 error instead of 403 is more complex. I think that you should try to use RewriteRule for that, but I didn’t try it. I’m writing this rule “on fly”:

    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule ^wp-content/uploads/ - [R=410,L]
    

    Please test it carefully, I think that it can’t block anything else than listings (it’s -d in first line and wp-content/uploads in second line). This rule should be before WP rules.

    Thread Starter Deon

    (@deon-b)

    Hi Sasa, thanks a lot for your help man!
    I did 403 as you suggested, and I submitted the URLs for removal.

    Now I just have to pray and hope all will be fixed soon. I hope before xmas.

    I have another question:
    My website is now still https://
    I was planning to move it https:// to not lose traffic now (I read that after you swap from http to https you will have a drop).

    Since I had a drop anyway, would you suggest to go ahead and swap to https now, or this could make the URL removal, de-indexing more complex and it’s better wait that the URLs have been deindexed?

    You are welcome! I’ve again re-read this issue.

    I think that’s good choice. Exposing listing of files could be kind of “security issue”. In this case, 403 is acceptable and the easiest way (and you don’t need that googlebot checks them in the future).

    This twit confirm it:

    403 is a bit weird but should drop the pages too. I’d clean it up normally with 404 though, don’t add a hack when the real solution is just as easy.

    It’s possible that google will check more details after detected changes and find mistakes which weren’t visible before. So, my personal opinion is that you should continue with new changes (without huge delays between actions).

    When you apply SSL then you should create new property (https) on google and continue to monitoring both version of website (properties on GSC). You should add proper redirects after you check that all URLs are https (to avoid redirect looping – it’s common mistake which could make penalties), …

    Maybe you can wait a few days before you set SSL. It’s possible new drop (I has seen it when SSL is applied). Also, entire traffic (which is only http now) will be splited between http and https (for some time). Anyway, you have already read about this and I’m not sure that I can say something new.

    I hope that will help.

    Thread Starter Deon

    (@deon-b)

    Hi, thanks.
    But if I wanted to set a 404 as they say in the Tweet, how would I do that in WordPress for such directory pages?

    It’s possible that google will check more details after detected changes and find mistakes which weren’t visible before. So, my personal opinion is that you should continue with new changes (without huge delays between actions).

    What do you mean by this?

    Thank you!

    An option could be to you add my optimized rules from Noindex on wp-content URL’s (Code related to # Serves only static files). These rules will filter requests (and it forces 404 for all “requests” which aren’t in the list of allowed extensions). You should keep Options -Indexes at begin of .htacceess and after this line add my rules (before other rules). I use these rules on a lot of sites and they are good improvement for SEO and security at same time.

    After you set apache to send proper 404 code, then you need to set a Custom 404 Error Page because R=404 will render generic 404 error. It could be done with ErrorDocument in .htaccess, but you should contact your host for more details. I’ve created custom HTML (based on WP 404 error – so it looks similar as other 404 errors) which is better way because apache don’t need to run PHP (and it’s performance improvement).

    Related to last question, I mean that googlebot will use entire cache (including all URLs) to re-validate all URLs and find new “mistakes”. I’ve seen your other post – it seems that’s something like this. You will notice new weird URLs which you didn’t know that exists.

    I think that I’ve replied to all your questions related to initial topic. I’m not sure that I can tell something more related to “custom 404”. I’ll try to reply in other topics when I find spare time.

    Thread Starter Deon

    (@deon-b)

    https://prnt.sc/qahmz5
    Hi @stodorovic

    ??
    Traffic still affected: 100 URLs indexed in between
    Directories + /pages/

    What can I do to get this fixed faster?
    Should I remove the 403 and make a 404?

    It needs some time to google remove them. If you requested removal then it’s same result with 403 or 404. You could try to implement solution which I already sent if you want 404. I think that’s better to leave it as is because it isn’t only issue which is responsive for this drop. You can later “polish it”.

    Check google index with search query “inurl:wp-content site:mydomain.com“. If you don’t see “these listings” then it’s good sign.

    Thread Starter Deon

    (@deon-b)

    2+ months later.
    Directories still indexed.
    403 not working.

    Thread Starter Deon

    (@deon-b)

    I also found an .htaccess in
    /uploads/wpseo-redirects/

    Which says:

    Options -Indexes
    deny from all

    Does this come from yoast?

Viewing 15 replies - 1 through 15 (of 15 total)
  • The topic ‘Attachments Indexed Fu*ked Up’ is closed to new replies.