Page caching results in 403 in crawlers (google, Screaming frog)
-
I just found out that page caching results in 403 error being returned for Screaming Frog SEO crawler. I tested in google search console and same result – if Page Caching is turned on, crawlers get 403 error, if I turn it off, they load the page correctly. Strangely enough, visitors see the page correctly always.
I have played with various options of Page Caching but none of them made any difference. In the end I found this rule in .htaccess, that apparently when processed breaks the crawlers:
RewriteRule .* "/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html" [L]
when Page Caching is turned off either globally or for certain page, this of course is not processed so the crawler get status 200 and page correctly. But when this is processed, it results in 403.
I have checked the path on server and if there is the cached file, it causes the 403. When I manually deleted this cached file for certain page, I got status 200 in the crawler for that page, apparently because page was loaded fresh and stored the cache, but immediately afterwards when I did second try, the cached page was already there and 403 returned again.
Has anyone experienced the same problem? Thanks
- The topic ‘Page caching results in 403 in crawlers (google, Screaming frog)’ is closed to new replies.