• Resolved filz51

    (@filz51)


    Hi

    Very confusing: On certain pages I’ve only gotten cf-cache-status: BYPASS with Safari or Opera in incognito mode, no matter how many times I reloaded those pages.

    When I was loading these pages with Chrome in incognito mode I get cf-cache-status: MISS for the 2nd visit and cf-cache-status: HIT for the 3rd visit.

    Reloading these pages in Safari and Opera again I also get the HIT.

    Can you explain this behavior? I made hundreds of tests before I randomly tried with Chrome…

    • This topic was modified 3 years ago by filz51.
Viewing 12 replies - 1 through 12 (of 12 total)
  • James W.

    (@urbancowboy1994)

    @filz51

    Would you mind sharing these pages with us so we can run developer tools? Also, check your Cache Behavior Settings under wp-admin/options-general.php?page=wp-cloudflare-super-page-cache-index&swcfpc=1 to see if the pages are check-marked or listed in the URI list.

    • This reply was modified 3 years ago by James W.. Reason: typo
    Thread Starter filz51

    (@filz51)

    Hi
    Scroll down to the calendar event list on this page https://magiclift.ch/fr/vols-d-altitude/ and pick some URLs to reproduce. I recently flushed everything. Some pages might be cached again but most of them should still be uncached.

    James W.

    (@urbancowboy1994)

    @filz51

    This issue appears to be cookie-related. Head over to wp-admin/options-general.php?page=wp-cloudflare-super-page-cache-index&swcfpc=1, scroll down to Strip response cookies on pages that should be cached, and select Yes. Click Update Settings, head over to Force purge everything, and let me know!

    • This reply was modified 3 years ago by James W.. Reason: Add text
    Plugin Contributor iSaumya

    (@isaumya)

    Hi @filz51,
    First of all as I can see in your website you are running this plugin along with the official Cloudflare Plugin and APO. Please note that you cannot use both the plugin at the same time. If you would like to use APO then disable this plugin and use the official Cloudlfare plugin with APO enabled.

    OR

    If you use this plugin then you need to delete the official Cloudflare plugin and diable APO from the Cloudflare Dashboard.

    Screenshot: https://i.imgur.com/Wj2BO7e.png

    Now coming to the pages being bypassed I can see that on your pages there are cookies being set like PHPSESSID. Screenshot: https://i.imgur.com/F7VP2xY.png
    You have to disable addition of this cookie. This is because when Cloudflare sees a page has any custom cookies in it it won’t cache that page thinking that the cookie value might be used inside the page to dynamically show some data.

    I personally won’t recommend using the Strip response cookies on pages that should be cached option as that might cause issues on pages where you have actual proper cookies which you want to keep. As this option will stripe the cookies from all pages. So, if you enable this option, I will highly recommend you to test out every single page of your website to ensure that every single page is loading and performing exactly as you wanted.

    Rather the best option would be to find the unneeded cookies that are being added everywhere e.g. PHPSESSID and make sure that it does not get loaded anymore.

    I also saw that you are using Varnish on your site but honestly if you are using full Cloudflare cache along with Smart Tiered Caching Enabled in your CF dashboard, you won’t need a varnish cache. It’s just another layer of cache that might cause issues does the line. When you are using Cloudflare to cache the best is to just let Cloudflare cache everything and not have multiple page caching system as they may cause issues/unneeded results. That’s just my personal take.

    Thread Starter filz51

    (@filz51)

    Hi Saumya
    I appreciate your explanations.
    I disabled APO and Varnish is bypassed now.
    Now I get the cf-cache-status: MISS/HIT for visits with Safari and opera as well.

    But I think there are still other issues:
    When I run the preloader and check some listed cached pages I see that not all of them respond with a cf-cache-status: HIT for the 1st visit. The idea of preloading is to get the HIT, right?

    • This reply was modified 3 years ago by filz51.
    Plugin Contributor iSaumya

    (@isaumya)

    The idea of preloading is to get the HIT, right?

    – yes but thats not always works that way. Cause during the preloading the plugin basically access the page via cURL. Sometimes Cloudflare doesn’t cache the content if the request is coming via cURL.

    So, basically if your website has good amount of traffic you don’t even need to run the preloader. But if your site has low traffic then you can preload.

    That being said, you should enable “Smart Tiered Caching”

    Check: https://www.ads-software.com/support/topic/preloading-the-cache-per-edge-server/

    Thread Starter filz51

    (@filz51)

    Hi Saumya

    What results can I expect when I setup my own crawler written in Python that calls all html-documents from the sitemap at 1am and 2am as cronjob to get the HIT for all requested pages when the job is done twice at 3am? Can you recommend an alternative to requests via cURL for that purpose?

    Plugin Contributor iSaumya

    (@isaumya)

    well, you can try if your server has resources to handle that bot request. But then again, if Cloudflare thinks that the request is not coming from an actual browser used by a human it might decide to not cache the items. Also, make sure you have turned on the Smart Tired Cache in CF Dashboard.

    Thread Starter filz51

    (@filz51)

    Confused. Which one you mean in CF Dashboard? Is it about Smart or Cache ? ??
    >Traffic > Argo > Argo Smart Routing
    or
    > Cashing > Tiered Cache > Argo Tiered Cache

    Plugin Contributor iSaumya

    (@isaumya)

    Cashing > Tiered Cache > Argo Tiered Cache
    – This one. Argo Smart Routing is a paid product.

    Check this thread for more info: https://www.ads-software.com/support/topic/preloading-the-cache-per-edge-server/

    Thread Starter filz51

    (@filz51)

    Thanks Saumya

    Now I’m running my own crawler twice a day and it’s working like intended. These four Python libraries are necessary to make it run it with Chrome user agent:
    https://pypi.org/project/requests/
    https://pypi.org/project/beautifulsoup4/
    https://pypi.org/project/fake-useragent/
    https://pypi.org/project/lxml/

    The script is working with Python 3.6 for standard formated sitemaps , I didn’t test with other versions. Adjust line 10, 30 and 37.

    import requests
    from bs4 import BeautifulSoup
    from fake_useragent import UserAgent
    
    ua = UserAgent()
    chrome = ua.chrome
    headers = {'User-Agent': ua.chrome}
    
    site_map_index_url = 'https://anywebsite.org/sitemap_index.xml'
    
    sitemap_index_dict = {}
    site_urls = []
    
    def call_url(target_url):
        return requests.get(target_url, headers=headers)
    
    def get_xml_tags(target_sitemap_url, target):
        response = call_url(target_sitemap_url)
        xml = response.text
        soup = BeautifulSoup(xml, 'lxml')
        return soup.find_all(target)
    
    for sitemap in get_xml_tags(site_map_index_url, 'sitemap'):
        sitemap_index_dict[sitemap.findNext('loc').text] = sitemap.findNext('lastmod').text
    
    sub_site_map_urls = sitemap_index_dict.keys()
    
    for sitemap_url in sub_site_map_urls:
        for url in get_xml_tags(sitemap_url, 'url'):
            site_urls.append(url.findNext('loc').text)
    
    for url in site_urls:
    #    if 'expression' not in url:
            call_url(url)

    Best

    Plugin Contributor iSaumya

    (@isaumya)

    Thank you @filz51 for sharing your script.

Viewing 12 replies - 1 through 12 (of 12 total)
  • The topic ‘Incognito mode with Safari or Opera: BYPASS, Incognito mode with Chrome: HIT’ is closed to new replies.