• Resolved EdKrasnov

    (@alshoker)


    Hello, I’ll be happy to help!
    1. How to change the crawl order of “LiteSpeed ??Cache Crawler” SiteMaps files
    The problem is that after launching Crawler, it started bypassing my sitemap files from the very first one, and given that the site has more than 100,000 pages, there is no urgent need for me to cache pages since 2007. It would be very nice to have Crawler start crawling pages from the latest sitemap file. How can I do that?

    2. Sitemap files are generated by the Yoast SEO plugin. Each file contains 1000 links. Sitemap files sometimes take about 20-30 seconds to open.
    Sample files: https://giport.ru/sitemap_index.xml (common file) https://giport.ru/post-sitemap97.xml (group of links)

    Search robots, for sure, do not even wait for files to be opened and leave. Is there a way to increase the opening time of these files? Would caching these files help? Or can it split these files into groups of 300-500 links?

    • This topic was modified 2 years, 10 months ago by EdKrasnov.

    The page I need help with: [log in to see the link]

Viewing 4 replies - 1 through 4 (of 4 total)
  • Thread Starter EdKrasnov

    (@alshoker)

    Похоже, никто с подобной проблемой не сталкивался… (

    Plugin Support qtwrk

    (@qtwrk)

    Hi,

    1) sadly it can’t , it will follow what/how it sees in sitemap

    2) that’s a question to Yoast , but yes , caching would help , but it could also leads to sitemap stale data

    best regards,

    Thread Starter EdKrasnov

    (@alshoker)

    qtwrk, Thank you for your reply!
    1. In LSCache, in the crawler settings, you can choose how often the crawler will re-bypass the sitemap page. Why not add the ability to re-bypass old sitemap pages that don’t change and focus on new pages. OR at least start processing files not from the beginning, but from the end (of newer files)?
    2. The lifetime of the cache and the frequency of bypassing sitemap files can be configured. For example, in the news and articles 3-5-7 years ago, little has changed, it is more important to update the sitemap in a timely manner for articles no older than 6 years.

    Plugin Support qtwrk

    (@qtwrk)

    will pass to our devs as feature request

    a quick workaround , you can use this script , separately , and independently to crawl the sub-sitemap files directly

    like if your sitemap is classified by year , then you make it run the last year’s sitemap without looking into whole sitemaps

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘LiteSpeed ??Cache Crawler sitemap bypass order and long sitemap loading’ is closed to new replies.