• Hello guys. On the last weeks I’ve noticed the visits of a particularly annoying spider, named “spbot”, which scanned hundreds of pages without a single delay between scans.

    Has its own official webpage: https://openlinkprofiler.org/bot

    And looks like it obeys the robots.txt instructions:

    User-agent: spbot
    Disallow: /

    May you consider adding it please?

    Thanks in advance

Viewing 2 replies - 1 through 2 (of 2 total)
  • @kent-brockman

    Assuming this is about adding a blocking rule to the The HackRepair.com’s blacklist, the HackRepair.com’s blacklist is supposed to be a starting point … Bots come and go, the list cannot include every bot all the time …
    Try and contact the author of the HackRepair.com’s blacklist (Jim Walker).
    Once you succeed in getting it in his master list iThemes will probably follow.
    spbot doesn’t look like a bad bot. It respects the robots.txt file … so it may not qualify for inclusion.

    Anyway you can block any additional bot(s) in the Ban User Agents field.

    Something else: Are you sure it’s the official spbot crawling your site ? Got this from their site:

    How can I verify spbot is really spbot?

    Other web crawlers can spoof the spbot user agent to make them seem legitimate. They seem to be coming from us but they don’t. You can verify the IP addresses to make sure that the spbot that visits your site is actually from OpenLinkProfiler.org. We’re currently using the hosting company Digital Ocean for our crawlers.

    A list of IP addresses is included on their site.

    • This reply was modified 7 years, 4 months ago by pronl.
    Thread Starter Marcelo Pedra

    (@kent-brockman)

    Hello @pronl, yes, I’m pretty sure it’s the right bot as the origin IPs match the ones they list in that URL under “How can I verify spbot is really spbot?” section.

    I found this bot scanning lots of sites since April, and although it does it on night houra, it’s still an annoyance because is very aggresive by don’t letting a minimal delay time between scanned URLs. It has scanned hundreds of URLs on several sites in the same server during hours, skyrocketing the CPU and RAM usage. That’s not fair. That’s a behaviour worth blocking. And I did it using your feature, but I certainly can’t implement the same for all our customers.

    Ok: I will contact Jim Walker regarding this.

    Thanks for the info.
    All the best,

    Marcelo

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘suggestion: another bot to be added’ is closed to new replies.