• Resolved greenrhyno

    (@greenrhyno)


    When the blog of my sites are checked at the W3C Rss Feed validator at:

    https://validator.w3.org/feed/check.cgi

    If the Anti-Crawler is turned on, the result shows:

    It looks like this is a web page, not a feed. I looked for a feed associated with this page, but couldn’t find one. Please enter the address of your feed to validate.

    I would like to somehow whitelist this service but I do not know how to do so.

    In the Anti-Crawler Description it says:

    To enable/disable, open the Advanced settings, and turn on/off “Block by User-Agent”.

    Yet when I open the advanced settings there is no option to turn on/off “Block by User-Agent”. I searched the Advanced settings and the words “User-Agent” do not appear. I love this plugin and it is very helpful but with much of the troubleshooting, whitelisting etc the tools are confusing and the documentation is just not good.

    How can I fix this so that whatever bot that is being blocked from the W3C validator no longer gets blocked?

    The page I need help with: [log in to see the link]

Viewing 9 replies - 1 through 9 (of 9 total)
  • Plugin Support katereji

    (@katereji)

    Hello.
    We have disabled the ability to check the option ‘Block by User-Agent’ in the latest version 5.152.5 of the Anti-Spam plugin. It is turned on by default.
    To whitelist the specific bot we need to know the IP. You could find it in your logs here: https://cleantalk.org/my/show_sfw
    But if you are sure that the issue is caused by the Anti-Crawler, you can turn it off. WordPress Admin Page —> Settings —> Anti-Spam by CleanTalk —> Anti-Crawler
    Did it help?
    Be well.

    Hello,
    I have the same trouble…
    You’ll find here the Debug Log

    ArrayObject Object
    (
    [storage:ArrayObject:private] => Array
    (
    [12:50:31_ACTION__FUNCTION_] => Array
    (
    [0] => Failed.
    Query: ALTER TABLE kra2019_cleantalk_sfw_logs
    CHANGE status status ENUM(‘PASS_SFW’,’DENY_SFW’,’PASS_SFW_BY_WHITELIST’,’PASS_SFW_BY_COOKIE’,’DENY_ANTICRAWLER’,’DENY_ANTIFLOOD’) NOT NULL AFTER ip;
    Error: Unknown column ‘status’ in ‘kra2019_cleantalk_sfw_logs’
    )

    [07:27:38_ACTION__FUNCTION_] => Array
    (
    [0] => Failed.
    Query: ALTER TABLE kra2019_cleantalk_sfw_logs
    CHANGE status status ENUM(‘PASS_SFW’,’DENY_SFW’,’PASS_SFW__BY_WHITELIST’,’PASS_SFW__BY_COOKIE’,’DENY_ANTICRAWLER’,’PASS_ANTICRAWLER’,’DENY_ANTIFLOOD’,’PASS_ANTIFLOOD’) NOT NULL AFTER ip;
    Error: Unknown column ‘status’ in ‘kra2019_cleantalk_sfw_logs’
    )

    )

    )

    When I desactivate the plugin, no more problem.
    If I stop Firewall option it’s better, but not completly…

    Plugin Support amagsumov

    (@amagsumov)

    Hello, @adweb2021

    According to your data, issue is caused by the “Anti-Flood” and “Anti-Crawler” options. You may disable them in plugin settings and check the results.

    If you know what IPs are exactly being blocked, please, add them in your SpamFireWall Whitelist here:
    https://cleantalk.org/my/support/open

    It should solve the issue.

    Actually – we have the same issue.
    This could be fixed if there would be a way to exclude the RSS feeds from “Anti-Crawler”.
    I tried entering the URL into “URL exclusions”, but that didn’t help.

    Plugin Support amagsumov

    (@amagsumov)

    @greenrhyno, @adweb2021, @markcanada

    According to WC3 documentation they are using the IPs from 128.30.52.0/24 subnet.
    https://validator.w3.org/services

    Please, add this subnet in your SpamFireWall Whitelist here:
    https://cleantalk.org/my/show_private?service_type=spamfirewall

    After that press “Synchronize with cloud” button in plugin settings to synchronize the plugin with cloud.

    Did it help?

    Plugin Support amagsumov

    (@amagsumov)

    We also added subnet 128.30.52.0/24 in our global whitelists, now there should be no blocks.

    Just press “Synchronize with cloud” button in plugin settings.

    Thank you.

    Thank you, but that would only work for this tester. There are many testers out there and many automated RSS retrieval systems.
    Like e-mail software that pulls feeds for newsletters: MailChimp, ActiveCampaign….. and their IP addresses can change.
    I assume that even some desktop RSS readers may have issues because they get a webpage instead of a feed.?
    We needed to disable the Anti-Crawler now. Even if we don’t want bots to crawl the main site, the RSS feed is actually ideal to be “crawled” and until that can be excluded for every access we can’t use the anti-crawl.

    Plugin Support amagsumov

    (@amagsumov)

    @markcanada

    Most of all well-known services are whitelisted already, and we are constantly adding new services.

    Anyway, you can keep the anti-crawler option disabled. It doesn’t affect anti-spam functionality.

Viewing 9 replies - 1 through 9 (of 9 total)
  • The topic ‘Anti-Crawler turned on causes W3C validator to fail.’ is closed to new replies.