Amazon bot
-
Hi! Is there a way to block the Amazon bot? I’ve seen a lot of stories saying that it may actually be a human but my statcounter says otherwise. I get hits from them everyday. Is it possible to block them on the frontend?
Thank you!
-
Hi, verityr. How are things?
Unfortunately, I can’t recognize what type of bot are you mentioning. So I can say here only some general things.
1. Put “robots.txt” in your site:
User-Agent:Amazon-bot Disallow:/
For example, google.com has its own robots.txt: https://www.google.com/robots.txt
2. Put “.htaccess” to deny the bot if you know its IP address:
order allow,deny allow from all deny from aaa.bbb.ccc.ddd/ee
“
aaa.bbb.ccc.ddd/ee
” is something like “172.16.0.0/12
“.If you mean the bots from amazon servers (e.g. AWS), then you can find the IP addresses in https://ip-ranges.amazonaws.com/ip-ranges.json
Otherwise, you can try the beta version of this plugin. I’m willing to support you if you want to try it.
Thanks.
I uploaded your new plugin to my site. I don’t see any differences between the version I had and the beta version. How can I tell I’m using the new one?
I also added the info to my robots file. Yesterday I tried to add to the htaccess file and my site went down. ??
Hi verityr,
Yesterday I tried to add to the htaccess file and my site went down. ??
Hummm… I assume your server is apache and you can google how to set .htaccess to deny specific IP address range.
How can I tell I’m using the new one?
Sorry but I missed to change the version number. You can see the version number of this plugin as 2.2.7 on plugin dashboard. But internally, it’s 3.0.0b4 ^^;
You can find “Front-end target settings (beta)” section on Settings tab of this plugin. Please enable “Block by country” at “Pubic facing pages“.
And please refer to some documents at https://www.ipgeoblock.com/codex/#blocking-on-front-end
I see the beta plugin is installed now. I’m having a roadblock. When I enabled “block by country”, I’m getting error messages that the htaccess file cannot be written to. I changed the permissions and it’s still not working. I assume that’s a problem on my end? I’ve emailed my hosting company.
I called my host and a techie went into my site and made sure permissions were set, etc. The error message I’m getting when I try to save is that there is no htaccess file in five different folders its trying to write to.
You don’t need to care about
.htaccess
when you enable “Block by country” at “Front-end target settings (beta)“. I think that at this time you enabled “Force to load WP core” at “Plugins area” and “Themes area” which need to set.htaccess
.In order to clarify your issue, please uncheck “Force to load WP core” if you still have a permission issue.
Then we should consider how to configure “Matching rule” and “UA string and qualification” properly to block your target, e.g. “Amazon bot”. At least, I need the IP address and the user agent string of “Amazon bot”. So could you tell me more details about “Amazon bot”? The access log is fine if you can get it. Then I can tell you the right configuration.
Hi,
“Force to load WP Core” in the Themes area is NOT checked.
“Blocked by country” under Front-End settings IS checked so perhaps it did save but it’s still attempting to write to the htaccess files in those 5 folders. Since “Blocked by country” is checked, does not that mean that certain countries cannot view my site at this time?
Thank you.
Hi verityr,
“Force to load WP Core” in the Themes area is NOT checked.
And also in the Admin area, am I right?
but it’s still attempting to write to the htaccess files in those 5 folders.
This is curious. This plugin does not write htaccess into 5 folders! Could you write down the exact message which this plugin is attempting?
Thanks.
That’s correct.
Unable to write /homepages/30/d560012174/htdocs/clickandbuilds/wp-content/plugins/.htaccess, /homepages/30/d560012174/htdocs/clickandbuilds/wp-content/themes/.htaccess, /homepages/30/d560012174/htdocs/clickandbuilds/wp-includes/.htaccess, /homepages/30/d560012174/htdocs/clickandbuilds/wp-content/uploads/.htaccess, /homepages/30/d560012174/htdocs/clickandbuilds/wp-content/languages/.htaccess. Please check permission.
OK, thanks. I’ll check the code. Please give me some time.
Since “Blocked by country” is checked, does not that mean that certain countries cannot view my site at this time?
It depends on the “Matching rule”:
1. If you select Follow “Validation rule settings”, then access from specified countries in “Validation rule settings” will be blocked.
2. If you select “Whitelist” or “Blacklist“, then you can specify distinctive countries in “Whitelist (Blacklist) of country code“. But when you leave it empty, then “Block by country” does not work.Well, I should improve representation of these terminology ??
By the way, could you tell me what type of bot do you want to block?
Take your time, I know the plugin is in beta.
I get a lot of hits from amazon.com and amazonaws.com. I assume they’re the amazon bot I’ve heard about. I also get a lot of hits from Digital Ocean that also seem to be a bot.
Kindly very much!
Next, we need to plan a strategy for blocking specific bots. For example, block by country, block by IP address, block by user agent string or those of combination.
You can put some rules into the text box “UA string and qualification” as your strategy.
So could you provide more detailed information about bots such as IP address, user agent string? I can make a strategy and a rule set from these information.
Thanks.
Unfortunately that could take a while, unless there is an easier way to do it. I would have to go down every hit, one by one. I forgot to add to that list Microsoft Corp. I can tell you that all of these bots are coming from the USA. Here’s some IPs:
23.96.208.78 Microsoft
54.173.35.122 amazonaws.com
50.18.94.121 amazonaws.com
52.38.247.117 amazonaws.com
45.55.177.156 Digital Ocean
45.55.199.243 Digital OceanThose IPs are just a few that happened yesterday.
Hi verityr,
Thank you for the info.
At first, sorry for my bothering you but I should ask you to update this plugin from here which can reduce “5 directories” issue to “2 directories”.
Then you can configure this plugin at “Front-end target settings (beta)” as follows:
– Enable “Block by country” at “Public facing pages”.
– Select “Whitelist” at “Matching rule”.
– Make sure “Whitelist of country code” empty.And then set the following IP addresses at “Validation rule settings“:
1. 23.96.208.78 Microsoft — I think access from this IP is using Microsoft Azure Cloud Services. If you want deny this access, you can put
23.96.0.0/16
into “Blacklist of extra IP addresses prior to country code“.2. 54.173.35.122 / 50.18.94.121 / 52.38.247.117 amazonaws.com — I think accesses from AWS are bots in most cases. For example, “linkdexbot” from 54.173.161.229 comes to my site. It’s useless for me but almost harmless to my site because the frequency is very low. But if you want to block this IP, put
54.172.0.0/15
,50.18.0.0/16
and52.36.0.0/14
into “Blacklist of extra IP addresses prior to country code“.3. Other amazonaws.com — If we want to block all the access from AWS, the easiest way is to check if the result of reverse DNS lookup includes the string “amazonaws.com” or not. IGB 3.0.0b can lookup reverse DNS, but currently it does not equip this functionality. So I need to take some time to imprement this.
4. Digital Ocean — Accesses from this IP might use cloud service of Digital Ocean. If you want to block all the accesses from this service, put
45.55.128.0/18
into “Blacklist of extra IP addresses prior to country code“.I think that the above examples are only the tip of the iceberg. In general, it’s difficult to distinguish between bad bots and good (or harmless) bots, or even human accesses to the front-end.
So I’d appreciate if you provide me any other examples and why and what type of bots do you want to block.
Thanks.
I uploaded the new plugin. Do you think that Digital Ocean and Microsoft cloud hits are coming from human beings? I’m always suspicious of the generic browser symbols coming with these hits.
- The topic ‘Amazon bot’ is closed to new replies.