• So this isn’t so much a support request as just me trying to get something out there about what’s happening to me. Based on a lot of research and logriding the only conclusion I can come to is that the site I manage is being attacked by some kind of botnet and the attack consists of flooding WP with requests for every month archive ever until the server resources are totally overloaded and need to be restarted.

    I have two 8-core dedicated servers, one for apache and one for mysql, running the site. We get ~11k hits per day normally and the load during normal traffic is around 1.5-3 (the db server has really low load most of the time but spikes during updates and stuff to around 4, that’s out of 8 cores though, so on some level that’s a load of 0.5)

    We have 49,878 posts in the db, which means that any kind of non-cached pageload is incredibly slow and intense, and that these bots, showing up and asking for every /2008/05 type url all at once brings it to its knees. I have monit ( https://mmonit.com/monit/ ) installed on the server, so when the load spikes up to 15 it restarts apache, which stops the whole thing from going down for an hour, but also causes tons of insane behavior (lost posts, murdered pageloads etc).

    This has been going on for months, I’ve been working around it with Monit and other optimizations but at the core my problem is that every 5 minutes or so a DIFFERENT computer comes by and loads all the pages at once. I’ve been tracking the IP’s and they seem to change each time it happens as well as being in totally different parts of the world and in different kinds of organizations. My suspicion is that they are just infected Windows machines.

    Here are a couple of examples from my apache logs. Note that in all cases I get 20-30 similar requests for different months all at once from the same IP/user-agent:

    160.43.250.36 - - [23/Mar/2009:12:59:09 -0400] "GET /2008/03/ HTTP/1.1" 200 128651 "-" "Mozilla/4.0 (compatible;)"
    198.24.6.168 - - [23/Mar/2009:15:06:29 -0400] "GET /2009/02/ HTTP/1.1" 200 126646 "-" "Mozilla/4.0 (compatible;)"
    198.151.13.8 - - [23/Mar/2009:13:38:14 -0400] "GET /2004/12/ HTTP/1.1" 200 107751 "https://globalvoicesonline.org/2008/02/25/barbados-carbon-footprint/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; InfoPath.1)"
    164.114.248.33 - - [23/Mar/2009:17:14:22 -0400] "GET /2008/06/ HTTP/1.1" 200 125667 "https://globalvoicesonline.org/2009/02/14/barbados-trinidad-tobago-clico-questions/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.04506.30)"
    204.184.29.2 - - [24/Mar/2009:12:13:39 -0400] "GET /2007/02/ HTTP/1.1" 200 120982 "https://globalvoicesonline.org/2009/01/21/japan-coming-of-age-in-2009/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
    66.210.188.149 - - [23/Mar/2009:17:14:30 -0400] "GET /2008/12/ HTTP/1.1" 200 122549 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

    One of the weirdest things is how they mostly have the “Mozilla/4.0 (compatible;)” user-agent, but in a lot of cases they have all kinds of weird mixes of MSIE6, InfoPath etc. It seems like maybe this is just because whatever spyware is coordinating the attack is using the default browser on the owned box, thus registering whatever user-agent the box would normally give when someone was browsing.

    Given the patterns though, I don’t see how it could possbily be real traffic, and I don’t see any reasons to think that it could be a real search engine crawler or anything, especially given how many different places its coming from.

    SO: Has anyone seen anything like this before? Any ideas?

    I’m going to work on apache/.htaccess methods for curtailing this but its really hard because it looks so much like real traffic. I’ll probably be spending some time with mod_evasive to see if that can help.

Viewing 5 replies - 1 through 5 (of 5 total)
  • Thread Starter Jer Clarke

    (@jeremyclarke)

    Okay. In case anyone ever reads this and wants to know what I did.

    It was really hard to work out a solution because the urls were so similar to our post urls (/2008/05/…) so I basically just gave up on month archives all together.

    What I did was create an action function that fires if you are on a month page (before the query is done) and shows you a warning page that says basically “click this button to see the month archive”. This way the people loading months dont’ ever actually access the database, and so they don’t crash my server. When people press the button it adds a $_POST value that i then check for in this function, so the page loads normally. I also hacked the pulldown menu I use to list all months so it has a similar value. This way if people come to the site and use the <select> to choose a month it works, but if they come directly to a month url (by changing the url, which i know some people) they need to do a tiny bit of captcha testing before they can see the content.

    Not ideal, and not something I’d want to do if we had ever advertised our month urls, but luckily our site has never had a UI that listed months directly, so anyone coming to those urls is ‘guessing’ them on some level.

    Good luck if you’re in a simlar boat. I’m glad as hell the bot isn’t attacking our category pages or I’d be totally screwed.

    Way over my head, but good luck to you. Your solution sounds pretty innovative, even if it is sort of a workaround rather than a direct solution. Nice work.

    I am having this same problem! I have such a modest unused blog.. I don’t get it! They aren’t access my archives though, just my normal posts and pages.

    Any other solutions? A plugin?

    Jeremy, do you have this kind of code in your HTML header?:


    <link rel=’archives’ title=’February 2009′ href=’https://yourblog.com/2009/02/&#8217; />
    <link rel=’archives’ title=’January 2009′ href=’https://yourblog.com/2009/01/&#8217; />
    <link rel=’archives’ title=’December 2008′ href=’https://yourblog.com/2008/12/&#8217; />
    <link rel=’archives’ title=’November 2008′ href=’https://yourblog.com/2008/11/&#8217; />
    <link rel=’archives’ title=’October 2008′ href=’https://yourblog.com/2008/10/&#8217; />

    If this is your case, maybe the problem is that ‘bad behaving’ browsers are accessing all this links at once after finding them inside the HTML header.

    I have this links in my header and I am seeing the pattern that you describe in my logs.

    Try removing the call to wp_get_archives() in your theme (file header.php). I have removed it. Let’s see if I stop seeing the ‘attack’ in my logs.

    Jeremy, please let us know if this fixes your problem.

    Thread Starter Jer Clarke

    (@jeremyclarke)

    Hey Ofrias, in the interim a sysadmin friend pointed out that it was probably that and I think he’s right. It makes total sense because its the only place we link to the month archives.

    Apparently there’s actually almost no reason for those links to be there. Everyone should just remove that function call entirely, especially if they are having problems with it.

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Month Archive based DDOS Attack (I think)’ is closed to new replies.