Forum Replies Created

Viewing 1 replies (of 1 total)
  • Thread Starter apthorpe

    (@apthorpe)

    I have no idea, though it’s important not to score comments that come from dynamically-allocated netblocks (e.g. dialup, broadband.) We expect legitimate web traffic from those networks but no legitimate mail traffic since dynamic users should be submitting mail through their ISP’s mailserver.
    Open proxies are a big problem regardless of the services they proxy. The IRC networks were blocking open proxies for years before anyone thought to refuse mail from them. I’d assert that any system running an open proxy is not under it’s owner’s control (or the owner is not responsible), so rejecting all traffic from those systems is a defensible security measure. But that’s me.
    I don’t see enough traffic to get any really decent data but the info’s there if you want to do something with it (IP address, score, rules hit.)
    The big win out of all this is SURBL checks. Initially I just wanted to extract domains from links and check those against SURBL. This is not trivial due to 2nd and 3rd-level TLDs (example: bbc.co.uk is a domain, co.uk is not) and redirector services (Google, Yahoo.)
    So rather than reinvent the wheel badly, I treated the comment as a message body, wrapped it in fake-but-believable mail headers and fed it to SpamAssassin. And since people were spamming via metadata (subject, sender’s URL, etc.) it made sense to analyze the metadata as well as the content. I’m surprised at how well it worked out, given the code is about 72 hours old; there’s definite need for testing, tuning, and improvement.

Viewing 1 replies (of 1 total)