Referrer spam – or by
Apache’s original misspelling, referer spam – has been a problem at
snarfed.org for years. I use Summary for web analytics,
and I made its statistic pages publicly available for a while, so spammers
still hit this site fake referrers, to this day, hoping that they’ll be linked
from the Summary pages.
There are a number of approaches to fighting referrer spam, but so far, no
silver bullet. Here’s what I did while my Summary output was online.
I maintained a hand-edited blacklist of known spammers’ sites. It’s far
from ideal, but it worked. You can find my blacklist in my
summary.conf file. If you’re fighting
referrer spam, feel free to borrow it. (I used to use
webalizer; my
webalizer.conf is also available,
but it’s not maintained.)
I used to use
iptables
to blacklist known spammers’ IP addresses and subnets, such as marketscore.com,
bezeqint.net, and ac.at.
I’d eventually like to use existing blacklist and referrer spam tools, such
as Jay Allen’sMT-Blacklist. However, most of those
tools are specific to MovableType and
Apache, and this site uses SnipSnap
instead.
A better approach would be to write tools that operate directly on referrer
logs, so they can be used with any web server or CMS and any web analytics
package. Tony Buser‘s
derefspam.pl script is a
good first step. I’d like to extend it to use DNSBLs and RBLs like
Spamhaus,
BSB, Blitzed, and
SURBL.
referrer spam
Referrer spam – or by Apache’s original misspelling, referer spam – has been a problem at snarfed.org for years. I use Summary for web analytics, and I made its statistic pages publicly available for a while, so spammers still hit this site fake referrers, to this day, hoping that they’ll be linked from the Summary pages.
There are a number of approaches to fighting referrer spam, but so far, no silver bullet. Here’s what I did while my Summary output was online.
I maintained a hand-edited blacklist of known spammers’ sites. It’s far from ideal, but it worked. You can find my blacklist in my summary.conf file. If you’re fighting referrer spam, feel free to borrow it. (I used to use webalizer; my webalizer.conf is also available, but it’s not maintained.)
I’ve written a webalizer nofollow patch that adds support for the popular new nofollow tag.
I used to use iptables to blacklist known spammers’ IP addresses and subnets, such as marketscore.com, bezeqint.net, and ac.at.
I’d eventually like to use existing blacklist and referrer spam tools, such as Jay Allen’s MT-Blacklist. However, most of those tools are specific to MovableType and Apache, and this site uses SnipSnap instead.
A better approach would be to write tools that operate directly on referrer logs, so they can be used with any web server or CMS and any web analytics package. Tony Buser‘s derefspam.pl script is a good first step. I’d like to extend it to use DNSBLs and RBLs like Spamhaus, BSB, Blitzed, and SURBL.
See also: