<?xml version="1.0"?>
<!DOCTYPE content [ <!ENTITY nbsp " "> ]>
<rdf:RDF xml:base="http://snarfed.org/rdf"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">

<rdf:Description rdf:about="http://snarfed.org">
  <dc:title> snarfed.org  </dc:title>
  <dc:description> draw group stream of consciousness </dc:description>
  <dc:creator> Ryan Barrett &lt;snarfed at ryanb dot org&gt; </dc:creator>
  <dc:language> en </dc:language>
  <dc:format> text/html </dc:format>
  <dc:rights> Copyright 2002-2007 Ryan Barrett </dc:rights>
</rdf:Description>

<rdf:Description rdf:about="http://snarfed.org/space/referrer%20spam">
  <dc:title> referrer spam </dc:title>
  <dc:creator> Ryan Barrett &lt;snarfed at ryanb dot org&gt; </dc:creator>
  <dc:date> 2003-01-01T05:00:00Z </dc:date>
  <dc:language> en </dc:language>
  <dc:format> text/html </dc:format>
  <dc:rights> Copyright 2002-2007 Ryan Barrett </dc:rights>

  <content>
    <p><a href="/space/referrer+spam"><img src="/space/spam.jpg" alt="/space/spam.jpg" title="" /></a></p>

<p><a href="http://www.spywareinfo.com/articles/referer_spam/">Referrer spam</a> - or by
Apache's original misspelling, <em>referer</em> spam - has been a problem at
snarfed.org for years. I use <a href="http://summary.net/">Summary</a> for web analytics,
and I made its statistic pages publicly available for a while, so spammers
still hit this site fake referrers, to this day, hoping that they'll be linked
from the Summary pages.</p>

<p>There are a number of approaches to fighting referrer spam, but so far, no
silver bullet. Here's what I did while my Summary output was online.</p>

<ul>
<li><p>I maintained a hand-edited blacklist of known spammers' sites. It's far
from ideal, but it worked. You can find my blacklist in my
<a href="http://ryan.barrett.name/summary.conf">summary.conf</a> file. If you're fighting
referrer spam, feel free to borrow it. (I used to use
<a href="http://ryan.barrett.name/webalizer/">webalizer</a>; my
<a href="http://ryan.barrett.name/webalizer.conf">webalizer.conf</a> is also available,
but it's not maintained.)</p></li>
<li><p>I've written a <a href="/space/webalizer+nofollow+patch">webalizer nofollow patch</a>
that adds support for the popular new
<a href="http://www.google.com/googleblog/2005/01/preventing-comment-spam.html">nofollow</a>
tag.</p></li>
<li><p>I <em>used to</em> use
<a href="http://www.google.com/url?sa=X&amp;start=1&amp;q=http://www.netfilter.org/">iptables</a>
to blacklist known spammers' IP addresses and subnets, such as marketscore.com,
bezeqint.net, and ac.at. See my <a href="http://ryan.barrett.name/referrer-spam-iptables.txt">referrer spam iptables
rules</a>.</p></li>
<li><p>I'd eventually like to use existing blacklist and referrer spam tools, such
as <a href="http://www.jayallen.org/">Jay Allen's</a>
<a href="http://www.jayallen.org/comment_spam/">MT-Blacklist</a>. However, most of those
tools are specific to <a href="http://www.sixapart.com/movabletype/">MovableType</a> and
<a href="http://apache.org/">Apache</a>, and this site uses <a href="/space/SnipSnap">SnipSnap</a>
instead.</p></li>
<li><p>A better approach would be to write tools that operate directly on referrer
logs, so they can be used with any web server or CMS and any web analytics
package. <a href="http://juju.org/">Tony Buser</a>'s
<a href="http://www.juju.org/archives/2005/01/21/derefspam">derefspam.pl</a> script is a
good first step. I'd like to extend it to use DNSBLs and RBLs like
<a href="http://www.spamhaus.org/xbl/index.lasso">Spamhaus</a>,
<a href="http://bsb.empty.us/">BSB</a>, <a href="http://opm.blitzed.org/info">Blitzed</a>, and
<a href="http://www.surbl.org/faq.html">SURBL</a>.</p></li>
</ul>

<p>See also:</p>

<ul>
<li><a href="/space/geocoding">geocoding</a></li>
<li><a href="/space/webalizer nofollow patch">webalizer nofollow patch</a></li>
</ul>

  </content>

  <rdf:Seq>

  </rdf:Seq>
</rdf:Description>
</rdf:RDF>
