(This is a continuation from Part 1: Understanding Referral Spam)
First off, there’s a referral spam blacklist that you can check to see if any of the referrals for your site match up. Download this into a spreadsheet, bearing in mind that it’s subject to change. Also bear in mind that some of the known bots (such as Semalt) will be filtered out with the new bots and spiders filter, which you can enable by going to admin and then view settings.
Once you're in the view menu, underrneath ‘currency’ you’ll see a checkbox to enable bot filtering.
Check this box and all of the known spam bots will automatically be excluded from your Google Analytics reports.
The most effective way to block spam bots is in editing the .htaccess file in the root directory of your domain (on an Apache server) as this stops the bots from hitting the server at all. If you’re not confident with editing this kind of file, then you should ask your webmaster to carry it out for you.
Important: You can only edit the .htaccess file for individual spam bots and you should make sure that you make a full site backup before you do. Bear in mind that just one wrong character can take down the entire site, so it’s essential that you have the skills to carry this out.
For WordPress, you can also use a plugin called WP-Ban, which allows you to block sites from the admin panel in WordPress. You should use this if you’re not confident editing .htaccess; it allows you to block by IP and IP range, host name, user agent and referrer URL.
To edit the file, you should access through FTP and download the file, opening it in a text editor such as Notepad.
The following is taken fromJared Gardener’s Moz post How to Stop Spam Bots from Ruining Your Analytics Referral Data.
# Block Russian Referrer Spam
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly.\.ru/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly\.info/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*iloveitaly\.ru/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*econom\.co/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*savetubevideo\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*kambasoft\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*buttons\-for\-website\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*semalt\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*darodar\.com/ [NC]
RewriteRule ^(.*)$ – [F,L]
Just paste the above into your file, save it and re-upload it to the server, overwriting the older version. Note the spammer name and replace with the referral spam URLs that you’ve identified in Analytics.
Firstly, you should go through referral traffic and make a note of those sessions with 100% bounce rate and 10 or more sessions as these are likely to be spam. You should then check the URLs against the known blacklist that you’ve already downloaded and saved as a spreadsheet.
Mark known spam on your list.
The next part would require you to go to the site of other suspicious URLs to check whether it’s legitimate or not. You should do this at your own risk, ensuring first that you have antivirus and antimalware software installed and up-to-date. You can also enable the link scanner in your AV software so that any infected sites are picked up before you get to them.
Once you have a complete list, then it’s time to create a filter.
Note that excluding the referral spam site from referral traffic via the ‘Referral exclusion list’ will not work, this technique simply ‘hides’ the traffic in your report, but it will still show up as direct traffic.
To do this go to admin then view settings and choose filters from the menu on the left hand side.
Next you have to decide what you want to filter out. In Jared’s article he recommends that you filter by country, choosing the ones which spam commonly comes in from such as Russia, Brazil or Indonesia. Clearly though this is not a good idea if you have legitimate traffic coming from those regions.
You can call your filter anything you like, as it doesn’t make any difference to the report. Choose the filter type custom and then choose ‘country’ from the dropdown menu. In the ‘Filter Pattern Field’ you will then define which countries you want to apply the filter to. Once you’ve done this, hit the ‘verify filter’ button to check that it’s viable – a graph will appear which will show what has been left out in the last week.
To filter out all ghost referral traffic is a little more complex as you have to create a filter based on valid domains. However, if you choose this method, you should approach with extreme care as you can easily exclude valid domains.
As Analytics Edge points out,
“Since the spam referrers do not know whose website the tracking ID belongs to (they are picking numbers at random), they send the “referral” using a hostname that is not one of yours. You can create an INCLUDE filter that keeps ONLY what was recorded from one of your valid web hosts.”
So ghost referral traffic is easier to identify as they don’t actually visit the site and use a fake hostname value. For more information on this and step-by-step instructions on how to apply such filters, check out Analytics Edge’s article Definitive Guide to Removing Referral Spam.
In order to pick up referral spam and deal with it quickly, there are some other tactics that can be utilised to help you to do this.
These include:
Referral spam is a pain when it comes to reporting and potentially dangerous in that it often comes from malicious sources. There are steps you can take to alter your GA data in order to strip out a good percentage of the spam. It’s likely that we’ll see Google bring out a better solution that the current one at some point, but in the meantime, you should check out the advice above and ensure that your site is locked down as far as exploits and other vulnerabilities are concerned.