How to Remove Referral Spam from Google Analytics: Part One

Understanding Referral Spam

Everything You Need to Know About Referral Spam

Have you ever noticed a spike in your traffic only to find, on investigation, that it’s all referral spam? It’s a frustrating problem for SEOs and webmasters and something that many believe Google should address. There are rumours that Google are aware of the issue and are currently working on a solution. However, the search giant is never forthcoming about problems it’s working on until it’s clear that a solution has been found, so we will have to wait and see.

In the meantime, there’s a couple of tools that can help you to filter out analytics spam, including:

  • SpamScape – a crowd-sourced offering which isn’t finished yet, it will auto-generate filtered results.
  • Analytics Spam Blocker – a WordPress plugin that redirects spam bots to stop them reaching your site and thus affecting your analytics data.

We don’t all have WordPress sites though and the plugin might not be compatible with all installations. With that in mind, what else can you do to remove referral spam from Analytics? In order to better understand, let’s first have a look at what referral spam is and what it does.

What is Referral Spam?

referrer spam

Image source: Incapsula

A referrer is a URL that is passed along when the browser goes from one page to another through a HTTP header and generally is used to indicate where the traffic is coming from. However, this can be changed and spammers often do this to promote a certain page. Once they have changed the header, they can then go ahead and make repeated requests and this means that the URL will be visible in analytics reports.

This can be done manually, but usually an automated script is used – or a bot – and this is known as referral spam.

Doesn’t Google Use Bots?

Yes, Google uses bots (automated scripts) to crawl websites in order to index them. However, bots can also be used for other purposes, including malicious ones, such as:

  • Click fraud
  • Scraping of email addresses
  • To scrape the content of a website
  • To spread malware
  • To artificially increase site traffic

Bots can also execute JavaScript and ones that do can show up as visits in your Google Analytics data. Referral spam doesn’t just affect the traffic, and skew the results in your reports, but it also increases the bounce rate and can affect conversion data. Not all bots execute JavaScript – which is also what Google Analytics uses – but visits from those that don’t will still show up in server logs.

Spam bots can also create fake accounts and spam emails, as well as bypass CAPTCHAs and they tend to attempt to hide themselves by pretending to be from a legitimate site or to be a common web browser.

Spam bots crawl 100s of 1000s of sites each day and send HTTP request to sites with fake referrer headers. This contains the URL that the spammer wishes to promote or build backlinks to. If your site receives a HTTP request from a spam bot, then it’s recorded in the server log. Google then uses this information for your Analytics report and to rank your site.

As the spammers don’t physically visit your site, the action registers as a session with a 100% bounce rate and 0 second duration. Last year, Google introduced a new filter in Analytics to filter out known bots and spiders, but this isn’t proving to be an effective measure as many people are still reporting it to be a big problem.

Ghost Referral Traffic

ghost referral spam

Image: Ohow

You should also be aware of what’s known as ‘ghost referral traffic’ which is also referral spam only it never actually visits a site. The way that HTTP requests are handled by Google Analytics means that it’s simple for a spammer to ‘spoof’ a session. This is again generated by sending fake HTTP requests and it can be aimed at different Google Analytics properties. This means that the traffic doesn’t hit the site, but does affect Analytics to spoof organic search results and send false events.

Ghost referrals are then different to bots and whilst the latter can be stopped by editing the .htaccess file, the former cannot.

The Problem with Referral Spam

Clearly the biggest problem with this type of spam is that it makes the job of the SEO that much more difficult. Google Analytics data is skewed which in turn gives false results for engagement and traffic volume.

Further to this, since referral spam aims to get links from the sites that publish access logs, this can improve on search results for the URL that they want to promote. So it’s a black hat SEO tactic employed by those who practice nefarious techniques.

Referral spam can also be used to harm competitor sites as it can’t be authenticated and tracked back to the source. This means that a spammer could send lots of unwanted traffic to a target site in order to either harm the site or to position it as a spam referrer. It’s not a good idea either for you to attempt to track referral spam back to the source, as referral sites can contain malware.

Referral Spam Skews Reporting

So referral spam is bad news all round. For marketers it’s a frustrating problem that makes it difficult to fully understand the traffic arriving at a site. As a tactic, it’s not maliciously attacking your site but the search engine.

Last year saw a large-scale referral spam campaign undertaken by a service called Semalt (which you may have seen show up in your Analytics). This widespread and aggressive campaign used bots and the company in question received a lot of negative online attention due to its habit of ignoring robots.txt directives.

A quick visit to the Salmalt website confirms that the service positions itself as a legitimate and professional one, describing itself as:

“A professional webmaster analytics tool that opens the door to new opportunities for the market monitoring, yours and your competitors’ positions tracking and comprehensible analytics business information.”

If this is a service that you’ve signed up for, it seems that if you attempt to remove your site from the crawling list, then it’s likely that you’ll be flooded with unwanted requests, Incapsula say. With this in mind, if you have signed up thinking it was a legitimate service, then it’s better to leave your details there for the time being.

Semalt and Botnets

There’s now evidence to suggest that Semalt uses a botnet to generate bot traffic and it’s thought that this has already infected 1000s of computers in order to create a large botnet which has in turn been incorporated into its referral spam activity and other malicious activities.

Since this information is available and widely substantiated, then Semalt surely can’t last too much longer. In the meantime however, you should take care when signing up to SEO services that sound too good to be true as they doubtless are.

Referral spam can get out of hand if your site is not properly configured and contains vulnerabilities. Every malicious web program or attacker usually takes the path of least resistance and so they attack the weakest sites first. With this in mind, you should ensure that all server and site software – including plugins – are updated regularly and you keep an eye out for known exploits. You’re also likely to be “assaulted by spam bots” if your site is running large scale affiliate marketing campaigns and for this reason, you should be careful about who you affiliate with.

Check out this article on finding and fixing malware on WordPress for further information.

In part two, we take a look at the nuts and bolts of removing referral spam from Google Analytics.

25 Website Must Haves
Kerry Butters

A prolific technology writer, Kerry was an authority in her field and produced content for a variety of high profile sites in her niche. Also a published author, she adored the written word and all things tech and internet related. Sadly she passed away in February 2016 after a valiant battle with cancer.