The last time I wrote about a hacked site, it was using a redirect that sent some users to a different site. This kind of hack is pretty common (even though it’s usually not as complex as mentioned in that post), it leverages the sad fact (archive.org) that users are often easy to trick and not browsing with protection (or a current browser (archive.org)).
A different angle of attack is to redirect only search engine crawlers to a different site. By doing this, they can make it look like the pages of a website moved to a new domain name. In general, when search engines find redirects like that, they will more or less pass the “value” that a page had on to the new URL – that generally also applies to PageRank. So in a sense, they are trying to steal the value that a webmaster has built up over time.
In this particular case, a “massive amount” of sites were hacked and likely redirected through suomi.co.in.
The webmaster generally doesn’t notice this kind of hack because there’s nothing that would alert him to a problem. Only search engine crawlers would get redirected, normal users (including the webmaster) would see the page normally.
The first symptom that you would see is hard to interpret: URLs from the website are just not indexed anymore (archive.org). URLs not being indexed is something that could happen because of any number of reasons, so how do we find out more?
One of the first things I like to do in a case like this is to access the site with a search engine crawler’s user agent. This gives you a rough look at how the website reacts to a search engine crawler (although it’s not complete, it’s often pretty close). There are two relatively easy ways to do this:
- Use an online tool such as Web-Sniffer (archive.org). It’s pretty easy to use and is somewhat close to an actual crawler.
- Use FireFox (archive.org) with the User Agent Switcher (archive.org) plugin. If you use this plugin, you’ll have to add the user agent yourself. I usually use the current Googlebot user agent string:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Note: if you use Firefox for this, make sure that your Firefox installation is up to date and locked down properly in case you run into a site serving malware like this. Sometimes it even makes sense to use a virtual machine for this.
- (I wish there were a half-“li” :) ) There’s also “wget”, which is easy for those of you who prefer use console tools. I usually use the above user agent string with wget.
If you access the site using one of these tools, you’ll often be able to spot these redirects (or other issues that a site might be having with regards to being accessed by search engine crawlers). It’s rare that someone uses cloaking by IP address for things like this. In a recent thread in the Webmaster Help forums (archive.org), “webado” spotted the redirect using Web-Sniffer.
In this particular case, the URL was redirected to http://suomi.co.in/ , from where it was redirected to a page that they wanted to promote with the original site’s “value”. I’ve seen the same kind of redirect going through http://ahtung.co.in/.
The webmaster responded with a note from his hoster in the thread:
Note from my host server (support @ hostgator.com) I have removed the file ".htaccess" from the directory /home/aceuropa which was causing the redirect. The logs show a massive amount of .htaccess files being edited over the last couple of days. I would highly suggest changing your password to something more secure. Please let us know if you have any further questions or concerns.
(It’s great to see a hoster act so quickly!)
There’s another way to spot this kind of hack with Google Webmaster Tools: When you submit a Sitemap file, Google will show warnings for URLs that redirect. By design, you should be listing the final URL in your Sitemap file, so if the URL is redirecting for our crawlers (as in this case), we’ll show a warning in your account.
Comments / questions
There's currently no commenting functionality here. If you'd like to comment, please use Twitter and @me there. Thanks!