Riddle, riddle: who is (old, probably outdated)

Warning: This page is pretty old and was imported from another website. Take any content here with a grain of salt.

I like watching the traffic my sites get from Googles internal network. It’s a bit of an ego-thing, I guess :-).

Looking at the statistics (Google Analytics are fun) for yesterdays joke, I noticed that we had a bit of traffic from Google that was interesting. It’s normal to see them come (and a little bit of traffic comes through the Google Web Accelerator proxy with their prefetch commands), but this time it was interesting because they came with a referrer.

They came from a query searching for “”.

We blogged about in our “test-site D” report. Strange accesses from there, just grabbing some content without getting the style sheet.

I actually logged quite a bit of traffic coming from the internal Google network to our test sites. It looks like most (if not all?) of it was actually from the Google Web Accelerator (archive.org), which uses prefetch commands to fetch the contents of pages linked in the currently visible one (see Google (archive.org), 37 Signals (archive.org) and others (archive.org)). So if a user with GWA were to access a page that had a link to our site on it, the GWA would prefetch those pages to give the user a faster web experience, should he choose to click on those links. It took me a bit of time to notice that, I almost assumed that Google was manually monitoring websites, getting their external stylesheets through an external proxy :-).

Anyway, our site is number one on Google for “ (archive.org)” for some strange reason (I’m currently pushing it even more, with a bit of keyword-frequency spam, haha).

If you take a look at the other sites listing that IP address, you’ll see an interesting mix of public webserver statistics (that probably shouldn’t be publicly visible). It certainly looks like a normal GWA IP (and from the posts (archive.org) all (archive.org) over (archive.org) the (archive.org) web (archive.org) it is in the range).

But why would someone from Googles internal network search for it and hit our site here?

No it wasn’t another GWA user who searched for it :-). And what’s different about this IP compared to the other GWA IPs? Hmm… Log files are fun when you can still look at every single access and try to understand it :-).

Warning: This page is pretty old and was imported from another website. Take any content here with a grain of salt.

