Google

Google and their Sitemap

The Google Sitemap system just turned one year old! The “Big Daddy” infrastructure and the “Crawl Caching Proxy” look like they were made to be a perfect match for Google Sitemaps (but it is more likely the other way around). In theory, Google Sitemaps can tell the Google crawlers more about a website, even without having to crawl it. The attributes can be used to help the proxy determine when actual accesses are necessary, keeping bandwidth use on all sides at a minimum.

Watching crawler activity and indexed pages over time

Google and the other search engines are constantly changing their software and infrastructure. Google has apparently switched to a new infrastructure starting beginning 2006 and is currently working on optimizing the “settings”. How does all of this show in the test sites? How does it show in a normal site? Do the number of indexed pages for a “spammy” site go down? Does the activity of the crawlers change? What are the other engines doing?

Google Adsense vs. Related Links: speed, relevancy, usefulness

Google “Related Links” looks to be the poor man’s version of Google Adsense (meaning you don’t get money for publishing it, ha ha). Let’s take a quick first look at the way they compare (an in-depth comparison will take some time, especially since Adsense is known to adapt in the period of a few days to a week, Related-Links might do the same). How well does it work compared to Adsense?

Google "Related Links": Adsense without the hassle of passing money

Google Labs has released a new service: “Related Links (archive.org)”. According to Google: Google Related Links use the power of Google to automatically bring fresh, dynamic and interesting content links to any website. Webmasters can place these units on their site to provide visitors with links to useful information related to the site's content, including relevant news, searches, and pages. Wow! This is great, finally Adsense for the publishers who don’t want the hassle of specifying a bank-account for the payout.

Riddle, riddle: who is 72.14.192.32?

I like watching the traffic my sites get from Googles internal network. It’s a bit of an ego-thing, I guess :-). Looking at the statistics (Google Analytics are fun) for yesterdays joke, I noticed that we had a bit of traffic from Google that was interesting. It’s normal to see them come (and a little bit of traffic comes through the Google Web Accelerator proxy with their prefetch commands), but this time it was interesting because they came with a referrer.

Results from our Sitemaps study: Site D

Site D is a normal website, with a little startup-funding in form of deep links from several external sites. It does not use Google Sitemaps, nor anything otherwise special. There were 4 links, one to each of the 4 levels, in different parts of the site. The site structure is strictly top-down, with links from the parent to about 10 children and a link to the main URL. There are no cross-links and no links from the children to the parent (just to the main URL).

Getting indexed by Google with Google Sitemaps - in what time

How long would you suggest it will take until a new webpage gets indexed by Google? You might say, this depends. You’re right with that. But you can help yourself getting your webpages indexed better. One approach is to participate with Google Sitemaps - and give Google the urls to add. The people say it takes very long until you see new webpages appearing at the serps. This article describes an example for adding a new article to enarion.

To (Google) sitemap or not to sitemap, that is the question

There are lots of ways to get indexed by Google. Using Google Sitemaps is only one way - the way that seems to be a bit trendy at the moment. “In the beginning” (June / July 2005), when Google had first introduced Google Sitemaps, it was a sure-fire way to get indexed within hours. It really worked. I bet it not only worked for us, but for lots of spammer sites, so Google had to button it down a bit.