… or examining a Google automated spam penalty Matt Cutts, a Google Engineer, explained in his private blog how off-topic and affiliate links can change the Google crawlers “priority” for a site, even deindexing it. The examples shown were quite extreme, but what surprised me was this part: The person said that every page has original content, but every link that I clicked was an affiliate link that went to the site that actually sold the T-shirts.
The Google Sitemap system just turned one year old! The “Big Daddy” infrastructure and the “Crawl Caching Proxy” look like they were made to be a perfect match for Google Sitemaps (but it is more likely the other way around). In theory, Google Sitemaps can tell the Google crawlers more about a website, even without having to crawl it. The attributes can be used to help the proxy determine when actual accesses are necessary, keeping bandwidth use on all sides at a minimum.
Google and the other search engines are constantly changing their software and infrastructure. Google has apparently switched to a new infrastructure starting beginning 2006 and is currently working on optimizing the “settings”. How does all of this show in the test sites? How does it show in a normal site? Do the number of indexed pages for a “spammy” site go down? Does the activity of the crawlers change? What are the other engines doing?
Google “Related Links” looks to be the poor man’s version of Google Adsense (meaning you don’t get money for publishing it, ha ha). Let’s take a quick first look at the way they compare (an in-depth comparison will take some time, especially since Adsense is known to adapt in the period of a few days to a week, Related-Links might do the same). How well does it work compared to Adsense?
Google Labs has released a new service: “Related Links (archive.org)”. According to Google: Google Related Links use the power of Google to automatically bring fresh, dynamic and interesting content links to any website. Webmasters can place these units on their site to provide visitors with links to useful information related to the site's content, including relevant news, searches, and pages. Wow! This is great, finally Adsense for the publishers who don’t want the hassle of specifying a bank-account for the payout.
I like watching the traffic my sites get from Googles internal network. It’s a bit of an ego-thing, I guess :-). Looking at the statistics (Google Analytics are fun) for yesterdays joke, I noticed that we had a bit of traffic from Google that was interesting. It’s normal to see them come (and a little bit of traffic comes through the Google Web Accelerator proxy with their prefetch commands), but this time it was interesting because they came with a referrer.
Site D is a normal website, with a little startup-funding in form of deep links from several external sites. It does not use Google Sitemaps, nor anything otherwise special. There were 4 links, one to each of the 4 levels, in different parts of the site. The site structure is strictly top-down, with links from the parent to about 10 children and a link to the main URL. There are no cross-links and no links from the children to the parent (just to the main URL).
How long would you suggest it will take until a new webpage gets indexed by Google? You might say, this depends. You’re right with that. But you can help yourself getting your webpages indexed better. One approach is to participate with Google Sitemaps - and give Google the urls to add. The people say it takes very long until you see new webpages appearing at the serps. This article describes an example for adding a new article to enarion.
Site B is a mixture of Site A (Google Sitemaps, no links) and Site C (Adsense, GoogleBar, no Sitemaps, no links). Site B uses Google Sitemaps along with Adsense blocks - and is visited regularly by a virtual visitor using the Microsoft Internet Explorer with the GoogleBar plug-in. Seeing that neither Site A nor Site C were indexed properly with Google, we can only assume that Site B will also not be indexed.
People who are new to the web and want to start with a website usually just put it online and hope that visitors come. With Google Sitemaps the webmaster has a way to let Google know about his site and to try to help Google find all of the pages. I’ll just go through the other sites in the order we had them, site A now, next site B, then site D (we already covered site C) and finally site E.