Google

A search-engine guide to 301, 302, 307, & other redirects

It’s useful to understand the differences between the common kinds of redirects, so that you know where to use them (and can recognize when they’re used incorrectly). Luckily, when it comes to Google, we’re pretty tolerant of mistakes, so don’t worry too much :). In general, a redirect is between two pages, here called R & S (it also works for pages called https://example.com/filename.asp , or pretty much any URL). Very simplified, when you call up page R, it tells you that the content is at S, and when it comes to browsers, they show the content of S right away.

JS SEO in 2016

An update (March 2016) on the current state & recommendations for JavaScript sites / Progressive Web Apps [1] in Google Search. We occasionally see questions about what JS-based sites can do and still be visible in search, so here’s a brief summary for today’s state: # Don’t cloak to Googlebot. Use “feature detection” & “progressive enhancement” [2] techniques to make your content available to all users. Avoid redirecting to an “unsupported browser” page.

HTTPS Migrations

Planning on moving to HTTPS? Here are 13 FAQs! What’s missing? Let me know in the comments and I’ll expand this over time, perhaps it’s even worth a blog post or help center article. Note that these are specific to moving an existing site from HTTP to HTTPS on the same hostname. Also remember to check out our help center at https://support.google.com/webmasters/answer/6073543 # Do I need to set something in Search Console?

Soft-404s & your site

We call a URL a soft-404 when it is essentially a placeholder for URLs that no longer exist, but doesn’t return 404. Using soft-404s instead of real 404s is a bad practice, and it makes things harder for our algorithms – and often confuses users too. We’ve been talking about soft-404s since “forever,” here’s a post from 2008: http://googlewebmastercentral.blogspot.com/2008/08/farewell-to-soft-404s.html for example. In 2010 we added information about soft-404s to Webmaster Tools ( http://googlewebmastercentral.

Indexing email confirmation links

Every now and then I hear from someone who accidentally got a bunch of email addresses indexed as parameters to some script on their site. There’s sometimes an easy solution to that, which will (temporarily!) take care of it fairly quickly: If there’s a common part of the path that identifies these URLs, use a “directory” URL removal request in Search Console (verify ownership first). For example, you can submit “email.

Forms

Looking for something simple & easy to do during the holidays? Double-check your site’s contact forms to see that they actually work. We occasionally reach out to webmasters who have technical issues that aren’t easily visible in Webmaster Tools. Sometimes not fixing those issues will result in us dropping your website completely from our search results. If your website doesn’t have an email address on it, and your contact form is an error-page, then you’re gonna have a bad time, and we’re not going to be able to warn you of those issues.

robotted resources

I see a bunch of posts about the robotted resources message that we’re sending out. I haven’t had time to go through & review them all (so include URLs if you can :)), but I’ll spend some time double-checking the reports tomorrow. Looking back a lot of years, blocking CSS & JS is something that used to make sense when search engines weren’t that smart, and ended up indexing & ranking those files in search.

robots.txt

I noticed there’s a bit of confusion on how to tweak a complex robots.txt file (aka longer than two lines :)). We have awesome documentation (of course :)), but let me pick out some of the parts that are commonly asked about: Disallowing crawling doesn’t block indexing of the URLs. This is pretty widely known, but worth repeating. More-specific user-agent sections replace less-specific ones. If you have a section with “user-agent: *” and one with “user-agent: googlebot”, then Googlebot will only follow the Googlebot-specific section.

hreflang canonical

For those of you using hreflang for international pages: Make sure any rel=canonical you specify matches one of the URLs you use for the hreflang pairs. If the specified canonical URL is not a part of the hreflang pairs, then the hreflang markup will be ignored. In this case, if you want to use the “?op=1” URLs as canonicals, then the hreflang URLs should refer to those URLs. Alternately, if you want to use just “/page”, then the rel=canonical should refer to that too.

429 or 503

Here’s one for fans of the hypertext HTTP protocol – should I use 429 or 503 when the server is overloaded? It used the be that we’d only see 503 as a temporary issue, but nowadays we treat them both about the same. We see both as a temporary issue, and tend to slow down crawling if we see a bunch of them. If they persist for longer and don’t look like temporary problems anymore, we tend to start dropping those URLs from our index (until we can recrawl them normally again).