We call a URL a soft-404 when it is essentially a placeholder for URLs that no longer exist, but doesn’t return 404. Using soft-404s instead of real 404s is a bad practice, and it makes things harder for our algorithms – and often confuses users too.
We’ve been talking about soft-404s since “forever,” here’s a post from 2008: http://googlewebmastercentral.blogspot.com/2008/08/farewell-to-soft-404s.html for example. In 2010 we added information about soft-404s to Webmaster Tools ( http://googlewebmastercentral.blogspot.com/2010/06/crawl-errors-now-reports-soft-404s.html ) and somewhere around that time we added the help center article as well ( http://support.google.com/webmasters/bin/answer.py?hl=en&answer=181708 ).
If you’re aware of your site using soft-404s (such as returning or redirecting to the homepage instead of a 404 error), then I’d recommend looking into ways to improve that. Think about ways that you can make your 404 pages useful to users, so that they recognize that the page no longer exists, and so that they can find something else that’s appropriate (you know your users best :)).
Returning 404 - and having the URLs listed in the crawl errors in Webmaster Tools - is not a problem. 404s are fine & expected: http://googlewebmastercentral.blogspot.com/2011/05/do-404s-hurt-my-site.html . Even if you have millions of them, if they’re all for pages that no longer exist, then that’s the right way to do it & fine by us.
Comments / questions
There's currently no commenting functionality here. If you'd like to comment, please use Twitter and @me there. Thanks!
- crawl budget & 404s (2016)
- 307s (2016)
- A search-engine guide to 301, 302, 307, & other redirects (2016)
- robotted resources (2015)
- robots.txt (2015)