I noticed there’s a bit of confusion on how to tweak a complex robots.txt file (aka longer than two lines :)). We have awesome documentation (of course :)), but let me pick out some of the parts that are commonly asked about:
Disallowing crawling doesn’t block indexing of the URLs. This is pretty widely known, but worth repeating.
More-specific user-agent sections replace less-specific ones. If you have a section with “user-agent: *” and one with “user-agent: googlebot”, then Googlebot will only follow the Googlebot-specific section.
The paths / URLs in the robots.txt file are case-sensitive.
For non-trivial files, tweaking can be a bit tricky. I strongly recommend using the robots.txt Tester in Search Console - it will pinpoint the line that blocks any specific URL, lets you test changes directly, and is the fastest way to let Google know of a changed robots.txt file on your site. Find out more about the tool at https://support.google.com/webmasters/answer/6062598 (archive.org)
Here’s the full documentation: https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt (archive.org)
Comments / questions
There's currently no commenting functionality here. If you'd like to comment, please use Twitter and @me there. Thanks!