Simplify robots.txt
This commit is contained in:
@@ -1,36 +1,2 @@
|
||||
# Being archived in a long-term store is harmful to my privacy. Never
|
||||
# know when I might need to change something in a hurry
|
||||
User-Agent: ia_archiver
|
||||
Disallow: /
|
||||
|
||||
User-Agent: archiver
|
||||
Disallow: /
|
||||
|
||||
# Search engines tend to update their indexes fairly quickly, so no
|
||||
# objections to being indexed by them in general. That said, I want to
|
||||
# do my own (tiny) part in making Google useless
|
||||
# not contribute to
|
||||
User-Agent: indexer
|
||||
Disallow:
|
||||
|
||||
User-agent: Googlebot
|
||||
Disallow: /
|
||||
|
||||
User-Agent: gus
|
||||
Disallow:
|
||||
|
||||
# Research *should* only report anonymised aggregates, I can live with
|
||||
# that
|
||||
User-Agent: researcher
|
||||
Disallow:
|
||||
|
||||
# I remain confused by the incluson of proxies in robots.txt, but am
|
||||
# happy for them to access the site as long as they themselves forbid
|
||||
# being indexed or archived. I can add exceptions if I find any that
|
||||
# don't do that
|
||||
User-Agent: webproxy
|
||||
Disallow:
|
||||
|
||||
# Here be dragons
|
||||
User-Agent: *
|
||||
Disallow: /cgi-bin/
|
||||
Disallow: /
|
||||
|
Reference in New Issue
Block a user