allwiki首页  
天下维客 你可以修改的网络知识库
首页最近更改优秀条目专题展示电脑科技词典软件学习网络知识电脑安全明星时尚天下百科
 

Robots.txt文件实例

天下维客,你可以修改的网络知识库

Jump to: navigation, search
robots.txt综述 怎么写robots.txt robots.txt语法规范 文件实例 常见错误 扩展功能
robots.txt疑难解答 检测工具 禁止收录特定页面 清除已收录的页面 META标签:robots
nofollow noindex .htaccess 网络爬虫的名称 常见爬虫的特性 robots.txt相关链接

维基百科的robots.txt

网址:http://www.wikipedia.org/robots.txt

#
# robots.txt for http://www.wikipedia.org/ and friends
#
# Please note: There are a lot of pages on this site, and there are
# some misbehaved spiders out there that go _way_ too fast. If you're
# irresponsible, your access to the site may be blocked.
#

# advertising-related bots:
User-agent: Mediapartners-Google*
Disallow: /

# Wikipedia work bots:
User-agent: IsraBot
Disallow:

User-agent: Orthogaffe
Disallow:

# Crawlers that are kind enough to obey, but which we'd rather not have
# unless they're feeding search engines.
User-agent: UbiCrawler
Disallow: /

User-agent: DOC
Disallow: /

User-agent: Zao
Disallow: /

# Some bots are known to be trouble, particularly those designed to copy
# entire sites. Please obey robots.txt.
User-agent: sitecheck.internetseer.com
Disallow: /

User-agent: Zealbot
Disallow: /

User-agent: MSIECrawler
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: Fetch
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: TeleportPro
Disallow: /

User-agent: WebZIP
Disallow: /

User-agent: linko
Disallow: /

User-agent: HTTrack
Disallow: /

User-agent: Microsoft.URL.Control
Disallow: /

User-agent: Xenu
Disallow: /

User-agent: larbin
Disallow: /

User-agent: libwww
Disallow: /

User-agent: ZyBORG
Disallow: /

User-agent: Download Ninja
Disallow: /

#
# Sorry, wget in its recursive mode is a frequent problem.
# Please read the man page and use it properly; there is a
# --wait option you can use to set the delay between hits,
# for instance.
#
User-agent: wget
Disallow: /

#
# The 'grub' distributed client has been *very* poorly behaved.
#
User-agent: grub-client
Disallow: /

#
# Doesn't follow robots.txt anyway, but...
#
User-agent: k2spider
Disallow: /

#
# Hits many times per second, not acceptable
# http://www.nameprotect.com/botinfo.html
User-agent: NPBot
Disallow: /

# A capture bot, downloads gazillions of pages with no public benefit
# http://www.webreaper.net/
User-agent: WebReaper
Disallow: /

# Don't allow the wayback-maschine to index user-pages
#User-agent: ia_archiver
#Disallow: /wiki/User
#Disallow: /wiki/Benutzer

#
# Friendly, low-speed bots are welcome viewing article pages, but not
# dynamically-generated pages please.
#
# Inktomi's "Slurp" can read a minimum delay between hits; if your
# bot supports such a thing using the 'Crawl-delay' or another
# instruction, please let us know.
#
User-agent: *
Disallow: /w/
Disallow: /trap/
Disallow: /wiki/Especial:Search
Disallow: /wiki/Especial%3ASearch
Disallow: /wiki/Special:Random
Disallow: /wiki/Special%3ARandom
Disallow: /wiki/Special:Search
Disallow: /wiki/Special%3ASearch
Disallow: /wiki/Spesial:Search
Disallow: /wiki/Spesial%3ASearch
Disallow: /wiki/Spezial:Search
Disallow: /wiki/Spezial%3ASearch
Disallow: /wiki/Specjalna:Search
Disallow: /wiki/Specjalna%3ASearch
Disallow: /wiki/Speciaal:Search
Disallow: /wiki/Speciaal%3ASearch
Disallow: /wiki/Speciaal:Random
Disallow: /wiki/Speciaal%3ARandom
Disallow: /wiki/Speciel:Search
Disallow: /wiki/Speciel%3ASearch
Disallow: /wiki/Speciale:Search
Disallow: /wiki/Speciale%3ASearch
Disallow: /wiki/Istimewa:Search
Disallow: /wiki/Istimewa%3ASearch
Disallow: /wiki/Toiminnot:Search
Disallow: /wiki/Toiminnot%3ASearch
#
# ar:
Disallow: /wiki/%D8%AE%D8%A7%D8%B5:Search
Disallow: /wiki/%D8%AE%D8%A7%D8%B5%3ASearch
#
# de:
# http://bugzilla.wikimedia.org/show_bug.cgi?id=4937
# sensible deletion and meta user discussion pages:
Disallow: /wiki/Wikipedia:L%C3%B6schkandidaten/
Disallow: /wiki/Wikipedia:Löschkandidaten/
Disallow: /wiki/Wikipedia:Vandalensperrung/
Disallow: /wiki/Wikipedia:Benutzersperrung/
Disallow: /wiki/Wikipedia:Vermittlungsausschuss/
Disallow: /wiki/Wikipedia:Administratoren/Probleme/
Disallow: /wiki/Wikipedia:Adminkandidaturen/
Disallow: /wiki/Wikipedia:Qualitätssicherung/
Disallow: /wiki/Wikipedia:Qualit%C3%A4tssicherung/
# Search- and random-page
Disallow: /wiki/Spezial:Suche
Disallow: /wiki/Special:Suche
Disallow: /wiki/Spezial:Zufällige_Seite
Disallow: /wiki/Spezial:Zuf%C3%A4llige_Seite
Disallow: /wiki/Special:Zufällige_Seite
Disallow: /wiki/Special:Zuf%C3%A4llige_Seite
# 4937#5
Disallow: /wiki/Wikipedia:Vandalismusmeldung/
Disallow: /wiki/Wikipedia:Gesperrte_Lemmata/
Disallow: /wiki/Wikipedia:Löschprüfung/
Disallow: /wiki/Wikipedia:L%C3%B6schprüfung/
Disallow: /wiki/Wikipedia:Administratoren/Notizen/
Disallow: /wiki/Wikipedia:Schiedsgericht/Anfragen/
Disallow: /wiki/Wikipedia:L%C3%B6schpr%C3%BCfung/
# http://bugzilla.wikimedia.org/show_bug.cgi?id=12111
Disallow: /wiki/Wikipedia:Checkuser/
Disallow: /wiki/Wikipedia_Diskussion:Checkuser/
Disallow: /wiki/Wikipedia_Diskussion:Adminkandidaturen/
#
# Folks get annoyed when VfD discussions end up the number 1 google hit for
# their name. See bugzilla bug #4776
# en:
Disallow: /wiki/Wikipedia:Articles_for_deletion/
Disallow: /wiki/Wikipedia%3AArticles_for_deletion/
Disallow: /wiki/Wikipedia:Votes_for_deletion/
Disallow: /wiki/Wikipedia%3AVotes_for_deletion/
Disallow: /wiki/Wikipedia:Pages_for_deletion/
Disallow: /wiki/Wikipedia%3APages_for_deletion/
Disallow: /wiki/Wikipedia:Miscellany_for_deletion/
Disallow: /wiki/Wikipedia%3AMiscellany_for_deletion/
Disallow: /wiki/Wikipedia:Miscellaneous_deletion/
Disallow: /wiki/Wikipedia%3AMiscellaneous_deletion/
Disallow: /wiki/Wikipedia:Copyright_problems
Disallow: /wiki/Wikipedia%3ACopyright_problems
Disallow: /wiki/Wikipedia:Protected_titles/
Disallow: /wiki/Wikipedia%3AProtected_titles/
#
# fi:
# http://bugzilla.wikimedia.org/show_bug.cgi?id=8695
Disallow: /wiki/Wikipedia:Poistettavat_sivut
Disallow: /wiki/K%C3%A4ytt%C3%A4j%C3%A4:
Disallow: /wiki/Käyttäjä:
Disallow: /wiki/Keskustelu_k%C3%A4ytt%C3%A4j%C3%A4st%C3%A4:
Disallow: /wiki/Keskustelu_käyttäjästä:
Disallow: /wiki/Wikipedia:Yll%C3%A4pit%C3%A4j%C3%A4t/
Disallow: /wiki/Wikipedia:Ylläpitäjät/
#
# fr:
Disallow: /wiki/Wikip%C3%A9dia:Pages_%C3%A0_supprimer/
Disallow: /wiki/Wikip%C3%A9dia:Pages_soup%C3%A7onn%C3%A9es_de_violation_de_copyright/
#
# he:
Disallow: /wiki/%D7%9E%D7%99%D7%95%D7%97%D7%93:Search
Disallow: /wiki/%D7%9E%D7%99%D7%95%D7%97%D7%93%3ASearch
#
# hu:
Disallow: /wiki/Speci%C3%A1lis:Search
Disallow: /wiki/Speci%C3%A1lis%3ASearch
#
# ja:
Disallow: /wiki/%E7%89%B9%E5%88%A5:Search
Disallow: /wiki/%E7%89%B9%E5%88%A5%3ASearch
#
# ru:
Disallow: /wiki/%D0%A1%D0%BF%D0%B5%D1%86%D0%B8%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5:Search
Disallow: /wiki/%D0%A1%D0%BF%D0%B5%D1%86%D0%B8%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%3ASearch
#
# ja:
# https://bugzilla.wikimedia.org/show_bug.cgi?id=5239
Disallow: /wiki/Wikipedia:%E5%89%8A%E9%99%A4%E4%BE%9D%E9%A0%BC/
Disallow: /wiki/Wikipedia%3A%E5%89%8A%E9%99%A4%E4%BE%9D%E9%A0%BC/
Disallow: /wiki/Wikipedia:%E5%88%A9%E7%94%A8%E8%80%85%E3%83%9A%E3%83%BC%E3%82%B8%E3%81%AE%E5%89%8A%E9%99%A4%E4%BE%9D%E9%A0%BC
Disallow: /wiki/Wikipedia%3A%E5%88%A9%E7%94%A8%E8%80%85%E3%83%9A%E3%83%BC%E3%82%B8%E3%81%AE%E5%89%8A%E9%99%A4%E4%BE%9D%E9%A0%BC
#
# pt:
# https://bugzilla.wikimedia.org/show_bug.cgi?id=5394
Disallow: /wiki/Wikipedia:Páginas_para_eliminar/
Disallow: /wiki/Wikipedia:P%C3%A1ginas_para_eliminar/
Disallow: /wiki/Wikipedia%3AP%C3%A1ginas_para_eliminar/
Disallow: /wiki/Wikipedia_Discussão:Páginas_para_eliminar/
Disallow: /wiki/Wikipedia_Discuss%C3%A3o:P%C3%A1ginas_para_eliminar/
Disallow: /wiki/Wikipedia_Discuss%C3%A3o%3AP%C3%A1ginas_para_eliminar/
#
# zh:
# https://bugzilla.wikimedia.org/show_bug.cgi?id=5104
Disallow: /wiki/Wikipedia:删除投票/侵权
Disallow: /wiki/Wikipedia:%E5%88%A0%E9%99%A4%E6%8A%95%E7%A5%A8/%E4%BE%B5%E6%9D%83
Disallow: /wiki/Wikipedia:删除投票和请求
Disallow: /wiki/Wikipedia:%E5%88%A0%E9%99%A4%E6%8A%95%E7%A5%A8%E5%92%8C%E8%AF%B7%E6%B1%82
Disallow: /wiki/Category:快速删除候选
Disallow: /wiki/Category:%E5%BF%AB%E9%80%9F%E5%88%A0%E9%99%A4%E5%80%99%E9%80%89
Disallow: /wiki/Category:维基百科需要翻译的文章
Disallow: /wiki/Category:%E7%BB%B4%E5%9F%BA%E7%99%BE%E7%A7%91%E9%9C%80%E8%A6%81%E7%BF%BB%E8%AF%91%E7%9A%84%E6%96%87%E7%AB%A0
#
# it: - http://bugzilla.wikimedia.org/show_bug.cgi?id=5545
Disallow: /wiki/Wikipedia:Pagine_da_cancellare
Disallow: /wiki/Wikipedia%3APagine_da_cancellare
Disallow: /wiki/Wikipedia:Utenti_problematici
Disallow: /wiki/Wikipedia%3AUtenti_problematici
Disallow: /wiki/Wikipedia:Vandalismi_in_corso
Disallow: /wiki/Wikipedia%3AVandalismi_in_corso
Disallow: /wiki/Wikipedia:Amministratori
Disallow: /wiki/Wikipedia%3AAmministratori
Disallow: /wiki/Wikipedia:Proposte_di_cancellazione_semplificata
Disallow: /wiki/Wikipedia%3AProposte_di_cancellazione_semplificata
Disallow: /wiki/Categoria:Da_cancellare_subito
Disallow: /wiki/Categoria%3ADa_cancellare_subito
Disallow: /wiki/Wikipedia:Sospette_violazioni_di_copyright
Disallow: /wiki/Wikipedia%3ASospette_violazioni_di_copyright
Disallow: /wiki/Categoria:Da_controllare_per_copyright
Disallow: /wiki/Categoria%3ADa_controllare_per_copyright
# added 2007-01-12
Disallow: /wiki/Progetto:Rimozione_contributi_sospetti
Disallow: /wiki/Progetto%3ARimozione_contributi_sospetti
Disallow: /wiki/Categoria:Da_cancellare_subito_per_violazione_integrale_copyright
Disallow: /wiki/Categoria%3ADa_cancellare_subito_per_violazione_integrale_copyright
Disallow: /wiki/Progetto:Cococo
Disallow: /wiki/Progetto%3ACococo
Disallow: /wiki/Discussioni_progetto:Cococo
Disallow: /wiki/Discussioni_progetto%3ACococo
#
# pl.wikipedia.org
# http://bugzilla.wikimedia.org/show_bug.cgi?id=8067
Disallow: /wiki/Wikipedia:Strony_do_usuni%C4%99cia
Disallow: /wiki/Wikipedia%3AStrony_do_usuni%C4%99cia
Disallow: /wiki/Wikipedia:Do_usuni%C4%99cia
Disallow: /wiki/Wikipedia%3ADo_usuni%C4%99cia
Disallow: /wiki/Wikipedia:SDU/
Disallow: /wiki/Wikipedia%3ASDU/
Disallow: /wiki/Wikipedia:Strony_podejrzane_o_naruszenie_praw_autorskich
Disallow: /wiki/Wikipedia%3AStrony_podejrzane_o_naruszenie_praw_autorskich
#
# en.wikinews:
# https://bugzilla.wikimedia.org/show_bug.cgi?id=5340
Disallow: /wiki/Portal:Prepared_stories/
Disallow: /wiki/Portal%3APrepared_stories/
#
# it.wikinews, http://bugzilla.wikimedia.org/show_bug.cgi?id=9138
Disallow: /wiki/Wikinotizie:Richieste_di_cancellazione
Disallow: /wiki/Wikinotizie:Sospette_violazioni_di_copyright
Disallow: /wiki/Categoria:Da_cancellare_subito
Disallow: /wiki/Categoria:Da_cancellare_subito_per_violazione_integrale_copyright
Disallow: /wiki/Wikinotizie:Storie_in_preparazione
#
# he.wikipedia, http://bugzilla.wikimedia.org/show_bug.cgi?id=9517
Disallow: /wiki/ויקיפדיה:רשימת_מועמדים_למחיקה/
Disallow: /wiki/ויקיפדיה%3Aרשימת_מועמדים_למחיקה/
Disallow: /wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94:%D7%A8%D7%A9%D7%99%D7%9E%D7%AA_%D7%9E%D7%95%D7%A2%D7%9E%D7%93%D7%99%D7%9D_%D7%9C%D7%9E%D7%97%D7%99%D7%A7%D7%94/
Disallow: /wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94%3A%D7%A8%D7%A9%D7%99%D7%9E%D7%AA_%D7%9E%D7%95%D7%A2%D7%9E%D7%93%D7%99%D7%9D_%D7%9C%D7%9E%D7%97%D7%99%D7%A7%D7%94/
#
Disallow: /wiki/ויקיפדיה:ערכים_לא_קיימים_ומוגנים
Disallow: /wiki/ויקיפדיה%3Aערכים_לא_קיימים_ומוגנים
Disallow: /wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94:%D7%A2%D7%A8%D7%9B%D7%99%D7%9D_%D7%9C%D7%90_%D7%A7%D7%99%D7%99%D7%9E%D7%99%D7%9D_%D7%95%D7%9E%D7%95%D7%92%D7%A0%D7%99%D7%9D
Disallow: /wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94%3A%D7%A2%D7%A8%D7%9B%D7%99%D7%9D_%D7%9C%D7%90_%D7%A7%D7%99%D7%99%D7%9E%D7%99%D7%9D_%D7%95%D7%9E%D7%95%D7%92%D7%A0%D7%99%D7%9D
Disallow: /wiki/ויקיפדיה:דפים_לא_קיימים_ומוגנים
Disallow: /wiki/ויקיפדיה%3Aדפים_לא_קיימים_ומוגנים
Disallow: /wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94:%D7%93%D7%A4%D7%99%D7%9D_%D7%9C%D7%90_%D7%A7%D7%99%D7%99%D7%9E%D7%99%D7%9D_%D7%95%D7%9E%D7%95%D7%92%D7%A0%D7%99%D7%9D
Disallow: /wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94%3A%D7%93%D7%A4%D7%99%D7%9D_%D7%9C%D7%90_%D7%A7%D7%99%D7%99%D7%9E%D7%99%D7%9D_%D7%95%D7%9E%D7%95%D7%92%D7%A0%D7%99%D7%9D
#
# sv.wikipedia, #10229
Disallow: /wiki/Wikipedia%3ASidor_f%C3%B6reslagna_f%C3%B6r_radering
Disallow: /wiki/Wikipedia:Sidor_f%C3%B6reslagna_f%C3%B6r_radering
Disallow: /wiki/Wikipedia:Sidor_föreslagna_för_radering
#
Disallow: /wiki/Användare
Disallow: /wiki/Anv%C3%A4ndare
#
Disallow: /wiki/Användardiskussion
Disallow: /wiki/Anv%C3%A4ndardiskussion
#
Disallow: /wiki/Wikipedia:Skyddade_sidnamn
Disallow: /wiki/Wikipedia%3ASkyddade_sidnamn
#
## *at least* 1 second please. preferably more :D
## we're disabling this experimentally 11-09-2006
#Crawl-delay: 1
#
Disallow: /wiki/Wikibooks:Votes_for_deletion
#
# 11291
Disallow: /wiki/Wikipedia:Sidor_som_bör_raderas
Disallow: /wiki/Wikipedia:Sidor_som_b%C3%B6r_raderas
Disallow: /wiki/Wikipedia%3ASidor_som_b%C3%B6r_raderas
#
# 11261
Disallow: /wiki/Wikipedia:Requests_for_arbitration/
Disallow: /wiki/Wikipedia%3ARequests_for_arbitration/
Disallow: /wiki/Wikipedia:Requests_for_comment/
Disallow: /wiki/Wikipedia%3ARequests_for_comment/
Disallow: /wiki/Wikipedia:Requests_for_adminship/
Disallow: /wiki/Wikipedia%3ARequests_for_adminship/
#
# 10288
Disallow: /wiki/Wikipedia_talk:Articles_for_deletion/
Disallow: /wiki/Wikipedia_talk%3AArticles_for_deletion/
Disallow: /wiki/Wikipedia_talk:Votes_for_deletion/
Disallow: /wiki/Wikipedia_talk%3AVotes_for_deletion/
Disallow: /wiki/Wikipedia_talk:Pages_for_deletion/
Disallow: /wiki/Wikipedia_talk%3APages_for_deletion/
Disallow: /wiki/Wikipedia_talk:Miscellany_for_deletion/
Disallow: /wiki/Wikipedia_talk%3AMiscellany_for_deletion/
Disallow: /wiki/Wikipedia_talk:Miscellaneous_deletion/
Disallow: /wiki/Wikipedia_talk%3AMiscellaneous_deletion/
#
# 6746
Disallow: /wiki/Wikipedia:Consultas_de_borrado/
Disallow: /wiki/Wikipedia%3AConsultas_de_borrado/
#
# 11432
Disallow: /wiki/Bruker:
Disallow: /wiki/Bruker%3A
Disallow: /wiki/Brukerdiskusjon
Disallow: /wiki/Wikipedia:Administratorer
Disallow: /wiki/Wikipedia%3AAdministratorer
Disallow: /wiki/Wikipedia-diskusjon:Administratorer
Disallow: /wiki/Wikipedia-diskusjon%3AAdministratorer
Disallow: /wiki/Wikipedia:Sletting
Disallow: /wiki/Wikipedia%3ASletting
Disallow: /wiki/Wikipedia-diskusjon:Sletting
Disallow: /wiki/Wikipedia-diskusjon%3ASletting
Disallow: /wiki/Spesial:
Disallow: /wiki/Spesial%3A
#
# working...
Disallow: /wiki/Fundraising_2007/comments
#

google.com的robots.txt

User-agent: *
Allow: /searchhistory/
Disallow: /news?output=xhtml&
Allow: /news?output=xhtml
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalogues
Disallow: /news
Disallow: /nwshp
Disallow: /?
Disallow: /addurl/image?
Disallow: /pagead/
Disallow: /relpage/
Disallow: /relcontent
Disallow: /sorry/
Disallow: /imgres
Disallow: /keyword/
Disallow: /u/
Disallow: /univ/
Disallow: /cobrand
Disallow: /custom
Disallow: /advanced_group_search
Disallow: /advanced_search
Disallow: /googlesite
Disallow: /preferences
Disallow: /setprefs
Disallow: /swr
Disallow: /url
Disallow: /default
Disallow: /m?
Disallow: /m/lcb
Disallow: /m/search?
Disallow: /wml?
Disallow: /wml/search?
Disallow: /xhtml?
Disallow: /xhtml/search?
Disallow: /xml?
Disallow: /imode?
Disallow: /imode/search?
Disallow: /jsky?
Disallow: /jsky/search?
Disallow: /pda?
Disallow: /pda/search?
Disallow: /sprint_xhtml
Disallow: /sprint_wml
Disallow: /pqa
Disallow: /palm
Disallow: /gwt/
Disallow: /purchases
Disallow: /hws
Disallow: /bsd?
Disallow: /linux?
Disallow: /mac?
Disallow: /microsoft?
Disallow: /unclesam?
Disallow: /answers/search?q=
Disallow: /local?
Disallow: /local_url
Disallow: /froogle?
Disallow: /products?
Disallow: /froogle_
Disallow: /product_
Disallow: /products_
Disallow: /print
Disallow: /books
Disallow: /patents?
Disallow: /scholar?
Disallow: /complete
Disallow: /sponsoredlinks
Disallow: /videosearch?
Disallow: /videopreview?
Disallow: /videoprograminfo?
Disallow: /maps?
Disallow: /mapstt?
Disallow: /mapslt?
Disallow: /maps/stk/
Disallow: /mapabcpoi?
Disallow: /translate?
Disallow: /ie?
Disallow: /sms/demo?
Disallow: /katrina?
Disallow: /blogsearch?
Disallow: /blogsearch/
Disallow: /blogsearch_feeds
Disallow: /advanced_blog_search
Disallow: /reader/
Disallow: /uds/
Disallow: /chart?
Disallow: /transit?
Disallow: /mbd?
Disallow: /extern_js/
Disallow: /calendar/feeds/
Disallow: /calendar/ical/
Disallow: /cl2/feeds/
Disallow: /cl2/ical/
Disallow: /coop/directory
Disallow: /coop/manage
Disallow: /trends?
Disallow: /trends/music?
Disallow: /notebook/search?
Disallow: /music
Disallow: /browsersync
Disallow: /call
Disallow: /archivesearch?
Disallow: /archivesearch/url
Disallow: /archivesearch/advanced_search
Disallow: /base/search?
Disallow: /base/reportbadoffer
Disallow: /base/s2
Disallow: /urchin_test/
Disallow: /movies?
Disallow: /codesearch?
Disallow: /codesearch/feeds/search?
Disallow: /wapsearch?
Disallow: /safebrowsing
Disallow: /reviews/search?
Disallow: /orkut/albums
Disallow: /jsapi
Disallow: /views?
Disallow: /c/
Disallow: /cbk
Disallow: /recharge/dashboard/car
Disallow: /recharge/dashboard/static/
Disallow: /translate_c?
Disallow: /s2
Disallow: /transconsole/portal/
Disallow: /gcc/
Disallow: /aclk
Disallow: /cse?
Disallow: /tbproxy/

新华网的robots.txt

网址:http://www.xinhua.com/robots.txt

# robots.txt to block all bots except bots from Google , MSN , Yahoo
User-agent: Googlebot
Disallow:
User-agent: Slurp
Disallow:
User-agent: MSNBot
Disallow:
User-agent: *
Disallow: /
常见爬虫的特性 GoogleBot Mediapartners Yahoo! Slurp Yahoo! Slurp China YodaoBot
爬虫程序 ia_archiver iask iearthworm DigExt Indy Library
网络爬虫的名称 爬虫程序的屏蔽 .htaccess robots.txt 更多爬虫
Personal tools
工具
金银币拍卖 金币拍卖预展  金银币网店 熊猫金银币 生肖金银币