For When You Can't Have The Real Thing
[ start | index | login ]
start > SnipSnap > config > robots

robots

Created by Administrator. Last edited by Administrator, 20 years and 125 days ago. Viewed 3,106 times. #1
[edit] [rdf]
labels
attachments
# This file contains User-Agent mappings for robots
#
# WWW Search Engines
Googlebot http://www.googlebot.com/bot.html
FAST-WebCrawler http://fast.no/support/crawler.asp
Slurp http://www.inktomi.com/slurp.html
Teoma http://www.teoma.com/
#
# Syndication Crawler
Syndic8 http://www.syndic8.com
organica http://organica.us 
Popdexter http://www.popdex.com
#
# Robots and crawler to be ignored
#
ZyBorg http://www.WISEnutbot.com IGNORE
larbon liston@cc.gatech.edu IGNORE
Baiduspider http://www.baidu.com/search/spider.htm IGNORE
Infoseek http://www.infoseek.com/ IGNORE
TurnitinBot http://www.turnitin.com/robot/crawlerinfo.html IGNORE
NIF http://www.newsisfree.com/robot.php IGNORE
NPBot http://www.nameprotect.com/botinfo.html IGNORE
Robot ??? IGNORE
Scooter ??? IGNORE
Mercator ??? IGNORE
EvilBot ??? IGNORE
BlogBot ??? IGNORE
no comments | post comment
This is a collection of techical information, much of it learned the hard way. Consider it a lab book or a /info directory. I doubt much of it will be of use to anyone else.

Useful:


snipsnap.org | Copyright 2000-2002 Matthias L. Jugel and Stephan J. Schmidt