Search_engine_spider Search_engine_spider

Search engine spider - Definition and Overview

Related Words: Beat, Chase, Check, Comb, Dog, Dragnet, Examination, Exploration, Explore, Fan, Fathom

See WebCrawler for the specific search engine of that name.


A web crawler (also known as web spider) is a program which browses the World Wide Web in a methodical, automated manner. A web crawler is one type of bot. Web crawlers not only keep a copy of all the visited pages for later processing - for example by a search engine but also index these pages to make the search narrower.

In general, the web crawler starts with a list of URLs to visit. As it visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit. The process is either ended manually, or after a certain number of links have been followed.

Web crawlers typically take great care to spread their visits to a particular site over a period of time, because they access many more pages than the normal (human) user and therefore can make the site appear slow to the other users if they access the same site repeatedly.

For similar reasons, web crawlers are supposed to obey the robots.txt protocol, with which web site owners can indicate which pages should not be spidered.

The procedure of following links and not submitting queries to databases causes much content to be ignored: the deep web.

See also: Google, PageRank, Data mining

External links

Example Usage of Search

johnraav: Comic Preview: 'Paranormal Activity: The Search For Katie ...: Paranormal Activity hits DVD and Blu-ray on Decem.. http://bit.ly/13rtYI
trafficexch: You have got to bookmark this Search engine http://www.easyfindit.net
livelovescrap: Why Tinkering and Fixing Is a Dying Craft: Parents and grandparents racing to electronic stores in Search of Ch.. http://bit.ly/7S9fwC
Copyright 2009 WordIQ.com - Privacy Policy  :: Terms of Use  :: Contact Us  :: About Us
This article is licensed under the GNU Free Documentation License. It uses material from the this Wikipedia article.