![]() |
|
|
| |
|
||||
Missing image Google_screenshot.png Google's main page's unusually spartan design, uncluttered appearance and quick loading time have contributed greatly to the site's mass appeal.
Google is a U.S.–based search engine owned by Google Inc. whose mission "is to organize the world's information and make it universally accessible and useful." The largest search engine on the web, Google receives over 200 million queries each day through its various services. In addition to its tool for searching webpages, Google also provides services for searching images, Usenet newsgroups, news websites, videos and items for sale online. As of November 2004, Google has indexed 8.05 billion webpages, 880 million images, and 845 million Usenet messages — in total, over 9.5 billion items. It also caches much of the content that it indexes. Some of the other programs that operate under Google control include Blogger, Picasa, Keyhole, Froogle, and Google Desktop Search. Trademark and domain names"To google," as a verb, has come to mean "to search for something on Google"; because of Google's popularity (currently 52 percent of all web searches [1] (http://www.iht.com/articles/2005/01/30/business/google31.html), but was as high as 80 percent) it has also generically come to mean "to search the web." Google officials have discouraged this usage of the company name out of fear of trademark dilution, as it could lead to their name becoming a genericized trademark. As several large companies have been wont to do on the internet, as a corporation, Google purchased the redirecting rights to several similar-sounding URLs (ex. gogle.com, googel.com etc). Otherwise, these domain names could have been registered by others in order to prevent domain hijacking. Some humourous ones, such as cheesemuffin.info, are also "secretly" used by them. This is not a Google-only practice. For example, in addition to the apple.com domain name, Apple Computer owns the mammals.org domain name. Other domains may be found in the "external links" section of this article. HistoryGoogle began as a research project in early 1996 by Larry Page and Sergey Brin, two Stanford graduate students who developed the theory that a search engine based on a mathematical analysis of the relationships between websites would produce better results than the basic techniques then in use. It was originally nicknamed BackRub because the system checked backlinks to estimate a site's importance. Convinced that the pages with the most links to them from other highly relevant webpages must be the most relevant ones, Page and Brin decided to test their thesis as part of their studies, and laid the foundation for their search engine. They formally founded their company, Google Inc., on September 7, 1998 at a friend's garage in Menlo Park, California. google.com went live soon thereafter under the name "Google!" (with an exclamation mark). In February 1999, the company moved into the somewhat notorious 165 University Ave., Palo Alto, California office location, before moving to their current "Googleplex" location later that year. Google's simple design was due to Brin's lack of interest in writing HTML, desiring a simple interface. Advertisements were sold by the keyword so that they would be more relevant to the end user, and the ads were text-based in order to keep page design uncluttered and fast-loading. In September 2001, Google's ranking mechanism PageRank was awarded a U.S. Patent. The patent was officially awarded to Stanford University and lists Lawrence Page as the inventor. At its peak in early 2004, Google handled upwards of 80 percent of all search requests on the Internet through its website and clients like Yahoo!, AOL, and CNN. [2] (http://www.onestat.com/html/aboutus_pressbox21.html) Google's share of internet search fell in 2004 when Yahoo! dropped Google's search technology for their own. Google includes humorous features such as cartoon modifications [3] (http://www.google.com/holidaylogos.html) (called "Google Doodles") of their logo for special occasions, the option to display the site in fictional or humorous languages such as Klingon and Leet, and April Fool's Day jokes about the company. It is conjectured that Google's future is personalized searches, using the data that is gathered from Orkut, Gmail, and Froogle to give results based on an individual's previous actions. In fact, there is a Personalized Google Search (http://labs.google.com/personalized) Beta in Google Labs (http://labs.google.com/), the experimental section of the site. EtymologyThe name "Google" is an accidental misspelling of the word googol, which was coined in 1938 by Milton Sirotta, nephew of U.S. mathematician Edward Kasner, to refer to the number represented by 1 followed by a hundred zeros. Google's use of the term reflects the company's mission to organize the immense amount of information available on the Web. ControversyA number of organizations have used the Digital Millennium Copyright Act to demand that Google remove references to allegedly copyrighted material on other sites. Google typically handles this by removing the link as requested and including a link to the complaint in the search results. There have also been complaints that Google's web cache feature violates copyright. However, Google provides mechanisms for requesting that caching be disabled (which Google respects; it also honors the robots.txt file which is another mechanism that allows operators of a website to request that part or all of their site not be included in search engine results). In 2002, news reports surfaced that the Google search engine had been banned in mainland China. The ban was later lifted, and some reports indicated that it was not Google itself that was targeted. Rather, Google's web cache feature provides a cached version of most websites which would allow Chinese web users to circumvent the ban of a specific website, merely by using Google. Google's efforts to refine its database has led to some legal controversy, notably a lawsuit in October 2002 from the company SearchKing which sought to sell advertisements on pages with inflated Google rankings. In its defense, Google stated that its rankings are its constitutionally protected opinions of the web sites that it indexes. A judge subsequently threw out SearchKing's lawsuit in mid-2003 on precisely these grounds. In late 2003 and early 2004, there were rumors that Google would be sued by the SCO Group over their use of the Linux operating system, in conjunction with SCO's lawsuit against IBM over the claimed ownership of intellectual property rights relating to Linux. See also the Criticisms of Google section below. The search enginePhysical structureGoogle employs server farms of Linux computers around the world to respond to search requests and to index the web. The server farms are built using a shared nothing architecture. The indexing is performed by a program Googlebot which periodically requests new copies of web pages it already knows about. The more often a page updates, the more often Googlebot will visit. The links in these pages are examined to discover new pages to be added to its internal database of the web. This index database and web page cache is several terabytes in size. Google has developed their own filesystem Google File System for storing all this data. The exact size and whereabouts of the physical machines in the Google search engine are unknown, and official figures remain intentionally vague. According to John Hennessy and David Patterson's Computer Architecture: A Quantitative Approach, Google's server farm computer cluster in the year 2000 consisted of approximately 6000 processors, 12000 common IDE disks (2 per machine, and one processor per machine), at four sites: two in Silicon Valley, CA and two in Virginia. Each site had an OC 48 (2488 Mbit/s) internet connection and an OC 12 (622 Mbit/s) connections to other Google sites. The connections are eventually routed down to 4 x 1 Gbit/s lines connecting up to 64 racks, with 40 machines and ethernet switch on both sides of each rack, so that a rack can hold 80 machines and two ethernet switches. Based on the Google IPO S-1 form released in April 2004, Tristan Louis, the Vice President of application development for the Internet unit of a large financial firm, estimated the current server farm to contain something like the following [4] (http://www.tnl.net/blog/entry/How_many_Google_machines):
According to this estimate, the Google server farm constitutes one of the most powerful supercomputers in the world, at 126-316 teraflops, being able to perform at least three times as many calculations per second as the Earth Simulator. PageRank and indexingGoogle uses an algorithm called PageRank to rank web pages that match a given search string. The PageRank algorithm computes a recursive figure of merit for web pages, based on the weighted sum of the PageRanks of the pages linking to them. The PageRank thus derives from human-generated links, and correlates well with human concepts of importance. Previous keyword-based methods of ranking search results, used by many search engines that were once more popular than Google, would rank pages by how often the search terms occurred in the page, or how strongly associated the search terms were within each resulting page. In addition to PageRank, Google also uses other secret criteria for determining the ranking of pages on result lists. Google not only indexes and caches HTML files but also 13 other file types [5] (http://www.google.com/help/faq_filetypes.html), which include PDF, Word documents, Excel spreadsheets and plain text files. Except in the case of text files, the cached version is a conversion to HTML, allowing those without the corresponding viewer application to read the file. Google may have difficulty indexing some websites, in particular those that use frames, links embedded within JavaScript or Java, or complex URLs with more than six variables in the query string. Google offers an explanation (http://www.google.com/webmasters/2.html) why some web pages haven't been included. Users can customize the search engine somewhat. They can set a default language, use "SafeSearch" filtering technology (which is 'ON' by default), and set the number of results shown on each page. Google has been criticized for placing long-term cookies on users' machines to store these preferences, a tactic which also enables them to track a user's search terms over time. For any query (of which only the 32 first keywords are taken into account), up to the first 1000 results can be shown with a maximum of 100 displayed per page. Despite its immense index, there is also a considerable amount of data in databases which are accessible from websites by means of queries, but not by links. This so-called deep web is not covered by Google and contains e.g. catalogues of libraries, official legislative documents of governments, phone books, etc. (For an April Fool's parody of PageRank, see Google's PigeonRank page (http://www.google.com/technology/pigeonrank.html)) Google optimizationSince Google is the most popular search engine, many webmasters have become eager to influence their websites' Google rankings. An industry of consultants has arisen to help websites raise their rankings on Google and on other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for improving rankings. One of Google's chief challenges is that as its algorithms and results have gained the trust of web users, the profit to be gained by a commercial web site in subverting those results has increased dramatically. Some search engine optimization firms have attempted to inflate specific Google rankings by various artifices, and thereby draw more searchers to their clients' sites. Google has managed to weaken some of these attempts by reducing the ranking of sites known to use them. Search engine optimization encompasses both "on page" factors (like body copy, title tags, H1 heading tags and image alt attributes) and "off page" factors (like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by incorporating the keywords being targeted in various places "on page," in particular the title tag and the body copy (note: the higher up in the page, the better its keyword prominence and thus the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to Google's spam checking algorithms. One "off page" technique that works particularly well is Google bombing in which websites link to another site using a particular phrase in the anchor text, in order to give the site a high ranking when the word is searched for. Google publishes a set of guidelines (http://www.google.com/webmasters/guidelines.html) for a website's owners who would like to raise their rankings when using legitimate optimization consultants. A more comprehensive guide to SEO ethics standards (http://www.dma.co.nz/pdfs/Standards_for_Search_Engine_Marketing.pdf) is available from the New Zealand DMA (http://www.dma.co.nz/). Google servicesGoogle AnswersMissing image
Google.Answers.png In April 2002, Google launched a new service called "Google Answers". Google Answers is an extension to the conventional search — rather than doing the search themselves, users pay someone else to do the search. Customers ask questions, offer a price for an answer, and researchers answer them. Researchers are screened through an application process that tests their research and communications abilities. Prices for questions range from $2 to $200; Google keeps 25% of the payment, sends the rest to the researchers, and charges an additional $0.50 listing fee. Once a question is answered, it remains available for anyone to browse for free. This service came out of beta in May 2003 and presently receives more than one hundred question postings per day. Google CatalogsMissing image
Google.Catalogs.png As of late August, 2004, Google's catalogs search feature is in the beta stage. Numerous (over 6,600 at the time of this writing) print catalogs are archived on Google as scanned image files. Through the use of character recognition, users can search for a text string in these catalogs in a fashion similar to how they would for materials on the general web. Matching results are displayed through thumbnails of the page(s) on which the text was found, with the specific area of the page where the search result is found shaded in a yellow box. Another image file next to the thumbnail, a shrunk version of the highlighted area on the thumbnail, highlights the exact location of the search result. Users can then access the page of the catalog (as a larger graphic file) and change pages by using a navigation bar positioned above the page image. It might be worth noting that one can access the catalogs without a search as well. Google DirectoryMissing image
Google.Directory.png The directory is a subset of the links in Google's database arranged into hierarchical subcategories like an advanced Yellow Pages of the web. The original source of the directory, and the categorization is the Open Directory Project (ODP), which has its own website at http://dmoz.org/ The ODP publishes a easily parsed version of its database in Resource Description Framework format for other sites, like Google, to use for derivative directories. FroogleMissing image
Froogle.png In December 2003, Google announced Froogle, a price engine-like spin-off that searches catalogues for particular products. This site had been active in beta for some months. It is now offered in Wireless Markup Language (WML) form and can be accessed from cellphones or other wireless devices that have support for WML. Google GroupsMissing image
Google.Groups.png Google maintains a usenet archive, called Google Groups (formerly an independent site known as Deja News). Google is currently testing a new version of its Groups service, which archives mailing lists in addition to usenet posts, using the same interface as Gmail (see below). Google ImagesMissing image
Google.Images.png In 2003, Google announced Google Images, which allows users to search the web for image content. The keywords for the image search are based on the filename of the image, the link text pointing to the image, and text adjacent to the image. When searching for an image, a thumbnail of each matching image is displayed. Google LabsMissing image
Google.Labs.png Google Labs consists of all of Google's experimental technologies. Located at http://labs.google.com/, Google Labs is akin to a directory page that links to all Google technologies under development or in beta that have not yet been made widely available. From the Google Labs home page, a user can access Google Suggest, Google Desktop Search, and other web technologies. Google MapsMissing image
Google.Maps.png On February 8, 2005, Google introduced a beta release of a US only online map service which interacts with Google Local to restrict results to a certain areas. It features draggable maps, location search and turn-by-turn directions. Google NewsMissing image
Google.News.png Google introduced a beta release of an automated news compilation service, "Google News" in April 2002. There are different versions of the aggregator for more than 20 languages, with more added all the time. To quell any charges of reporting bias, it is fully automated with no human editors. The service covers the news articles that appeared within the past 30 days on news websites in the language concerned, from various countries; for the English language it covers about 4,500 sites, for other languages less. It provides around the first 200 characters and links to the full article. Some of these websites require a subscription; in that case this is noted in the Google News summary of their articles. Google News provides searching, and the choice of sorting the results by date and time of publishing (not to be confused with date and time of the news happening) or grouping them (and also grouping without searching). In the English version, there is an option to tailor the grouping to a selected national audience. Users can request Google News Alerts on various topics by subscribing while using key words. An email is sent when a news article matching the request comes online. Google PrintMissing image
Google.Print.png In August 2004, Google announced its new "Google Print" service. This tool searches the contents of books submitted by publishers and displays matches above web matches on the search result page. It offers links to purchase the book, as well as content-related advertisements. Google will limit the number of viewable pages from any book through user-tracking. As of early January 2005, this service remains in the beta stage. This feature is similar to a service offered by A9.com. In December 2004, Google announced an extension to its Google Print program [6] (http://www.google.com/googleblog/2004/12/all-booked-up.html). It is a non-exclusive deal with several high-profile university and public libraries, including the University of Michigan, Harvard (Widener Library), Stanford (Green Library), Oxford (Bodleian Library), and the New York Public Library. According to press releases and university librarians, Google plans to have approximately 15 million public domain volumes online within a decade. [7] (http://www.nytimes.com/2004/12/14/technology/14google.html?hp&ex=1103086800&en=9d5c79b92752adff&ei=5094&partner=homepage)[8] (http://www.google.com/press/pressrel/print_library.html)[9] (http://print.google.com/googleprint/library.html)[10] (http://www.webmasterworld.com/forum3/27080.htm)
Google ScholarMissing image
Google.Scholar.png In November, 2004, Google released "Google Scholar", which indexes and searches academic literature across an array of sources and disciplines. Results are ranked by "relevance", which is based largely on the number of citations and in this sense is similar to PageRank. Google SpecialAllows users to perform special searches such as U.S. Government Search, Linux Search, BSD Search, Apple Macintosh Search, and a Microsoft Windows Search. Google SuggestMissing image
Google_Suggest_(Beta)_blacklist_demo_animation_1.gif A new feature called Google Suggest Beta was introduced (http://www.google.com/googleblog/2004/12/ive-got-suggestion.html) on December 10, 2004. It provides an autocomplete functionality that gives the user suggestions as they type. JavaScript is used to rapidly query the server and update the page for each keystroke that the user types. The feature quickly drew widespread praise as an impressive innovation, and so far competitors have not offered anything similarly real-time. It was also quickly noticed that Google attempts to avoid suggesting potentially offensive searches. For instance, there are no suggestions for searches containing the word porn, but there are many for pr0n and other variations that aren't on the blacklist. Although pr0n (with a zero) is allowed, pron is on the blacklist, which has the side-effect of not suggesting searches containing any words that include pron such as apron, mispronunciation, pronunciation or prone. Unlike pron and sex, the word ass is only blacklisted when it appears with a space after it, so words containing ass such as associated are suggested. The blacklist also includes the word lesbian, but not faggot, nigger, shit, or several other words that are often included on profanity blacklists. Google UniversityAllows users to search within certain University domains. Google VideoOn January 25 2005, Google introduced a beta of Google Video, allowing users to search through television content based on title, network or a closed caption transcript. [11] (http://video.google.com) Certain (http://video.google.com/videopreview?q=MASH&time=230000&page=1&docid=-6861157170290896009&urlcreated=1106634658&chan=KTVU&prog=M*A*S*H+%7C+Bug+Out&date=Tue+Dec+28+2004+at+12%3A00+AM+PST&hmac=3L+VyR7lPsIOKvOYtvuEGEUv7gI) show links redirect to Wikipedia articles for details. Other (http://video.google.com/videopreview?q=scrubs&time=0&page=1&docid=-7910565294102232492&urlcreated=1106634885&chan=KNTV&prog=Scrubs+%7C+My+First+Kill&date=Tue+Dec+28+2004+at+9%3A30+PM+PST&hmac=ImFmo7kxrLT2wjrP0FFmQi+yoAY) program summaries clearly show that Google Video is in a developmental stage, outputting such excerpts as $Ithlst klb$di I$Dir$| where script lines should be. Google WirelessAllows users to search using Google from wireless devices such as mobile phone and PDAs. Other tools
Google ToolsGoogle Browser ButtonsThis tool allows users to put links to Google services in their web browsers. Google DeskbarIn December 2003, Google launched the beta version of the Google Deskbar, a search tool which runs from the Microsoft Windows taskbar, without a browser having to be open. It can return film reviews, stock quotes, dictionary and thesaurus definitions, plus any pre-configured search of a third-party site (e.g. eBay or Amazon). Google Deskbar APIIn November 2004, Google launched an API for Google Deskbar. Google Desktop SearchMissing image
Google.Desktop.Logo.png Known internally under the codename Puffin, Google Desktop Search runs locally on a PC and will index all Microsoft Outlook and Outlook Express emails, text documents, Microsoft Office documents, AOL Instant Messenger conversations, and the Internet Explorer history on that PC, and allow the user to search them from a browser. Google Desktop Search is an extension of Google Search. After indexing a user's files, his or her local results will turn up on normal Google search on his or her local computer. Google Desktop Search does not store users' files on the web and users' personal information is not sent to Google. Google Desktop Search was likely developed in response to file and Web search capabilities that will be offered in the next major release of Microsoft Windows, codenamed Longhorn (slated for release in 2006) — features that directly compete with Google's core Internet search business. Currently, Google Desktop Search does not support Google's "Did You Mean" feature. For example, if a user lets it look up his or her computers for "chicke", it will not ask whether he or she meant 'chicken'. Desktop Search received much attention because it may allow reverse engineering of Google's proprietary search algorithm. Google Language ToolThis tool allows users to use Google in many different languages. Google ToolbarThis addition to Microsoft Internet Explorer 5 or later adds Google's searching capabilities in a toolbar in the web browser. The latest version includes pop-up ads blocking, automatic filling of forms, and the ability to show the Google PageRank value for the current page being viewed. It has been criticized for being a security risk because it updates itself without user intervention. A separately downloadable add-on for the toolbar allows participation in Google Compute, a distributed computing project to help scientific research. Other browsers, such as Mozilla, Mozilla Firefox, Opera, and Safari, have built-in search tools that offer the same functionality. Mozilla Firefox also has its own version of the Google Toolbar, the Googlebar, which is developed independently of and is not supported by Google or the Mozilla Firefox developers. It expands upon the official Google toolbar to the point that the only feature not replicated is the Google PageRank functionality. There are other tools that bring the pagerank functionality to Mozilla and Firefox, including a modification of Googlebar. Googlebar has also been built into Safari for Apple Computer's new Mac OS X operating system. Google Translate ToolThis tool allows users to translate webpages and text into other languages. Google Web APIThe Google Web API (or Google Web Services) is Google's public interface for registered developers. Using Simple Object Access Protocol (SOAP), a programmer can write services for search and data mining that rely on Google's results. Also, websurfers can view cached pages and make suggestions for better spelling. By default a developer has a limit of 1,000 requests per day. This program is still in Beta phase. Google is one of the few search engines to make its results available via a public API; Technorati is another good example. Some popular implementations of the Google Web API include the alerting service Google Alert (http://www.googlealert.com), or FindForward (http://www.findforward.com), as well as the Google Dance Tool, which monitors when Google is spidering the Internet. Criticism of GoogleDespite Google's apparent success it has also managed to become the target of critics. For example, online journalists believe Google News should not treat press releases as news. Claims of partialityIn February 2003, Google banned the ads of Oceana, a two and a half year old non-profit organization, which was protesting the environmental effects of a major cruise ship operations's sewage treatment practices. Google claimed that their editorial policy states, "that Google does not accept advertising if the ad or site advocates against other individuals, groups, or organizations." Offensive search resultsIn April 2004, Google received complaints that a search for "Jew" on its site listed the anti-Jewish website Jew Watch at or towards the top of the list. Google insisted this was a result of their content-oblivious PageRank algorithm. [13] (http://www.google.com/explanation.html). Claims of censorshipSites advocating ethnocentrism or historical revisionism have been banned for years in the French and German versions of Google as such speech is censored in those countries. The Chinese version of Google restricts searches on tens of thousands of keywords, acting as a technological partner to the content control policies of the Chinese central government [14] (http://www.weeklystandard.com/Content/Public/Articles/000/000/004/699bevot.asp). Other potentially controversial sites such as pornography have been restricted by a "SafeSearch" filter which can be turned off. When Google's image search feature did not return any results on the Abu Ghraib prisoner abuse scandal for several months in 2004, some Internet users and privacy advocates theorized that this was possibly censorship. [15] (http://yro.slashdot.org/article.pl?sid=04/11/07/1442217) A Google representative responded, "Google's image index is not updated as frequently as it should be" [16] (http://yro.slashdot.org/comments.pl?sid=128815&cid=10747654). The images started appearing in Google's image index in November 2004. Claims of privacy invasionMain article: Google and Privacy Some have pointed out the privacy implications of having a centrally located, widely popular data warehouse of millions of internet users' searches, and how under existing US law, Google would be required to hand over all such information to the US government. It has been claimed that Google infringes the privacy of visitors by uniquely identifying them using cookies which are used to track web user's search history. The cookies possess excessively distant expiry dates and it is claimed users' searches are recorded without permission for advertising purposes. In response Google claims cookies are necessary to maintain user preferences between sessions and offer other search features. Some users believe the processing of email message content by Google's GMail service goes beyond proper use. The point is often made that people without GMail accounts, who have not agreed to the GMail terms of service, but send email to GMail users have their correspondence analyzed without permission. Google claims that mail sent to or from GMail is never read by a human being, and is only used to improve relevance of advertisements. Other popular email services such as Hotmail also scan incoming email to try to determine whether it is unsolicited email. Chris Hoofnagle, associate director of the Electronic Privacy Information Center in Washington, DC warned that "As courts become more frequent integrators of electronic records, there is a greater risk of Google ... becoming a serious privacy threat." Criticisms of PageRank systemGoogle's central PageRank system has been criticized, some calling it "undemocratic". Common arguments are that the system is unfairly biased towards large web sites, and that the criteria for a page's importance are not subject to peer review. The system is also highly susceptible to manipulation and fraud through the use of dummy sites. Index size
(source: Internet Archive copy of google.com (http://web.archive.org/web/%2a/google.com)) Google jargon
Books
Related articlesExternal links
|
|
|
|
|
|
|
|
Copyright 2008 WordIQ.com - Privacy Policy
::
Terms of Use
:: Contact Us
:: About Us This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Google News". |