Computer and Internet Technology: Search engine

Search engine on the internet is a computer program designed to facilitate a person to find the files that are stored in the computer, for example, in a public server on the web. The search engine allows us to ask for media content with specific criteria (usually containing the word or words that we set) and get a list of files that meet these criteria. Search engines usually use the index (which was made before and updated on a regular basis) to find the file after the user enters a search criteria.

In the context of the Internet, search engine usually refers to the WWW, and not the protocol or other areas. In addition, search engine data available in newsgroups, large databases, or open directories like DMOZ.org. Because data collection is done automatically, unlike search engine web directory of human work.

Most search engines run by private companies using proprietary algorithms and closed databases - the most popular is Google (MSN Search and Yahoo! Left little behind). There have been several attempts to create search engines with open-source (open source), for example is Htdig, Nutch, Egothor and OpenFTS. [1]

The way search engines work
Web search engines work by storing information about many web pages, which they retrieve from the WWW. These pages are retrieved by a web crawler - automated web browser which follows every link he saw. The contents of each page and then analyzed to determine how mengindeksnya (for example, the words taken from the title, subtitle, or special fields called meta tags). Data about web pages are stored in an index database for use in subsequent searches. Some search engines such as Google, store all or part of the page source (called cache) as well as information on the web page itself.

When a user visits a search engine and enter a query, typically by entering keywords, search engines index and provides a list of web pages that best matches the criteria, usually accompanied by a brief summary of the document title and sometimes some of the text.

There are other search engines: search engine real-time, such as Orase. Machines like this do not use indexes. Machinery necessary information is collected only if there is a new search. When compared with systems that use index-based machines such as Google, real-time system is superior in several respects: information is always up to date, (almost) no dead links, and fewer system resources required. (Google uses nearly 100,000 computers, Orase only one.) But there are also disadvantages: the search for longer completion.

Benefits of search engine depends on the relevance of the results it provided. Although there may be millions of web pages containing a word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to be able to provide the "best" first. The way the machine determines which pages are most appropriate, and the order of the pages shown, very varied. The methods also change over time as Internet usage changes and evolve new techniques.

Most web search engines are commercial ventures supported by advertising revenue and therefore most controversial practice, which allows advertisers to pay for their pages ranked higher in search results.

Some Search Tips Using Search Engines
filetype:

This option is used to search for specific file types. Example: filetype: xls -> to search for MS Excel files filetype: doc -> to find the file MS Word

inurl:

This option is used to search for particular words that "entered" as a url. With this option you can search a specific folder on your search (if combined with the option "index of"). Example: inurl: admin -> search produces a url website has the word "admin"

site:

This option is specifically used to perform a search on a particular site. Example: site: torry.net "xp style" -> search with keywords "style xp" on the site www.torry.net

intitle:

This option is used to search for specific words contained in the title of web pages.

link:

This option is used to determine which sites are nge-link to a particular site. Example: link: delphi3000.com -> find a site that has links to www.delphi3000.com

You can combine a variety of options above to get more specific search. Example: pdf "rapidshare.de / files" site: rapidshare.de -> to find books or pdf files on rapidshare.de

+ inurl: exe rar zip site: rapidshare.de -> to search for programs, applications in rapidshare.de

You can change your site: rapidshare.de to site: megaupload.com to search for files on MegaUpload

History
The first search engine was "Wandex", which is now defunct index collected from the World Wide Web Wanderer, a web crawler developed by Matthew Gray at MIT in 1993. Other search engines long, Aliweb, also appeared in 1993 and still runs today. One of the first search engine now developed into a commercial enterprise that is big enough Lycos, which started at Carnegie Mellon University as a research project in 1994.

Soon after, many search engines appeared and vied for popularity. These include WebCrawler, Hotbot, Excite, Infoseek, Inktomi, and AltaVista. They competed with popular directories such as Yahoo. Later these directories or adding combines search engine technology to increase functionality.

In 2002, Yahoo! acquired Inktomi and in 2003, which has acquired Overture AllTheWeb and Altavista. In 2004, Yahoo! sndiri launched its search engine based on the combined technology of machines that have diakuisisinya and give priority to service than the Web search engine directory.

In December 2003, Orase published the first version of search technology, real time. This machine has many new functions and increased performancenya very great.

The search engine also known as the stars of the brightest in the Internet investing frenzy that occurred in the late 1990s. Several companies entered the market spectacularly, recording record gains during their initial public offerings. Some have taken down their public search engine, and the only marketing Enterprise editions only, such as the former Northern Light is one of the 8 or 9 after the initial search engine Lycos appears.

Before the advent of the Web, there are search engines for the [protocol]] or other use, such as the Archie search engine for FTP sites anonymously and Veronica search engine for the Gopher protocol.

Books Osmar R. Zaïane From Resource Discovery to Knowledge Discovery on the Internet explains in detail the history of search engine technology before the advent of Google.

Search engines other a9.com include, AllTheWeb, Ask Jeeves, Clusty, Gigablast, Teoma, Wisenut, GoHook, Kartoo, and Vivisimo.

Google
Around the year 2001, the Google search engine developed greater. This success is based on the basic concept of link popularity and PageRank. Each page sorted based on how many related sites, from a premise that the site you want be more connected than others. Website ranking (The PageRank) of a link page and the number of links from these pages is the input for the respective sites Ranking. This makes it possible for Google to sort the results based on how many pages to the page it finds. Google's user interface is preferred by users, and it evolved into its competitors.

Researchers at NEC Research Institute claim to have improved upon Google's patented PageRank technology by using Web Crawlers to find "communities" of websites. Instead of ranking pages, this technology uses an algorithm that follows links on a webpage to find other pages that link back to the first one and so on from page to page. The algorithm "Remembers" where it has been and indexes the number of cross-links and relates these into groupings. In this way virtual communities of webpages are found.

The challenge for search engines
The web is growing much faster than any present-technology search engine can possibly index (see distributed crawling).
Many web pages are updated frequently, which forces the search engine to Revisit them periodically.
The queries one can make are currently limited to searching for key words, which may results in many false Positives.
Dynamically generated sites, which may be slow or difficult to index, or may result in excessive results from a single site.
Many dynamically generated sites are not indexable by search engines; this phenomenon is known as the invisible web.
Some search engines do not order the results by relevance, but rather according to how much money the sites have paid them.
Some sites use tricks to manipulate the search engines to display them as the first result returned for some keywords. This can lead to some search results being polluted, with more relevant links being pushed down in the result list.
See also
Search engine spammer
Internet History
List of search engines
Data mining
Metasearch Engine
[edit] External links
Google Directory Search Engine Indonesia
Blog SEO Indonesia blog that discusses technology and news about search engines
How to Search Engines Work A brief explanation how search engines work
How it Works Google specific explanation about how the Google
Links to search engines
Search Engine Forums Sell Buy Special index of Indonesia Forum Jual Beli Indonesia's largest.
Special Search engines look for sites Indonesia Indonesia. Look for Information / Gossip / Celebrity / Government / Health / Medicine / Employment / Media / News / Sports / Video / etc.
About
Clusty
eoSearch.com
Excite
Google
Search Indonesia
Yahoo!
Yooci
Sapujagat
AMGEM Force (16 languages)
FiLeByType
Metasearch Engine
Dogpile
MetaCrawler - Metasearch engine One of the earliest
Vivisimo Clustering
international java Indonesia destination
nowGoogle.com is Multiple Search Engine Cool
Short Tutorial About MetaSearching

Source : www.wikipedia.com

Computer and Internet Technology

My Blog List

Thursday, February 18, 2010

Search engine

No comments:

Post a Comment

Pages

Labels

Categories

Favorites Link

Favorites Link

My Blog List

Labels

Favorites Link

My Blog List

Categories

Categories

Followers

Blog Archive

About Me

visitors