Home > SEARCH ENGINES EXPLAINED > How Search Engines Work
How Search Engines Work
Click here to view the articles in How Search Engines Work
Contents of How Search Engines Work:
- What is a Search Engine
Internet Search Engines
Internet Search Engines explained in this category will clarify you on getting top search engine listing, which is perhaps one of the hottest subjects in the Internet marketing world.
The actual state of affairs shows that your good looking site doesn’t interest Internet search engines until you use important and vital for optimization rules and techniques, so starting when building and optimizing a website, it’s very important to get a notion of how search engines work, what their goals are, and how they learn. Trying to follow experienced person or adopt a decision about some tricks without comprehension can harm you.
It’s a well-known fact that search portals strive to improve the quality of their search results. Historical development of Internet search engines ranking starts from the logic of an ordinary Web surfer and attention to keyword distribution. Later, other factors prevailed, among them “link popularity,” which remains a huge factor in ranking. Both of the above mentioned parameters as well as distinguished search engines ranking factors form our optimization strategy on website building stage and later on during popularity establishing, increasing traffic, and service, and resources development.
Therefore, the start of the project or site development process will be closely connected with understanding of Internet search engines explained features and ranking strategies.
While explaining Internet search engines particulars, we should begin form the notion itself. A search engine is a searchable online database of Internet resources. It has several components: search engine software, spider software, an index (database), and a relevancy algorithm (rules for ranking). The search engine software consists of a server or a collection of servers dedicated to indexing Internet webpages, storing the results, and returning lists of pages to match user queries. The spidering software constantly crawls the Web, collecting webpage data for the index. The index is a database for storing the data.
Internet Search Engines Explained
We should also mention four main types while describing Internet search engines. They are crawler-based (traditional or common engines), directories (human-edited catalogs), hybrid engines which are META engines and those using other engines’ results, and paid listings (PPC and paid inclusion engines).
As you can see, spider software belongs to crawler-based Internet search engines. In a nutshell, their work is as following: spiders read your page, index, and rank it. Finally, it appears on search engine results pages for the words and phrases most common on the indexed webpage.
If you’ll look on the way result listing is generated more closely, you can clearly differentiate pure search engines and directories (For example, the latter group is often mistakenly called search engines, too).
Directories work in the following way: you have to submit your pages manually to one of the existing categories, your site is visited and read by a directory editor. You must be ready for long queue process as reviewing by an editor (directories use human power for indexing) takes much longer to process all pages. Most directories do not have their own ranking mechanism; they use some obvious factor to sort the URLs such as an alphabetic sequence or Google Page Rank.
Paid inclusion engines require certain fees to list your page with some differences in the working system as re-spidering or top-ranking for keywords that you choose. Moreover, most major Internet search Engines utilize such schemes as a part of their indexing and ranking system. PPC engines use an auction system where keywords and phrases are associated with a cost-per-click (CPC) fee. The fundamental principle that lies at the heart of PPC process is that the higher you bid, the higher your position will be for the particular search terms.
The significance of major Internet search engines is explained in detail in our next sections. Without knowledge of ranking algorithms and the way result listing is generated, most your work can be idle or not visible for search engines. In other words, the site’s design is meaningless if your pages are ranked low on Internet search engines.
It should be noted that search engines now apply a sophisticated technique to determine how relevant you pages are to search words and phrases. For this purpose, they examine many on-the-page and off-the-page factors and only after this give your page a certain position or rank. This position will be visible while displaying results for a certain search query. To be top ranked, you should also be familiar with such parameter as relevancy of your website. It means that your content should depict a particular subject and be focused on it.
In conclusion, if you’re working for highly ranking website, continuing to optimize your pages over time, the most important lesson you can learn is that internet Search Engines are striving to live and actively use their everyday experience to change for the better. That’s why when considering any trick and tactic for your search engine optimization strategy, you must think about goals of the engines. Will it give you better results applying to most searches? Start reading and understanding what will be sound for your project. Internet search engines explained algorithms and features will guide you to peaks of ranking.
Spider-based search engines have made their way from simple, spam-vulnerable algorithms to complex and sophisticated mechanisms that are dangerous to play with. Also, the search engine optimization industry has developed a number of black-hat techniques to abuse the automatic site indexation and ranking. These techniques are referred to as search engine spamming. Nowadays, they can be considered neither legitimate nor effective.
- Search Engine Features
Top Search Engines in General
Before familiarizing with Top search engines, let’s dwell on a question, what is search engine at all? A search engine is a complex program defining the most relevant information in the WWW and providing it to the customer on the request of definite keyword or phrase. However, it’s only a top of iceberg; the nature of top search engines is complex and constantly improving.
Everybody surfing the Web resorts to the help of search engines in information retrieval, and really, search engines are the only means by which find website without knowing its definite address. It’s natural that to get the most relevant information, people use only the top search engines. In fact, they offer much more that just looking up the webpages out there that contain words from your query. For example, Google can be used for a much wider range of purposes from finding all sites linking to a given one, looking up definitions of technical terms, Internet shopping, to evaluating mathematical expressions and more.
Types of Top Search Engines
The world of top search engines could be divided into Crawler-based, Human-powered, Hybrid, and Pay-per-performance search engines. Crawler-based (also known as “traditional” and “common”) utilize special software, surfing the Internet, to supplement their database. Search engine spider scans parameters which are not visible for a human. It doesn’t differ the amenities of design, but sees only the code of images, which throws obstacles between it and your content. The most important for search engine robot is the relevancy of your printed text. The frequency of your keywords and phrases in the contents of the page, especially at the beginning and the end of it, should be encouraged. Among the top search engines of traditional kind are Google, Teoma, AllTheWeb, Alta Vista, and Hotbot.
Directories (also known as human-powered or -edited search engines) do not look for websites to index them; they prefer you to suggest your site. The most of them have “Add URL” pages where you fill out the required information such as the website title, description, keywords, and email. Basically, you select the proper category for the website and then, in accordance of tern, it will be examined by human editors. The decision of this editor could be only to accept or not to accept the site. As human-powered search engines do not have own algorithms to rank websites, they make it according to alphabetic sequence or Google Page Rank. Directories are often used by top search engines as a source of new pages to scan for indexation. Among the largest top search engines of human-edited nature there are Yahoo!, DMOZ, and Looksmart.
Hybrid Search engines utilize both types of listing. For example, MSN presents human-edited listings from Looksmart and crawler-based listing of its own Web crawler. Nowadays, some top search engines also resort to opposite listings such as Google and Yahoo!. However, it takes a small part of the whole process, and they are still leading representatives of their classes. Also, there are so called meta-search engines, which combine the results from the number of other top search engines at the same time and present them for a user. Among representatives of such kind of search engines you can find MetaCrawler and Dogpile. For example, MetaCrawler compiles the results of seven search engines including AltaVista and Lycos.
In Pay-per-performance search engines, you pay for your site to be listed, re-spidered, and top-ranked according to keywords you choose. There is not a big amount of sites focused only on paid inclusions, but the most prominent is Overture. Also, you can find some additional paid services provided by other top search engines, but it’s not obligatory to use them.
What is the structure of Crawler-based Top Search Engines?
The structure of such top search engines is similar and consists of three main elements.
The first is a Spider or Crawler. It surfs the Web looking for changes there such as pages recently created and updates of already indexed pages. Spiders crawl their way through links both inside the site and between websites. If there is no site to link to you, the search engine spider will visit your site only if you take advantage of suggesting the site yourself; the majority top search engines allow this.
The second element of crawler-based search engines is Index. The information gathered by search engine spiders gets to it. The Index of crawler-based top search engines is a huge knowledge box that concentrates the copies of all the pages found by spiders that are treated as relevant and worth to be posted by a top search engine. Even if a crawler visited your page, it doesn’t mean it will be indexed promptly. It could take for several months if you do not resort to using a “paid submission.”
The third top search engine’s element is Software. It scans the Index and presents the consecutive list of the most relevant pages on your query. Every top search engine has its own algorithm, and the tuning of all search sngine crawlers differs. If you decide to rank high at a top search sngine, do not forget that the value will be given not by a person, but by an impartial search engine robot, which will not value your sophisticated design.
Most enjoyed Top Search Engines
Top search engines do their best to provide the most relevant information, and it’s really interesting to find out which of them are most popular. One of investigations of 2006 year gives such result of usage top search Engines in USA:
- Google—42.7%.
- Yahoo!—28%.
- MSN—13.2%.
- AOL—7.6%.
- Ask—5.9%.
Other search engines together possess 2.6% of all searching requests.
This rate of top search engines doesn’t correlate to activities of other countries’ or the world’s searches at all, but it could have something in common with them.
In brief about some of Top Search Engines
Google is the leading top search engine of today with little more then 8,000,000,000 pages in its index. It consists of “regional” branches such as “Google Canada” and “Google Australia,”which are modifications of the main knowledge-base stored on the appropriate area servers. If you submit website to the “Main Google,” and it gets to the Index, it will be listed at every “branch.” Google inacts near 350 million searches per day, and it’s really the most authoritative one between top search engines.
Launched in 1994 ,Y ahoo! was the first and is still one of the most popular search engines as well as the leading Directory. It could take a lot of time (at the rate of few months) for editors to reach your request. Paid listings are also provided and should bring the consideration of editors of this top search engine in few days.
MSN, owned by Microsoft, is one of three top search engines. It was “powered” by Inktomi database, now owned by Yahoo!. However, since the February of 2005, it has used its own database. MSNBot can find any site on the Web. If your site does not appear in MSN Search, you can suggest your URL yourself to this top search engine.
Relations Between Top Search Engines
In this impetuous rush for the leading place between top search engines, a lot of methods to enrich the aim were used. Some top search engines are fed by another or absorb them. For example, Yahoo! owns AllTheWeb, Overture, and AltaVista and also owns and uses Inktomi's technology and database for its partnering engines. Ask Jeeves owns and uses results and crawler of the Teoma engine. DMOZ is owned by Netscape, and Lycos owns HotBot and Tripod. Such correlations allow some of top search engines to struggle for leading places and others to fight stay afloat.
Every top search engine, no matter what kind it is, is provided to bring people the best information they need, and you should possess the sufficient and of current importance knowledge to climb to the most highly-ranked places of top search engines.
- Algorithms of SE Ranking
Search engines ranking
SEO is a number of skills to gain high search engines
ranking.
Search engines Ranking is what SEO is all about. You tweak your pages trying to
"meet search engine's requirements" and gain high ranking in the search engines'
results. However, what are these "requirements"? We all know search engines are
nothing but robots, computers that follow a certain program to determine which
site is more relevant to a query and which one is less useful. Yes, when chasing
search engines ranking, we are just struggling against a machine—a powerful
instrument with a sophisticated program though it may be, but a machine
nevertheless. However, we are humans with much more flexible minds, and our
creative approaches will let us crack search engines’ ranking algorithms if we
input time and proficient efforts.
Search engines' ranking algorithms
Search engines keep ranking algorithms secret, to
produce more relevant results.
Search engines do not reveal their ranking algorithms, for doing so would be
equal to giving each SEO a nuclear weapon and pushing them into a mortal combat
against each other. The World Wide Web would turn into a giant heap of
over-optimized pages stuffed with ads and nonsensical content. In other words, if
search engines kept their ranking algos open, they would confine themselves to
showing not RELEVANT, but OPTIMIZED pages on the top, and today, there's indeed a
great gap separating these two sets.
It is true that the search engines' ranking algorithms are secret, and we have to
submit ( :) ) to the fact that we cannot buy this information from the search
engines or find it, say, on Google.
What are the ways of discovering the path to high search engines ranking, then?
Search engines experiments
Search engines' ranking algorithms can be guessed /
discovered by experiment.
The word SEO (search engine optimization) first sounded nearly ten years ago.
Since then, many researchers, driven by financial, professional, or scientific
interests, tried to crack ranking algos by experimenting. Thus, several years
ago, it was discovered that putting many keywords in the META and TITLE tags can
improve rankings in the search engines for these keywords, giving an optimized
page a competitive edge over those probably more relevant but less optimized. The
search engines' ranking algorithms of those days were really simple enough to
allow invisible tags (such as META keywords and META description) influence the
actual position of that page in search results.
This gap was quickly discovered by spammers of all kinds who didn't leave any
chance to their competition to appear on the top of the search results. The
importance and benefit of good search engine positioning and organic rankings
wasn't as well-recognized then, so it was mostly advanced optimizers themselves,
driven by sportive passion, who tried to outrank one another.
Search engines ranking
Knowing search engines ranking algos is a powerful
weapon in marketing wars.
But search engines quickly understood their vulnerability and reacted by shifting
the importance from the invisible on-page elements (like the META tags) to
visible and significant page areas (such as the TITLE tag) and to the off-page
factors like inbound links, which made them almost completely bulletproof since
these ranking factors are beyond the scope of webmaster's direct influence.
To that moment, the profitability of having organic search engines ranking became
clear to a larger amount of website owners ranging from small businesses that did
optimization themselves or hired an SEO to big players with the ability to spend
thousands on their search presence and strategy. Therefore, search engines still
needed to keep their algorithms secret and stick to their main principle: same
rules and equal chances for everybody.
Search engines ranking
Evolution of search engines ranking techniques
Search engines' ranking algorithms became more and more sophisticated, and Google
became famous for its art of discouraging search engine optimizers. Search
engines say they are "committed to providing relevant results," but they also
have another hidden reason to make artificial achieving of high rankings more
difficult—a business owner unable to gain rankings for free will agree to pay a
reasonable amount for Web visibility, buying a place in the "sponsored results"
column.
Nowadays, search engines ranking are done by applying algorithms that take into
account elements such as the page contents, the URL of the page, the TITLE tag,
and HTML headings (H1-H6). Also included in the ranking are words that are in
bold, words that are used in links to other pages, words and phrases that are
used in the beginning, and, at the end of the text, the number of links from
other pages and sites that point to this page, the text used in these links, the
authority of the linking site, and even the number of other links on the linking
page. Google is said to apply additional algorithms such as like Hilltop,
Sandbox, and PageRank for ranking pages more relevantly, and all of them will be
described and discussed in this section.
Advanced search engines ranking
Advanced search engines ranking algorithms are Hilltop,
Sandbox, and PageRank.
Hilltop is an algorithm that was created in 1999. Basically, it looks at the
relationship between the "Expert" and "Authority" pages. An "Expert" is a page
that links to many other relevant documents. An "Authority" is a page that has
links pointing to it from the "Expert" pages. Sandbox refers to an algorithm that
detects how old a page is and how long ago it has been updated. Page Rank is an
absolute value which is regularly calculated by Google for each page it has in
its index. The number of links you've gotten from other sites outside your domain
matters greatly, as does the link quality. The latter means that in order to give
you some weight, the sites linking to yours must themselves have high a Page Rank
in addition to being content-rich and regularly updated.
- Crawler-based Search Engines
Search Engine Spider – finds. Index – retains. Search Engine Software – presents. : The Most Relevant Pages for You.
Search Engine Spider, What is It?
Everybody connected to the Web has heard the term “Search Engine Spider.” From its name, its not difficult to understand its nature in general, but let’s investigate it in a bit more detail. A Search Engine Spider (also known as “robot”, “crawler” and “worm”) is an integral component of Crawler-based Search engines as it separates them from another type of search engines – Human-powered. Crawler-based search engines observe websites with the help of special programs called “search engine spiders.” They surf the infinite space of the Internet looking for the recently created pages and for updates of the pages already indexed. Search Engine Spiders crawl their way through links both inside the site and between websites. If your site has no incoming links, the only way to invite the Search Engine Spider is to take advantage of suggestion the site by yourself; the majority Crawler-based search engines allow it.
The information gathered by Search Engine Spiders gets to the Index – the second element of Crawler-based search engines. Index is a huge knowledge box, concentrating the copies of all the pages found by Search Engine Spiders treated as relevant by Search Engine. However, don’t take for granted that if the page is found by Search Engine Spider, it will be indexed promptly.
The third element of every Crawler-based Search Engine is the software, which scans the Index and presents the consecutive list of the most relevant pages on your query. As Search Engine Spiders constantly surf the Web, they inform the Index about changes, which influences on the posterior ranking according to the cherished Algorithm.
Of course, every Search Engine has its own algorithm, and all Search Engine Spiders are tuned differently. However, if you desire to rank highly in Crawler-based search engines, just remember that the value will be given not by a man and that even the unsurpassed design will not be appreciated by impartial Search Engine Spider.
- Human-powered Engines and Directories
Internet directories can be considered another kind of search engines. Instead of using “crawlers,” Internet directories compile and rank large databases of webpages by using human power to manually find and sort relevant webpages. Such “directories” employ human editors to compile their databases. Most Internet directories are covered in this section.
Some time ago, directories were not as numerous as they are today, but their popularity was greater because the technologies powering their competitors’ crawler-based search engines were not as advanced as those of today, which allowed Internet directories to offer much more refined and relevant content. To be listed in a directory like Yahoo! Directory or DMOZ meant good and targeted traffic.
Today, almost every industry and branch of human knowledge seems to have its own directory in the Internet, so it's not strange to find a “directory of directories.” Such sites aim at categorizing existing catalogues, but in general, they do nothing but make the picture more complicated. Ah, yes. They also tell search engine optimizers to which else Internet directories they can submit their sites to get a backlink.
Still, a DMOZ or Yahoo! directory listing can do a lot for your SE rankings because every month, Google begins crawling the Web from the DMOZ pages and Yahoo! from its own directory. Links from established and recognized Internet directories are a powerful contribution to your overall link popularity and Web visibility. However, it's difficult to get there. Before your site gets listed, an editor must review it to decide whether it is appropriate for a certain category. That’s why it is not enough to write a keyword-rich text copy; you must create a website that is attractive, appealing, unique, and easy to read in order to make it into an Internet directory.
>>WebProGuide Home Page
>>Site Contents
|
|
|