Wednesday, October 12

How search engines rank web pages

Search for anything using your favorite crawler-based search engine and instantly, the search engine will sort through the millions of pages it knows about and present you with ones that match your topic. The matches will be ranked, with the most relevant ones coming first.

Imagine going into a book shop and saying 'coffee'. The shop assistant will ask you questions in order to direct you to the type of book you are looking for.

Unfortunately, search engines don't have the ability to ask a few questions to focus your search.

So, how do crawler-based search engines go about determining relevancy, when confronted with hundreds of millions of web pages to sort through? They follow a set of rules, known as an algorithm. Exactly how a particular search engine's algorithm works is a closely-kept trade secret. However, all major search engines follow the general rules below.

Location & Frequency

One of the the main rules in a ranking algorithm involves the location and frequency of keywords on a web page.

So, using the example above, the search engines need to find books matching the request for "coffee". It makes sense that they first look for coffee in the title. Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic.

Search engines will also check to see if the search keywords appear near the top of a web page, such as in the headline or in the first few paragraphs of text. They assume that any page relevant to the topic will mention those words right from the beginning.

Frequency is the other major factor in how search engines determine relevancy. A search engine will analyze how often keywords appear in relation to other words in a web page. Those with a higher frequency are often deemed more relevant than other web pages.

Varying results

Now it's time to qualify the location/frequency method described above. All the major search engines follow it to some degree. However, no one search engine does it exactly the same way, which is one reason why the same search on different search engines produces different results.

To begin with, some search engines index more web pages than others whilst some also index web pages more often than others. The result is that no search engine has the exact same collection of web pages to search through. That naturally produces differences, when comparing their results.

Spam, spam and more spam ...

Search engines may also penalize pages or exclude them from the index, if they detect search engine "spamming." Examples: when a word is repeated hundreds of times on a page, to increase the frequency and propel the page higher in the listings or added in the same colour as the page, thereby making it invisible to the viewer. Search engines watch for common spamming methods in a variety of ways, including following up on complaints from their users.

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home