Monday, November 5, 2007

Latent Semantic Indexing Concepts

By John Martin

Latent semantic indexing (LSI) is a new concept that Google began to employ after the recent purchase of a company called Applied Semantics which pioneered the initial technology. LSI was first used by Google in its AdSense program as a way of verifying which adverts would be the most relevant for a particular site; however, now it is being used as a way of rating and ranking websites by Google and other search engines.

What LSI is, in basic and non-mathematical terms, is the ability for the search engine to search for websites on the Internet the same way a human would. In other words, the search engine looks for relevance and quality, rather than just keywords or links. Keywords and links were the way the search engines used to do things but this process had a number of weaknesses. First of all, webmasters or SEO 'experts' that used "blackhat" techniques, could get top rankings by simply loading a site full of irrelevant keywords, using poor quality content, or through the use of link farms. Many sites would simply seek out links from other irrelevant sites only to make money from traffic or AdSense. The old system penalized perfectly good sites--sites with good content, sites that added content too quickly, or sites that were new. Most Internet users have been the victim of many irrelevant sites from search engine top rankings so Google, and other search engines, have decided to do their best to create a cleaner and higher quality Internet experience.

Looking at LSI in more detail, it's easy for us to see how to structure and build our websites and web pages correctly. The LSI algorithm works by scanning your website for keywords and then comparing relationships between the various keywords and keyword phrases which are found. It scans other websites as well that have the same keywords, or concentration of those keywords, and looks for related words and phrases. LSI goes so far as to check grammar, terminologies, and spelling on sites already indexed in addition to your own website. Basically, what LSI is doing is checking the overall theme of your website to see whether it matches what the user is searching for and to see how your site ranks in relation to other similar sites in terms of keyword relevance. The most relevant site will win--it will rank the highest.

For example, if you search for "cellphone" on a search engine under the old system it would display sites which have the highest mention of "cellphone" and/or the most links. But under LSI, a search for "cellphone" displays results of sites that also have the word "mobile phone" or "cellular phone" or other relevant phrases. What this means is that keyword stuffing into sites and articles will not win you a higher ranking, but that relevant quality content and a good overall theme will.

Website developers and writers who have been using website optimization techniques based on ethical and quality principles will finally come out on top. At the same time, irrelevant and rubbish sites are thrown off the rankings completely. The higher the quality and relevance of the site, the better the rankings will be with the introduction of latent semantic indexing.

John Martin has been working with website optimization techniques since the early 1990's. He is the owner of http://www.LatentSemanticIndexing.com where you can read articles related to LSI and keep up-to-date on the latest optimization techniques.
Article Source: http://EzineArticles.com/?expert=John_Martin