| |
|
| |
 |
| |
Search
engine optimization is a set of methods aimed at improving
the ranking of a website in search engine listings. The term
also refers to an industry of consultants that carry out optimization
projects on behalf of clients' sites. SEO, or "white
hat SEO", is distinguished from "black hat SEO",
or spamdexing by methods and objectives. Spamdexing uses a
variety of deceptive techniques in an attempt to manipulate
search engine rankings, whereas legitimate SEO focuses on
building better sites, and using honest methods of promotion.
What constitutes an honest, or ethical, method is an issue
that has been the subject of numerous debates.
Search engines display different kinds of listings in the
search engine results pages (SERPs), including: pay-per-click
advertisements, paid inclusion listings, and organic search
results. SEO is primarily concerned with advancing the goals
of a web sites by improving the number and position of its
organic search results for a wide variety of relevant keywords.
SEO strategies can increase the number of visitors, and the
quality of visitors, where quality means visitors who complete
the action the site intends (e.g. purchase, sign up, learn
something). |
| |
Google was started by two PhD students at Stanford University,
Sergey Brin and Larry Page, and brought a new concept to evaluating
web pages. This concept, called PageRank, has been from the
start important to the Google algorithm [1]. PageRank relies
heavily on incoming links and uses the logic that each link
to a page is a vote for that page's value. The more incoming
links a page had the more "worthy" it is. The value
of each incoming link itself varies directly based on the
PageRank of the page it comes from and inversely on the number
of outgoing links on that page.
With help from PageRank, Google proved to be very good at
serving relevant results. Google became the most popular and
successful search engine. Because PageRank measured an off-site
factor, Google felt it would be more difficult to manipulate
than on-page factors.
But manipulated it was. Webmasters had already developed link
manipulation tools and schemes to influence the Inktomi search
engine. These methods proved to be equally applicable to Google's
algorithm. Many sites focused on exchanging, buying, and selling
links on a massive scale. PageRank's reliance on the link
as a vote of confidence in a page's value was undermined as
many webmasters sought to garner links purely to influence
Google into sending them more traffic, irrespective of whether
the link was useful to human site visitors.
It was time for Google—and other search engines—to
look at a wider range of off-site factors. There were other
reasons to develop more intelligent algorithms. The Internet
was reaching a vast population of non-technical users who
were often unable to use advanced querying techniques to reach
the information they were seeking and the sheer volume and
complexity of the indexed data was vastly different from that
of the early days. Search engines had to develop predictive,
semantic, linguistic and heuristic algorithms.
A proxy for the PageRank metric is still displayed in the
Google Toolbar, but PageRank is only one of more than 100
factors that Google considers in ranking pages.
Today, most search engines keep their methods and ranking
algorithms secret. A search engine may use hundreds of factors
in ranking the listings on its SERPs; the factors themselves
and the weight each carries may change continually.
Much current SEO thinking on what works and what doesn't is
largely speculation and informed guesses. Some SEOs have carried
out controlled experiments to gauge the effects of different
approaches to search optimization.
The following, though, are some of the considerations search
engines could be building into their algorithms, and the list
of Google patents [2] may give some indication as to what
is in the pipeline:
»» Age of site
»» Length of time domain has been registered
»» Age of content
»» Regularity with which new content is added
»» Age of link and reputation of linking site
»» Standard on-site factors
»» Negative scoring for on-site factors (for example, a dampening
for sites with extensive keyword meta tags indicative of having
being SEO-ed)
»» Uniqueness of content
»» Related terms used in content (the terms the search engine
associates as being related to the main content of the page)
»» Google Pagerank (Only used in Google's algorithm)
»» External links, the anchor text in those external
links and in the sites/pages containing those links
»» Citations and research sources (indicating
the content is of research quality)
»» Stem-related terms in the search engine's database
(finance/financing)
»» Incoming backlinks and anchor text of incoming
backlinks
»» Negative scoring for some incoming backlinks
(perhaps those coming from low value pages, reciprocated backlinks,
etc.)
»» Rate of acquisition of backlinks: too many
too fast could indicate "unnatural" link buying
activity
»» Text surrounding outward links and incoming
backlinks. A link following the words "Sponsored »»
Links" could be ignored
»» Use of "rel=nofollow" to suggest
that the search engine should ignore the link
»» Depth of document in site
»» Metrics collected from other sources, such
as monitoring how frequently users hit the back button when
SERPs send them to a particular page
»» Metrics collected from sources like the Google
Toolbar, Google AdWords/Adsense programs, etc.
»» Metrics collected in data-sharing arrangements
with third parties (like providers of statistical programs
used to monitor site traffic)
»» Rate of removal of incoming links to the site
»» Use of sub-domains, use of keywords in sub-domains
and volume of content on sub-domains… and negative scoring
for such activity
»» Semantic connections of hosted documents
»» Rate of document addition or change
»» IP of hosting service and the number/quality
of other sites hosted on that IP
»» Other affiliations of linking site with the
linked site (do they share an IP? have a common postal address
on the "contact us" page?)
»» Technical matters like use of 301 to redirect
moved pages, showing a 404 server header rather than a 200
server header for pages that don't exist, proper use of robots.txt
»» Hosting uptime
»» Whether the site serves different content to
different categories of users (cloaking)
»» Broken outgoing links not rectified promptly
»» Unsafe or illegal content
»» Quality of HTML coding, presence of coding
errors
»» Actual click through rates observed by the
search engines for listings displayed on their SERPs
»» Hand ranking by humans of the most frequently
accessed SERPs |
| |
|
| |
|
|