Thursday, September 18, 2008

How Teoma Works

How Teoma Works



Teoma adds a new dimension and level of authority to search results through its

approach, known as Subject-Specific PopularitySM.


Instead of ranking results based upon the sites with the most links leading to them,

Teoma analyzes the Web as it is organically organized—in naturally-occurring

communities that are about or related to the same subject—to determine which sites are

most relevant. Teoma’s search technology can locate communities on the Web within

their specific subject areas, as they actually exist.


To determine the authority—and thus the overall quality and relevance—of a site's

content, Teoma uses Subject-Specific PopularitySM. Subject-Specific Popularity ranks a

site based on the number of same-subject pages that reference it, not just general

popularity. In a recent test performed by Search Engine Watch, Teoma's relevance grade

was raised to an "A" following the integration of Teoma 2.0.


Teoma 2.0: Evolution and Growth


In early 2003, Teoma 2.0 was launched. The enhanced version represents a major

evolution in terms of improvements to relevance and an expansion of the overall

advanced search functionalities. Below are detailed explanations for the improvements

made in this version:


More Communities





Like social networks in the real world, the Web is clustered into local communities.



Communities are groups of Web pages that are about or are closely related to the same

subject. Teoma is the only search technology that can view these communities as they

naturally occur on the Web (displayed under the heading "Refine" on Teoma.com). This

method allows Teoma to generate more finely tuned search results. In other words,

Teoma's community-based approach reveals a 3-D image of the Web, providing it with

more information about a particular Web page than other search engines, which have

only a one-dimensional view of the Web.






Web-Based Spell Check





Teoma's proprietary Spell Check technology identifies query misspellings and offers

corrections that help improve the relevance and precision of search results. The Spell

Check technology, developed by Teoma's team of scientists, leverages the real-time

content of the Web to determine the correct spelling of a word.






Dynamic DescriptionsSM


Dynamic Descriptions enhance search results by showing the context of search terms as

they actually appear on referring Web pages. This feature provides searchers with

information that helps them to determine the relevance of a given Web page in

association with their query.







Advanced Search Tools





Teoma's Advanced Search tools allow searchers to search using specific criteria, such as

exact phrase, page location, geographic region, domain and site, date, and other word

filters. Users can also search using 10 Western languages, including Danish, Dutch,

English, French, German, Italian, Norwegian, Portuguese, Spanish and Swedish. A link

to Teoma's Advanced Search tools can be found next to the search box on Teoma.com.






The Teoma Algorithm


In addition to utilizing existing search techniques, Teoma applies what they call

authority, a new measure of relevance, to deliver search results. For this purpose, Teoma

employs three proprietary techniques:






Refine, Results and Resources.


Refine


First, Teoma organizes sites into naturally occurring communities that are about the

subject of each search query. These communities are presented under the heading



"Refine" on the Teoma.com results page. This tool allows a user to further focus his or

her specific search.






For example, a search for "Soprano" would present a user with a set of refinement

suggestions such as "Marie-Adele McArther" (a renowned soprano), "Three Sopranos"

(the operatic trio), "The Sopranos" (the wildly-popular HBO television show) as well as

several other choices. No other technology can dynamically cluster search results into the

actual communities as they exist on the Web.






Results





Next, after identifying these communities, Teoma employs a technique called Subject-

Specific PopularitySM. Subject-Specific Popularity analyzes the relationship of sites

within a community, ranking a site based on the number of same-subject pages that

reference it, among hundreds of other criteria. In other words, Teoma determines the best

answer for a search by asking experts within a specific subject community about who

they believe is the best resource for that subject. By assessing the opinions of a site's

peers, Teoma establishes authority for the search result. Relevant search results ranked by

Subject-Specific Popularity are presented under the heading "Results" on the

Teoma.com results page.



In some instances companies pay to have their Web sites included within Teoma's

dataset, otherwise known as the Teoma Index. Like all Web sites, these sites are

processed through Teoma's search algorithms and are not guaranteed placement in the

results. This ensures that relevancy is the primary driver of results.






Resources





Finally, by dividing the Web into local subject communities, Teoma is able to find and

identify expert resources about a particular subject. These sites feature lists of other

authoritative sites and links relating to the search topic.


For example, a professor of Middle Eastern history may have created a page devoted to

his collection of sites that explain the geography and topography of the Persian Gulf. This

site would appear under the heading "Resources" in response to a Persian Gulf-related

query. No previous search technology has been able to find and rank these sites.







Sponsored Links



Search results appearing under the heading "Sponsored Links" are provided by

Google®, a third party provider of pay for performance search listings. Google generates

highly relevant sponsored results by allowing advertisers to bid for placement in this area



based on relevant keywords. These results, which are powered by Google's advanced

algorithms, are then distributed across the Internet to some of the world's most popular

and well-known Web sites, including Teoma.com and Ask Jeeves.



Other factors


Boolean Searching





Limited Boolean searching is available. Teoma defaults to an AND between search terms

and supports the use of - for NOT. Either OR or ORR can be used for an OR operation,

but the operator must be in all upper case. Unfortunately, no nesting is vailable.


Proximity Searching





Phrase searching is available by using “double quotes” around a phrase or by checking

the "Phrase Match" box. Teoma also supports phrase searching when a dash is used

between words with no spaces. Until Nov. 2002, Teoma's help page stated that "Teoma

returns results which exactly or closely matches the given phrase" which meant that not

all phrases matches will necessarily be accurate. As of Nov. 2002, that appears to have

been corrected and phrase searching now works properly.


Truncation






No truncation is currently available.


Case Sensitivity


Searches are not case sensitive. Search terms entered in lowercase, uppercase, or mixed

case all get the same number of hits.


Stop Words





Teoma does ignore frequently-occurring words such as 'the,' 'of', 'and', and 'or'. However,

like at Google, these stop words can be searched by putting a + in front of them or by

including them within a phrase search.


Sorting





By defaults, sites are sorted in order of perceived relevance. They also have site

collapsing (showing only two pages per site with the rest link via a “More Results”

message. There is no option for sorting alphabetically, by site, or by date.


Display





Teoma displays the title (roughly first 60 characters), a two line keyword-in-context

extract from the page, and the beginning of the URL for each hit. Some will also have a



link to "Related Pages" which finds related records based on identifying Web

communities by analyzing link patterns. Two other sections displayed are the "Refine"

section (formerly folders) that suggest other related searches based on words that Teoma

uses to identify communities on the Web and the "Resources: Link collections from

experts and enthusiasts" (formerly "Experts' Links") which are Web pages that include

numerous links to external resources -- metasites or Internet resource guides. Some

"Sponsored Links" may show up at the top. These are ads from the Google AdWords

program.


Teoma will only display 10 Web page records at a time; however, up to a 100 at a time

can be displayed through a change in the preferences and on the advanced search page.

Teoma may also display up to 10 metasites under the "Resources" heading and up to 6

Refine suggestions.

0 comments: