Choosing Keywords
  Home arrow Choosing Keywords arrow Page 2 - Matching Strings and Algorithms
SEO Chat Forums  
Choosing Keywords  
Google Optimization  
Link Trading  
MSN Optimization  
Search Engine News  
Search Engine Spiders  
Search Optimization  
Web Directories  
Website Marketing  
Website Promotion  
Website Submission  
Yahoo Optimization  
SEO Tools
Adsense Calculator
AdSense Preview
Advanced Meta-Tags
Alexa Rank Tool
Check Server Headers
Class C Checker
Code to Text Ratio
CPM Calculator
Domain Age Check
Domain Typos
Future PageRank
Google Dance
Google Keywords
Google Search
Google Suggest
Google vs Yahoo
Indexed Pages
Keyword Cloud
Keyword Density
Keyword Difficulty
Keyword Optimizer
Keyword Position
Keyword Typos
Link Popularity
Link Price Calculator
Meta Analyzer
Meta Tag Generator
Multiple Link Popularity
Page Comparison
Page Size
PageRank Lookup
PageRank Search
Robots.txt Generator
ROI Calculator 
S.E. Comparison 
S.E. Keyword Position 
Site Link Analyzer 
Spider Simulator 
URL Redirect Check 
URL Rewriting 
Mobile Linux 
APP Generation ROI 
IBM® developerWorks 
SEO Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
CHOOSING KEYWORDS

Matching Strings and Algorithms
By: Simon White
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 20
    2004-03-15

    Table of Contents:
  • Matching Strings and Algorithms
  • Equivalence Methods
  • Wildcards and Regular Expressions
  • Similarity Ranking Methods
  • Conclusions

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Matching Strings and Algorithms - Equivalence Methods


    (Page 2 of 5 )

    Equivalence methods compare two strings and return a value of true or false according to whether the method deems those two strings to be, in some sense, equivalent. In terms of user-interface design, your application can be more forgiving of user inputs if it accepts equivalent strings instead of only exact matches. A simple example of equivalence is to treat ‘Tweetle-Beetle Battle’ the same as ‘TWEETLE BEETLE BATTLE’ despite the differences in case, and the replacement of a hyphen with a space in the second string.  

    Word Stemming
    Word stemming is a technique that reduces closely related words to a basic canonical form or ‘stem’. For example, the user inputs ‘swims’ and ‘swimming’ can be reduced to the basic stem ‘swim’ before performing an exact match against expected inputs. Stemming makes use of a suffix dictionary that contains lists of possible word endings. However, such a list is clearly language-dependent and even regional differences of the same language must be considered (for example, compare British spelling ‘standardise’ with American spelling ‘standardize’). Also, not all languages lend themselves to such treatment, although it has been demonstrated for most languages of the Indo-European family (which includes Latin-based and Germanic languages). Deriving stemming algorithms is a difficult, time-consuming and error-prone activity. Therefore, for application building, I can only recommend using tools such as Snowball, with its suite of existing stemming algorithms for many languages.

    Synonyms
    In this approach, synonyms of expected inputs are stored explicitly.  For example, with the string ‘television’, you might also store ‘TV’ and ‘televisions’; and with the string ‘license’ you might also store ‘licence’. As with word stemming, user inputs are converted to a canonical form before any further processing.

    This mechanism can provide a forgiving user-interface, and is also language independent. Unfortunately, it does mean that many synonym strings must be prepared in advance to anticipate every possible user input. You could argue that the approach simply increases the number of expected inputs, rather than providing a better algorithm to find the strings that are of real interest. However, if you reconsider the problem of retrieving product descriptions from a database, you should see that there are other advantages. Firstly, as synonyms are resolved close to the user-interface, you can index your products in the system’s back end using a small, controlled keyword vocabulary. Secondly, the architecture provides a clean separation between these user-interface concerns and the integrity of the data.

    More Choosing Keywords Articles
    More By Simon White


     

    CHOOSING KEYWORDS ARTICLES

    - Keyword Difficulty vs. Size of Domain
    - Increase Your AdSense Revenue Through Keywor...
    - The Lowdown on Keyword Density
    - Using Calendar-Based Keywords
    - Encourage Conversion: More Advanced Keyword ...
    - Advanced Keyword Research Strategies
    - Keyword Research Tips
    - Think Like a Searcher to Increase Your Traff...
    - Using Search Tools for SEO
    - Effective Keyword Choice Strategy and Useful...
    - Content is King: Information Architecture
    - The Hard Line Keyword Sales Pitch
    - Web Development: Keyword Themes Increase Vis...
    - Integrating Your Keywords into Your Content
    - How to Effectively Choose Your Web Site`s Ke...





    © 2003-2010 by Developer Shed. All rights reserved. DS Cluster 12 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek