Thursday, July 7, 2011

String Anagrams

You are required to implement a program which is able to print out all anagrams of a specified string. Two strings will become anagrams if letters of one string can be rearranged to obtain the other string. For example, the string "rats" is an anagram of "star". Your first step is to load a reasonable number of words in the dictionary into a hash table enabling the user to efficiently look for anagrams of a given word/s. The user should have the facility to specify anagram queries through stating the required word/s. The output should print out the given string/s along with the number of matching anagrams and the corresponding list of anagrams retrieved from the hash table.

       Example :    traps            2          sprat    strap
                            opt               2          top       pot
                            star              1          rats

      Note: The matching anagrams that the program prints out may be ordered differently.


Algorithm and Implementation 
Our first step is to load all of the words in the dictionary.txt into the hash table. A clever trick that we have used to facilitate this is to first sort the letters of every word we insert into hash table to produce a key for each word. For example, the key for the string for both "star" and "rats" is "arst". We will then use a hash table to store pairs of strings, where the pair consists of the original word and it’s key. When performing insertions, we will compute the hash of the key of the word to compute the correct bucket. This approach guarantees that all words which are anagrams of one another are stored in the same bucket of the hash table. Similarly, when we are searching for anagrams, we will first compute the key of the word we are searching for, then hash the key, then search that bucket for anagram matches. For this question, there are 5 files.

1. list.h ==> implementation of the linked list
2. hashtable.h ==> implementation of the hash table and basic functionality (insert, find) of a chained hash table.
3. main.cpp ==> a main program which loads words from the dictionary.txt into the hash table and then answers anagram queries
4. dictionary.txt ==> this contains all the anagrams of strings.
5. user_string.txt ==> user can be able to search word/words by specifying in this file.


No comments:

Post a Comment