When i was solving some ACM; problems . I found many angrams problems and some having application of anagrams; .During Holidays i was reading Knuths guide to programming ..and i have found this cool algo.
Suppose a problem is given :-
Given a List of dictionary words ( >1,00,000 words )in a file dict.txt ( say ) . and you have to find out all the anagrams in it .
So here is the Knuths algorithm :-
Store the words in an array a1 (say) sort each and every word and store it in a2. [ I mean to say suppose a1[i]=”dog” sort “dog” and then put a2[i]=”dgo” ]
Now sort array a2 according to the string in it & keep swaping the respective indexes in a1 .
Now compare values in a2 and find whether they are anagram or not
that’s it ![]()
Example :-
—————————————————————————————
dict.txt :-
lander –> a1[0]
Earth –> a1[1]
antsier –> a1[2]
Heart –> a1[3]
asterin –> a1[4]
randel –> a1[5]
sort and put in a2
a2[0]=adelnr
a2[1]=aehrth
a2[2]=aeinrst
a2[3]=aehrth
a2[4]=aeinrst
a2[5]=adelnr
Now sort a2 accordin to the str in it
simlilary change a1
a2[0]=adelnr a1[0]=lander
a2[1]=adelnr a1[1]=randel
a2[2]=aehrth a1[2]=Earth
a2[3]=aehrth a1[3]=Heart
a2[4]=aeinrst a1[4]=antsier
a2[5]=aeinrst a1[5]=asterin
Now compare values of a2 to find the grp of anagrams
according to the example
Grp of anagrams are :-
1 lander and randel
2 Earth and Heart
3 antsier and asterin
If some body has a better algorithm or a better way to improve this code plz write a comment





12 comments
Comments feed for this article
July 31, 2006 at 5:50 pm
snigam
seems dat u r all upto programmin goodies
keep it up dude
n keep postin similar posts
August 2, 2006 at 6:59 am
lifeizlikethat
You know ur algo sucks i have a much better algo than this!!!
August 2, 2006 at 8:56 am
marutiborker
@lifeizlikedat
Dood … frst of all this is not ma Algo .. and i think i have requested you to write the algo if u ahave a better one
August 9, 2006 at 7:06 am
ranjeeth
Good to see such discussions. But the original problem is too simple.. try the variants… this, for instance
January 10, 2008 at 2:19 am
Sanket
Is this 3n*log(n) or O ( n * log(n) ) ???
July 21, 2008 at 8:47 am
Peter
I don’t understand this Now sort a2 accordin to the str in it simlilary change a1
a2[0]=adelnr a1[0]=lander
a2[1]=adelnr a1[1]=randel
a2[2]=aehrth a1[2]=Earth
a2[3]=aehrth a1[3]=Heart
a2[4]=aeinrst a1[4]=antsier
a2[5]=aeinrst a1[5]=asterin
Now compare values of a2 to find the grp of anagrams
according to the example
Grp of anagrams are :-
1 lander and randel
2 Earth and Heart
3 antsier and asterin
Please explain.
July 21, 2008 at 9:04 am
Peter
I don’t understand this.
Now sort array a2 according to the string in it & keep swaping the respective indexes in a1 .
August 26, 2008 at 10:44 am
Maruti Borker
@Peter i meant sorting the a1 according to a2. That is suppose that a1 and a2 are mapped to each other and you are sorting based on the string in a2
January 5, 2011 at 12:24 am
Bruno Mosconi
first sign each line of a file for finding anagrams
The input line “stop” gives the output line “opst stop”
1. Sign words so that those in the same anagram class have the same signature, and then collect equal signatures.
Collecting the Classes. Sort the words by their sig- natures.
Sort this way (with a horizontal wave of the hand) then that way (a vertical wave).
September 20, 2011 at 12:11 am
Jack Reilly
You can do this in linear time by doing bucket sort (implemented using a hash map), on the bucket sorted words. never forget bucket sort!
https://gist.github.com/1227148
September 27, 2011 at 1:09 am
Liam
You need to be a little careful using bucket sort here. Bucket sort is only linear if the elements are approximately evenly distributed in a known range. While we know the range (the alphabet), letters in the alphabet are not evenly distributed. You will have to make a version of bucket sort that respects the relative frequency of letters.
September 20, 2011 at 12:14 am
Jack Reilly
sorry, here’s the gist
https://gist.github.com/1227157