## permutations again

Write here if you have problems with your C source code

Moderator: Board moderators

### permutations again

Hi,

I've found lots of posts telling how to generate permutations of a string but none about determining whether a number (with let's say 10 digits) is a permutation of another number.

Right now I convert those numbers to strings, sort them and then strcmp them. It works and is reasonable fast, but is there a better algorithm?
WR
Experienced poster

Posts: 145
Joined: Thu Nov 27, 2003 9:46 am

For such short string (10 digits) I think it's as fast as it can get. For longer strings (say >100 digits), you could calculate two histograms and compare them. That method is O(N), while the sorting method is O(N*logN) (N is number of digits in a string).

little joey
Guru

Posts: 1080
Joined: Thu Dec 19, 2002 7:37 pm

Thanks a lot for the quick response!
WR
Experienced poster

Posts: 145
Joined: Thu Nov 27, 2003 9:46 am

Hmm. I was wrong in my previous post. The complexity of the sorting method is of course dependant on the complexity of the sorting function. In this case the O(N) bucket sort can be used, so the sorting method can be also made O(N).

I wrote a test program to test the three different methods (sorting using quick sort, sorting using bucket sort and the histogram method), and here are the results:
Code: Select all
`     N   QSORT   BSORT   HISTO ----- ------- ------- -------     1      10      10      10     2      20      10      10     5      30      10      10    10      50      20      10    20     120      20      20    50     340      50      20   100     770      80      40   200    1720     140      80   500    4890     320     180  1000   10610     630     360  2000   20370    1220     720  5000   62760    3070    1780 10000  134270    6140    3540`

Here the numbers are the times in milliseconds to perform 10000 tests using two random strings of length N (using the very inaccurate function clock()). The number of succesful tests is about equal to the number of failing tests (in about half the cases the second string is a permutation of the first). The numbers support the O(N*logN) complexity for the QSORT method and the O(N) complexity for the BSORT and HISTO methods. The fact that BSORT is slower then HISTO doesn't mean that HISTO is the better method, but only that my HISTO implementation is more efficient than my BSORT method. Careful inspection of both methods reveal that they in fact are two different implementations of the same algorithm.

For those interested in my code, here it is:
[c]#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#define TESTS 10000
#define MAXSTRING 10000

const int DIGITS[]={1,2,5,10,20,50,100,200,500,1000,2000,5000,10000};

int charcompare(const void *p1, const void *p2){

return *((char *)p1) - *((char *)p2);
}

int q_ispermutation(char *s1, char *s2){
static char c1[MAXSTRING+1],c2[MAXSTRING+1];

strcpy(c1,s1);
qsort(c1,strlen(c1),sizeof(char),charcompare);
strcpy(c2,s2);
qsort(c2,strlen(c2),sizeof(char),charcompare);
return !strcmp(c1,c2);
}

void bsort(char *s){
int bucket[10];
int i,j;

for(i=0;i<10;i++) bucket[i]=0;
for(j=strlen(s)-1;j>=0;j--) bucket[s[j]-'0']++;
for(i=0;i<10;i++){
for(j=bucket[i];j>0;j--) *(s++)='0'+i;
}
}

int b_ispermutation(char *s1, char *s2){
static char c1[MAXSTRING+1],c2[MAXSTRING+1];

strcpy(c1,s1);
bsort(c1);
strcpy(c2,s2);
bsort(c2);
return !strcmp(c1,c2);
}

int h_ispermutation(char *s1, char *s2){
int histo[10];
int i;

for(i=0;i<10;i++) histo[i]=0;
for(i=strlen(s1)-1;i>=0;i--) histo[s1[i]-'0']++;
for(i=strlen(s2)-1;i>=0;i--) histo[s2[i]-'0']--;
for(i=0;i<10;i++) if(histo[i]) return 0;
return 1;
}

void randomize_string(char *s){
int i,j,len;
char temp;

len=strlen(s);
for(i=0;i<len-1;i++){
j=i+rand()%(len-i);
temp=s[i];
s[i]=s[j];
s[j]=temp;
}
}

int main(){
int caseno,i,j;
static char s1[TESTS][MAXSTRING+1],s2[TESTS][MAXSTRING+1];
clock_t t0,tq,tb,th;

printf(" N QSORT BSORT HISTO\n");
printf(" ----- ------- ------- -------\n");

for(caseno=0;caseno<sizeof(DIGITS)/sizeof(int);caseno++){
for(i=0;i<TESTS;i++){
for(j=0;j<DIGITS[caseno];j++) s1[i][j]='0'+rand()%10;
s1[i][DIGITS[caseno]]='\0';
if(rand()%2){
for(j=0;j<DIGITS[caseno];j++) s2[i][j]='0'+rand()%10;
s2[i][DIGITS[caseno]]='\0';
}
else{
strcpy(s2[i],s1[i]);
randomize_string(s2[i]);
}
}
t0=clock();
for(i=0;i<TESTS;i++) q_ispermutation(s1[i],s2[i]);
tq=clock()-t0;
t0=clock();
for(i=0;i<TESTS;i++) b_ispermutation(s1[i],s2[i]);
tb=clock()-t0;
t0=clock();
for(i=0;i<TESTS;i++) h_ispermutation(s1[i],s2[i]);
th=clock()-t0;
printf("%6d%8d%8d%8d\n",DIGITS[caseno],tq/1000,tb/1000,th/1000);
}

return 0;
}
[/c]

little joey
Guru

Posts: 1080
Joined: Thu Dec 19, 2002 7:37 pm

i don't understand how in the case of two random strings the number of failure about as equal to the number of success. i'd be suprised if you can get two random strings to be permutations for the alphabet size of N=10000. indeed Bucket sort and histogram are same techniques, however, one could use the histogram technique a bit more efficiently for this specific problem.
bye
abi

abishek
Experienced poster

Posts: 131
Joined: Mon Dec 15, 2003 5:41 am

abishek wrote:i don't understand how in the case of two random strings the number of failure about as equal to the number of success. i'd be suprised if you can get two random strings to be permutations for the alphabet size of N=10000.

It's the if(rand()%2) in my code. In about half the cases a new random string is created for s2[i], in the other half of the cases (else part) s2[i] is just a random permutation of s1[i].

abishek wrote:indeed Bucket sort and histogram are same techniques, however, one could use the histogram technique a bit more efficiently for this specific problem.

Well, my objective was to demonstrate my statement, not to write the most efficient code. I think the code for b_ispermutation() can be optimized to run about as fast as h_ispermutation(), but that was beside my point.

little joey
Guru

Posts: 1080
Joined: Thu Dec 19, 2002 7:37 pm