2

Good day, I am making my hashing algorthm, so I am rewriting it to C++ from PHP. But result in C++ is different than php result. PHP result contains more than 10 characters, C++ result only 6 - 8 characters. But those last 8 characters of PHP result are same as C++ result. So here is PHP code:

<?php function JL1($text) { 
$text.="XQ";
$length=strlen($text);
$hash=0;        
for($j=0;$j<$length;$j++) {
    $p=$text[$j];
    $s=ord($p);
    if($s%2==0) $s+=9999;
    $hash+=$s*($j+1)*0x40ACEF*0xFF;                         
}       
$hash+=33*0x40ACEF*0xFF;
$hash=sprintf("%x",$hash);
return $hash; } ?>

And here C++ code:

char * JL1(char * str){
int size=(strlen(str)+3),s=0; //Edit here (+2 replaced with +3)
if(size<=6) //Edit here (<9 replaced with <=6)
    size=9;
char *final=new char[size],temp;
strcpy(final,str);
strcat(final,"XQ");
long length=strlen(final),hash=0L;
for(int i=0;i<length;i++){
    temp=final[i];
    s=(int)temp;    
    if(s%2==0)s+=9999;
    hash+=((s)*(i+1)*(0x40ACEF)*(0xFF));
}
hash+=33*(0x40ACEF)*(0xFF);
sprintf(final,"%x",hash); //to hex string
final[8]='\0';
return final; }

Example of C++ result for word: "Hi!" : 053c81be And PHP result for this word: 324c053c81be

Does anyone know,where is that mistake and how to fix that, whether in php or in cpp code? By the way, when I cut those first letters in php result I get C++ result, but it wont help, because C++ result have not to be 8 characters long, it can be just 6 characters long in some cases.

  • 1
    `long hash` in C++ is most likely limited to 32 bits on your platform. PHP's number isn't. Try doing `$hash = ($hash & 0xFFFFFFFF);` after every modification of `$hash`. – DCoder Jul 26 '12 at 12:41
  • @DCOder, nope, I'm getting the same result on a 64-bit platform. – SingerOfTheFall Jul 26 '12 at 12:43
  • It works ,but just in case, when C++ hash is 8 characters long. When it hashes "asdf" ,result is 6 characters long in C++: 20205e ,but in PHP it is 9 characters long: 10020205e –  Jul 26 '12 at 12:46
  • @SingerOfTheFall: Are you on [Windows x64](http://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows) or Linux x64? – DCoder Jul 26 '12 at 12:52
  • Interesting. I think that the (main) problem is in `sprintf(final, "%x", hash)`. `%x` interprets the argument as an *unsigned int*, which is 32 bits on both Windows and Linux x64. So it's interpreting a `long` as an `unsigned int` and in doing so your visible result gets truncated. It seems to me that if you clamp the final value of php's `$hash`, the php code will produce the same results as the C++ code would on x64 Linux. If you clamp it every step of the way, you should get the same results as the C++ code would on 32 bit platforms or Windows x64. – DCoder Jul 26 '12 at 13:10
  • @DCoder You were right in that first, I just wrongly explained it for myself. Here is fixed php code: $hash+=($s*($j+1)*(0x40ACEF*0xFF)); $hash&=0xffffffff; (ps. how do I tag comment as answer :/ im new to here) –  Jul 26 '12 at 13:43
  • You cannot mark a *comment* as an answer. But I added my comments as an answer, so you can mark *that*. :) – DCoder Jul 26 '12 at 14:20

3 Answers3

2

Where to begin...

Data types do not have fixed guaranteed sizes in C or C++. As such, hash may overflow every iteration, or it may never do so.

chars can be either signed or unsigned, therefore converting one to an integer may result in negative and positive values on different implementations, for the same character.

You may be writing past the end of final when printing the value of hash into it. You may also be cutting the string off prematurely when setting the 9th character to 0.

strcat will write past the end of final if str is at least 7 characters long.

s, a relatively short-lived temporary variable, is declared way too soon. Same with temp.

Your code looks very crowded with almost no whitespace, and is very hard to read.

The expression "33*(0x40ACEF)*(0xFF)" overflows; did you mean 0x4DF48431L?

Consider using std::string instead of char arrays when dealing with strings in C++.

aib
  • 45,516
  • 10
  • 73
  • 79
  • Converting char to int results negative just in case when char contains "not-english character", I mean "ô","ň","ä","ö","č" etc. char *final=new char[(atleast 9)]; is 9 characters long, because maximal hash size can be 8 characters + \0 character –  Jul 26 '12 at 13:04
1

There seems to be a bug here...

int size=(strlen(str)+2),s=0; 
if(size<9)     
    size=9; 
char *final=new char[size],temp; 
strcpy(final,str); 
strcat(final,"XQ");

If strlen was say 10, then size will be 12 and 12 chars will be allocated. You then copy in the original 10 characters, and add XQ, but the final terminating \0 will be outside of the allocated memory.

Not sure if that's your bug or not but it doesn;t look right

jcoder
  • 29,554
  • 19
  • 87
  • 130
  • Yes, I already fixed it with: int size=(strlen(str)+3),s=0; if(size<=6) size=9; But results are still same. –  Jul 26 '12 at 13:06
1
  1. long hash in C++ is most likely limited to 32 bits on your platform. PHP's number isn't.

  2. sprintf(final, "%x", hash) produces a possibly incorrect result. %x interprets the argument as an unsigned int, which is 32 bits on both Windows and Linux x64. So it's interpreting a long as an unsigned int, if your long is more than 32 bits, your result will get truncated.

  3. See all the issues raised by aib. Especially the premature termination of the result.

You will need to deal with the 3rd point yourself, but I can answer the first two. You need to clamp the result to 32 bits: $hash &= 0xFFFFFFFF;.

If you clamp the final value, the php code will produce the same results as the C++ code would on x64 Linux (that means 64 bit integers for intermediate results).

If you clamp it after every computation, you should get the same results as the C++ code would on 32 bit platforms or Windows x64 (32 bit integers for intermediate results).

DCoder
  • 12,962
  • 4
  • 40
  • 62