26

I need a quick checksum (as fast as possilbe) for small strings (20-500 chars).

I need the source code and that must be small! (about 100 LOC max)

If it could generate strings in Base32/64. (or something similar) it would be perfect. Basically the checksums cannot use any "bad" chars.. you know.. the usual (){}[].,;:/+-\| etc

Clarifications

It could be strong/weak, that really doesn't matter since it is only for behind-the-scenes purposes.

It need not contain all the data of the original string since I will be only doing comparison with generated checksums, I don't expect any sort of "decryption".

John Boker
  • 82,559
  • 17
  • 97
  • 130
Robin Rodricks
  • 110,798
  • 141
  • 398
  • 607

10 Answers10

27

schnaader's implementation is indeed very fast. Here it is in Javascript:

function checksum(s)
{
  var chk = 0x12345678;
  var len = s.length;
  for (var i = 0; i < len; i++) {
      chk += (s.charCodeAt(i) * (i + 1));
  }

  return (chk & 0xffffffff).toString(16);
}

Using Google Chrome, this function takes just 5ms to run for 1-megabyte strings, versus 330ms using a crc32 function.

joelpt
  • 4,675
  • 2
  • 29
  • 28
  • 1
    @oelpt: this is faster because you forgot to implement the other part of the function provided by schnaader. The one called `encode_int`. Without this part your checksum would increase indefinitely along with the string length, therefor it would be quite useless as a checksum. – Marco Demaio Apr 05 '11 at 15:43
  • @Marco: I see you're right. The original question mentioned 20-500 character strings and I only tested up to 1MB strings. Do you think adding a modulo operation after each addition would be sufficient for very large strings? TBH I don't understand what schnaader's encode_int is doing. – joelpt Jun 07 '11 at 02:26
  • 1
    schnaader's `encode_int` I think simply conmverts an int into a string in hex format. – Marco Demaio Jun 13 '11 at 16:32
  • 2
    why removing the "i+1" from schnaader's implementation? here you will get the same result whatever the first char is. – Alexis Oct 25 '13 at 20:12
  • 1
    I decided not to implement encode_int() as Marco noted because it just converts the checksum from an int to a hex string, which for many uses cases isn't actually needed. If it is, you can just "return chk.toString(16);" in order to obtain a hex checksum instead. As Marco noted, my function does return a value that grows as string length grows, and for very long strings it can eventually overflow the JS number type. If this is a problem, you can use "chk = (chk + (s.charCodeAt(i) * (i + 1))) % 0xFFFFFFFF;" in the inner loop to prevent overflow/growth. This makes the function about 10x slower. – joelpt Feb 02 '14 at 18:27
  • Your code allows for small optimization. You're reading string length in every loop while you could simply compare it to a variable if you'd minimally modify your `for` statement to this: `for(var i = 0, l = s.length; i < l; i++)`. – Robert Koritnik Feb 24 '15 at 17:36
  • **And** you don't have to execute module in each iteration. Schnaader converted the whole thing to a hex string after calculated checksum. You could just change the return statement to `return (chk & 0xffffffff).toString(16);` and you'd practically end up with the same result as his code. And it wouldn't have one order of magnitude performance impact. – Robert Koritnik Feb 24 '15 at 18:57
  • 1
    @RobertKoritnik great observations. Fixed. – joelpt Aug 13 '16 at 01:19
  • 1
    Just so it's obvious: this checksum will not be unique. checksum("231") == checksum("203") == "123457a3" – iamio Jul 19 '17 at 17:37
  • The overflow situation isn't handled properly. The leading bit of mask 0xffffffff does nothing but capture a bit that that should not be captured, generating a negative hex value (with inverted magnitude). As not aiming for perfection here, perhaps satisfactory to just mask with 0x7fffffff. That will keep uniqueness just as well as before up to the overflow limit, and will start cycling through available permutations again. Otherwise the function shouldn't return an invalid result and should throw if the 0x10000000 bit is set. – Blaine Nov 19 '20 at 19:21
6

Here's a Javascript implementation of CRC32:

function crc32 ( str ) {
    // http://kevin.vanzonneveld.net
    // +   original by: Webtoolkit.info (http://www.webtoolkit.info/)
    // +   improved by: T0bsn
    // -    depends on: utf8_encode
    // *     example 1: crc32('Kevin van Zonneveld');
    // *     returns 1: 1249991249

    str = utf8_encode(str);
    var table = "00000000 77073096 EE0E612C 990951BA 076DC419 706AF48F E963A535 9E6495A3 0EDB8832 79DCB8A4 E0D5E91E 97D2D988 09B64C2B 7EB17CBD E7B82D07 90BF1D91 1DB71064 6AB020F2 F3B97148 84BE41DE 1ADAD47D 6DDDE4EB F4D4B551 83D385C7 136C9856 646BA8C0 FD62F97A 8A65C9EC 14015C4F 63066CD9 FA0F3D63 8D080DF5 3B6E20C8 4C69105E D56041E4 A2677172 3C03E4D1 4B04D447 D20D85FD A50AB56B 35B5A8FA 42B2986C DBBBC9D6 ACBCF940 32D86CE3 45DF5C75 DCD60DCF ABD13D59 26D930AC 51DE003A C8D75180 BFD06116 21B4F4B5 56B3C423 CFBA9599 B8BDA50F 2802B89E 5F058808 C60CD9B2 B10BE924 2F6F7C87 58684C11 C1611DAB B6662D3D 76DC4190 01DB7106 98D220BC EFD5102A 71B18589 06B6B51F 9FBFE4A5 E8B8D433 7807C9A2 0F00F934 9609A88E E10E9818 7F6A0DBB 086D3D2D 91646C97 E6635C01 6B6B51F4 1C6C6162 856530D8 F262004E 6C0695ED 1B01A57B 8208F4C1 F50FC457 65B0D9C6 12B7E950 8BBEB8EA FCB9887C 62DD1DDF 15DA2D49 8CD37CF3 FBD44C65 4DB26158 3AB551CE A3BC0074 D4BB30E2 4ADFA541 3DD895D7 A4D1C46D D3D6F4FB 4369E96A 346ED9FC AD678846 DA60B8D0 44042D73 33031DE5 AA0A4C5F DD0D7CC9 5005713C 270241AA BE0B1010 C90C2086 5768B525 206F85B3 B966D409 CE61E49F 5EDEF90E 29D9C998 B0D09822 C7D7A8B4 59B33D17 2EB40D81 B7BD5C3B C0BA6CAD EDB88320 9ABFB3B6 03B6E20C 74B1D29A EAD54739 9DD277AF 04DB2615 73DC1683 E3630B12 94643B84 0D6D6A3E 7A6A5AA8 E40ECF0B 9309FF9D 0A00AE27 7D079EB1 F00F9344 8708A3D2 1E01F268 6906C2FE F762575D 806567CB 196C3671 6E6B06E7 FED41B76 89D32BE0 10DA7A5A 67DD4ACC F9B9DF6F 8EBEEFF9 17B7BE43 60B08ED5 D6D6A3E8 A1D1937E 38D8C2C4 4FDFF252 D1BB67F1 A6BC5767 3FB506DD 48B2364B D80D2BDA AF0A1B4C 36034AF6 41047A60 DF60EFC3 A867DF55 316E8EEF 4669BE79 CB61B38C BC66831A 256FD2A0 5268E236 CC0C7795 BB0B4703 220216B9 5505262F C5BA3BBE B2BD0B28 2BB45A92 5CB36A04 C2D7FFA7 B5D0CF31 2CD99E8B 5BDEAE1D 9B64C2B0 EC63F226 756AA39C 026D930A 9C0906A9 EB0E363F 72076785 05005713 95BF4A82 E2B87A14 7BB12BAE 0CB61B38 92D28E9B E5D5BE0D 7CDCEFB7 0BDBDF21 86D3D2D4 F1D4E242 68DDB3F8 1FDA836E 81BE16CD F6B9265B 6FB077E1 18B74777 88085AE6 FF0F6A70 66063BCA 11010B5C 8F659EFF F862AE69 616BFFD3 166CCF45 A00AE278 D70DD2EE 4E048354 3903B3C2 A7672661 D06016F7 4969474D 3E6E77DB AED16A4A D9D65ADC 40DF0B66 37D83BF0 A9BCAE53 DEBB9EC5 47B2CF7F 30B5FFE9 BDBDF21C CABAC28A 53B39330 24B4A3A6 BAD03605 CDD70693 54DE5729 23D967BF B3667A2E C4614AB8 5D681B02 2A6F2B94 B40BBE37 C30C8EA1 5A05DF1B 2D02EF8D";

    var crc = 0;
    var x = 0;
    var y = 0;

    crc = crc ^ (-1);
    for( var i = 0, iTop = str.length; i < iTop; i++ ) {
        y = ( crc ^ str.charCodeAt( i ) ) & 0xFF;
        x = "0x" + table.substr( y * 9, 8 );
        crc = ( crc >>> 8 ) ^ x;
    }

    return crc ^ (-1);
}

Found at kevin.vanzonneveld.net

Michael Cramer
  • 5,080
  • 1
  • 20
  • 16
  • Thanks for your trouble, but Schnaader's little snippet works like a charm, and its fast. Real fast. – Robin Rodricks May 01 '09 at 13:24
  • 2
    yes and i like that, as a web developer, it doesn't matter if it is fast.. i just can't execute C in a browser :P – jebbie Aug 20 '13 at 14:14
  • Well this one's not particularly optimised as it does string manipulation. At least this part could benefit from having an array of values instead of putting them all in a string. So instead of string parsing we'd just be referencing particular table index getting directly to the number we need. That would make it faster. **Much much faster**. Like [25 times](http://jsperf.com/string-manipulation-vs-table-index) faster... – Robert Koritnik Feb 24 '15 at 20:33
5

Quick implementation in C, no copyrights from my side, so use it as you wish. But please note that this is a very weak "checksum", so don't use it for serious things :) - but that's what you wanted, isn't it?

This returns an 32-bit integer checksum encoded as an string containing its hex value. If the checksum function doesn't satisfy your needs, you can change the chk += ((int)(str[i]) * (i + 1)); line to something better (f.e. multiplication, addition and bitwise rotating would be much better).

EDIT: Following hughdbrown's advice and one of the answers he linked, I changed the for loop so it doesn't call strlen with every iteration.

#include <stdio.h>
#include <stdlib.h>
#include <string>

char* hextab = "0123456789ABCDEF";

char* encode_int(int i) {
  char* c = (char*)malloc(sizeof(char) * 9);

  for (int j = 0; j < 4; j++) {
    c[(j << 1)] = hextab[((i % 256) >> 4)];
    c[(j << 1) + 1] = hextab[((i % 256) % 16)];

    i = (i >> 8);
  }
  c[8] = 0;

  return c;
}

int checksum(char* str) {
  int i;
  int chk = 0x12345678;

  for (i = 0; str[i] != '\0'; i++) {
    chk += ((int)(str[i]) * (i + 1));
  }

  return chk;
}

int main() {
  char* str1 = "Teststring";
  char* str2 = "Teststring2";

  printf("string: %s, checksum string: %s\n", str1, encode_int(checksum(str1)));
  printf("string: %s, checksum string: %s\n", str2, encode_int(checksum(str2)));

  return 0;
}
Community
  • 1
  • 1
schnaader
  • 49,103
  • 10
  • 104
  • 136
  • Double the speed of the best CRC32 I could get my hands on :) – Robin Rodricks May 01 '09 at 13:22
  • Thanks so much! I guess the answers my question. Plus, I can change it myself without crashing any delicate algorithm. – Robin Rodricks May 01 '09 at 13:22
  • 1
    So if you multiply the value if the first character by 0, then two strings that differ only in the first character have the same hash. And all 0- and 1-length strings have the same hash. – hughdbrown Aug 11 '11 at 20:39
  • @hughdbrown: Oh, good catch. Changing that to multiplying with `i + 1`. – schnaader Aug 11 '11 at 22:10
  • 1
    No need to calculate strlen() each time through the loop. Why not write your loop like this? `for (int c, j = 0; 0 != (c = *cp++); ) chk += c * ++j;` – hughdbrown Aug 12 '11 at 20:08
  • In this case I trust the compiler to optimize this for me, he might be clever enough to see that `str` and so `strlen(str)` doesn't change. Also, it's more readable that way. – schnaader Aug 13 '11 at 00:58
  • 1
    Telling the compiler what you want is more reliable than hoping the compiler will do the right thing. http://stackoverflow.com/questions/2049480/how-many-times-will-strlen-be-called-in-this-for-loop http://stackoverflow.com/questions/3388029/strlen-function – hughdbrown Aug 15 '11 at 11:44
  • OK, thanks for the linked SO questions/answers, I edited the code. Though the code you recommended might still be a bit faster, I decided to use this version to keep it as readable as possible. – schnaader Aug 15 '11 at 12:43
  • 2
    Tag: Javascript. Accepted answer: C... :( – Stijn de Witt Sep 03 '17 at 16:09
4

Pretty much any algorithm you could come up with would satisfy your criteria. E.g.

CHECKSUM = SUM( i=0 .. input.length, input[i] )

to make it "bad-char-safe"

CHECKSUM = 'A' + SUM( i=0 .. input.length, input[i] ) MODULO 26

An attempt that tries to reduce the number of collisions by increasing the output domain

# Assume BASE64[ ] is the safe output alphabet array.
TMP = SUM( i=0 .. input.length, input[i] ) MODULO 2^24
FOR I = 0..3
    CHECKSUM[I] = BASE64[TMP MODULO 64]
    TMP = TMP / 64

A solution that further reduces the number of collisions by calculating different values for different permuations

# Assume BASE64[ ] is the safe output alphabet array.
TMP = SUM( i=0 .. input.length, i*input[i] ) MODULO 2^24
FOR I = 0..3
    CHECKSUM[I] = BASE64[TMP MODULO 64]
    TMP = TMP / 64

In general, all these variations perform pretty well if the input is random enough and sparse enough (of course, "enough" differs in each case)

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Good idea, but I need something where even 1 unique character would generate a different checksum. – Robin Rodricks May 01 '09 at 13:08
  • As a footnote, this is an algorithm, not sourcecode, and hence it can't be open-source. If you turn this into sourcecode, you get to decide whether it's open source. – MSalters May 01 '09 at 13:09
  • +1 great explanation thanks. FYI the 1st solution the one that does not use `MOD`, would create a checksum that increases idefinitely with the string length, but I suppose you enterd it for easier explanation reasons. – Marco Demaio Apr 05 '11 at 15:32
  • I understood that your function creates a checksum, but do you know why the `Luhn mod N alg` (http://en.wikipedia.org/wiki/Luhn_mod_N_algorithm in some way similar to your implementetion) during the sum it doubles the value of each even positioned char/digit? I mean do you have an idea why he doesn's simply sum the chars/digits `from i to input.length` like your examples? – Marco Demaio Apr 05 '11 at 15:39
3

It is 207 lines, but it is a javascript implementation of md5:

http://www.webtoolkit.info/javascript-md5.html

Couple that with a javascript base64:

http://www.webtoolkit.info/javascript-base64.html

These scripts are totally self contained, so they have some redundancies (such as the UTF-8 encode/decode) that could easily be made common to them.

EDIT: On the same site you can find a javascript crc32:

http://www.webtoolkit.info/javascript-crc32.html

Mike Boers
  • 6,665
  • 3
  • 31
  • 40
  • Or in recent installments you could just load crypto js. https://cdnjs.cloudflare.com/ajax/libs/crypto-js/3.1.2/components/core.js https://cdnjs.cloudflare.com/ajax/libs/crypto-js/3.1.2/rollups/md5.js and run them with md5 = CryptoJS.MD5(string); words = CryptoJS.lib.WordArray.create(md5.words); and you'll have all in hex output in literally 4 lines of code. – JasonXA May 21 '17 at 08:19
1

Works in modern (non IE) browsers, has to be on HTTPS:

https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/digest

const text = `Lorem ipsum dolor sit amet, consectetur adipiscing elit. In nec elit a justo rhoncus blandit. Aenean commodo sem in mattis fermentum. Phasellus pellentesque tortor lectus, a sodales quam cursus in. Aliquam lorem velit, faucibus id nisi sed, tristique tempor odio. Donec sit amet neque non nunc dictum gravida. Vestibulum in suscipit urna, ac porttitor enim. Integer ultrices feugiat justo vel gravida. Cras viverra laoreet lobortis. Mauris non pharetra purus. Aenean sed elementum justo. Vivamus libero enim, consequat eu convallis in, lacinia ut mi. Curabitur eget diam id augue tincidunt tempus ac eu nisi. Sed dui leo, rhoncus vel egestas non, tempor a magna.

Donec congue vehicula nunc sed vestibulum. Interdum et malesuada fames ac ante ipsum primis in faucibus. Donec sodales scelerisque ullamcorper. Sed iaculis aliquet consectetur. Donec vel purus sodales, interdum velit eget, ultricies arcu. Nullam eu lorem vel sem aliquam congue eu ut felis. Praesent auctor vitae massa venenatis bibendum. Morbi a aliquet enim. Mauris ac nisi lacus. Etiam nec sollicitudin nibh. Sed maximus tortor eget lectus maximus, quis ultrices justo faucibus. Suspendisse potenti. Fusce id consequat mi.

Donec cursus, orci vel malesuada porttitor, nisi orci volutpat ipsum, eget pretium nisl est a lorem. Curabitur et egestas tortor, vel mattis tellus. Suspendisse eget nunc varius, pharetra velit sit amet, viverra est. Pellentesque a vehicula risus, eu tincidunt justo. Sed nec ligula a eros sagittis rhoncus. Vestibulum a nulla erat. Nulla facilisi. Aenean elit diam, scelerisque quis sollicitudin non, feugiat a lorem.

Aliquam lacinia mi diam, ut aliquet libero placerat at. Pellentesque sit amet neque varius, pharetra nunc ac, egestas justo. Fusce at dapibus felis, et imperdiet felis. Nullam fringilla mi ut lorem imperdiet cursus. Morbi venenatis, justo vel efficitur euismod, dui lacus tincidunt neque, vel vestibulum velit lectus et velit. Sed non dolor libero. Vivamus nec ligula a nisl eleifend sollicitudin at id nisl. Sed in vestibulum nisi, sed vestibulum nunc. Donec volutpat eu nisi nec venenatis.

Sed rhoncus ut nisl a tristique. Integer lacus massa, congue fringilla mollis in, pretium vitae lorem. Vivamus porttitor quam nisl, vitae suscipit ante egestas at. Ut sed enim vel ante congue euismod sit amet a elit. Phasellus sed placerat nunc. Aenean turpis tortor, convallis eget leo a, fringilla fringilla nisi. Maecenas euismod sapien ut massa ultricies interdum. Donec suscipit dolor dolor.`;

async function digestMessage(message) {
  // encode as (utf-8) Uint8Array
  const msgUint8 = new TextEncoder().encode(message);
  // hash the message
  const hashBuffer = await crypto.subtle.digest('SHA-256', msgUint8);
  // convert buffer to byte array
  const hashArray = Array.from(new Uint8Array(hashBuffer));
  // convert bytes to hex string
  const hashHex = hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
  return hashHex;
}

const digestBuffer = digestMessage(text)
  .then(digestBuffer => console.log(digestBuffer));

The above takes about 4.5ms, a short sentence and this long one made no difference. Didn't try it for a very long string.

Dominic
  • 62,658
  • 20
  • 139
  • 163
  • I think this is the best solution, because the work is done by the browser, and you can choose which algorithm to use, such as: SHA-1, SHA-256, SHA-384, or SHA-512. – jalbr74 May 24 '22 at 15:41
1

to use just javascript you could possibly use this crc function: http://www.webtoolkit.info/javascript-crc32.html

/**
*
*  Javascript crc32
*  http://www.webtoolkit.info/
*
**/

function crc32 (str) {

    function Utf8Encode(string) {
        string = string.replace(/\r\n/g,"\n");
        var utftext = "";

        for (var n = 0; n < string.length; n++) {

            var c = string.charCodeAt(n);

            if (c < 128) {
                utftext += String.fromCharCode(c);
            }
            else if((c > 127) && (c < 2048)) {
                utftext += String.fromCharCode((c >> 6) | 192);
                utftext += String.fromCharCode((c & 63) | 128);
            }
            else {
                utftext += String.fromCharCode((c >> 12) | 224);
                utftext += String.fromCharCode(((c >> 6) & 63) | 128);
                utftext += String.fromCharCode((c & 63) | 128);
            }

        }

        return utftext;
    };

    str = Utf8Encode(str);

    var table = "00000000 77073096 EE0E612C 990951BA 076DC419 706AF48F E963A535 9E6495A3 0EDB8832 79DCB8A4 E0D5E91E 97D2D988 09B64C2B 7EB17CBD E7B82D07 90BF1D91 1DB71064 6AB020F2 F3B97148 84BE41DE 1ADAD47D 6DDDE4EB F4D4B551 83D385C7 136C9856 646BA8C0 FD62F97A 8A65C9EC 14015C4F 63066CD9 FA0F3D63 8D080DF5 3B6E20C8 4C69105E D56041E4 A2677172 3C03E4D1 4B04D447 D20D85FD A50AB56B 35B5A8FA 42B2986C DBBBC9D6 ACBCF940 32D86CE3 45DF5C75 DCD60DCF ABD13D59 26D930AC 51DE003A C8D75180 BFD06116 21B4F4B5 56B3C423 CFBA9599 B8BDA50F 2802B89E 5F058808 C60CD9B2 B10BE924 2F6F7C87 58684C11 C1611DAB B6662D3D 76DC4190 01DB7106 98D220BC EFD5102A 71B18589 06B6B51F 9FBFE4A5 E8B8D433 7807C9A2 0F00F934 9609A88E E10E9818 7F6A0DBB 086D3D2D 91646C97 E6635C01 6B6B51F4 1C6C6162 856530D8 F262004E 6C0695ED 1B01A57B 8208F4C1 F50FC457 65B0D9C6 12B7E950 8BBEB8EA FCB9887C 62DD1DDF 15DA2D49 8CD37CF3 FBD44C65 4DB26158 3AB551CE A3BC0074 D4BB30E2 4ADFA541 3DD895D7 A4D1C46D D3D6F4FB 4369E96A 346ED9FC AD678846 DA60B8D0 44042D73 33031DE5 AA0A4C5F DD0D7CC9 5005713C 270241AA BE0B1010 C90C2086 5768B525 206F85B3 B966D409 CE61E49F 5EDEF90E 29D9C998 B0D09822 C7D7A8B4 59B33D17 2EB40D81 B7BD5C3B C0BA6CAD EDB88320 9ABFB3B6 03B6E20C 74B1D29A EAD54739 9DD277AF 04DB2615 73DC1683 E3630B12 94643B84 0D6D6A3E 7A6A5AA8 E40ECF0B 9309FF9D 0A00AE27 7D079EB1 F00F9344 8708A3D2 1E01F268 6906C2FE F762575D 806567CB 196C3671 6E6B06E7 FED41B76 89D32BE0 10DA7A5A 67DD4ACC F9B9DF6F 8EBEEFF9 17B7BE43 60B08ED5 D6D6A3E8 A1D1937E 38D8C2C4 4FDFF252 D1BB67F1 A6BC5767 3FB506DD 48B2364B D80D2BDA AF0A1B4C 36034AF6 41047A60 DF60EFC3 A867DF55 316E8EEF 4669BE79 CB61B38C BC66831A 256FD2A0 5268E236 CC0C7795 BB0B4703 220216B9 5505262F C5BA3BBE B2BD0B28 2BB45A92 5CB36A04 C2D7FFA7 B5D0CF31 2CD99E8B 5BDEAE1D 9B64C2B0 EC63F226 756AA39C 026D930A 9C0906A9 EB0E363F 72076785 05005713 95BF4A82 E2B87A14 7BB12BAE 0CB61B38 92D28E9B E5D5BE0D 7CDCEFB7 0BDBDF21 86D3D2D4 F1D4E242 68DDB3F8 1FDA836E 81BE16CD F6B9265B 6FB077E1 18B74777 88085AE6 FF0F6A70 66063BCA 11010B5C 8F659EFF F862AE69 616BFFD3 166CCF45 A00AE278 D70DD2EE 4E048354 3903B3C2 A7672661 D06016F7 4969474D 3E6E77DB AED16A4A D9D65ADC 40DF0B66 37D83BF0 A9BCAE53 DEBB9EC5 47B2CF7F 30B5FFE9 BDBDF21C CABAC28A 53B39330 24B4A3A6 BAD03605 CDD70693 54DE5729 23D967BF B3667A2E C4614AB8 5D681B02 2A6F2B94 B40BBE37 C30C8EA1 5A05DF1B 2D02EF8D";

    if (typeof(crc) == "undefined") { crc = 0; }
    var x = 0;
    var y = 0;

    crc = crc ^ (-1);
    for( var i = 0, iTop = str.length; i < iTop; i++ ) {
        y = ( crc ^ str.charCodeAt( i ) ) & 0xFF;
        x = "0x" + table.substr( y * 9, 8 );
        crc = ( crc >>> 8 ) ^ x;
    }

    return crc ^ (-1);

};

in php you could do it with one line with an md5 hash, on a string of length 20 - 500 it should be pretty fast

$hash_code = md5($string_to_hash);

here's some more info: http://us.php.net/md5

also, if you wanted to choose your hashing algorithm you could use the hash php function: http://us.php.net/manual/en/function.hash.php

John Boker
  • 82,559
  • 17
  • 97
  • 130
0

This may seem a bit late and partly off topic... I used joelpt's JS-implementation of schnaader's solution in an application based on JS and PHP. And, I assume that the PHP implementation might be helpful for others, as well.

function checksum($s) {
    $chk = 0x12345678;
    $len = mb_strlen($s);
    for ($i = 0; $i < $len; $i++) {
        $chk += (mb_ord(mb_substr($s, $i, 1)) * ($i + 1));
    }
    return dechex($chk & 0xffffffff);
}

if (!function_exists('mb_ord')) {
    // Shim from https://stackoverflow.com/a/1365610/336311
    function mb_ord($string) {
        mb_internal_encoding('UTF-8');
        // mb_language('Neutral');
        // mb_detect_order(['UTF-8', 'ISO-8859-15', 'ISO-8859-1', 'ASCII']);

        $result = unpack('N', mb_convert_encoding($string, 'UCS-4BE', 'UTF-8'));

        if (is_array($result)) {
            return $result[1];
        }
        return ord($string);
    }
}
BurninLeo
  • 4,240
  • 4
  • 39
  • 56
0

What you are looking for is an algorithm to create a hash code for a string. In C#:

byte[] bytesToHash = Encoding.UTF8.GetBytes(stringToHash);
HashAlgorithm sha = new SHA1CryptoServiceProvider();
byte[] hash = sha.ComputeHash(dataArray);
string result = Convert.ToBase64String(hash);
Ronald Wildenberg
  • 31,634
  • 14
  • 90
  • 133
0

md5sum, sha1sum, sha224sum, sha256sum, sha384sum, sha512sum are available to most *nix distributions

Andrew Sledge
  • 10,163
  • 2
  • 29
  • 30