I'm working with hundreds of thousand line strings right now. Is there anyway I could compress the string to like an MD5 like, then uncompress it?
Asked
Active
Viewed 1.1k times
7
-
1Not using a hash function like MD5, SHA, etc. It would need to be an encryption-quality function, or more likely, an actual compression algorithm. – Jared Farrish Jun 01 '12 at 22:00
-
Firstly, MD5 isn't compression, it's a hash, which is one way only, secondly MD5 is old and broken, don't use it. You say you want compression, not encryption so consider using something like GZIP. – Jeff Watkins Jun 01 '12 at 22:01
-
What should be used instead on MD5? Why is it broken? – Ash Burlaczenko Jun 01 '12 at 22:04
-
@AshBurlaczenko - MD5 is an "irreversible" hashing function not meant for compression (or encryption). It's meant to be one-way in practice. – Jared Farrish Jun 01 '12 at 22:05
-
SHA2 is considered safe by most security agencies. MD5 can be broken in minutes as it's not collision proof (actually trivially easy to produce collisions). There are sites where you can reverse engineer an MD5 hash by posting it. It's just not safe. However it's only useful for hashing operations, not for compression or two way encryption and decryption. – Jeff Watkins Jun 01 '12 at 22:05
-
The question was asked, "why is is broken?", I answered that :) – Jeff Watkins Jun 01 '12 at 22:07
-
@JaredFarrish, yeah my bad, I'm not using the site's awesome power correctly. – Jeff Watkins Jun 01 '12 at 22:09
2 Answers
14
Yes you can compress and uncompress strings in PHP (Demo):
$str = 'Hello I am a very very very very long string';
$compressed = gzcompress($str, 9);
$uncompressed = gzuncompress($compressed);
echo $str, "\n";
echo $uncompressed, "\n";
echo base64_encode($compressed), "\n";
echo bin2hex($compressed), "\n";
echo urlencode($compressed), "\n";
However MD5 is not compressing but hashing.
See as well: How to compress/decompress a long query string in PHP?
-
Just thought mentioning zipping and unzipping "per request" won't result in any savings, as opposed to zipping and then working, then unzipping at some later time. At least without testing, I doubt zipping is always the best approach. – Jared Farrish Jun 01 '12 at 22:12
-
Sure, the string in the example has no benefits, the following is a better example: http://codepad.viper-7.com/JtvTCk - only if the compressed string is less long, it makes sense. – hakre Jun 01 '12 at 22:15
-
To clarify, I meant MD5 as how the string looks/ect. I didn't mean to mean. But thank you. – Jake Jun 01 '12 at 22:16
-
For that you can use `bin2hex`, however I suggest you take `base64_encode` because your string will be smaller/shorter. – hakre Jun 01 '12 at 22:26
3
Take a look at the ZLib functions provided with PHP: http://us.php.net/manual/en/ref.zlib.php
You can use a combination of gzencode()
and gzdecode()
, or a combination of gzdeflate()
and gzinflate()
, or a combination of gzcompress()
and gzuncompress()
. Just remember to use the decompression function which matches the compression function you used, as all three of these pairs of functions return (or accept) slightly different data.
You will probably need to do some real-world tests to determine which of these pairs is best for you. Good luck!

Jazz
- 1,435
- 1
- 15
- 23