0

Does anyone have a better compression algorithm that would allow random reads/writes?

I think you could use any compression algorithm if you write it in blocks, but ideally I would not like to have to decompress a whole block at a time. But if you have suggestions on an easy way to do this and how to know the block boundaries, please let me know. If this is part of your solution, please also let me know what you do when the data you want to read is across a block boundary?

In the context of your answers please assume the file in question is 100GB, and sometimes I'll want to read the first 10 bytes, and sometimes I'll want to read the last 19 bytes, and sometimes I'll want to read 17 bytes in the middle. .

King User
  • 100
  • 6
  • 1
    Have you thought about just writing the first ten bytes, last nineteen bytes and middle seventeen bytes to a separate uncompressed file? Surely forty-six bytes isn't too onerous :-) – paxdiablo Jun 03 '15 at 06:42

1 Answers1

2

Have these people never heard of "compressed file systems", which have been around since before Microsoft was sued in 1993 by Stac Electronics over compressed file system technology?

I hear that LZS and LZJB are popular algorithms for people implementing compressed file systems, which necessarily require both random-access reads and random-access writes.

Perhaps the simplest and best thing to do is to turn on file system compression for that file, and let the OS deal with the details. But if you insist on handling it manually, perhaps you can pick up some tips by reading about NTFS transparent file compression.

Aditya Giri
  • 1,786
  • 18
  • 33
  • 1
    If you read those answers which say "impossible", I think you'll discover that the issue of dispute is over terminology. Everyone is agreed that you can have a file format where, if you want the 10000th byte, you can find the chunk which contains that byte, and read through just that one chunk until you get the 10000th byte. Not everyone considers that to be "random-access", which is what the question specified. – King User Jun 03 '15 at 06:40