0

I want to create a compression scheme for 2-bit numbers such that it will reduce the size of any sequence by at least one bit. How can I prove this is not possible?

JWWalker
  • 22,385
  • 6
  • 55
  • 76
benji_r
  • 535
  • 2
  • 16
  • 34
  • 2
    Of course it's possible to always reduce the size by one bit. What is impossible is losslessly reversing the process. – Mark Adler Feb 16 '13 at 04:37

1 Answers1

3

There are 4 possible two-bit numbers and 3 possible shorter bit sequences (the empty sequence of bits and the sequences 0 and 1). By the pigeonhole principle, this means that any mapping from two-bit sequences to shorter sequences must have at least two sequences compressed to the same shorter sequence. As a result, when you want to decompress this shorter sequence, you will not be able to do so because you won't know which of the original two-bit sequences it came from.

This can be generalized to show that n-bit sequences cannot be losslessly compressed to bit sequences of length less than n. This earlier answer details why this is.

Hope this helps!

Community
  • 1
  • 1
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 2
    You mean "...must have at least two sequences compressed to _the same_ shorter sequence", I think. (Obvious to anyone who already knows the argument, but perhaps not obvious to a new reader.) – Nemo Feb 16 '13 at 03:18
  • @templatetypedef In most cases if you compress things you have to get rid of stuff, then how could you compress anything and bring back to the original form? – benji_r Feb 16 '13 at 23:12
  • @benbar- In many cases data is highly redundant and it's possible to compress the data by eliminating the redundancy – templatetypedef Feb 17 '13 at 22:15