I have the assumption there is no added protection at all.
-
Please make sure to check out possible side-channel attacks using compression: https://stackoverflow.com/a/30644897/2650622 – Agost Biro Feb 16 '23 at 09:09
9 Answers
There is no difference in the security provided, but because of the way compression algorithms work, you are probably going to get better compression if you compress first then encrypt.
Compression algorithms exploit statistical redundancies (such as those that exist in natural language or in many file formats) in the data which should be eliminated when you encrypt it, therefore an encrypted message shouldn't be able to be compressed all that well.
From the wikipedia article:
However, lossless data compression algorithms will always fail to compress some files; indeed, any compression algorithm will necessarily fail to compress any data containing no discernible patterns. Attempts to compress data that has been compressed already will therefore usually result in an expansion, as will attempts to compress all but the most trivially encrypted data.

- 11,524
- 3
- 24
- 32
-
Hmm, so if it's a lossless compression then it's not worth compressing first? – john Dec 09 '10 at 15:49
-
5@john: no, if it's lossless compression then it is not worth compressing *after*. – Reese Moore Dec 09 '10 at 15:52
-
Ahh, the patterns inserted by lossless compression should be hidden by the encryption-- so best to do compression first. – john Dec 09 '10 at 16:01
-
2Incidentally, compression will also reduce the size of the plaintext - this may be an infinitesimal increase in security. – Piskvor left the building Dec 09 '10 at 16:10
-
@john: And if you use *lossy* compression *after* then you will be unable to decrypt. – President James K. Polk Dec 09 '10 at 23:29
-
5Compression before encryption can help against some attacks. In particular a known plaintext attack where the plaintext that is known is within other unknown plaintext such that the compressed version of the known text is affected by the unknown plaintext. – Slartibartfast Dec 10 '10 at 20:05
-
@Slartibartfast: In my response I assumed that the encryption scheme was resilient to certain cryptanalytic attacks (known-plaintext, frequency analysis, etc). If the encryption scheme is weak to these, then yes, you might gain some security from the compression, but if the system is weak to these attacks then that is a problem in the encryption scheme. – Reese Moore Dec 10 '10 at 20:17
-
1There is a benefit in security by compressing first. The lower the entropy of the plain-text, the easier it potentially is to break the encryption. Cribs and other known-plaintext attacks can be leveraged. It is all well and good to say "assumed the encryption scheme was resilient to [these] attacks" but that only helps you theoretically, not practically. Anyway, compression doesn't really work on encrypted files since they have had most redundancy structure stripped from them anyways. Always compress first. (Some content from: "Applied Cryptography, 2nd Ed.", Bruce Schneier, 1996) – Eadwacer Dec 20 '10 at 22:19
-
@GregS: There is no such thing as a lossy compression for arbitrary data, as it's impossible to know what data can and can't be removed – BlueRaja - Danny Pflughoeft Mar 15 '11 at 19:58
-
1There *is* a difference in the security provided. Compressing first reveals the compressibility of the original message. For example, it's possible to determine a user's location on a map by examining the size of the map tiles being sent to his or her device. – vroomfondel Feb 05 '18 at 06:01
-
1This answer is out of date and should be removed. See John Mellor's answer – jvdh Aug 06 '20 at 14:29
Warning: if an attacker controls part of the plaintext that gets compressed, and can observe the size of the resulting encrypted ciphertext, they may be able to deduce the rest of the plaintext, by adjusting the part that they control until the length of the ciphertext decreases (which implies that there was some repetition between the part of the plaintext they control and the secret part of the plaintext).
See https://en.wikipedia.org/wiki/CRIME for example.

- 12,572
- 4
- 46
- 35
-
Are there compression algorithms which avoid this attack vector? Or is should you just pad the encrypted data? – Lupilum May 16 '18 at 19:10
-
No, compress-then-encrypt is generally considered a bad practice. This is why TLS 1.3 does not use compression anymore – jvdh Aug 06 '20 at 14:30
-
This should be higher. The side channel vulnerability was found in Threema recently: https://breakingthe3ma.app/files/Threema-PST22.pdf – Agost Biro Feb 15 '23 at 12:17
Encryption works better on short messages, with a uniform distribution of symbols. Compression replaces a message with a non-uniform distribution of symbols by another, shorter sequence of symbols that are more uniformly distributed.
Therefore, it's mathemathically safer to compress before encryption. Compression after encryption doesn't affect the encryption, which remains relatively weak due to the non-uniform distribution of plaintext.
Of course, if you use anything like AES256, and the NSA isn't after you, this is all theory.

- 173,980
- 10
- 155
- 350
-
1Why is it mathematically safer to compress before encryption? Could you site your source, please? My understanding has always been that for a good encryption algorithm, such as AES, you cannot distinguish ciphertext from random data. – Steve Jan 22 '13 at 14:41
-
2@Steve: Modern attacks against cyphers assume that you have at least some idea of the possible plaintexts. Since well-compressed data is _also_ indistinguishable from random data, it makes it hard to impossible to validate whether you made any progress on breaking the key. And if you don't know whether you made any progress, you have to revert to brute-forcing the decryption. In the extreme, when you're encrypting truly random data with a perfect encryption algorithm, there is no way to determine whether any key is correct. This is for instance trivially true with a One Time Pad. – MSalters Jan 22 '13 at 15:22
-
But compressed data is *not* indistinguishable from random. If you assume the attacker knows the plaintext, or can submit the plaintext for compression & encryption, the attacker also work out the compressed version of the plaintext. There is no additional randomness (or security) added by compressing the data first. In fact, compressing truly random data would give an attacker some idea of the data because they can start looking for compression headers. – Steve Jan 22 '13 at 15:37
You should compress before encrypting.
Encryption turns your data into high-entropy data, usually indistinguishable from a random stream. Compression relies on patterns in order to gain any size reduction. Since encryption destroys such patterns, the compression algorithm would be unable to give you much (if any) reduction in size if you apply it to encrypted data. If the encryption is done properly then the result is basically random data. Most compression schemes work by finding patterns in your data that can be in some way factored out.
Compression before encryption also slightly increases your practical resistance against differential cryptanalysis (and certain other attacks) if the attacker can only control the uncompressed plaintext, since the resulting output may be difficult to deduce.

- 1,324
- 5
- 24
- 32
There is no added security (as compression is not a security mechanism), but a properly encrypted message shouldn't be easily compressible (i.e. rule of thumb: if you can significantly compress an encrypted message, something is wrong).
Therefore, compress then encrypt.

- 91,498
- 46
- 177
- 222
Look here: Super User thread about compression && encryption or the other way around
They have a complete and detailed answer to your question (witch is compress then encrypt, by the way).

- 1
- 1

- 149
- 5
According to William Stalling’s book “Network Security Essentials: Applications and Standards, 4th edition”, published by Pearson, At chapter 7, Page 227:
As a default, PGP compresses the message after applying the signature but before encryption. This has the benefit of saving space both for e-mail transmission and for file storage. The placement of the compression algorithm, indicated by Z for compression and Z –1 for decompression in Figure 7.1, is critical.
The signature is generated before compression for two reasons:
- It is preferable to sign an uncompressed message so that one can store only the uncompressed message together with the signature for future verification. If one signed a compressed document, then it would be necessary either to store a compressed version of the message for later verification or to recompress the message when verification is required.
- Even if one were willing to generate dynamically a recompressed message for verification, PGP’s compression algorithm presents a difficulty. The algorithm is not deterministic; various implementations of the algorithm achieve different tradeoffs in running speed versus compression ratio and, as a result, produce different compressed forms. However, these different compression algorithms are interoperable because any version of the algorithm can correctly decompress the output of any other version. Applying the hash228 CHAPTER 7 / ELECTRONIC MAIL SECURITY function and signature after compression would constrain all PGP implementations to the same version of the compression algorithm.
2. Message encryption is applied after compression to strengthen cryptographic security. Because the compressed message has less redundancy than the original plaintext, cryptanalysis is more difficult.
Although those reasons are explained for PGP algorithm, but simply they are expandable to other approaches. So it's better to compress before the encryption.

- 1,074
- 2
- 11
- 17
There is no difference in security provided.

- 60,705
- 7
- 138
- 176
-
Thanks, I can think of a few reasons why compressing first would be a wiser choice. I just wanted to make sure I wasn't affecting the security. Thanks again!! – john Dec 09 '10 at 15:27