How come putting the GCM authentication tag at the end of a cipher stream require internal buffering during decryption?

Question

In Java, the "default" AES/GCM provider SunJCE will - during the decryption process - internally buffer 1) encrypted bytes used as input or 2) decrypted bytes produced as result. Application code doing decryption will notice that Cipher.update(byte[]) return an empty byte array and Cipher.update(ByteBuffer, ByteBuffer) return written length 0. Then when the process completes, Cipher.doFinal() will return all decoded bytes.

First question is: Which bytes is it that are being buffered, number 1 or number 2 above?

I assume the buffering occurs only during decryption and not encryption because firstly, the problems that arise from this buffering (shortly described) does not occur in my Java client doing the encryption of files read from the disk, it always occur on the server side, receiving those files and doing the decryption. Secondly, it is said so here. Judging only by my own experience, I can not be sure because my client uses a CipherOutputStream. The client does not explicitly use methods on the Cipher instance. Hence I can not deduce whether internal buffering is used or not because I can not see what the update- and final method return.

My real problems arise when the encrypted files I transmit from client to server become large. By large I mean over 100 MB.

What happens then is that Cipher.update() throw a OutOfMemoryError. Obviously due to the internal buffer growing and growing.

Also, despite internal buffering and no result bytes received from Cipher.update(), Cipher.getOutputSize(int) continously report an ever growing target buffer length. Hence, my application code is forced to allocate an ever growing ByteBuffer that is feed into Cipher.update(ByteBuffer, ByteBuffer). If I try to cheat and pass in a byte buffer with a smaller capacity, then the update method throw a ShortBufferException ^#1. Knowing I create huge byte buffers for no use is quite demoralizing.

Given that internal buffering is the root of all evil, then the obvious solution for me to apply here is to split the files into chunks, of say 1 MB each - I never have problems sending small files, only large ones. But, I struggle hard to understand why internal buffering happens in the first place.

The previously linked SO answer says that GCM:s authentication tag is "added at the end of the ciphertext", but that it "does not have to be put at the end" and this practice is what "messes up the online nature of GCM decryption".

Why is it that putting the tag at the end only messes up the server's job of doing decryption?

Here is how my reasoning goes. To compute an authentication tag, or MAC if you will, the client use some kind of a hash function. Apparently, MessageDigest.update() does not use an ever growing internal buffer.

Then on the receiving end, can not the server do the very same thing? For starters, he can decrypt the bytes, albeit unauthenticated ones, feed those into his update function of the hash algorithm and when the tag arrives, finish the digest and verify the MAC that the client sent.

I'm not a crypto guy so please speak to me as if I am both dumb and crazy but loving enough to care some for =) I thank you wholeheartedly for the time taken to read through this question and perhaps even shed some light!

UPDATE #1

I don't use AD (Associated Data).

UPDATE #2

Wrote software that demonstrate AES/GCM encryption using Java, as well as the Secure Remote Protocol (SRP) and binary file transfers in Java EE. The front-end client is written in JavaFX and can be used to dynamically change encryption configuration or send files using chunks. At the end of a file transfer, some statistics about time used to transfer the file and time for the server to decrypt is presented. The repository also has a document with some of my own GCM and Java related research.

Enjoy: https://github.com/MartinanderssonDotcom/secure-login-file-transfer/

#1

It is interesting to note that if my server who do the decryption does not handle the cipher himself, instead he use a CipherInputStream, then OutOfMemoryError is not thrown. Instead, the client manage to transfer all bytes over the wire but somewhere during the decryption, the request thread hang indefinitely and I can see that one Java thread (might be the same thread) fully utilize a CPU core, all while leaving the file on disk inaccessible and with a reported file size of 0. Then after a tremendously long amount of time, the Closeable source is closed and my catch clause manage to catch an IOException caused by: "javax.crypto.AEADBadTagException: Input too short - need tag".

What render this situation weird is that transmitting smaller files work flawless with the exact same piece of code - so obviously the tag can be properly verified. The problem must have the same root cause as when using the cipher explicitly, i.e. an ever growing internal buffer. I can not track on the server how many bytes was successfully read/deciphered because as soon as the reading of the cipher input stream begin, then compiler reordering or other JIT optimizations make all my logging statements evaporate into thin air. They are [apparently] not executed at all.

Note that this GitHub project and its associated blog post says CipherInputStream is broken. But the tests provided by this project does not fail for me when using Java 8u25 and the SunJCE provider. And as has already been said, everything work for me if only I use small files.

I think there have been some changes made specifically because of this: https://bugs.openjdk.java.net/browse/JDK-8012900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel — Maarten Bodewes, Nov 14 '14 at 00:33
@owlstead Wrote a piece of software that test AES/GCM encryption in Java (see the question update). I think it could be turned into some kind of a benchmark tool for other providers such as Bouncy Castle. — Martin Andersson, Nov 22 '14 at 22:40

Andrew Michael Felsher · Accepted Answer · 2021-05-19T13:37:59.000

The short answer is that update() can't distinguish the ciphertext from the tag. The final() function can.

The long answer: Since Sun's specification requires the tag to be appended to the ciphertext, the tag needs to be stripped from the source buffer (ciphertext) during (or rather, prior to) decryption. However, because the ciphertext can be provided over the course of several update() calls, Sun's code does not know when to pull off the tag (in the context of update()). The last update() call does not know that it is the last update() call.

By waiting until the final() to actually do any crypto, it knows the full ciphertext + tag has been provided, and it can easily strip the tag off the end, given the tag length (which is provided in the parameter spec). It can't do crypto during the update because it would either treat some ciphertext as the tag or vice versa.

Basically, this is the drawback to simply appending the tag to the ciphertext. Most other implementations (e.g. OpenSSL) will provide the ciphertext and tag as separate outputs (final() returns the ciphertext, some other get() function returns the tag). Sun no doubt chose to do it this way in order to make GCM fit with their API (and not require special GCM-specific code from developers).

The reason encryption is more straightforward is that it has no need to modify its input (plaintext) like decryption does. It simply takes all data as plaintext. During the final, the tag is easily appended to the ciphertext output.

What @blaze said regarding protecting you from yourself is a possible rationale, but it is not true that nothing can be returned until all ciphertext is known. Only a single block of ciphertext is needed (OpenSSL, for example, will give it to you). Sun's implementation only waits because it cannot know that that first block of ciphertext is just the first block of ciphertext. For all it knows, you're encrypting less than a block (requiring padding) and providing the tag all at once. Of course, even if it did give you the plaintext incrementally, you could not be sure of authenticity until the final(). All ciphertext is required for that.

There are, of course, any number of ways Sun could have made this work. Passing and retrieving the tag through special functions, requiring the length of the ciphertext during init(), or requiring the tag to be passed through on the final() call would all work. But, like I said, they probably wanted to make the usage as close to the other Cipher implementations as possible and maintain API uniformity.

It is a bad design decision, we have made a [different design decision for Java Card](https://docs.oracle.com/javacard/3.0.5/api/javacardx/crypto/AEADCipher.html) (where buffering is of course even more expensive, and it also removed over 30% of the code when I tried it on Bouncy Castle code (!)). Couldn't have written a better answer myself. — Maarten Bodewes, May 09 '22 at 12:29
*Sun's implementation only waits because it cannot know that that first block of ciphertext is just the first block of ciphertext.* Buffering the entire stream is **NOT** the only solution to that even with the tag appended to the ciphertext. All the implementation needs to do is withhold a rolling last 16 bytes (normal tag length) passed in via `update()` from the decryption process until `doFinal()` is called. — Andrew Henle, Jul 29 '22 at 00:18
Agreed. I implemented something like that myself in security provider code around the time I answered this question. I omitted that for the sake of simplicity. But that approach does create an issue where the returned plaintext does not correspond exactly to the provided ciphertext (within the context of a given update call), which may or may not be a problem depending on what the application does with the resulting plaintext. — Andrew Michael Felsher, Aug 02 '22 at 17:28

score 3 · Answer 2 · answered Nov 14 '14 at 01:29

I don't know why, but the current implementation writes every encoded byte you throw at it into a buffer until doFinal(), no matter what you do.

Source can be found here: GaloisCounterMode.java

This method is called from update and is given the bytes (+buffered ones) and is supposed to decrypt in case it can.

int decrypt(byte[] in, int inOfs, int len, byte[] out, int outOfs) {
    processAAD();

    if (len > 0) {
        // store internally until decryptFinal is called because
        // spec mentioned that only return recovered data after tag
        // is successfully verified
        ibuffer.write(in, inOfs, len);
    }
    return 0;
}

but it simply adds the data to ibuffer (ByteArrayOutputStream) and returns 0 as number of decrypted bytes. It does the whole decryption in doFinal then.

Given that implementation your only choices are to avoid that encryption or to manually build blocks of data you know your server can handle. There is no way to provide the tag data in advance and make it behave nicer.

Thank you so much zapl, your answer was really helpful. At least now I know that SunJCE always do internal buffering and there's no way around that. My solution will be to split the files instead. — Martin Andersson, Nov 14 '14 at 15:40
I did use chunked file transfer, demonstrated here: [https://github.com/MartinanderssonDotcom/secure-login-file-transfer/](https://github.com/MartinanderssonDotcom/secure-login-file-transfer/) — Martin Andersson, Nov 22 '14 at 23:01

score 2 · Answer 3 · answered Nov 14 '14 at 01:57

2

Until entire ciphertext is known, algorithm can't tell if it was correct or tampered with. No decrypted bytes can be returned to use before decryption and authentication are completed.

Ciphertext buffering may be caused by the reasons @NameSpace mentioned, but plaintext buffering is here to not to let you shoot into your own leg.

Your best option is to encrypt data in small chunks. And don't forget to change nonce value between them.

answered Nov 14 '14 at 01:57

blaze

4,326
18
23

Not allowing me to flush is shooting myself in my own leg =) I can handle a bad tag and then simply not use the plaintext, but you know we can do nothing about OutOfMemoryError. I tried to get Bouncy Castle to work and see how BC behaves, but it was a real nightmare bundling the provider with my application. I'll fall back to sending small chunks instead. – Martin Andersson Nov 14 '14 at 15:39
It forces you to do right thing, by the threat of shooting your leg :) If programmers are given ability to process plaintext before it is verified, they will screw it up twelve times out of ten (because some of them will screw it twice). Processing, getting some internal error before auth tag is even received and returning this error? Hi there, you just invented your own new kind of padding oracle! Processing, using some numbers, hitting auth tag error, trying to rollback and failing it? All your base are belong to us! Until auth tag is verified, data is random junk. Don't use random junk. – blaze Nov 14 '14 at 19:14
haha not sure I followed you all the way there, speaking of oracles and stuff =) Generally speaking, I am against autocracy, or the fact the we many time hurt thousands in order to save just one. I would agree with you on this issue, if the implementation itself made sure no OutOfMemoryError happened, or perhaps if I knew more about cryptography than my current limited skill set. Should we implement in Java that numbers can not overflow? Should we ban null? Should "a =+ a" compile (because the programmer "probably" wanted "a += a")? Should "if (bool = false)" compile? – Martin Andersson Nov 14 '14 at 19:43
What can a developer say about his API usage around the world? Having one person think for everybody is risky, the other way around, then it is the client's responsibility. As my case is an example of, there is always always exceptions to the rule. Proper documentation is the only rule we should abide to. But in this field of cryptography, I am a newbie. So I'm not confident enough to say that you're wrong =) Just expressing my thoughts. Thank you blaze for your feedback! – Martin Andersson Nov 14 '14 at 19:47
Padding oracle is a type of attack on encryption where returning different error codes on different stages of decryption allows attacker to obtain decrypted data. Wiki have a nice article about it. It is very easy mistake to make, it stayed there for many years before even pros noticed, and it is not the only one. It is not about hurting hundreds to save one. It is about saving them by putting a fence around minefield, and cryptography engineering IS a minefield, and statistics say it looks so straightforward that programmers are just happy to walk into it, resulting in vulnerable apps. – blaze Nov 14 '14 at 20:19
BTW, if you need to transfer large amounts of data, why not use SSL/TLS, and get streaming API with all trouble handled by library? – blaze Nov 14 '14 at 20:20
Doesn't that require me to have a certificate? I use Secure Remote Protocol (SRP) to compute two symmetric keys and then encryption with AES/GCM. For me, that is both cheaper and easier. No hassle! Pure code only =) – Martin Andersson Nov 14 '14 at 20:36

How come putting the GCM authentication tag at the end of a cipher stream require internal buffering during decryption?

UPDATE #1

UPDATE #2

#1

3 Answers3

Linked