0

Let's imagine, that I have a large file to be processed and I do not want to load it into the memory by converting it to bytes(or string).

So I have InputStream instead.

The question, is there any way to check whether this InputStream is Base64 encoded?

Hakan Dilek
  • 2,178
  • 2
  • 23
  • 35
  • Does this answer your question? [How to check that byte array is Base64 encoded?](https://stackoverflow.com/questions/42226385/how-to-check-that-byte-array-is-base64-encoded) – Luca Jul 07 '20 at 09:17
  • No, I do not have a byte array. And I do not want to get it from InputStream, because file size could be too large. –  Jul 07 '20 at 09:19
  • You can never tell for sure. But if you read for example the first 500 bytes and every byte of these 500 converts to `A-Z` `a-z` `0-9` `+` or `/` the probability that it's Base64 Encoded is pretty high – Felix Jul 07 '20 at 09:29

1 Answers1

0

In addition to Luca's reference, you might also want to check this out: How to check whether a string is Base64 encoded or not

The regular expression search should work for your constraint of not converting to bytes since regular expressions just scan your document for character matches. There's also a built-in common-codecs and the java.util.Base64 libraries.

Are these answers more along the lines of what you were looking for?

slow-but-steady
  • 961
  • 9
  • 15
  • I've already checked that articles, but the issue is I cannot upload into the memory whole file. String is not a solution, because file could be 1GB, for example, and what if there are 1000 users would upload such files? I cannot work with anything except InputStream to save memory and to process faster. –  Jul 07 '20 at 09:38
  • Hmm in that case using the [Scanner](https://stackabuse.com/reading-a-file-line-by-line-in-java/) class comes to mind. Would iterating through the file line by line along with a check on that specific line work? If so you could just iterate through the file using something like `String line = scanner.nextLine();` and then run the regular expression on `line` – slow-but-steady Jul 07 '20 at 09:44
  • @ShailShouryya and what is a `line`in a base64 file? If it's really base64, then there are usually no linebreaks ´\n` in the file. – jps Jul 07 '20 at 09:55