-1

I want to compare the content in cis and resStream. My target is to check if the reEncryption gave me the same data or different.

Any leads would be really great. Thanks

final InputStream cis = test.getEncryptingInputStream(is);
final InputStream resStream = test.reEncrypt(cis, "");

How can we check the content of InputStreams?

priyanka
  • 422
  • 2
  • 7
  • 19
  • 2
    `InputStream` has a `read` method. Start there, try to implement it, and if you get stuck, ask a new question with specific details. – Jorn Jul 25 '23 at 11:42
  • Call `read()` on each and exit reading as soon as the two calls produce a differing value. You can short cut by returning false if the files (if they *are* files) differ in length – g00se Jul 25 '23 at 11:49
  • Read the whole stream, md5 it, if the hashes are different - the content is different. – Shark Jul 25 '23 at 11:58

3 Answers3

1

How to compare content of 2 InputStreams Objects ?

You can read the data from each stream and compare their byte codes. Not that If the streams has different size then they are definitely not equal.

int cisByte = cis.read();
int resByte = resStream.read();

while (cisByte != -1 && resByte != -1) {
    if (cisByte != resByte) {
        return false;
    }
    cisByte = cis.read();
    resByte = resStream.read();
}

return cisByte == resByte;
Lunatic
  • 1,519
  • 8
  • 24
  • I am sure `test.reEncrypt(cis, "")` would be reading the stream `cis` already, inside it. That's how it would return the `reStream`. So, above function to read both simultaneously won't work. When you read from `resStream`, internally to encrypt a byte, it would also read from `cis`. So, from `cis` - 2 bytes would be read (one by above furnction and one by resStream.read() internally), while from `resByte`, only one. Depends how the algo works. But this mostly won't work. – Ishan Jul 25 '23 at 11:59
  • Is there a way to convert the content in inputStream to String, so that i can compare both the strings if both are equal or not. – priyanka Jul 25 '23 at 12:27
  • @priyanka Yeap, You can use a 'BufferedReader'. Something like 'BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));' and 'StringBuilder builder = new StringBuilder();' then put builder in fused loop like while and call 'append', feel free to ask another question about this if you need more details about it. – Lunatic Jul 25 '23 at 12:30
  • If you've got two files then `boolean identical = (Files.mismatch(path1, path2) == -1);` – g00se Jul 25 '23 at 13:02
  • Apache Commons IO has this method: https://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/IOUtils.html#contentEquals-java.io.InputStream-java.io.InputStream- – Rob Spoor Jul 25 '23 at 14:49
0

Putting just the algorithm, not the real implementation...

final InputStream cis = test.getEncryptingInputStream(is);
Create a ByteArrayOutputStream cisArray / FileOutputStream
Read everything from cis and store in cisArray/FileOutputStream.

Create cisInputStream as ByteArrayInputStream from stored cisArray's bytes OR FileInputStream from file.
final InputStream resStream = test.reEncrypt(cisInputStream, "");

Read resStream & cisInputStream (re-open) byte by byte and compare
Ishan
  • 400
  • 2
  • 8
  • Why would you read everything into memory? – g00se Jul 25 '23 at 12:54
  • At least one stream - cis has to be read into the memory and stored for later comparison with reEncrypt's output. The resStream's reading can be done byte by byte and compared with cisArray. But at least cisArray has to be read fully and remembered. – Ishan Jul 25 '23 at 13:14
  • And depends, the output of the `test.getEncryptingInputStream(is)` can be stored in a file too, if it is too much for the memory. The main point is, we have to read one stream fully and store somewhere, and then create another stream from that stored data (in memory or on disk) and pass that to the `reEncrypt` function. If we directly pass the output of `getEncryptingInputStream` to `reEncrypt`, it won't work. You cannot read the first's output twice. The `reEncrypt`'s output does not have to be stored anywhere, but just compared with the stored output of `getEncryptingInputStream` – Ishan Jul 25 '23 at 13:22
-1

The idea is to read the whole stream as a String, then md5 the string, and finally check if the hashes are the same. MD5 guarantees that hashing the same string will produce the same result.

public String readInputStreamAsString(InputStream stream) {
 int bufferSize = 1024;
 char[] buffer = new char[bufferSize];
 StringBuilder out = new StringBuilder();
 Reader in = new InputStreamReader(stream, StandardCharsets.UTF_8);
 for (int numRead; (numRead = in.read(buffer, 0, buffer.length)) > 0; ) {
     out.append(buffer, 0, numRead);
 }
 return out.toString();
}

public String hashMD5(String input) {
        MessageDigest md = MessageDigest.getInstance("MD5");
        byte[] messageDigest = md.digest(input.getBytes());
        BigInteger number = new BigInteger(1, messageDigest);
        String hashtext = number.toString(16);
        return hashtext;
}

public void TestMe() {
   final InputStream cis = test.getEncryptingInputStream(is);
   final InputStream resStream = test.reEncrypt(cis, "");

   String cisAsString = readInputStreamAsString(cis);
   String resStreamAsString = readInputStreamAsString(resStream);

   String cisMD5 = hashMD5(cisAsString);
   String resMD5 = hashMD5(resStreamAsString);

   if(cisMD5.equals(resMD5)) { 
       System.out.println("Streams contents are equal"); 
   } else { 
       System.out.println("Stream contents are NOT equal :("); 
   }
}
Shark
  • 6,513
  • 3
  • 28
  • 50
  • Surely it's faster to simply compare the two strings, rather than calculating a hash and then comparing the hash? And there's no chance then that two different strings will happen to hash to the same value. – tgdavies Jul 25 '23 at 12:08
  • For an infinitely long string - sure, comparing individual characters is the way to go. For 10 megabytes of test data - does it really make a difference? I thought the whole **point of the question** was to detect whether there is a difference, not where the difference is. Identical strings will surely hash to the same value. Different strings (thus different contents) will not hash to the same value. – Shark Jul 25 '23 at 12:11
  • @tgdavies look: `My target is to check if the reEncryption gave me the same data or different.`. This does it just fine. Perhaps in a suboptimal way, but an easy peasy way. – Shark Jul 25 '23 at 12:12
  • You've made the solution slower, less reliable, and more complicated for zero benefit. The expression `cisMD5.equals(resMD5)` can simply be `cisAsString.equals(resStreamAsString)` – tgdavies Jul 25 '23 at 12:12
  • @tgdavies Robustness aside, does it not demonstrate an idea to compare two things? Improvement and optimization is left as an excercise to the reader :) Which would probably skip the whole reading into string to begin with. – Shark Jul 25 '23 at 12:19
  • 1
    Who said anything about `String`s? – g00se Jul 25 '23 at 12:56
  • 1
    This will result in a lossy comparison, because not all byte combinations are valid in UTF-8, and result in Unicode replacement characters (not to mention the unqualified usage of `String.getBytes()` also being potentially lossy if the default JVM character set is not UTF-8), so two different streams may produce the same string. If you want to determine a hash, use a `DigestInputStream` and read through it (discarding the bytes). – Mark Rotteveel Jul 25 '23 at 12:58