1

I have an encrypted CSV file that is encrypted line by line. i.e:

  • encrypted_row1<new_line_char>
  • encrypted_row2<new_line_char>
  • encrypted_row3<new_line_char>

Encryption key: 3B047F64FC0DC09ECFA7C59C2C84FBB703F074474459B4A8690A9229297B77F6 Example File:

FE8C7D902C44957624DBA215F4DE1DF08E5B44A29E57176C222D1B040201ED520C920063A8F83A623EC7590F96DEB9E714DF52E826E1219915936765D86C19FEC8B7974023711A9706458A203BADEA36A44530A262B470D0983716A5A533472A 03269E5BC0BC1BF9411E44A0AE7E9490AE27F1E42770020B342BCB1171F1D757AFDAC75F7FA8F6C753CA9DA5AA831FD4878C761ECEBBE261BE0D67B12F15DBFB03311C72138120FB174C56AD4676AF19 F8E7A3201B9DDFAA11A3017DC3756706E832DA95A387C7889FE37DB586A20102793E70A7378A54AFFEF1E1239F56BD1589A7B9348847D1BE7A78759C93BA6A534F83A0C5FC59BCFABAB3E47510C354D4 CF813CADE9778381CF68E613E6E9D86E22F0C413A76A4B6C429FB9EDA20C7F2582B293D35FD2206C6E7AEB96464DA0EF22005D019274E3AC32A5861D7C068EFFA5395D86DEB48C4A31E928A7B9720708

I have a function that takes java.io.Reader as input. I need to decrypt this file then provide a java.io.Reader of the decrypted file content on the fly.

My code:

    String path = "path-to-file";
    File file =  new File(path);
    
    String encryptionKey = "encryptionKey";
    SecretKeySpec secKeySpec = new SecretKeySpec(Hex.decode(encryptionKey), "AES"); 
    byte[] ivBytes = new byte[16];
    IvParameterSpec ivParameterSpec = new IvParameterSpec(ivBytes);
    Cipher cipher = Cipher.getInstance("AES/CBC/NoPadding");                                   
  
    cipher.init(Cipher.DECRYPT_MODE, secKeySpec, ivParameterSpec);
    
    Reader reader = new InputStreamReader(new FileInputStream(file), UTF_8);
    
    BufferedReader rd = new BufferedReader(reader);
    
    String line = "";
    while ((line = rd.readLine()) != null) {
        System.out.println(new String(cipher.doFinal(Hex.decode(line))));
    }

I need to wrap the HEX decoded and decrypted string in a java.io.Reader in the most efficient way possible. ( File sizes can be very large ). I have a function that use java.io.Reader to read its chars to do some business logic on it. I don't want to decrypt and write the file then re-read it. I need to do this on the fly or in a secure way

What can be the best way to do this?

Selim Alawwa
  • 742
  • 1
  • 8
  • 19
  • 1
    You're decrypting in `cihper.doFinal(...)`. Why not implement your business logic there? – h0r53 Aug 31 '20 at 13:59
  • 1
    Could you be more specific about the requirements ? The ability to process (i.e. decrypt and apply business logic) the file line by line can be achieved without providing a java.io.Reader, but a simple Iterator (that may be based on a Reader, why not). So do you need your decryption API to be a java.io.Reader, or do you just need it to be an iterator (`next()` semantics)? `Reader` semantics are harder to achieve (e.g. read just one char is part of the Reader API, but if you only need lines, then why try to conform to the Reader API ?) – GPI Aug 31 '20 at 14:16
  • 1
    Another question: are the hex data in string-format, limited by a ? If yes it just a simple text line reader, followed by a conversion "hexStringToByteArray" (you will find dozen examples here on SO) and feed the byte[] to your decryption method. – Michael Fehr Aug 31 '20 at 14:22
  • 1
    Is each line a single block of encryption or can an encrypted block overlap to the next line? – Zuko Sep 01 '20 at 06:20
  • @GPI I have CSV reader that takes java.io.Reader as input, I can't change this CSV reader as its used by other classes. Before encryption, we passed an InputStreamReader from FileInputStream to the CSV read method - After encryption, Each row is a HEX string of the encrypted row data. I need to decode the HEX and decrypt it. Decoding / Decryption is done line by line, while my CSV reader uses the input java.io.Reader and reads it char by char. So I need to wrap the InputStreamReader into another reader that does the decryption line by line and for readChar() returns the decrypted char. – Selim Alawwa Sep 01 '20 at 11:50
  • @MichaelFehr I want a function to output a java.io.Reader object that output the decrypted char when readChar() is called – Selim Alawwa Sep 01 '20 at 11:52
  • @OluSmith each line is single block of encryption. – Selim Alawwa Sep 01 '20 at 11:53
  • @h0r53 how can this be done? In any case I need a method that at the end returns some implementation of java.io.Reader – Selim Alawwa Sep 01 '20 at 11:53

2 Answers2

2

This can be achieved with the CipherInputStream/CipherOutputStream classes.

However, there is an important caveat here: Some encryption modes have authentication in them, and if authentication is part of your encryption mode, then it is impossible to decrypt on the fly securely.

The point is the following: If an attacker sends you a message, and you start running code based on data you decrypt, but you don't know the data is authentic, the data may very well be malicious. Therefore, to act securely on encrypted data, you must decrypt and validate the authentication before acting on the data received.

sonOfRa
  • 1,280
  • 11
  • 21
  • 1
    Agreed about message auth. But it **may** not be an issue. Remember its not the file that is encrypted, it is each line. My takeaway is that if you decrypt a whole line at once, you can and will detect authentication errors. So in that sense, you should be able to process the file line by line. – GPI Aug 31 '20 at 14:13
  • 1
    @sonOfRa: in the example given by the user the algorithm-mode is ""AES/CBC/NoPadding" so there seems to be NO authentication in use. – Michael Fehr Aug 31 '20 at 14:19
  • 1
    That is correct, though that would be the next problem to solve; unauthenticated encryption is insecure in almost all scenarios. We would need additional information from @selim-alawa – sonOfRa Aug 31 '20 at 18:31
  • @sonOfRa I have tried with CipherInputStream, but it doesn't work. It either decrypt without decoding so gets wrongs values or throws IllegalBlockSizeException due Input length not multiple of 16 bytes. I wrapped FileInputStream in a cipher input stream new InputStreamReader(new CipherInputStream(new FileInputStream(file), cipher), UTF_8); – Selim Alawwa Sep 01 '20 at 12:11
  • @sonOfRa There is no authentication – Selim Alawwa Sep 01 '20 at 12:12
1

You use case is a mixture of tools that are not quite available in the standard stream/reader implementations.

What makes is special is that you start with a reader process (lines of the encrypted files), then each line has to fall back to a bytes based process (Cipher operations), and you want the output to have character semantics (reader) to pass on to a CSV parser. This char/byte/char part is not trivial.

The process I would use is to create another reader, I call it the LineByLineProcessingReader which allows to consume the input line by line, then process it, then make the output of each line available as a Reader.

The process really does not matter for this purpose. Your process implementation would be to Hex decode, then decypher, then convert back to string, but it might very well be just anything.

The tricky part is all in making the process conform to the Reader API.

When conforming to the Reader API, you have a choice, of extending FilterReader or Reader itself. The usual being the filter version.

I choose not to extend the filter version, because on the whole, my process is not to ever expose the original file's contents. As the filter's implementation always fallback to the orignal reader's, this maksing would imply re-implementing everything, which incurs a lot of work.

On the other hand, if I override Reader directly, I only have one method to get right, because everything Reader really has to express is the read(buf, o, c) method, all other are implemented on top of it.

The strategy I chose to implement it, is akin to creating you own Iterator implementation over some source of data. I prefetch a line of the input file, and make it available as a Reader, in an internal variable.

That being done, I only need to make sure that this reader variable (e.g. current line) is always prefetched each time the previous has been fully read, just not before.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;

public class LineByLineProcessingReader extends Reader {

    /** What to use as a line break. Becuase bufferedReader does not report the line breaks, we insert them manually using this */
    final String lineBreak;
    /** The original input being read */
    final BufferedReader input;

    /** A reader for the current processed line. */
    private StringReader currentReader;
    private boolean closedOrFinished = false;

    /**
     * Creates a reader that will ingest an input line by line,
     * then process it (default implementation is a no-op),
     * then recreate a reader for it, until there is no more line.
     *
     * @param in a Reader object providing the underlying stream.
     * @throws NullPointerException if <code>in</code> is <code>null</code>
     */
    protected LineByLineProcessingReader(BufferedReader in, String lineBreak) {
        this.input = in;
        this.lineBreak = lineBreak;
    }

    public int read(char[] cbuf, int off, int len) throws IOException {
        ensureNextLine();
        // Check end of input
        if (currentReader == null) {
            return -1;
        }
        int read = currentReader.read(cbuf, off, len);
        // Edge case : if current reader was at its end
        if (read < 0) {
            currentReader = null;
            // Recurse to go fetch next line.
            return read(cbuf, off, len);
        }
        // General case, we have our result.
        // We may have read less than was asked (in length), but it's contractually OK.
        return read;
    }

    /**
     * Advances the underlying input to the next line, and makes it available
     * for reading inside the {@link #currentReader}
     */
    private void ensureNextLine() throws IOException {
        // Do not try to read if closed or already finished
        if (closedOrFinished) {
            return;
        }
        // Check if there is still data to be read
        if (currentReader != null) {
            return;
        }

        String nextLine = input.readLine();
        if (nextLine == null) {
            // Nothing was left to read, we are bailing out.
            currentReader = null;
            closedOrFinished = true;
            return;
        }
        // We have a new line, process it and publish it as a reader
        String processedLine = processRawLine(nextLine);
        currentReader = new StringReader(processedLine+lineBreak);
    }

    /**
     * Performs a process of the raw line read from the underlying source.
     * @param rawLine the raw line read
     * @return a processed line
     */
    protected String processRawLine(String rawLine) {
        return rawLine;
    }


    @Override
    public void close() throws IOException {
        input.close();
        closedOrFinished = true;
    }


}

What would be left to do is plug your decryption process inside the processLine method.

A very quick test for the class (you might want to check it further).

import junit.framework.TestCase;
import org.junit.Assert;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.io.StringWriter;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class LineByLineProcessingReaderTest extends TestCase {

    public void testRead() throws IOException {
        String input = "a\nb";

        // Reading a char one by one
        try (Reader r = new LineByLineProcessingReader(new BufferedReader(new StringReader(input)), "\n")) {
            String oneByOne = readExcatlyCharByChar(r, 3);
            Assert.assertEquals(input, oneByOne);
        }

        // Reading lines
        List<String> lines = readAllLines(
            new LineByLineProcessingReader(
                new BufferedReader(new StringReader(input)),
                "\n"
            )
        );
        Assert.assertEquals(Arrays.asList("a", "b"), lines);

        String[] moreComplexInput = new String[] {"Two households, both alike in dignity",
            "In fair Verona, where we lay our scene",
            "From ancient grudge break to new mutiny",
            "Where civil blood makes civil hands unclean." +
            "From forth the fatal loins of these two foes" +
            "A pair of star-cross'd lovers take their life;" +
            "Whose misadventured piteous overthrows" +
            "Do with their death bury their parents' strife." +
            "The fearful passage of their death-mark'd love",
            "And the continuance of their parents' rage",
            "Which, but their children's end, nought could remove",
            "Is now the two hours' traffic of our stage;" +
            "The which if you with patient ears attend",
            "What here shall miss, our toil shall strive to mend."};
        lines = readAllLines(new LineByLineProcessingReader(
            new BufferedReader(new StringReader(String.join("\n", moreComplexInput))), "\n") {
            @Override
            protected String processRawLine(String rawLine) {
                return rawLine.toUpperCase();
            }
        });
        Assert.assertEquals(Arrays.stream(moreComplexInput).map(String::toUpperCase).collect(Collectors.toList()), lines);
    }

    private String readExcatlyCharByChar(Reader reader,int numberOfReads) throws IOException {
        int nbRead = 0;
        try (StringWriter output = new StringWriter()) {
            while (nbRead < numberOfReads) {
                int read = reader.read();
                if (read < 0) {
                    throw new IOException("Expected " + numberOfReads + " but were only " + nbRead + " available");
                }
                output.write(read);
                nbRead++;
            }
            return output.toString();
        }
    }

    private List<String> readAllLines(Reader reader) throws IOException {
        try (BufferedReader b = new BufferedReader(reader)) {
            return b.lines().collect(Collectors.toList());
        }
    }

}
GPI
  • 9,088
  • 2
  • 31
  • 38
  • this looks like a good solution, I will test it and let you know. In the meantime I also figured out another solution, which to implement a CipherBufferedReader that extend buffered reader. Overriding read(char cbuf[], int off, int len) method. I use bufferedReader readLine method, to read a line, decrypt it then save to buffer char array. What do you think of this solution? – Selim Alawwa Sep 01 '20 at 14:29
  • Try calling `skip(10)` at random points in your reading process and see if it still works :-) That really is the only question : can you preserve Reader's semantics whatever the order / length of read/readLine/skip/mark/reset calls. – GPI Sep 01 '20 at 14:34
  • calling skip(10) leads to IllegalBlockSizeException, while calling a different value, i.e skip skip(129) leads to wrong decrypted values – Selim Alawwa Sep 01 '20 at 14:39
  • That's because your implementation is not strict enough. E.g. skip(10) probably skips 10 chars from the raw (encrypted files), without presenting them to the cipher, which leads to an inconsistent state, and to the failure. If you want to go this route, you have to make sure that *all* of BufferedReader's methods act consistently (read from uncrypted data) and that the only way to call down to the encrypted file feeds data to the cipher. That's why my implementation *hides* the raw Reader, and does not filter/extend it. It's far simpler to achieve this way. Encapsulate. – GPI Sep 01 '20 at 14:51
  • I am using your class, but all values seem to be returning as null / empty. What value of lineBreak should be used? – Selim Alawwa Sep 02 '20 at 14:10
  • 1
    Using '\n' should be fine. I recommend you do step by step debugging. – GPI Sep 03 '20 at 07:24