0

I thought that it's such a common question, but I can't find anything which helps me. I'm relatively new to java and exercising for a job application. For that I've started writing a tool to transform data. (eg. read CSV, translate some columns and write it as SQL inserts to a file)

If you're interested you find it here, I'll copy some code for my question: https://github.com/fvosberg/datatransformer

I've started with a Class which should read the CSV (and which will get more complex by enclosing some fields which should contain the seperator and so on). My IDE (IntellJ IDEA) suggested to use as strict access modifiers for my methods as possible. Why should I hide these methods (with private) from subclasses?

package de.frederikvosberg.datatransformer;

import java.io.BufferedReader;
import java.io.Reader;
import java.util.*;

class CSVInput {
    private final BufferedReader _reader;
    private String separator = ",";

    public CSVInput(Reader reader) {
        _reader = new BufferedReader(reader);
    }

    public List<SortedMap<String, String>> readAll() throws java.io.IOException {
        List<SortedMap<String, String>> result = new LinkedList<>();
        List<String> headers = readHeaders();

        String line;
        while ((line = _reader.readLine()) != null) {
            result.add(
                    colsFromLine(headers, line)
            );
        }
        _reader.close();
        return result;
    }

    private List<String> readHeaders() throws java.io.IOException {
        List<String> headers = new ArrayList<>();

        String line = _reader.readLine();
        if (line == null) {
            throw new RuntimeException("There is no first line for the headers in the CSV");
        }

        return valuesFromLine(line);
    }

    public void setSeparator(String separator) {
        this.separator = separator;
    }

    /**
     * creates a list of values from a CSV line
     * it uses the separator field
     *
     * @param line a line with values separated by this.separator
     * @return a list of values
     */
    private List<String> valuesFromLine(String line) {
        return Arrays.asList(
                line.split(this.separator)
        );
    }

    private SortedMap<String, String> colsFromLine(List<String> headers, String line) {
        SortedMap<String, String> cols = new TreeMap<>();
        List<String> values = valuesFromLine(line);
        Iterator<String> headersIterator = headers.iterator();
        Iterator<String> valuesIterator = values.iterator();
        while (headersIterator.hasNext() && valuesIterator.hasNext()) {
            cols.put(headersIterator.next(), valuesIterator.next());
        }
        if (headersIterator.hasNext() || valuesIterator.hasNext()) {
            throw new RuntimeException("The size of a row doesn't fit with the size of the headers");
        }
        return cols;
    }
}

Another downside are the unit test. I would like to write separate tests for my methods. Especially the CSVInput::valuesFromLine method which will get more complex. My Unit test for this class is testing so much and I really don't want to have to many things in my head when developing.

Any suggestions from experienced Java programmers?

Thanks in advance


Replies to comments

Thank you for your comments. Let me answer to the comments here, for the sake of clarity.

"Why should I hide these methods (with private) from subclasses?" Why do you keep your car keys away from your front door?

For security purposes, but why does it affect security when I change the access modifier of the colsFromLine method? This method accepts the headers as a parameter, so it doesn't rely on any internal state nor change it.

The next advantage of strict access modifiers I can think of is to help other developers to show them which method he should use and where the logic belongs to.

Don't write your test to depend on the internal implementation of the functionality, just write a test to verify the functionality.

I don't. It depends on what do you mean with internal implementation. I don't check any internal states or variables. I just want to test the algorithm which is going to parse the CSV in steps.

"My Unit test for this class is testing so much" - If too many tests on a class, you should rethink your design. It's very likely that your class literally is doing too much, and should be broken up.

I don't have many tests on my class, but when I'm going the way I started, I'm going to write many tests for the same method (parsing the CSV) because it has many edge cases. The tests will grow in size because of different boilerplates. That is my concern why I'm asking here

fvosberg
  • 677
  • 10
  • 30
  • 2
    "Why should I hide these methods (with private) from subclasses?" Why do you keep your car keys away from your front door? – Andy Turner Jan 31 '17 at 22:27
  • 2
    Don't write your test to depend on the internal implementation of the functionality, just write a test to verify the functionality. If you later decide to change your implementation, your test should be able to tell you that your new implementation still produces the correct results. If you are testing your implementation, then this isn't going to work. – Jason Jan 31 '17 at 22:28
  • 2
    "My Unit test for this class is testing so much" - If too many tests on a class, you should rethink your design. It's very likely that your class literally *is* doing too much, and should be broken up. – azurefrog Jan 31 '17 at 22:30
  • Thanks, that was quick ;-) – GhostCat Feb 03 '17 at 08:04

1 Answers1

3

To answer your direct question: you always strive to hide as much as possible from either client code, but also from subclasses.

The point is: you want (theoretically) to be able to change parts/all of your implementation without affecting other elements in your system. When client/subclass code knows about such implementation details ... sooner or later, such code starts relying on them. To avoid that, you keep them out of sight. The golden rule is: good OO design is about behavior (or "contracts") of objects and methods. You absolutely do not care how some method does its job; you only care about the what it does. The how part should be invisible!

Having said that, sometimes it does make sense to give "package protected" visibility to some of your methods; in order to make to make them available within your unit tests.

And beyond that: I don't see much point in extending your CsvInput (prefer camel case, even for class names!) class anyway. As usual: prefer composition over inheritance!

In any case, such kind of "assignments" are excellent material for practicing TDD. You write a test (that checks one aspect); and then you write the code to pass that test. Then you write another test checking "another" condition; and so on.

Community
  • 1
  • 1
GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • Thank you very much for your answer, it helps me a lot because I didn't think of the "what can change" part. With extending I meant extending the method itself, not the class with a subclass. Because of edgecases like linebreaks in enclosed fields and so on. I have to think more about this case and I'm going to comment in the evening. Thanks! – fvosberg Feb 01 '17 at 09:27
  • You are very welcome. I updated my answer to mention that such problems are *perfect* for using TDD. – GhostCat Feb 01 '17 at 09:30
  • Thanks. I'm practicing TDD with this project. That is the point why I wanted to have the colsFromLine method accessible from outside. To just test this method on it's own. – fvosberg Feb 01 '17 at 09:32