I thought that it's such a common question, but I can't find anything which helps me. I'm relatively new to java and exercising for a job application. For that I've started writing a tool to transform data. (eg. read CSV, translate some columns and write it as SQL inserts to a file)
If you're interested you find it here, I'll copy some code for my question: https://github.com/fvosberg/datatransformer
I've started with a Class which should read the CSV (and which will get more complex by enclosing some fields which should contain the seperator and so on). My IDE (IntellJ IDEA) suggested to use as strict access modifiers for my methods as possible. Why should I hide these methods (with private) from subclasses?
package de.frederikvosberg.datatransformer;
import java.io.BufferedReader;
import java.io.Reader;
import java.util.*;
class CSVInput {
private final BufferedReader _reader;
private String separator = ",";
public CSVInput(Reader reader) {
_reader = new BufferedReader(reader);
}
public List<SortedMap<String, String>> readAll() throws java.io.IOException {
List<SortedMap<String, String>> result = new LinkedList<>();
List<String> headers = readHeaders();
String line;
while ((line = _reader.readLine()) != null) {
result.add(
colsFromLine(headers, line)
);
}
_reader.close();
return result;
}
private List<String> readHeaders() throws java.io.IOException {
List<String> headers = new ArrayList<>();
String line = _reader.readLine();
if (line == null) {
throw new RuntimeException("There is no first line for the headers in the CSV");
}
return valuesFromLine(line);
}
public void setSeparator(String separator) {
this.separator = separator;
}
/**
* creates a list of values from a CSV line
* it uses the separator field
*
* @param line a line with values separated by this.separator
* @return a list of values
*/
private List<String> valuesFromLine(String line) {
return Arrays.asList(
line.split(this.separator)
);
}
private SortedMap<String, String> colsFromLine(List<String> headers, String line) {
SortedMap<String, String> cols = new TreeMap<>();
List<String> values = valuesFromLine(line);
Iterator<String> headersIterator = headers.iterator();
Iterator<String> valuesIterator = values.iterator();
while (headersIterator.hasNext() && valuesIterator.hasNext()) {
cols.put(headersIterator.next(), valuesIterator.next());
}
if (headersIterator.hasNext() || valuesIterator.hasNext()) {
throw new RuntimeException("The size of a row doesn't fit with the size of the headers");
}
return cols;
}
}
Another downside are the unit test. I would like to write separate tests for my methods. Especially the CSVInput::valuesFromLine method which will get more complex. My Unit test for this class is testing so much and I really don't want to have to many things in my head when developing.
Any suggestions from experienced Java programmers?
Thanks in advance
Replies to comments
Thank you for your comments. Let me answer to the comments here, for the sake of clarity.
"Why should I hide these methods (with private) from subclasses?" Why do you keep your car keys away from your front door?
For security purposes, but why does it affect security when I change the access modifier of the colsFromLine
method? This method accepts the headers as a parameter, so it doesn't rely on any internal state nor change it.
The next advantage of strict access modifiers I can think of is to help other developers to show them which method he should use and where the logic belongs to.
Don't write your test to depend on the internal implementation of the functionality, just write a test to verify the functionality.
I don't. It depends on what do you mean with internal implementation. I don't check any internal states or variables. I just want to test the algorithm which is going to parse the CSV in steps.
"My Unit test for this class is testing so much" - If too many tests on a class, you should rethink your design. It's very likely that your class literally is doing too much, and should be broken up.
I don't have many tests on my class, but when I'm going the way I started, I'm going to write many tests for the same method (parsing the CSV) because it has many edge cases. The tests will grow in size because of different boilerplates. That is my concern why I'm asking here