Initializing empty instance variables in constructor

Question

I have a LogAnalyzer class that looks at a web server log, creates LogEntry objects and puts those objects into HashMaps for analyzing.

My LogAnalyzer class has these fields:

private int totalVisits;
private int uniqueVisits;
private ArrayList<LogEntry> records;
private HashMap<String, ArrayList<LogEntry>> uniqueIPs; //<address, log entries>
private HashMap<String, ArrayList<LogEntry>> dailyRecords; // <date, log entries>

My constructor looks like this:

public LogAnalyzer() {

    records = new ArrayList<>();
    dailyRecords = new HashMap<>();
    uniqueIPs  = new HashMap<>();

}

And then I have this method:

public void initializeRecords(String path){
    readFile(path); //reads the web log file and fills out the records and dailyRecords fields
    getUniqueIPs(); //fills out the uniqueIPs HashMap field.
    uniqueVisits = uniqueIPs.size(); //fills out the uniqueVisits field
    totalVisits = records.size(); //fills out the totalVisits field
}

So my question:

I have read (but don't really understand) it's "bad" to call methods inside the constructor. However it seems like the constructor is pointless here, since it is actually the initializeRecords that is doing all of the meaningful work of "creating" the object.

I don't have the background in Java or programming to understand the explanations I've found so far. There is a lot of talk of overriding things, and I think that's what I'm not clear on. I'm wondering why I should keep my constructor and this method seperated, in simple terms that a beginner can understand.

**EDIT: ** Here's the code for readFile():

public void readFile(String filename) {
    FileResource fr = new FileResource(filename);
    for (String line : fr.lines()){
        LogEntry le = WebLogParser.parseEntry(line);
        String date = le.getAccessTime().toString().substring(4, 10);
        if (dailyRecords.keySet().contains(date)){
            dailyRecords.get(date).add(le);
        }
        else{
            ArrayList<LogEntry> entries = new ArrayList<>();
            entries.add(le);
            dailyRecords.put(date, entries);
        }
        records.add(le); 
    }

Where did you read that it's bad to call methods inside the constructor? — mapeters, Jan 16 '20 at 19:44
@mapeters https://stackoverflow.com/questions/5230565/can-i-call-methods-in-constructor-in-java, https://stackoverflow.com/questions/51592673/using-methods-inside-a-constructor, https://stackoverflow.com/questions/18348797/why-is-it-considered-bad-practice-in-java-to-call-a-method-from-within-a-constru are a fiew examples... Obviously this question has been asked before, but I'm not understanding the answers. — rocksNwaves, Jan 16 '20 at 19:50
You're not calling any instance methods, you're initializing instance variables, one of which is overwritten later. The last question you reference is the best one, re: overridable methods and leaking `this`--that is ultimately what you're asking about. Whether or not it makes sense to initialize the variables you use *later* is a matter of opinion and usage. — Dave Newton, Jan 16 '20 at 19:55
@DaveNewton Ok, I think I just have to keep learning and revisit this question later. I'm a little shaky on what "this" means, as well as what "derived classes" and "overridable methods" are. I'm hoping these terms will pop up in my online courses later. — rocksNwaves, Jan 16 '20 at 20:08
@rocksNwaves `this` is the current instance (standard, basic Java). "Derived class" is a subclass. "Overridable method" is... a method that can be overridden by a derived class. — Dave Newton, Jan 16 '20 at 20:10

score 1 · Answer 1 · answered Jan 16 '20 at 20:03

It's not a good practice to call methods from within constructor, because Java always calls the most derived method, which means we could call a method on a half-initialized object.

To answer your question,

public LogAnalyzer() {

    records = new ArrayList<>();
    dailyRecords = new HashMap<>();
    uniqueIPs  = new HashMap<>();

}

What the above part exactly does is, it gives the variables, records, dailyRecods and uniqueIPs a physical address in the memory stack.

When we write something like private ArrayList<LogEntry> records; in the class, at this time only a reference is generated, but actual initialization happens only when records = new ArrayList<>(); this line gets executed.

Hope this clarifies your doubt!!!

"*we could call a method on a half-initialized object*" Care to elaborate? — PM 77-1, Jan 16 '20 at 21:48
@PM77-1 please have a look at these articles: ```https://www.javaspecialists.eu/archive/Issue086.html``` and ```https://www.javaspecialists.eu/archive/Issue086b.html``` — prashant.kr.mod, Jan 16 '20 at 21:53

score 1 · Answer 2 · answered Jan 16 '20 at 20:06

As you can see in readFile() it uses the following instruction

dailyRecords.keySet().contains(date)

without initializing dailyRecords prior to it. So if you do not initialize dailyRecords either while declaring it or in constructor you will face NullPointerException.

In your case instead of using constructor for initialization you can use declaration part like this

private HashMap<String, ArrayList<LogEntry>> dailyRecords = new HashMap<>();

score 1 · Accepted Answer · answered Jan 16 '20 at 20:08

1

Keeping the two methods allows you more flexibility in the use of your code. You can instantiate your LogAnalyzer without knowing the path to your log. I would rename initializeRecords to processRecords which IMO is more descriptive of what you are doing there.

We should create the object first and then call methods on it. If your readFile method were to throw an Exception because it can't read the file for example. I would find it very odd to get that exception when I am constructing the object. The point of the constructor is to provide an object that can be used to do something.

answered Jan 16 '20 at 20:08

Schleis

41,516
7
68
87

When you say "The point of the constructor is to provide an object that can be used to do something." I think you hit the heart of my question on the head. The way that my code is set up, it seems like I am providing an object that is useless, because it doesn't contain anything but empty variables. It doesn't become useful until I call `processRecords()`. However, it seems like you are saying that I should still keep the two things separate? Or am I misunderstanding your answer? – rocksNwaves Jan 16 '20 at 20:14
You are basically getting it. The other thing is to not think about the data as being the "useful" part of the object. The point of OOP is to provide useful abstractions that make things useful to understand. – Schleis Jan 16 '20 at 20:18
Thank you for the clarification, specifically on "The point of OOP". That's also at the heart of my question. I took some OOP classes in college long ago, but never got a job in programming, since I joined the military. Now, I'm having to re-learn everything without the benefit of instructors to answer all of my questions. – rocksNwaves Jan 16 '20 at 20:22

Initializing empty instance variables in constructor

3 Answers3