0

I am reading data from a file of students where each line is a student, I am then turning that data into a student object and I want to return an array of student objects. I am currently doing this by storing each student object in an arraylist then returning it as a standard Student[]. Is it better to use an arraylist to have a dynamic size array then turn it into a standard array for the return or should I first count the number of lines in the file, make a Student[] of that size then just populate that array. Or is there a better way entirely to do this.

Here is the code if it helps:

public Student[] readStudents() {
        String[] lineData;
        ArrayList<Student> students = new ArrayList<>();
        while (scanner.hasNextLine()) {
            lineData = scanner.nextLine().split(" ");
            students.add(new Student(lineData));
        }
        return students.toArray(new Student[students.size()]);
    }
Reporter
  • 3,897
  • 5
  • 33
  • 47
Thomas Briggs
  • 119
  • 1
  • 11
  • There is a great answer for CSV files handling (which is just like your case except the separator is ` ` and not `,`) https://stackoverflow.com/a/55085305/4062197 – ymz Feb 12 '20 at 12:58
  • I'll check it out now thanks – Thomas Briggs Feb 12 '20 at 13:00
  • Now the return type can be anything. it's going from a return and just printing it out in the main. I just used an array because I don't need any of the features of any other data type – Thomas Briggs Feb 12 '20 at 13:13
  • Hello, can you use streams in your assignment ? – dariosicily Feb 12 '20 at 13:17
  • This isn't for an assignment, I mean it was and I did it by using a arraylist. This is just for me wondering what is the standard way to do this. I just feel like I'm using a arraylist for no real reason, so to answer your question yes. Anything can be used – Thomas Briggs Feb 12 '20 at 13:20
  • Counting the number of lines first means that you need to read the file twice or store the whole content into memory, neither is good choice. Well, if the size of file is relatively small, you won't notice the big difference. But if the file is large enough, I'd just read the file once and put create a list. – ntalbs Feb 12 '20 at 13:25

2 Answers2

2

Which is better depends on what you need and your data set size. Needs could be - simplest code, fastest load, least memory usage, fast iteration over resultind data set... Options could be

  1. For one-off script or small data sets (tens of thousands of elements) probably anything would do.
  2. Maybe do not store elements at all, and process them as you read them? - least memory used, good for very large data sets.
  3. Use pre-allocated array - if you know data set size in advance - guaranteed least memory allocations - but counting elements itself might be expensive.
  4. If unsure - use ArrayList to collect elements. It would work most efficiently if you can estimate upper bound of your data set size in advance, say you know that normally there is not more than 5000 elements. In that case create ArrayList with 5000 elements. It will resize itself if backing array is full.
  5. LinkedList - probably the most conservative - it allocates space as you go but required memory per element is larger and iteration is slower than for arrays or ArrayLists.
  6. Your own data structure optimized for your needs. Usually the effort is not worth it, so use this option only when you already know the problem you want to solve.

Note on ArrayList: it starts with pre-allocating an array with set of slots which are filled afterwards without memory re allocation. As long as backing array is full a new larger one is allocated and all elements are moved into it. New array size is by default twice the size of previous - normally this is not a problem but can cause out of memory if new one cannot get enough contiguous memory block.

Petr Gladkikh
  • 1,906
  • 2
  • 18
  • 31
2

Use an array for a fixed size array. For students that is not the case, so an ArrayList is more suited, as you saw on reading. A conversion from ArrayList to array is superfluous.

Then, use the most general type, here the List interface. The implementation, ArrayList or LinkedList then is a technical implementation question. You might later change an other implementation with an other runtime behavior.

But your code can handle all kinds of Lists which is really a powerful generalisation.

Here an incomplete list of useful interfaces with some implementations

  • List - ArrayList (fast, a tiny bit memory overhead), LinkedList
  • Set - HashSet (fast), TreeSet (is a SortedSet)
  • Map - HashMap (fast), TreeMap (is a SortedMap), LinkedHashMap (order of inserts)

So:

public List<Student> readStudents() {
    List<Student> students = new ArrayList<>();
    while (scanner.hasNextLine()) {
        String[] lineData = scanner.nextLine().split(" ");
        students.add(new Student(lineData));
    }
    return students;
}

In a code review one would comment on the constructor Student(String[] lineData) which risks a future change in data.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138