5

I want a List of n Sets of Integers and initially this list should be filled with null. A lot of the Sets will be initialised later, and some will remain null.

I have tried different methods to implement this, some of them are included here:

List<HashSet<Integer>> List_of_Sets = Arrays.asList(new HashSet[n]);
ArrayList<HashSet<Integer>> List_of_Sets = new ArrayList<>(n);
while(n-- > 0) List_of_Sets.add(null);

Is there a faster way to do this?

For clarification an example for arrays would be Arrays.fill() used to be slower than:

/*
 * initialize a smaller piece of the array and use the System.arraycopy 
 * call to fill in the rest of the array in an expanding binary fashion
 */
public static void bytefill(byte[] array, byte value) {
  int len = array.length;

  if (len > 0){
    array[0] = value;
  }

  //Value of i will be [1, 2, 4, 8, 16, 32, ..., len]
  for (int i = 1; i < len; i += i) {
    System.arraycopy(array, 0, array, i, ((len - i) < i) ? (len - i) : i);
  }
}

^above code is from Ross Drew's answer to Fastest way to set all values of an array?

RipeGorilla
  • 70
  • 1
  • 5
  • 3
    1) The Answer is No (AFAIK). 2) You should be able to check which of the alternatives is fastest by benchmarking (using the correct techniques!!), but I think that the `Arrays.asList` version should be marginally faster. 3) Is this "premature optimization"? – Stephen C Feb 09 '21 at 00:14
  • By the way, if you are only concerned with the speed of creating / initializing the fixed sized list (as your title says), it makes no difference to that speed what operations you do afterwards. – Stephen C Feb 09 '21 at 00:18
  • 1
    Finally, if speed is really important to you, you could just replace the `ArrayList` with an array. That will be faster, both to create and to access / update. (Just a bit less convenient in some cases.) From what I can tell, you won't be making use of the `List` API's ability to add and remove list elements .... – Stephen C Feb 09 '21 at 00:21
  • Thanks @Stephen C, I will edit the question :D. My code does everything I want it to so I figured now was the time to optimize, is there a step I missed? – RipeGorilla Feb 09 '21 at 00:21
  • 2
    `Collections.fill()` doesn't do anything on an empty list, so it's a bit puzzling if you are satisfied with it. There is `Collections.nCopies()` which may work with `null`. – tevemadar Feb 09 '21 at 00:30
  • 1
    I would just `List> List_of_Sets = Arrays.asList(new HashSet[n]);` and move on – Bohemian Feb 09 '21 at 00:36
  • @tevemadar thanks for pointing that out, i do indeed get an IndexOutOfBoundsException when running it. nCopies() doesn't seem to work for this use either. Note to self: run every example next time... – RipeGorilla Feb 09 '21 at 00:42
  • Why do you need it be faster? If this is the slowest part of your application, I guess either you do something wrong, or you measure it wrong. BTW, the mentioned answer about `Arrays.fill` comes from an article published in 2000. 21 years ago! The article was not even about HotSpot JVM. – apangin Feb 09 '21 at 02:42
  • @apangin I need it to be faster since my solution is ranked by time taken on the website i submit it to: https://open.kattis.com/ It probably isn't the slowest part of my aplication, but since i have often wondered about the question i thought i would post it. I tried submitting a solution using `Arrays.fill` and it is indeed just as fast in this instance, thanks. I didn't want to dump my entire code here even though it isn't that large and ask "How do i make this faster?" so i split it in to multiple questions, and out of those i could not find this one asked before so i asked. – RipeGorilla Feb 09 '21 at 12:00
  • Thanks for all the help, I have now edited the question to be more general for the benefit of future readers since I already got my rather specific (and therefor useless to others) answer. @StephenC if you want credit for answering please formulate your response as an answer and I will accept it since Chris Foley suggests not using his answer. – RipeGorilla Feb 09 '21 at 12:34
  • 1
    You could also write a `native` method. – MC Emperor Feb 09 '21 at 14:12
  • @MCEmperor I don't know if I am allowed to, but I will certainly give it a try. Thanks for the suggestion! – RipeGorilla Feb 09 '21 at 18:00

2 Answers2

3

Is there a faster way to do this?

As far as I am aware, no. Certainly, there is no easy way that is faster.

Based on how it works, I think (but I have not tested) that the Arrays.asList(new HashSet[n]) should be the fastest solution.

It would be possible to implement a custom List implementation that is like an ArrayList but is pre-initialized to N null values. But under the hood the initialization will be pretty much identical with what happens in the List implementation that asList returns. So I doubt that any performance improvements would be significant ... or worth the effort.

If you want to be sure of this, you could write a benchmark of the various options. However, I don't think this is the right approach in this case.

Instead I would recommend benchmarking and profiling your entire application to determine if operations on this list are a real performance hotspot.

  • If it is not a hotspot, my recommendation would be to just use the Arrays.asList approach and spend your time on something more important.

  • If it is a hotspot, you should consider replacing the List with an array. From your earlier description it seemed you are going to use the List like an array; i.e. using positional get and set operations, and no operations that change the list size. If that is the case, then using a real array should be more efficient. It saves memory, and avoids a level of indirection and (possibly) some bounds checking.

    One reason not to do this would be if you needed to pass the array to some other code that requires a List.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
1

If resizing is not important to you then implementing your own list might be fast. It might also be buggy. It would at least be interesting to benchmark compared to Java's lists. One strange effect that you might see is that standard lists might be optimised by the JIT sooner, as they could be used internally by Java's standard library.

Here is my attempt, although I suggest you don't use it. Use a standard list implementation instead.

import java.util.*;

public class FastListOfNullsDemo {
    public static void main(String[] args) {
        Set<Integer>[] arr = new Set[100_000]; // all set to null by default.
        List<Set<Integer>> myList = new ArrayBackedList<>(arr);
        
        myList.set(3, new TreeSet<Integer>());
        
        myList.get(3).add(5);
        myList.get(3).add(4);
        myList.get(3).add(3);
        myList.get(3).add(2);
        myList.get(3).add(1);
        
        // Let's just print some because 100,000 is a lot!
        for (int i = 0; i < 10; i++) {
            System.out.println(myList.get(i));
        }
    }
}

class ArrayBackedList<T> extends AbstractList<T> {
    private final T[] arr;
    
    ArrayBackedList(T[] arr) {
        this.arr = arr;
    }
    
    @Override
    public T get(int index) {
        return arr[index];
    }

    @Override
    public int size() {
        return arr.length;
    }
    
    @Override
    public T set(int index, T value) {
        T result = arr[index];
        arr[index] = value;
        return result;
    }
}

Another possibility would be implementing an always-null, fixed-size list. Use that to initialise the ArrayList. I won't promise that it is fast but you could try it out.

import java.util.*;

public class FastListOfNullsDemo {
    public static void main(String[] args) {
        List<Set<Integer>> allNull = new NullList<>(100_000);
        List<Set<Integer>> myList = new ArrayList<>(allNull);
        
        myList.set(3, new TreeSet<Integer>());
        
        myList.get(3).add(5);
        myList.get(3).add(4);
        myList.get(3).add(3);
        myList.get(3).add(2);
        myList.get(3).add(1);
        
        System.out.println(myList.size());
        // Let's just print some because 100,000 is a lot!
        for (int i = 0; i < 10; i++) {
            System.out.println(myList.get(i));
        }
    }
}

class NullList<T> extends AbstractList<T> {
    private int count;
    
    NullList(int count) {
        this.count = count;
    }
    
    @Override
    public T get(int index) {
        return null;
    }

    @Override
    public int size() {
        return count;
    }
}
Chris Foley
  • 454
  • 3
  • 5
  • Thanks! I ended up using an array of n HashSets so I will follow your suggestion and not use it. I have edited the question to be more general for the benefit of future readers. – RipeGorilla Feb 09 '21 at 12:17