I have a list of strings (["A", "B", ...])
and a list of sizes ([4,7,...])
. I would like to sample without replacement from the set of strings where initially string in position i
appears sizes[i]
times. I have to perform this operation k
times. Clearly, if I pick string i
, then sizes[i]
decreases by 1. The current naive solution that I developed is to generate the entire input set, shuffle it, and iteratively pop the first element of the array. This is clearly inefficient since if a string appears 1 million times I would have to generate 1 million entries.
public static void main(String[] args) {
String[] elems = { "A", "B", "C", "D", "E" };
Integer[] sizes = { 10, 5, 4, 7, 3 };
int k = 3;
ArrayList<String> bag = new ArrayList<>();
for (int i = 0; i < elems.length; i++) {
for (int j = 0; j < sizes[i]; j++) {
bag.add(elems[i]);
}
}
Collections.shuffle(bag);
for (int i = 0; i < k; i++) {
System.out.println(bag.remove(0));
}
}
Is there a better and more efficient way to perform this operation? Thanks.