0
    public List<ArrayList<String>> removeRow(int columnIndex,Set<String> masterData,List<ArrayList<String>> rowColumnData){ 

        List<ArrayList<String>> finalData= new ArrayList<ArrayList<String>>();

        for(ArrayList<String> data: rowColumnData){
            String columnVal=data.get(columnIndex);
            if(masterData.contains(columnVal){
            finalData.add(data);
        }

        return finalData;
    }

I need to filter out rows if a specific set of values of a column didn't match. My masterData contains 30,000 records. My rowColumnData will going to hold 2M records, basically a row whose column values are store in a array list and the entire table data is List>.

How can using stream API write the code which is going to give me better performance?

1 Answers1

0

I'm not that good at performance level while using streams but you can do this by using parallelStream where multiple thread get process the data, need to know some interesting facts Java 8's streams: why parallel stream is slower?

List<List<String>> result = rowColumnData
                            .parallelStream()
                            .filter(l->masterData.contains(l.get(columnIndex)))
                            .collect(Collectors.toList());

But be aware

List E get(int index) Throws :

IndexOutOfBoundsException - if the index is out of range (index < 0 || index >= size())

Set boolean contains(Object o) Throws :

NullPointerException - if the specified element is null and this set does not permit null elements (optional)

Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
  • Can you explain how would parallel streams used in the current case help "improve the performance"? – Naman Apr 21 '19 at 19:12