I am trying to solve a problem where I have a large CSV file with below structure.
Dataset: order_id,product_id,add_to_cart_order,reordered
I want a list of product_id for each order_id.
So I am creating a HashMap(Map<order_id<HashSet<product_id>>)
by reading DataSet. Where order_id
and product_id
I am keeping as String
. When I am trying to populate this hashmap then I am getting GC overhead limit exceeded error.
I know this is not an optimized solution so please help me with a better approach to do this work.
DataSet Contains around 90K entries.
File file = new File(csvFile);
CsvReader csvReader = new CsvReader();
try (CsvParser csvParser = csvReader.parse(file, StandardCharsets.UTF_8)) {
CsvRow row;
while ((row = csvParser.nextRow()) != null) {
if (!orderProductMap.containsKey(row.getField(0))) {
orderProductMap.put(row.getField(0), new HashSet<>());
}
((Set) orderProductMap.get(row.getField(0))).add(row.getField(1));
}
}