Context: I am reading an Excel File in a certain format using Apache POI library. Each file has only one sheet and a certain template. I am able to read the sheet, perform some manipulation on the values, store them as POJO, and then convert them to XML using a JAXB implementation.
Problem: I am reading only a few excel files (say 100) for now, but I wanted to design my application in such a way that it is scalable enough to read around 1000 to 10000 files. Can you suggest a good architecture for the same. Also, should i be using multithreading (say a threadpool of 10 threads) to read 10 sheets at once, or would that be a bad design considering the fact that each sheet has separate data that it not interlinked with any other sheet.
Note: I cannot share the any code snippets since that is proprietary code, although for the sake of assumption, we can assume each sheet to have 50 rows and each row has 6 to 10 columns with plain text data in all the cells. Since the file is small, I am loading the entire file in memory and then processing it. Also, I am using apache poi code to iterate through the rows and columns (sample below)
XSSFWorkbook workbook = new XSSFWorkbook(fileInputStream);
sheet = workbook.getSheetAt(0);
//outer for loop using 'i' to iterate all rows
row = sheet.getRow(i);
//inner for loop using 'j' to iterate all columns in a row
value = row.getCell(j)
//use 'value' as and when required
//end inner for loop
//end outer for loop
P.S. This is my first question on SO, so please feel free to suggest any changes/improvements in my question.
Thanks and Regards, Sid