I've been reading the previous answers to this question and trying the code. None have helped so far. All of them result in a GC Out of Memory error.
I am running Eclipse 2019-09 on a Windows 10 PC with 12 GB RAM. In Eclipse I deployed a WildFly 12 internal server. I have JDK 1.8.0 152. In the WildFly startup line, I changed the option from -Xmx512m to -Xmx2048m.
I tried the simple POI. Also added monitorjbl. I then removed everything from the code but still can't read large files. My XLSX has 12 sheets, each with 500K rows. Each row is of the same format: 1 serial number, 2 (string) id's with numbers and dashes, and a (string) name column.
The program runs for about 12 minutes before giving the memory error.
import com.monitorjbl.xlsx.StreamingReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
/**
* Servlet implementation class StreamImport
*/
@WebServlet(description = "Import data using XLSX stream", urlPatterns = { "/StreamImport" })
public class StreamImport extends HttpServlet {
private static final long serialVersionUID = 1L;
/**
* @see HttpServlet#HttpServlet()
*/
public StreamImport() {
super();
// TODO Auto-generated constructor stub
}
/**
* @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse response)
*/
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
final String strFile = "C:\\Users\\Hussain\\Downloads\\2019819515128395ATL_IT1.xlsx";
Integer iRows = 0;
PrintWriter out = response.getWriter();
out.println("<html><body>");
out.println("<h1>Importing Using Stream</h1>");
out.println("<p>Input Excel File: " + strFile + "</p>");
// Open file
FileInputStream fisFile = new FileInputStream(new File(strFile));
//Create Workbook instance holding reference to .xlsx file
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100) // number of rows to keep in memory (defaults to 10)
.bufferSize(12000) // buffer size to use when reading InputStream to file (defaults to 1024)
.open(fisFile); // InputStream or File for XLSX file (required)
//Get first/desired sheet from the workbook
Sheet sheet = workbook.getSheetAt(0);
for(Row row : sheet) {
iRows++;
}
workbook.close();
fisFile.close();
out.println("Read " + iRows.toString() + " rows <br>");
out.println("</body></html>");
}
}
Some of the answers say that the monitorjbl is no longer needed as the SXSSF method does its own streaming. Others say that SXSSF is only for writing.
Edit: Yes, as I said, I know this is a duplicate question. First, I don't know if the monitorjbl solution has been superseded by SXSSF. And, secondly, after following other solutions, I am still getting out of memory errors.