I am trying to parse a PDF file into an Excel file
PDDocument document = Loader.loadPDF(new File("Example.pdf"));
// Extract the text from the PDF
PDFTextStripper stripper = new PDFTextStripper();
String text = stripper.getText(document);
// Create a new Excel workbook
XSSFWorkbook workbook = new XSSFWorkbook();
// Create a new sheet in the workbook
XSSFSheet sheet = workbook.createSheet("PDF Data");
// Split the text into rows and columns
String[] rows = text.split("\n");
for (int i = 0; i < rows.length; i++) {
String[] cols = rows[i].split("\t");
Row row = sheet.createRow(i);
for (int j = 0; j < cols.length; j++) {
Cell cell = row.createCell(j);
cell.setCellValue(cols[j]);
}
}
// Save the Excel file
FileOutputStream fileOut = new FileOutputStream("outputFile.xlsx");
workbook.write(fileOut);
workbook.close();
// Close the PDF document
document.close();
My PDF file looks like a table.
But this code does not insert data into the correct columns, that is, instead of
| name | some data | some data 2| some data 3 | some data 4 | some data 5 |
+----------+----------------+------------+-------------+-------------------+--------------+
Code inserts data into one column
name
some data
some data 2
some data 3
some data 4
some data 5
Please help me achieve the result so that the data is distributed over several columns and not in one.
The whole problem I have is that I need to write the data from the EXCEL file to the database