0

I have to parse with php an XLS file that is written by some other code and it seems to be poorly written.

I've tried parsing it with PHPExcel using autorecognition in this way:

$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
echo 'filetype: '.$inputFileType.'<br>';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($inputFileName);

Which returns:

filetype: CSV

The file is opened but it is not read correctly as the data it's not correctly recognized, content is not in proper cells and some cells give error. I've tried using all other PHPExcel filetypes and all of them return error.

I've tried to open it with a text editor (Notepad++) and the file it's in binary, not a simple CSV. The extension is XLS but since it's written via a script cannot be used as unique identifier of the version.

If i open the file with Excel it's opened and i can saved it in another format (for example as a new xlsx file) and after that i can correctly read it.

Thinking it's encoded in some very old format, I've tried with other library SimpleExcel and i got this error: File extension XLS doesn't match with xml

Is there a way to "correct" the format before parsing it?

Tiziano Mischi
  • 176
  • 1
  • 16
  • You'd either get lucky with the newer PhpSpreadsheet, or will have ot pre-convert it with Excel then, or salvage it via headless [soffice](https://stackoverflow.com/questions/37825864/converting-xls-to-semicolon-delimited-csv-with-soffice-c) if it really needs automation. – mario Apr 12 '20 at 09:38
  • I have to fully automate the process. Can i convert it somewhat via command line? – Tiziano Mischi Apr 12 '20 at 09:39

0 Answers0