10

I have a large xlsx file that is 90MB using phpexcel it is giving me

Warning: simplexml_load_string(): Memory allocation failed : growing buffer

I tried to load the file using every methods documented here, and also changed php.ini memory_limit = -1 .

I am trying to convert the xlsx file to a csv file so it can be easily loaded.

Is there any way to convert xlsx file to csv without using phpexcel?

DrakaSAN
  • 7,673
  • 7
  • 52
  • 94
snehal
  • 429
  • 5
  • 11
  • 25
  • Its huge file, You can use Office Excel itself to save file as csv if its only one file – Saqueib Nov 21 '13 at 05:40
  • i want to write script to load this file contain and put it to mysql database. – snehal Nov 21 '13 at 05:44
  • 90MB is to much, you can split this file in parts that way you can use PHPExcel is the way to go – Saqueib Nov 21 '13 at 05:49
  • you can try putting ini_set('max_execution_time', 0); before any code, don't forget to remove after you done importing – Saqueib Nov 21 '13 at 05:52
  • if i split then is there chance of loosing data? – snehal Nov 21 '13 at 06:00
  • @ Saquib can you please tell me how to split file contain.so it can read all contain of the file. – snehal Nov 21 '13 at 06:52
  • see this link http://www.extendoffice.com/documents/excel/628-excel-split-workbook.html, i would go with VBA Macro – Saqueib Nov 21 '13 at 07:03
  • @ Saquib thank you for the link. can you please tell me how to give the file path in this macro or ThisWorkbook.Path i have to change.and i also want to know that how to schedule this macro so that every hour it split my file. then my other php script read it and load in mysql. – snehal Nov 21 '13 at 07:15
  • @ Saquib for ThisWorkbook.Path i given my xlsx file path that is "D:\Data" it is creating 3 sheets but all files are empty.please can you help me on this. – snehal Nov 21 '13 at 07:28
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/41598/discussion-between-saquib-and-snehal) – Saqueib Nov 21 '13 at 07:33

4 Answers4

3

You can use python:

wb = xlrd.open_workbook(os.path.join(filepath, 'result.xls'))
sheet = wb.sheet_by_index(0)
fp = open(os.path.join(filepath, 'result.csv'), 'wb')
wr = csv.writer(fp, quoting=csv.QUOTE_ALL)
for rownum in xrange(sheet.nrows):
  wr.writerow([unicode(val).encode('utf8') for val in sheet.row_values(rownum)])
hd1
  • 33,938
  • 5
  • 80
  • 91
2

XLSX files are compressed zip files. If you decompress your XLSX file, look at the folder xl/worksheets, which contains a xml file for each sheet of the file.

You may want to extract these XML files first and then parse the (xml) content, element by element, so that the buffer to get each xml element does not need to be so big. This way, you can make your own script in php to read the extracted file, or use some xml parser, to transform the sheets into xml objects and them dump your csv.

The structure of the resulting xml is something like this example (the important information is inside sheetData):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
    <dimension ref="A1:J42"/>
    <sheetViews>
        <sheetView workbookViewId="0">
            <selection activeCell="C7" sqref="C7"/>
        </sheetView>
    </sheetViews>
    <sheetFormatPr defaultRowHeight="12.75" x14ac:dyDescent="0.2"/>
    <cols>
        <col min="1" max="1" width="18.140625" style="1" customWidth="1"/>
        <col min="2" max="16384" width="9.140625" style="1"/>
    </cols>
    <sheetData>
        <row r="1" spans="1:10" x14ac:dyDescent="0.2">
            <c r="B1" s="1" t="s"><v>0</v></c>
            <c r="C1" s="1" t="s"><v>1</v></c>
            <c r="D1" s="1" t="s"><v>2</v></c>
        </row>
        <row r="2" spans="1:10" x14ac:dyDescent="0.2">
            <c r="A2" s="1" t="s"><v>4</v></c><c r="B2" s="1"><v>200</v></c>
            <c r="C2" s="1"><v>200</v></c>
            <c r="D2" s="1"><v>100</v></c><c r="E2" s="1"><v>200</v></c>
        </row>
        <row r="3" spans="1:10" x14ac:dyDescent="0.2">
            <c r="A3" s="1" t="s"><v>10</v></c><c r="C3" s="1"><f>6*125</f><v>750</v></c>
            <c r="H3" s="1" t="s"><v>6</v></c><c r="I3" s="1"><v>130</v></c>
        </row>
    </sheetData>
    <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
    <pageSetup paperSize="0" orientation="portrait" horizontalDpi="0" verticalDpi="0" copies="0"/>
</worksheet>

That is, you need to look at each cell (c tag) of each row (row tag) the xml has:

worksheet.sheetData.row[i].c[j].v

and take the content of the value (v tag).

Eduardo Poço
  • 2,819
  • 1
  • 19
  • 27
  • Not the best answer in my imagination, but at least this provide some hints in solving the problem by reading line by line. Parse as XML content is not possible for my case as even all the RAM in my computer is not enough for opening the file, so reading as string using PHP and then parse is not possible. – cytsunny Aug 10 '16 at 09:21
  • 1
    There can be an hybrid approach: you locate the begining of sheetdata by char reading and then parse one row element by read, until it reaches the end. I know that there are java xml parsers that can read one tag by time, so I think there may be some analog parser for PHP too. – Eduardo Poço Aug 10 '16 at 14:31
1

Online converter up to 100MB file size:

http://www.zamzar.com/convert/xlsx-to-csv/

3 way's tutorial:

http://www.ehow.com/how_6921725_convert-xlsx-file-csv.html

hope this helps...

0

You can do this with ; with the below, you will scan A1 -> A10 and export the 5 first columns of the "DATA" tab of current workbook.

Sub exportCSV()

  Dim wkRange As Range
  Dim cpSheet As Worksheet

  Dim myPath As String, myFileName As String
  Dim fn As Integer ' File number
  Dim cLine As String ' current line to be writen to file

  ' create output file:
  myPath = "C:\local\"
  myFileName = "out.csv"
  fn = FreeFile
  Open myPath & myFileName For Append As #fn
  Set wkRange = ThisWorkbook.Sheets("DATA").Range("$A1:$A10")
  For Each c In wkRange
  ' select your columns with "offset"
    cLine = c.Offset(0, 0).Value & ","
    cLine = cLine & c.Offset(0, 1).Value & ","
    cLine = cLine & c.Offset(0, 2).Value & ","
    cLine = cLine & c.Offset(0, 3).Value & ","
    cLine = cLine & c.Offset(0, 4).Value
    Print #fn, cLine
  Next
  Close #fn
  MsgBox "done!"

End Sub
J. Chomel
  • 8,193
  • 15
  • 41
  • 69
  • How to run a DBA script without opening Excel? – cytsunny Aug 04 '16 at 05:40
  • (its a vba, not DBA script); you could search, and find its not really possible: https://search.lilo.org/searchweb.php?q=run%20excel-vba%20macro%20without%20opening%20excel > http://stackoverflow.com/questions/12759229/running-macros-without-opening-excel – J. Chomel Aug 04 '16 at 06:11