8

Looking for help with this issue.

I have an excel file of size 5.7 MB, when I save the file as csv it will increase to 14.3 MB! When I click on the created csv file and save it again as csv, the size will be reduced to 1KB!

My question, is there a way to save my Excel file directly to csv and strip out all formatting so that it will be save as 1KB csv file?

Thanks in advance.

Asma
  • 133
  • 1
  • 1
  • 6
  • 2
    There's no formatting in a CSV file. Your Excel file may be smaller then the first CSV because the xlsx format is really a ZIP archive, and so has some level of compression. Can't explain what is going on with the second save. – Tim Williams Oct 01 '17 at 06:38

4 Answers4

20

A CSV file will often be larger than the XLSX it was created from. This is because in XLSX is a actually a compressed (zipped) file - you can unzip it with a standard compression tool and check it out for yourself.

You will see smaller XLSX files if there is a lot of repeat data. XLSX actually pulls out every text value, stores it in a lookup table, and then replaces it with a smaller reference number to the lookup table. This means that it only has to use up space once per text string (a little more than "once" because of the references, but still much less space).

A CSV file will list every occurrence in full which takes up much more space. I've seen files balloon to 10x the size after saving to CSV. The only time you should expect a savings with a CSV is if there is a lot of formatting and not too much reused text. Then the CSV strips out the formatting and the final product is smaller.

In reference to your 1KB file, that's a placeholder temporary file while the OS loads data to the final version. After the save is complete, it will be the same size as the other CSV. Either that, or the file is corrupt.

ProfessorFluffy
  • 331
  • 3
  • 9
  • Could it be because of the different ways Excel and CSV store numeric values? I guess Excel can treat numbers as float value, whereas CSV always stores the text. My file has no repeated string values, but a large amount of numeric values. – Xinlu Mar 22 '23 at 02:48
2

I had same problem, my original file size was about 250KB, after some processing in Excel, it increased to about 5.5 MB and then when I save as CSV, it went to 184MB. Then I have discovered that I was somehow saving all >milloin rows in the Excel sheet and that was making file size enormous. When I select only the data rows/columns and copy/paste into another sheet, then the file size became 115KB.

  • Thank you for this comment, I had been troubleshooting this for quite a while before I came across your answer and it resolved my issue. Do you know what causes it to try and save all rows instead of only the ones with actual content? – Kender Feb 07 '23 at 15:22
0

Doing File > Save As and choosing .csv as the file type will always save it with all formatting stripped out.

Saving it as a 1KB file doesn't make sense. You may just be seeing the size of the file placeholder when it is first created and before windows explorer has refreshed to show the full filesize after it is fully saved.

When you open the 1KB file, does it actually contain all of your data? What does the properties of the file say when you right click on the file in windows explorer?

Michael
  • 4,563
  • 2
  • 11
  • 25
-2

If the Excel file is an XLS file, than the CSV file should be significantly smaller than the Excel file, but I doubt that so small as 1KB. It should be around 1-2MB.

If the Excel file is an XLSX file, than the CSV file should be a little smaller than the Excel file.

The numbers assumes that the Excel files does not contain images and the data in cells is dominant.

The size of the CSV also depends by the encoding of the file. Check also the encoding for the two CSV files that you have.

Later edit: If you don't want to export the last rows, you can try to export only a range of cells. You have 2 options:
1. Export a range using VBA
2. Create a new sheet, create formulas that points to the first sheet, export this new sheet as CSV.

alex.pulver
  • 2,107
  • 2
  • 31
  • 31
  • Hi alex, I noticed that the saved CSV file has several empty rows because I added some formulas to the column. Is there a way to avoid exporting these lines? – Asma Oct 03 '17 at 05:10
  • Maybe to export only a range of cells. Check my edit of the answer. – alex.pulver Oct 03 '17 at 14:27
  • @alex.pulver I've tested converting xls/x to csv using apache poi library. Always the size of csv is either equal or greater than xls/x file size. Sometimes it is going to 10x size also. Its again based on the data(repetative) in the file. – Pavan Feb 07 '22 at 04:16