6

I am working on a site that allows teachers to upload documents and students download them. However, there is a problem. Microsoft Word (.docx) files download perfectly, but when downloading an excel (xlsx) file, excel gives a "This file is corrupt and cannot be opened" dialog. Any help with this would be greatly appreciated!

My download code is as follows:

case 'xlsx':

    header('Content-type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet');
    header('Content-Disposition: attachment; filename="' . $filename . '"');
    header('Content-Transfer-Encoding: binary');
    header('Expires: 0');
    header('Pragma: no-cache');
    readfile('./uploads/resources/courses/' . $filename);

break;
cbenjafield
  • 81
  • 1
  • 1
  • 5
  • 2
    Any whitespace being outputted e.g. before opening tag? Are there any warnings/notices, those will make the file corrupt. Download the file from an FTP source and open it to verify the original file is not corrupt. – MrCode Dec 18 '12 at 17:20
  • What does the downloaded file look like? Is it empty? Is it cut off? In the latter case you could try sending a `Content-Length` header. Another issue might be that downloading takes to too long and PHP throws a timeout. Use `set_timeout()` to raise the timeout. – Halcyon Dec 18 '12 at 17:20
  • What happens after this `switch` block? Usually you `exit;` after you call `readfile`. – gen_Eric Dec 18 '12 at 17:25
  • Not that i can see in regards to the whitespace, the file opens fine if i open it from the uploads folder on the server. No warnings or notices are produced. If there was any whitespace, wouldn't it affect the docx files too? – cbenjafield Dec 18 '12 at 17:25
  • There's an exit after the switch block, tried added one after the readfile(), still corrupt. – cbenjafield Dec 18 '12 at 17:28
  • 2
    White space won't affect the file on the server.... it's whitespace or a BOM or echoed messages/warnings/errors from your script that get sent to php://output alongside the file itself that are the most likely cause – Mark Baker Dec 18 '12 at 17:28
  • Try to open the file generated by PHP in a hex editor, and compare it with the original file. – gen_Eric Dec 18 '12 at 17:37
  • 1
    Really confusing now! Notepad++ says that both files match! – cbenjafield Dec 18 '12 at 17:42
  • @cbenjafield: Are you sure this only happens with this `case` statement? – gen_Eric Dec 18 '12 at 17:44
  • Yes, i've only tried docx and xlsx, but the docx with the docx mime-type works soundly! – cbenjafield Dec 18 '12 at 17:59
  • how did you compare the files using notepad++? I do not think it has a binary comparison utility. Use diff instead, this will at least show you whether the files differ or not. – dualed Dec 18 '12 at 19:11
  • I used the compare plugin... – cbenjafield Dec 18 '12 at 20:06
  • Oh, Diff has proved there is nothing in the downloaded file! – cbenjafield Dec 18 '12 at 20:12
  • I had this issue and placing `exit;` after `readfile();` solved it! – Nick Rolando Oct 15 '13 at 16:35

6 Answers6

3

I have this problem and was the BOM.

How to notice it

unzip: Checking the output file with unzip, I saw a warning at the second line.

$ unzip -l file.xlsx 
Archive:   file.xlsx
warning file:  3 extra bytes at beginning or within zipfile
...

xxd (hex viewer): I saw the first 5 bytes with the following command

head -c5 file.xlsx | xxd -g 1
0000000: ef bb bf 50 4b                             PK...

Notice the 3 first bytes ef bb bf that's BOM!

Why?

Maybe a php file with BOM or a previous output from a library.

You have to find where is the file or command with the BOM, In my case and right now, I don't have time to find it, but I solve this with output buffer.

<?php
ob_start();

// ... code, includes, etc

ob_get_clean();
// headers ...
readfile($file);
Felipe Buccioni
  • 19,109
  • 2
  • 28
  • 28
2

this works fine on my local xampp setup regardless of extension so from my point of view no case statement is needed unless i'm missing something

i've tested with docx, accdb, xlsx, mp3, anything ...

$filename = "equiv1.xlsx";

header('Content-type: application/octet-stream');
header('Content-Disposition: attachment; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Pragma: no-cache');
cristi _b
  • 1,783
  • 2
  • 28
  • 43
  • Thanks, but gives me a longer error message about about it is not the right format! – cbenjafield Dec 18 '12 at 19:03
  • 1
    how do you store your files? do you change their names or extensions? if you try to open a file you know for sure it's xlsx , it gives that error, ok , but how do you save files on your server in the first place ? – cristi _b Dec 18 '12 at 19:06
  • Ok, the user uploads the file, the script takes the `$_FILES['resource']['name']` removes the extension, replaces spaces with underscores, adds a four digit number and adds the extension. This is then moved from tmp_name to the new path using `move_uploaded_file()`. However, when I go into the upload path folder on the server, the new uploaded file opens fine, it's only when it is downloaded that it breaks. – cbenjafield Dec 18 '12 at 19:22
  • I have removed the processing of the filename and now it just stores the original file in the folder now. I removed the switch block but it has reverted back to the short error message. – cbenjafield Dec 18 '12 at 20:00
  • the new code is simply as you wrote, however, I was thinking that the actual path of the file is not being stated with that anywhere. The filename will be 'excel1.xlsx' but that's not the path... – cbenjafield Dec 18 '12 at 20:17
  • well that depends where you store your files, i have an app where I store pdf files in a folder files under the site folder so my path would be files/file.xlsx given the script runs in site ... think of it like this /site/file_download.php echoes a http header to download /site/files/file.xlsx – cristi _b Dec 18 '12 at 20:36
  • So where would the path header go/what would it be? because that was where the read file came in handy! – cbenjafield Dec 18 '12 at 20:46
  • depends where you store your files , you should know these details – cristi _b Dec 18 '12 at 20:47
  • the file is uploaded to a folder /uploads/resources/file.xlsx, but the download script is located within an MVC (codeigniter) so this proves difficult. The download link goes to /download/resource/file.xlsx where the file.xlsx acts as a $_GET variable. – cbenjafield Dec 18 '12 at 20:55
  • Sorry to be such a pain, but the problem seems to be when the file is saved rather than opened. If I click open file in firefox, it opens without problems, but if I say save file, it corrupts it. DOCX still works soundly, jpg and png work soundly... it is only xlsx that refuses to work. – cbenjafield Dec 18 '12 at 21:41
0

try this:

header("Content-Disposition: attachment; filename=\"$filename\"");
header("Content-Type: application/vnd.ms-excel");
GabCas
  • 778
  • 8
  • 28
0

try:

<?
//disable gzip
@apache_setenv('no-gzip', 1); 
//set download attachment
header('Content-Disposition: attachment;filename="filename.xlsx"');
//clean the output buffer 
ob_clean(); 
//output file
readfile('filepath/filename.xlsx');
//discard any extra characters after this line
exit; 
?>
N. Peters
  • 11
  • 1
0

Try adding a additional header

header('Content-Length: ' . filesize('./uploads/resources/courses/' . $filename));
Elitmiar
  • 35,072
  • 73
  • 180
  • 229
0

Probably it's very misleading information given by Windows and has nothing to do with the code, Excel library, or server, and the file itself is a proper one. Windows blocks opening some files downloaded from the Internet (like .xlsx) and instead of asking whether you want to open an insecure file, it just writes that the file is corrupt. In Windows 10, one needs to right-click the file and select "Unblock" (you can read more for example here: https://winaero.com/blog/how-to-unblock-files-downloaded-from-internet-in-windows-10/)

boryn
  • 726
  • 9
  • 10