Why are .docx files being corrupted when downloading from an ASP.NET page?

Question

I have this following code for bringing page attachments to the user:

private void GetFile(string package, string filename)
{
    var stream = new MemoryStream();

    try
    {
        using (ZipFile zip = ZipFile.Read(package))
        {
            zip[filename].Extract(stream);
        }
    }
    catch (System.Exception ex)
    {
        throw new Exception("Resources_FileNotFound", ex);
    }

    Response.ClearContent();
    Response.ClearHeaders();
    Response.ContentType = "application/unknown";

    if (filename.EndsWith(".docx"))
    {
        Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
    }

    Response.AddHeader("Content-Disposition", "attachment;filename=\"" + filename + "\"");
    Response.BinaryWrite(stream.GetBuffer());
    stream.Dispose();
    Response.Flush();
    HttpContext.Current.ApplicationInstance.CompleteRequest();
}

The problem is that all supported files works properly (jpg, gif, png, pdf, doc, etc), but .docx files, when downloaded, are corrupted and they need to be fixed by Office in order to be opened.

At first I didn't know if the problem was at uncompressing the zip file that contained the .docx, so instead of putting the output file only in the response, I saved it first, and the file opened successfully, so I know the problem should be at response writing.

Do you know what can be happening?

This tripped me up when outputting PDF. Turns out that PDF viewers will tolerate unexpected garbage after the end of valid data, and I was adding rendered HTML of the page to every PDF file I was sending. Might be the same for other binary file formats, they don't care about unexpected data appended to valid data. — millimoose, Apr 08 '13 at 01:11

score 33 · Accepted Answer · edited Nov 15 '21 at 08:14

33

I also ran into this problem and actually found the answer here:

It turns out that the docx format needs to have Response.End() right after the Response.BinaryWrite.

edited Nov 15 '21 at 08:14

Tomerikoo

18,379
16
47
61

answered May 17 '10 at 20:12

Geoff Tanaka

346
2
3

Just saved me hours of trawling the net looking for this, thanks!! It makes sense too because the file is having some other bits appended to the end of it from the stream and it's ending up slightly larger than on the server. – David Swindells Aug 17 '12 at 10:42
Adding a Response.End() AND limiting the output to the orginal filesize (as Randall Spychalla's solution below) were both required for me to resolve this – Simon Molloy Aug 18 '14 at 17:05
1

Response.End isn't mandatory, but setting the file length is so it knows when it's done. – Shawn Apr 01 '15 at 13:35
I'm using a using for my binaryreader and have that wrapped in a try/catch so I had to put the Response.End in a finally block or else it would error about aborting a thread. But it worked as advertised! – Jeremy Jan 11 '17 at 21:22
3

The link above is broken. – Felix Cen Feb 24 '17 at 18:04
Strangely, I faced this problem only on a particular server. Elsewhere my code works fine without Response.End (instead I have Response.Flush). I believe Response.End is known to raise ThreadAbort Exception – Vipul bhojwani Jan 31 '18 at 05:10

score 4 · Answer 2 · answered Aug 19 '11 at 13:22

When storing a binary file in SQL Server, keep in mind that a file is padded to the nearest word boundry, so you can potentially have an extra byte added to a file. The solution is to store the original file size in the db when you store the file, and use that for the length that needs to be passed to the write function of the Stream object. "Stream.Write(bytes(), 0, length)". This is the ONLY reliable way of getting the correct file size, which is very important for Office 2007 and up files, which do not allow extra characters to be on the end of them (most other file types like jpg's don't care).

score 3 · Answer 3 · answered Mar 19 '10 at 13:46

3

You should not use stream.GetBuffer() because it returns the buffer array which might contain unused bytes. Use stream.ToArray() instead. Also, have you tried calling stream.Seek(0, SeekOrigin.Begin) before writing anything?

Best Regards,
Oliver Hanappi

answered Mar 19 '10 at 13:46

Oliver Hanappi

12,046
7
51
68

score 2 · Answer 4 · answered Feb 25 '15 at 23:13

For what it's worth, I also ran into the same problem listed here. For me the issue was actually with the upload code not the download code:

    Public Sub ImportStream(FileStream As Stream)
        'Use this method with FileUpload.PostedFile.InputStream as a parameter, for example.
        Dim arrBuffer(FileStream.Length) As Byte
        FileStream.Seek(0, SeekOrigin.Begin)
        FileStream.Read(arrBuffer, 0, FileStream.Length)
        Me.FileImage = arrBuffer
    End Sub

In this example the problem is I declare the Byte array arrBuffer with a size one byte too large. This null byte is then saved with the file image to the DB and reproduced on download. The corrected code would be:

        Dim arrBuffer(FileStream.Length - 1) As Byte

Also for reference my HttpResponse code is as follows:

                context.Response.Clear()
                context.Response.ClearHeaders()
                'SetContentType() is a function which looks up the correct mime type
                'and also adds and informational header about the lookup process...
                context.Response.ContentType = SetContentType(objPostedFile.FileName, context.Response)
                context.Response.AddHeader("content-disposition", "attachment;filename=" & HttpUtility.UrlPathEncode(objPostedFile.FileName))
                'For reference: Public Property FileImage As Byte()
                context.Response.BinaryWrite(objPostedFile.FileImage)
                context.Response.Flush()

score 1 · Answer 5 · edited Nov 15 '21 at 08:16

If you use the approach above which uses response.Close(), Download managers such as IE10 will say 'cannot download file' because the byte lengths do not match the headers. See the documentation. Do NOT use response.Close. EVER.

However, using the CompeteRequest verb alone does not shut off the writing of bytes to the output stream so XML based applications such as WORD 2007 will see the docx as corrupted.

In this case, break the rule to NEVER use Response.End. The following code solves both problems. Your results may vary:

'*** transfer package file memory buffer to output stream
Response.ClearContent()
Response.ClearHeaders()
Response.AddHeader("content-disposition", "attachment; filename=" + NewDocFileName)
Me.Response.ContentType = "application/vnd.ms-word.document.12"
Response.ContentEncoding = System.Text.Encoding.UTF8
strDocument.Position = 0
strDocument.WriteTo(Response.OutputStream)
strDocument.Close()
Response.Flush()
'See documentation at http://blogs.msdn.com/b/aspnetue/archive/2010/05/25/response-end-response-close-and-how-customer-feedback-helps-us-improve-msdn-documentation.aspx
HttpContext.Current.ApplicationInstance.CompleteRequest() 'This is the preferred method
'Response.Close() 'BAD pattern. Do not use this approach, will cause 'cannot download file' in IE10 and other download managers that compare content-Header to actual byte count
Response.End() 'BAD Pattern as well. However, CompleteRequest does not terminate sending bytes, so Word or other XML based appns will see the file as corrupted. So use this to solve it.

score 0 · Answer 6 · edited May 23 '17 at 12:26

Take a look a this: Writing MemoryStream to Response Object

I had the same problem and the only solution that worked for me was:

    Response.Clear();
    Response.ContentType = "Application/msword";
    Response.AddHeader("Content-Disposition", "attachment; filename=myfile.docx");
    Response.BinaryWrite(myMemoryStream.ToArray());
    // myMemoryStream.WriteTo(Response.OutputStream); //works too
    Response.Flush();
    Response.Close();
    Response.End();

score 0 · Answer 7 · answered Mar 19 '10 at 13:40

0

It all looks ok. My only idea is to try calling Dispose on your stream after calling Response.Flush instead of before, just in case the bytes aren't entirely written before flushing.

answered Mar 19 '10 at 13:40

Ray

21,485
5
48
64

I did this also, without success. – Victor Rodrigues Mar 22 '10 at 13:04
1

just taking a wild-ass guess here... try it using content type "application/octet-stream" and see if you get a valid file downloaded. btw - this is probably more appropriate than "application/unknown" when you don't know the file type. – Ray Mar 22 '10 at 14:14

score 0 · Answer 8 · edited Jan 09 '15 at 14:13

I had the same problem while i try to open .docx and .xlsx documents. I solve the problem by defining the cacheability to ServerAndPrivate instead of NoCache

there is my method to call document:

public void ProcessRequest(HttpContext context)

 {


       var fi = new FileInfo(context.Request.Path);
        var mediaId = ResolveMediaIdFromName(fi.Name);
        if (mediaId == null) return;

        int mediaContentId;
        if (!int.TryParse(mediaId, out mediaContentId)) return;

        var media = _repository.GetPublicationMediaById(mediaContentId);
        if (media == null) return;

        var fileNameFull = string.Format("{0}{1}", media.Name, media.Extension);
        context.Response.Clear();
        context.Response.AddHeader("content-disposition", string.Format("attachment;filename={0}", fileNameFull));            
        context.Response.Charset = "";
        context.Response.Cache.SetCacheability(HttpCacheability.ServerAndPrivate);
        context.Response.ContentType = media.ContentType;
        context.Response.BinaryWrite(media.Content);
        context.Response.Flush();          
        context.Response.End();          
    }

score 0 · Answer 9 · answered May 18 '21 at 15:34

Geoff Tanaka's answer also works for Response.Writefile not just binarywrite ie adding Response.End() after it gets rid of Office document corruption error "Word found unreadable content". Turns out all the messing about with Response.ContentType was unneccessary and I can now revert to "application/octet-stream". Another afternoon I'll never get back.

Why are .docx files being corrupted when downloading from an ASP.NET page?

9 Answers9

Linked