34

My question is a duplicate of How to encode the filename parameter of Content-Disposition header in HTTP? But since that question was asked a long time ago and there is still no satisfying answer (in my opinion), I would like to ask again.

I develop a C++ CGI application that delivers files that can contain special characters in their names like
"weird # € = { } ; filename.txt"

There seems to be no possibility to set the HTTP Content-Dispostion in a way that it works for every browser like

  • Internet Explorer
  • Firefox
  • Chrome
  • Opera
  • Safari

I would be happy with a different solution for every browser.
Now that is how far I came:

Internet Explorer (added double quotes and replaced # and ; )

Content-Disposition: attachment; filename="weird %23 € = { } %3B filename.txt"

Firefox (double quotes seem to work. nothing more to do):

Content-Disposition: attachment; filename="weird # € = { } ; filename.txt"

Another working alternative:

Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt

Chrome

when using only double quotes these problems arise:

  • = disapears in filenames
  • € will be replaced by -

but this works:

Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt

Opera

Using duoble quotes or using the syntax: filename*=UTF-8''... produces the following problems:

  • Multiple sticked together spaces in filenames are reduced to one
  • { and } disapear: "ab{}cd.txt" -> "abcd.txt"
  • filenames get cut off after ; in it: "abc ; def.txt" -> "abc"

EDIT 2: This was because of filename length limitations. This syntax works with Opera:

Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt

Safari

  • € will be replaced by an invisble character (using double quotes)

    no solution that prevents that little problem
    

The suggestion from the other thread (mentioned above) using

Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%80%20%3D%20%7B%20%7D%20%3B%20filename.txt

didn't work for me. The escape characters won't be translated back or the browser wants to save to file with the name of my cgi application. That was because my encoding was wrong. I did not encode according to RFC 5987. But Safari isn't using this encoding anyway. So no solution for the € character so far.

BTW: An UTF-8 converter http://www.rishida.net/tools/conversion/

I used the latest version of every browser fo these tests:

  • Firefox 7
  • Internet Explorer 9
  • Chrome 15
  • Opera 11.5
  • Safari 5.1

PS: I tried all special characters on my keyboard. I used in this thread only the ones that made trouble.

EDIT:

I also tried a filename with all special characters on my keyboard (that are possible in a filename) and that did not work as it did with the test string above:

Complete Test string:

0 ! § $ % & ( ) = ` ´ { }    [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg

Encoded Test String:

0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg

Using this method:

Content-Disposition: attachment; filename*=UTF-8''0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg

I had the following results:

  • Firefox works
  • Chrome works
  • IE: $ % & ( ) = ` ´ { } [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg (removed the first 6 characters). EDIT 2: This was because of filename length limitations of the browser. It startet to cut off the filename from the start of the string. I didn't go deep into this but it looks like normal filenames can be about 200 characters long and filenames with many escape sequesnces even more but less than 250. But that's OK.
  • Opera: 0 ! § $ % & ( ) = ` ´ [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg (missing some characters as before). EDIT 2: I shortened my test string because I suspected filename length "problems" with Opera as there are with IE and it worked there too.
  • Safari doesn't work with that syntax. That was excepted.

EDIT 2:

Status so far is, that the syntax filename*=UTF-8''filname escape sequence" works with every browser except Safari. And the only character that is getting replaced with Safari is the €. I guess I can live with that. Thank you!

EDIT 3: Filename length

I noticed some filename length issues.

  • Internet Explorer: File names can be 147 characters long. If the string doesn't contain escape sequences then that's the length of the filename. If it does the file name can vary. The resulting file name is shorter that 147 characters. But it differs. I used 2 escape sequences and the file name shortened 5 characters and I used many escape sequences and the file name shortened onyl 2 characters. I couldn't find a rule here.
  • The other browsers don't seems to have that problem. They would save the file if the file system can handle it. I tried for instance 250 characters and the browsers said I have to reduce the file name (Chrome) or they did it themselfs shortening it to either 220 (Opera) or 210 (Firefox) characters. Opera cut off the file ending though. Safari tried to save that long file name and ended up not saving it and writing "-1" in the download list as filename.
juergen d
  • 201,996
  • 37
  • 293
  • 362
  • 1
    possible duplicate of [How to encode the filename parameter of Content-Disposition header in HTTP?](http://stackoverflow.com/questions/93551/how-to-encode-the-filename-parameter-of-content-disposition-header-in-http) – Jim Nov 01 '11 at 15:01
  • 2
    If you want to call attention to an old question, you should post a bounty on it. Reposting is spammy. – Jim Nov 01 '11 at 15:03
  • 1
    If you want to fix the browsers, talk with the vendors. That might be more productive. Until then, provide file-names every browser understands, why make it more complicated then it must be? – hakre Nov 01 '11 at 15:40
  • @hakre: The user can choose any filename he wants. I don't like it either, but I have to life with it and want it to work with every browser. – juergen d Nov 02 '11 at 06:40
  • 1
    @juergend: Technically, the user can not choose any filename she wants. You can't code without specification, otherwise you run into problems like these. I can honor your willingness to give users a broad choice, but keep in mind that you can not fulfill everybody wishes. For example control characters in the filename. Take care. – hakre Nov 02 '11 at 09:22
  • @juergend: Safari works as expected in that it is a known shortcoming that they do not support the syntax. – Julian Reschke Nov 02 '11 at 14:17
  • http://stackoverflow.com/a/216777/1586797 works for me. – bronze man Dec 04 '14 at 04:32

1 Answers1

15

Firefox, MSIE (starting with version 9), Opera, Konq and Chrome support; MSIE8 and Safari not support; others support is unknown - the encoding defined in RFC 5987.

Note that in

  Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%80%20%3D%20%7B%20%7D%20%3B%20filename.txt

you got the encoding for the Euro character wrong; it's unicode code point is not %80, fixing this should make it work everywhere except Safari (the correct encoding being %e2%82%ac).

Test case at:

http://greenbytes.de/tech/tc2231/#attwithfn2231utf8

Julian Reschke
  • 40,156
  • 8
  • 95
  • 98
  • You are right! It looks like I am using the wrong encoding here. I'm going into this and report back – juergen d Nov 02 '11 at 09:19
  • How can you provide a fall-back for user-agents that do not support it? How to handle user-agents that mask or do not provide the user-agent string? – hakre Nov 02 '11 at 09:42
  • 1
    hakre: I recommend to always use the new RFC 5987 variant, and to add a plain ASCII variant for legacy browsers such as Safari and IE pre version 9. See also: greenbytes.de/tech/webdav/rfc6266.html#examples – Julian Reschke Nov 02 '11 at 11:08
  • @JulianReschke: The test string above works like you said. I tried an even worse string:0 ! § $ % & ( ) = ` ´ { } [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg". But that one doesn't work. I encoded it into 0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg. But then Firefox and Chrome work,IE removes first 6 characters and Opera has still the same problem as I mentioned earlier. – juergen d Nov 02 '11 at 13:25
  • 1
    juergen - can you make minimal test cases demonstrating the IE and Opera issues? I then can add them to my test suite. Thanks. BTW: note that it's totally ok for UAs to filter out certain characters, such as controls or path separators; but it would be interesting to see in which way they do it differently. – Julian Reschke Nov 02 '11 at 14:13
  • @JulianReschke: From what I know, the fallback does not always work as intended and will offer for some browsers you've listed, the ASCII variant instead of the UTF-8 variant. Can you add to your answer how the fallback works and for the browser-list given, which browsers support it and how? - +1 for your work anyway. Just seeing your authoring that page! – hakre Nov 02 '11 at 18:31
  • 1
    hakre - the fallback depends on ordering; IE8 needs to see the all-ASCII variant first. See http://greenbytes.de/tech/tc2231/#attfnboth and http://greenbytes.de/tech/tc2231/#attfnboth2. – Julian Reschke Nov 03 '11 at 07:57
  • @JulianReschke: what kind of test cases do you have in mind? Do you mean example Content-Dispositions with smaller filenames that are reduced to the core characters making trouble? – juergen d Nov 03 '11 at 08:08
  • @JulianReschke: I found the problem. It was due to filename length limitations. Using many special characters in a filename leed to very long escape sequence filenames. See above. Thanks for the help! – juergen d Nov 03 '11 at 16:40
  • @juergen d - length limitations are interesting as well -- which one did you see? – Julian Reschke Nov 03 '11 at 16:56
  • @JulianReschke: See EDIT3 above. – juergen d Nov 04 '11 at 15:39