How to set response header in JAX-RS so that Chinese chars are displayed properly within the custom generated file

Question

How would I properly generate a "javax.ws.rs.core.Response" (to be returned) that supports Chinese character encoding within an Excel file?

To clarify, i have a file (CSV excel) which contains some Chinese content, and I need to return a javax response which then displays the Chinese characters in the document properly (on the client side).

Currently I'm doing the following:

return Response.status( 200 )
        .header( "content-disposition", 
                 "attachment;filename=SampleCSV.csv;charset=Unicode" )
        .entity( result )
        .build();

but when this response is built and returned to the client side (and a popup screen is displayed asking to download the file), the Chinese content of the excel file is gobbly gooed.

Any suggestion will be highly appreciated.

score 5 · Accepted Answer · edited Oct 07 '21 at 05:54

5

The RFC that defines the content-disposition header doesn't mention a charset clause

Try also adding a proper content-type header to the response:

.header("Content-Type", "text/csv; charset=utf-8")

Be sure to use utf-8, and not unicode. If that works, then you can remove the charset clause from the content-disposition header.

edited Oct 07 '21 at 05:54

Community

1
1

answered May 05 '12 at 10:27

Sean Reilly

21,526
4
48
62

score 3 · Answer 2 · edited May 23 '17 at 12:07

You specify charset=Unicode, which is not valid because Unicode is not a single encoding. It's a character set with a family of encodings. UTF-8 and UTF-16 are commonly-used encodings.

You can control the response header, to affect how the browser/client interprets the response, using the @Produces annotation. I've seen different opinions about whether this works:

"I've tried @Produces("text/html; charset=UTF-8") but that was ignored and only text/html was send with the HTTP header"
"It is also possible to use ResponseBuilder.header(...) method to set the content type with the charset."
The @Produces annotation sends the corrects charset in the header, but doesn't change the encoding of the body
"@Produces("text/html; charset=UTF-8") works with current versions of the reference implementation Jersey."
It worked for me in Jersey 1.12.

I'm fairly certain that this only changes the encoding declared in the response headers; it doesn't change the encoding that's actually used to convert the response string into bytes to send over the network. These two must match, otherwise the browser/client will misinterpret the response, because it believes that you used a different encoding than you actually did.

If you return a java.lang.String object, JAx-RS uses a system default encoding to convert it to a byte stream. If the JAX-RS server is running on Unix this is UTF-8, which usually works well, but on Windows it's something weird that doesn't.

Therefore you should force it to use a specific encoding, by wrapping the result object in an OutputStreamWriter that specifies the encoding. This prevents JAX-RS from using the default conversion.

To be specific, if result is a java.lang.String object in your code, you may need to create an OutputStreamWriter around it that specifies an encoding, such as UTF-8, to affect byte stream that JAX-RS writes to the network. I haven't tested this code, but it might work:

.entity(new OutputStreamWriter(result, "UTF-8"))

I had this problem with Tika, which sends a StreamingOutput instead of a Response, and constructs it with a default OutputStreamWriter, which uses the system's default encoding instead of something predictable.

I modified Tika to specify the encoding when constructing the OutputStreamWriter, and added a charset to the @Produces annotation, and that fixed it for me.

How to set response header in JAX-RS so that Chinese chars are displayed properly within the custom generated file

2 Answers2