As I was working with WebClient
class, I noticed that a simple call like this
string downloadedString = new WebClient().DownloadString("http://whatever");
produced a string using an incorrect encoding, even though the response contained a proper Content-Type
header application/json; charset=utf-8
.
When I looked at the source code I found out that DownloadString
doesn't look at the response headers at all. Instead it uses request.ContentType
and if the charset is not present there, it uses the Encoding
property (which has to be set beforehand, otherwise it will be system's default).
It seems weird that we have to specifically tell the WebClient
object which encoding to use before sending the request (by adding a Content-Type
header or setting encoding directly). It becomes pointless to use DownloadString
: if we want the right encoding, we have to use DownloadData
or plain old WebRequest
and write code that parses response headers manually in order to get the correct response string.
Does anyone know the reason for such behavior?
Is there a better way in .NET to properly download HTTP string response, than manually parsing response Content-Type
?