0

i am using

HttpClient.GetStringAsync(url)

to get the content of a webpage, and then get a certain tag's innerText. by default, the response is okay, but when the target webpage has the following metadata, it will cause messy code

<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />

so how can i know its charset and use it like this?

var byteData = await client.GetByteArrayAsync(url);
var Encoding.GetEncoding("name").GetString(byteData)
paul cheung
  • 748
  • 2
  • 13
  • 32
  • [This](http://stackoverflow.com/a/11018883/492258) may be answer to your question – asdf_enel_hak Oct 26 '15 at 11:10
  • or this : http://stackoverflow.com/questions/24829440/how-to-get-the-html-encoding-right-in-c – sbouaked Oct 26 '15 at 11:13
  • @asdf_enel_hak i have seen it before, but seems not relevant to me. if the target web page without defined(actually i use this in a web crawler application) the HttpClient.GetStringAsync(url) also cause messy code. – paul cheung Oct 27 '15 at 07:41

0 Answers0