I need to get the source code of the particular URL using a java code. I was able to get the source code for UTF-8 encoded web page but was not able to get the code for ISO-8859-1 encoded character set. My question, is it possible to get the source code of website with iso-8859-1 using a java program? Please help
Asked
Active
Viewed 235 times
0
-
Show us your code. Probably you are using your default system encoding and the `ISO` encoding must be specified explicitly somewhere. – Tomasz Nurkiewicz Jun 25 '12 at 11:58
-
What was the code you used for getting utf-8 page and where does it fail for the other? – mmmmmm Jun 25 '12 at 11:58
1 Answers
0
If you are reading by using following method you need to Specify character set explicitly by
URL url = new URL(URL_TO_READ);
BufferedReader in = new BufferedReader(
new InputStreamReader(url.openStream(),"ISO-8859-1" ));
How ever if there is little parsing include with your requirement I would suggest you to use JSOUP and it will read the character-set from the response of server, Also you could explicitly set the charset