0

i'm new in java and i'm stuck in this function:

public String getFromUrl(String url){
    String content = "";
    try{
        URL U = new URL(url);
        URLConnection conn = U.openConnection();
        conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
        BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
        String line;
        while((line = reader.readLine()) != null)content += line+"\r\n";
        reader.close();
    }
    catch(Exception e){}
    return content;
}

i always get question marks instead of utf-8 symbols! what do i do wrong?

i read this post

first: i cant understand why byte array is used?

second: how should "while loop" look like in this case cause if i write

while((line = reader.readLine()) != null)content = line.getBytes("UTF-8");

my eclipse says something like "the local variable content may not have been initialized"

third: how i should convert byte array back into string?

then i read this one. i didnt even try the way it was in this post because i'm trying to write a function that will simulate browsers get and post request. it seems i found out how to perform it with URL class so i dont want to use any other classes and methods.

and now the only problem i have is how to handle utf-8 content.

any help apriciated!

Community
  • 1
  • 1
SuperYegorius
  • 754
  • 6
  • 24

1 Answers1

0

Dump:

String encoding = conn.getContentEncoding();

If not null, you can use that for the reader.

And dump the possible exception catched.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • well guys, i just tried to add this commands: File f = new File("C:\\output.txt"); FileUtils.write(f, content, "UTF-8"); all utf-8 symbols display well in output.txt and now i'm a little bit confused. is it my eclipse doesnt display utf-8 sympols properly? – SuperYegorius Feb 22 '13 at 22:04
  • 1
    eclipse Window / Preferences / Workspace / Text file encoding maybe. Setting eclipse to UTF-8 seems best for internationally minded projects. – Joop Eggen Feb 22 '13 at 22:17
  • yes, thats it Window / Preferences / Workspace / Text file encoding to utf-8 and everything works just fine! – SuperYegorius Feb 23 '13 at 12:21