-1

Following is code i use to read the file.

InputStreamReader iReader = new InputStreamReader(new FileInputStream("myrecords.txt"),"ISO-8859-1");
BufferedReader bReader = new BufferedReader(iReader);

public static List<String> bufferedReaderToStringList(BufferedReader bReader) throws IOException {
    List<String> stringList = new ArrayList<String>();
    String text;
    while ((text = bReader.readLine()) != null) {
        stringList.add(text);
    }
    bReader.close();
    return stringList;
}

When i fetch data from string and print then all characters get distorted. On My Putty Screen and even i save it in Database it is still distorted. Anyone please help in resolving issue

Anyone please guide where i am doing wrong?

Dolvenh�yda is distorted one. It is Norwegian character. 
Dolvenhøyda is correct one. 
fatherazrael
  • 5,511
  • 16
  • 71
  • 155

2 Answers2

1

How did you print the text to the console? Might be that the console is assuming characters in utf-8 while they are actually iso-....

'Converting' the printed string might fix the issue: Charset.forName("UTF-8").encode(myString)

uniknow
  • 938
  • 6
  • 5
1

The InputStreamReader wraps an InputStream (binary data) together with its encoding (ISO-8859-1 here) to read text, which in java internally is held as Unicode. The encoding must be correct.

InputStreamReader iReader = new InputStreamReader(
        new FileInputStream("myrecords.xml"), "ISO-8859-1");

The BufferedReader simply deals with (assumedly correct) text.

BufferedReader bReader = new BufferedReader(iReader);

Hence only the InputStreamReader could be wrong. This you can check with the XML file.

XML is by default in UTF-8, overriden by the encoding in <?xml ... encoding=... ?>. In some cases this could be a ly, but clicking on the XML will easily show its correctness.

Now Reader, String and such should be right, given the correct encoding.

However outputting to the console (System.out) uses the Operating System encoding, which might mangle the given text.

Outputting again to a file, would need to specify the desired encoding of the file content. Also one would need to keep the encoding in <?xml encoding=... ?> correct.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • Thanks for response. I have done it using simple text file but getting same error. While displaying on screen or inserting into database. When i run it locally it is fine. – fatherazrael Jul 08 '16 at 06:35
  • You could try **ISO-8859-4** too for Norwegian. Or instead of Latin-1 (ISO-8859-1) use Windows Latin-1 (**Windows-1252**) which is actually used in browsers for instance. XML is nice because of opening it in the browser effect the encoding specified in the XML. So one can check its encoding visually, and use it to read the XML. The reading can be correct, but the output to screen coded incorrect, or the console not being able to use that encoding. As last resort one needs to dump the character codes as numbers, and check them. Database is an additonal step. – Joop Eggen Jul 08 '16 at 06:45
  • You might use an encoding aware programmer's editor like Notepad++ or JEdit. – Joop Eggen Jul 08 '16 at 06:46
  • I am unable to print it properly using Putty and Text box. I have used ISO-8859-1 and -4 both but in vein – fatherazrael Jul 08 '16 at 11:08
  • First step. Determine the encoding of the XML file. Using a hex dump (linux `hexdump`) / JEdit / Notepad++. Or make a text file in the right encoding, and test that. – Joop Eggen Jul 08 '16 at 11:51