0

How will append a utf-8 string to a properties file. I have given the code below.

public  static void addNewAppIdToRootFiles() {
Properties properties = new Properties();       
   try {
       FileInputStream fin = new FileInputStream("C:\Users\sarika.sukumaran\Desktop\root\root.properties");
       properties.load(new InputStreamReader(fin, Charset.forName("UTF-8")));
       String propertyStr = new String(("قسيمات").getBytes("iso-8859-1"), "UTF-8");
       BufferedWriter bw = new BufferedWriter(new FileWriter(directoryPath + rootFiles, true));
       bw.write(propertyStr);
       bw.newLine();
       bw.flush();
       bw.close(); 
       fin.close();           

   } catch (Exception e) {
       System.out.println("Exception : " + e);
   }
} 

But when I open the file, the string I have written "قسيمات" to the file shows as "??????". Please help me.

B770
  • 1,272
  • 3
  • 17
  • 34
skmaran.nr.iras
  • 8,152
  • 28
  • 81
  • 116
  • 1
    And what's your problem? – AlexR Jan 09 '12 at 07:52
  • @AlexR: When I write the string "قسيمات" to the file, the string will written as "??????" – skmaran.nr.iras Jan 09 '12 at 07:58
  • 1
    These lines in your code "BufferedWriter bw = new BufferedWriter(new FileWriter(directoryPath + rootFiles, true)); bw.write(propertyStr);" are doing the job for you? are you getting any exception? – Rajesh Pantula Jan 09 '12 at 07:59
  • 1
    did you check if your system property "file.encoding" is set to UTF-8? – PhilW Jan 09 '12 at 08:03
  • @AlexR : I have edited the code in my question. Now u can easily understand what I meant. No exception. – skmaran.nr.iras Jan 09 '12 at 08:06
  • @PhilW : While creating a new file with this string it works properly. But I want to append a utf-8 string to an existing file. – skmaran.nr.iras Jan 09 '12 at 08:08
  • 1
    Are you viewing the output file using an editor that understands UTF-8? Have you tried to read in the value via `properties.load()`? – Bohemian Jan 09 '12 at 08:12
  • 2
    `("قسيمات").getBytes("iso-8859-1")`... Why did you convert Arabic string into iso-8859-1 (ISO Latin 1) which doesn't support Arabic code points? http://htmlhelp.com/reference/charset/ ...Just use `getBytes()` which will return the proper Unicode code points for the Arabic string... – ee. Jan 09 '12 at 08:20
  • @Bohemian: I had tried that also. But no way. – skmaran.nr.iras Jan 09 '12 at 08:26
  • @ee : I tried. But no change :( – skmaran.nr.iras Jan 09 '12 at 08:32
  • 1
    OK..is your input `root.properties` file really in UTF-8 encoding since you have made the assumption by using `Charset.forName("UTF-8")` to open it? If it is a new file to be created, just use `properties.load(new InputStreamReader(fin));` so it will treat the new file as a Unicode-encoded file. – ee. Jan 09 '12 at 08:37
  • 1
    If not, I suspect your text editor is not using the correct font that supports Arabic code points in Unicode. But, usually incorrect font will produce rectangular boxes instead of multiple ?'s, depending on the editor's font glyph implementation. – ee. Jan 09 '12 at 08:44
  • @ee : Any other solution? While creating a new file, it works properly (Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileDir), "UTF-8")));. But I need to append. – skmaran.nr.iras Jan 09 '12 at 08:48
  • I don't get it; what do you mean by `Bit Ineed to append`? Or, do you mean `Bit I need to append`? – ee. Jan 09 '12 at 08:50
  • It shall be "UTF-8", not "UTF8"... – ee. Jan 09 '12 at 08:51
  • @ee : It is an issue not only for arabic. Sorry! I mistaked while typing. – skmaran.nr.iras Jan 09 '12 at 08:52
  • 1
    I will check this again on my way back home if there is nobody to provide the good explanation to your problem convincingly. – ee. Jan 09 '12 at 08:54

2 Answers2

2

OK, your first mistake is getBytes("iso-8859-1"). You should not do these manipulations at all. If you want to write unicode text to file you should open the file and write text. The internal representations of strings in java is unicdoe, so everything will be writter correctly.

You have to care about charset when you are reading file. BTW you do it correctly.

But you do not have to use file manipulation tools to append something to properites file. You can just call prop.setProperty("yourkey", "yourvalue") and then call prop.store(new FileOutputStream(youfilename)).

AlexR
  • 114,158
  • 16
  • 130
  • 208
  • ("yourkey", "yourvalue") means? – skmaran.nr.iras Jan 09 '12 at 08:55
  • 1
    it is just an example... in your case, it shall be `prop.setProperty("قسيمات", "whatever");` so it will write `قسيمات=whatever` in youfilename properties file even though Arabic is right-to-left (RTL) when calling `prop.store(new FileOutputStream(youfilename));` – ee. Jan 09 '12 at 09:07
  • @Alex, ee : Properties properties = new Properties(); properties.setProperty("قسيمات", "aaaaa"); properties.store(new FileOutputStream(directoryPath + rootFiles), ""); Did mean it? But it does not wrk for me. – skmaran.nr.iras Jan 09 '12 at 09:25
  • 1
    Also I lost all properties set to the file. Any option to set UTF-8 to FileWriter object? – skmaran.nr.iras Jan 09 '12 at 09:27
2

Ok, I have checked the specification for Properties class. If you use following methods: load() for input stream or store() for output stream, the input/output stream for the file is assumed a iso-8859-1 encoding by default. Therefore, you have to be cautious with a few things:

Some characters in French, German and Portuguese are iso-8859-1 (Latin1) compatible, which they normally work fine in iso-8859-1. So, you don't have to worry that much. But, others like Arabic and Hebrew characters are not Latin1 compatible, so you need to be careful with the choice of encoding for these characters. If you have a mix of characters of French and Arabic, you have no choice but to use Unicode.

What is your current input file's encoding if it already exists to be used with Properties's load() method? If it is not the default iso-8859-1, then you need to figure out what it is first before opening the file. If infile file encoding is UTF-8, then use properties.load(new InputStreamReader(new FileInputStream("infile"), "UTF8"))); Then, stick to this encoding till the end. Match the file encoding with the character encoding as well.

If it is a new input file to be used with Properties's load() method, choose the file encoding that works with your character's encoding. Then, stick to this encoding till the end.

Your expected output file's encoding shall be the same with what is used from Properties's load() method before you use the store() method. If it is not the default iso-8859-1, then you need to figure out what it is first before saving the file. Stick to this encoding till the end. Match the file encoding with the character encoding as well. If outfile file encoding is UTF-8, then specifically use UTF-8 encoding when saving the file. But, if the store() method still ends up with an outfile in iso-8859-1 encoding, then you need to do what is suggested next...

If you stick to the default iso-8859-1, it works fine for characters like French. But, if the characters are not iso-8859-1 or Latin1 encoding compatible, you need to use Unicode escape characters instead as an alternative: for example:\uFE94 for the Arabic character. For me, this escaping is too tedious and normally we use native2ascii utility provided in JRE or JDK to convert a properties file from one encoding to another encoding. Of course, there are other ways...just check the references below...For me, it is better to use a properties file in XML format since by default it is UTF-8...

References:

Community
  • 1
  • 1
ee.
  • 947
  • 5
  • 5