0

I am reading in multiple files and putting them all in to one new file. However for some reason when generating the new file there are EOF characters being inserted into the file.

This appears at the end of each line where the file ended.

|ýÿ

note I'm using UTF-16LE as it seems to be the only encoding that can properly handle prime quotations.

BufferedWriter out =  new BufferedWriter(new OutputStreamWriter(new FileOutputStream(exportFile),"UTF-16LE"));

for (File f : files) {
        System.out.println("merging: " + f.getName());
        FileInputStream fis;
        try {
            Reader reader = new InputStreamReader (new FileInputStream(f), "UTF-16LE");
            BufferedReader in = new BufferedReader(reader); 
            String aLine;                                
            while ((aLine = in.readLine()) != null) {

                out.write(aLine);
                out.newLine();
            }

            in.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
tai
  • 39
  • 1
  • 8
  • 3
    These are not end-of-file characters. They are probably artifacts produced because you are using UTF-16 when the source files are not actually in UTF-16. The first thing to do is to find out what the original files' encoding actually is. How do you view the files? – RealSkeptic Nov 20 '15 at 18:49
  • Sweet!! I was opening the files via Notepad. You actually helped me figure out the main issue I was having which is why I was using UTF-16. Forgot to check what encoding the files were (ANSI) and am now using the proper one. – tai Nov 20 '15 at 19:05
  • As far as the artifacts, they seemed very similar to the EOF symbols which is why I assumed that's what they meant. – tai Nov 20 '15 at 19:07
  • I am not actually sure what you mean by "end of file symbols". Some operating systems use ctrl-Z as EOF, but usually, end-of-file is a condition, not a character. But well, if the problem is solved, I suppose it doesn't matter. – RealSkeptic Nov 20 '15 at 19:14
  • I stumbled upon this while in my searches and found a similar thread in regards to java [link](http://stackoverflow.com/questions/4906341/why-do-i-get-a-%C3%BF-char-after-every-include-that-is-extracted-by-my-parser-c)[link] so I assumed it must be something similar to what I was seeing. Thanks again. – tai Nov 22 '15 at 16:35

0 Answers0