1

I'm trying to print prime numbers between 2 and 1000 but it is writing some other characters in text file

Here is the contents of my TreeSet:

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89,
97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181,
191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281,
283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397,
401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503,
509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619,
631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743,
751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863,
877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997]

Here is snippets of my code retrieving and printing the above prime numbers into the text file "PrimeNumbersList.txt":

        pw = new PrintWriter("PrimeNumbersList.txt");
        int count = 0;
        for(Long l: treeSet){
            if(count==0){
                pw.print(""+l);
                pw.flush();
                count++;
                continue;
            }
            pw.print("\n"+l);
            pw.flush();
        }

Expected output:

2
3
5
.
.
.
997

The output I got:

ਲਲ਼ਵ਷ㄱㄊਲ਼㜱ㄊਹ㌲㈊ਹㄳ㌊਷ㄴ㐊ਲ਼㜴㔊ਲ਼㤵㘊਱㜶㜊਱㌷㜊ਹ㌸㠊ਹ㜹ㄊ㄰ㄊ㌰ㄊ㜰ㄊ㤰ㄊ㌱ㄊ㜲ㄊㄳㄊ㜳ㄊ㤳ㄊ㤴ㄊㄵㄊ
㜵ㄊ㌶ㄊ㜶ㄊ㌷ㄊ㤷ㄊㄸㄊㄹㄊ㌹ㄊ㜹ㄊ㤹㈊ㄱ㈊㌲㈊㜲㈊㤲㈊㌳㈊㤳㈊ㄴ㈊ㄵ㈊㜵㈊㌶㈊㤶㈊ㄷ㈊㜷㈊ㄸ㈊㌸㈊㌹㌊㜰㌊
ㄱ㌊㌱㌊㜱㌊ㄳ㌊㜳㌊㜴㌊㤴㌊㌵㌊㤵㌊㜶㌊㌷㌊㤷㌊㌸㌊㤸㌊㜹㐊㄰㐊㤰㐊㤱㐊ㄲ㐊ㄳ㐊㌳㐊㤳㐊㌴㐊㤴㐊㜵㐊ㄶ㐊㌶㐊
㜶㐊㤷㐊㜸㐊ㄹ㐊㤹㔊㌰㔊㤰㔊ㄲ㔊㌲㔊ㄴ㔊㜴㔊㜵㔊㌶㔊㤶㔊ㄷ㔊㜷㔊㜸㔊㌹㔊㤹㘊㄰㘊㜰㘊㌱㘊㜱㘊㤱㘊ㄳ㘊ㄴ㘊㌴㘊
㜴㘊㌵㘊㤵㘊ㄶ㘊㌷㘊㜷㘊㌸㘊ㄹ㜊㄰㜊㤰㜊㤱㜊㜲㜊㌳㜊㤳㜊㌴㜊ㄵ㜊㜵㜊ㄶ㜊㤶㜊㌷㜊㜸㜊㜹㠊㤰㠊ㄱ㠊ㄲ㠊㌲㠊㜲㠊
㤲㠊㤳㠊㌵㠊㜵㠊㤵㠊㌶㠊㜷㠊ㄸ㠊㌸㠊㜸㤊㜰㤊ㄱ㤊㤱㤊㤲㤊㜳㤊ㄴ㤊㜴㤊㌵㤊㜶㤊ㄷ㤊㜷㤊㌸㤊ㄹ㤊㜹
mps88
  • 33
  • 4
  • This is an encoding issue. PrintWriter doesn't print in ascii, so if you're just opening it with cat or a text editor it likely has the wrong format. Read it in using a Reader and you'll probably see the right data. I believe PW uses a 2 byte encoding that Java uses by default for characters, although the documentation actually doesn't promise a particular encoding. Apparently that 2 byte encoding looks like Korean if misinterpreted. – Gabe Sechan Oct 31 '21 at 21:38
  • 1
    Actually you may get this to work with just new PrintWriter("PrimeNumbersList.txt","UTF-8"); – Gabe Sechan Oct 31 '21 at 21:44
  • https://docs.oracle.com/javase/7/docs/api/java/io/PrintWriter.html#print(java.lang.String) says "the string's characters are converted into bytes according to the platform's default character encoding, and these bytes are written in exactly the manner of the write(int) method" - what's your platform default encoding? You can find it out like this: https://stackoverflow.com/questions/1749064/how-to-find-the-default-charset-encoding-in-java – Janos Vinceller Nov 01 '21 at 14:08

1 Answers1

2

It took me a while to figure out what is going on. Imo, it is only minimally related to the character set in use. You have the following code:

      pw = new PrintWriter("PrimeNumbersList.txt");
        int count = 0;
        for(Long l: treeSet){
            if(count==0){
                pw.print(""+l);
                pw.flush();
                count++;
                continue;
            }
            pw.print("\n"+l);
            pw.flush();
        }

First, the newline separator is being ignored so that won't work regardless. But the surprising thing to me is the conversion from the long to a string. In the above code, if you put the string "\n" before the long in the print statement, you get the oriental output. If you put the string after the long you get expected numeric output sans line separator. I believe it is the way it is parsed and printed via write to the OutputStreamWriter. The prepending of the "\n" to the long is causing the entire string to be interpreted differently than if it were appended (which as I said, doesn't work in either case).

To simplify your above code and get the desired output, I would recommend the following:

  • use try with resources to both catch an exception and close the file.
  • use println(long) to print the value followed by a newline.
  • the auto closing of the file will flush the output buffer.
try (PrintWriter pw = new PrintWriter("f:/PrimeNumbersList.txt")) {
         
         for(long l: longs){
             pw.println(l);
         }
} catch (IOException ioe){
    ioe.printStackTrace();
}

The above printed the values, one per line, in the target file. I did not have to specify a character set although in practice it would be recommended, usually based on the Locale.

WJS
  • 36,363
  • 4
  • 24
  • 39
  • 1
    That’s not what is going on. I would have been very surprised if the `PrintWriter` had such a strange behavior. The only thing that differs between the OP’s code and your improved version, is that you always write a line break after the number whereas the OP’s code does not write a line break after the last line. That missing line break at the file end seems to be enough to cause Window’s built-in editor to interpret the file as UTF-16. Open it in a different editor and you’ll see that there is almost no difference. (Just writing a single space at the end, so the file size is odd, also helps…) – Holger Jul 20 '22 at 17:25