0

i have the following code:

int zeichen = System.in.read();

System.out.println(zeichen);

The documentation for read() says: Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255.

So for ASCII chars it converts the char into its corresponding value, but how does it convert chars that are out of the ASCII charset?

For example: £ or ₩ is converted to 239 and 丌 is converted to 228

How does this conversion happen and how can i "calculate" the coresponding value for each char?

  • depends on your local charset – njzk2 Dec 16 '20 at 19:10
  • The examples you named were just the first byte of the conversion. Have you heard of UTF-8? – xehpuk Dec 16 '20 at 19:21
  • `System.in` is an `InputStream` instance designed to work with bytes, not characters. – terrorrussia-keeps-killing Dec 16 '20 at 19:22
  • @njzk2 I am using UTF-8 – GPSforLEGENDS Dec 16 '20 at 19:34
  • 1
    So, you have your answer then. In UTF-8, characters are composed of 1 or more bytes. `read()` returns the next byte. You may need to call `read()` mutliple times to get all the bytes from your character. – njzk2 Dec 16 '20 at 19:36
  • Is your question "How does this work technically?" or "How would I do this in Java?" – that other guy Dec 16 '20 at 19:37
  • The value of a byte is -128 - 127. byes are signed in java. – NomadMaker Dec 16 '20 at 20:48
  • If you want to read text from the console you should use [`java.io.Console`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/io/Console.html), which handles all the decoding for you. Though note that `Console` is likely not available when you run your program from the IDE ([relevant question](https://stackoverflow.com/questions/104254/java-io-console-support-in-eclipse-ide)). – Marcono1234 Dec 16 '20 at 21:19

0 Answers0