How to flter ANSI/Terminal Control sequences from InputStream?

Question

I'm building a simple terminal emulator using the pty4j library. My program has a print() method that renders the text to a canvas using the GraphicsContext.fillText() method from javafx. I connect the emulator to an instance of cmd and read ouptput from a buffered reader. Now sadly when it recieves text it also includes ANSI-escape characters (see image). However If i print the ouput to the IDE or system console it works fine.

I tried using the readLine() method from the BufferedReader and then applying a regex, but because not all input recieved from the terminal is terminated by a \n it blocks on the last line.

Thread terminalReaderThread = new Thread() {
   public void run() {
      try {
         int c;
         while (terminal.isRunning() && (c = terminal.getReader().read()) != -1) {
            if(c != 0){
               print(Character.toString((char)c));
            }
         }

      } catch (IOException e) {
         e.printStackTrace();
      }
  }
};
terminalReaderThread.start();

Is there an effective way to filter these escape codes from the inputStream?

You only have to write yourself a `FilterReader` subclass to solve this problem. — user207421, Mar 06 '20 at 02:49
@user207421 Some of the control sequences are multiple characters long. how would that be possible with a `FilterReader`? — rempas, Mar 06 '20 at 15:42
Depending on your needs, you might get away with complicated regexp filtering. Since this is error-prone (some sequences can be interleaved), a better way is to pass the data through a terminal parser recognizing the sequences. — jerch, Apr 01 '20 at 12:59

score 1 · Accepted Answer · answered Mar 17 '22 at 10:03

The answers I received to my question (Safely ignoring unknown ANSI, ESC/P, ESC/POS sequences, know the length) should also answer yours. If you read the standard (ECMA-48 https://www.ecma-international.org/wp-content/uploads/ECMA-48_5th_edition_june_1991.pdf) you will see that the sequences always start from the ESCAPE character and always end with a FINAL BYTE which has a value from a defined range. With this information it is enough to detect start and end of each sequence. (regex for example) (also the newline character (and other C0 codes) is not allowed inside the escape sequence so you will never have an escape sequence not entirely located inside one line)

How to flter ANSI/Terminal Control sequences from InputStream?

1 Answers1