0

Some context first:
I'm making a device which transforms an electronic typewriter into a serial printer/terminal. (don't ask why; I know that this does not make much sense practically)
Device inserted between the typewriter's controller and its keyboard.
It can:

  • let the keyboard through, transparently,
  • obtain keys presses, with or without blocking the typewriter from seeing them,
  • insert additional key presses.

With this I can make the typewriter work in different modes:

  • normal typewriter,
  • typewriter with each typed character logged through the serial port,
  • serial printer,
  • serial terminal.

For the serial printer/terminal modes I want to accept and understand some of the ANSI (for terminal), ESC/P, ESC/POS (for printer) escape sequences, depending on the mode.

And here comes the problem. Because the device is limited, it is possible to accept a very small subset of the escape sequences, which are possible to perform on the typewriter. I want to simply ignore any unsupported sequences.
The problem is that the sequences have different lengths.
When an unrecognised (by the device) sequence arrives, is there a general way to determine how many bytes long the sequence will be so that I know how many characters to ignore? (some simple rules based on first character(s) for example?)
Or am I forced to prepare a long lookup table (which takes precious flash space) for all possible sequences to always know how many bytes to ignore?

I want to avoid:

  • ignoring actual valid data which comes after the sequence and not printing it
  • printing parts of the escape sequences on paper
  • interpreting parts of unknown sequences as start of a new sequence

Of course, I could define my own sequences but then I would need a custom driver for my device. I prefer to use existing standard.

Edited to Add: as @Raymond Chen shows in the comment below, for ANSI sequences it can be detected where they are terminated. So no problem there. However for the ESC/P sequences (when in printer mode) I haven't noticed a similar way to know it.

  • 1
    The [Wikipedia page](https://en.wikipedia.org/wiki/ANSI_escape_code#Description) spells out the syntax. Or you can [read the formal specification](https://www.ecma-international.org/wp-content/uploads/ECMA-48_5th_edition_june_1991.pdf). All you care about is detecting termination, which is relatively straightforward. (1) ESC followed by a single "uppercase letter". (2) ESC followed by [, then digits and punctuation, then a "letter". (3) ESC followed by X, ], ^ or _, then an arbitrary string terminated by ESC+\. (I use quotation marks because they are a little more than letters.) – Raymond Chen Feb 16 '22 at 14:54
  • Yes, that solves it for ANSI, however still issue remains for ESC/P, ESC/POS. – bicyclesonthemoon Feb 16 '22 at 15:05
  • You'll have to look at the ESC/P and ESC/POS specs to see if there's a pattern for those escapes. – Raymond Chen Feb 16 '22 at 18:37
  • In which programming language to you code this project? – Marc Balmer Feb 17 '22 at 09:43
  • @MarcBalmer I write this in C. And I also have python scripts which convert button press sequences defined in a text file into .c and .h files – bicyclesonthemoon Feb 17 '22 at 10:17

1 Answers1

0

ESC/P and ESC/POS have specifications by EPSON, but they are just de facto standards, not standardized ones.
Other vendors diverting them do not necessarily comply with them and frequently make their own extensions.

EPSON itself has made various extensions, and there are specifications such as ESC/P2, ESC/Page and ESC/Label(Zebra-ZPL II compatible?).

For example, ESC/POS is here.
ESC/POS Command Reference for TM Printers

And here is ESC/P.
EPSON ESC/P Reference Manual

If you look elsewhere, you will find these.
ESC/P - Wikipedia
ESC/P 2 and FX Commands
ESC/Label Command Reference Guide - Epson
Esc/Pageコマンドリファレンス第4版 Membership registration is required.


There are loose heuristic formats for their interpretation, but there will be no strictly standardized rules that can be applied to all.

Whether you want to interpret all the documented commands steadily, or support them to a certain extent and give up on the details, you have a lot of options.

There is a tool like this that is probably famous.
ESC/POS command-line tools

Included utilities
esc2text
esc2text extracts text and line breaks from binary ESC/POS files.

It's not finished yet, but I'm making such a tool myself.
EscPosUtils

If you search, there will be other similar tools.

kunif
  • 4,060
  • 2
  • 10
  • 30
  • Thanks for the answer. I kind of hoped for the situation to have a better more generalized solution. From reading some of the document i already see that some sequences have fixed lengths, some are terminated. Task slightly bigger than expected. Thanks anyway. I'm not sure yet which way I will approach this. – bicyclesonthemoon Feb 17 '22 at 20:55