8

Example: I start recording with script, and try to type echo test but omit the o, so I backspace to correct it.

When I cat typescript everything looks normal, since the codes are interpreted, but if I use less or vim I see ech test^H^[[K^H^[[K^H^[[K^H^[[K^H^[[Ko test^M

I fully understand what this is and why it's happening, but is there any way to "burn in" the codes and just see the result in a file? My kludgy method is to cat the file, then copy/paste the text out of the terminal, but surely some combination of cat, sed, awk, or something else can get me there more easily?

Chris Martin
  • 30,334
  • 10
  • 78
  • 137
Joe Fruchey
  • 379
  • 3
  • 10
  • Try `less -r typescript` or `less -R typescript`. – John1024 Feb 01 '15 at 23:50
  • Ah, that's cool, I didn't know about -r. Any way to save that to a file? `less -r typescript > newfile` didn't work. – Joe Fruchey Feb 02 '15 at 00:00
  • Seems like a dup. of [this stackexchange question](http://unix.stackexchange.com/questions/14684/removing-control-chars-including-console-codes-colours-from-script-output). I like best the last one. – rodrigo Feb 02 '15 at 00:26
  • Joe, I just added an answer with a `sed` command that should remove most of those sequences. – John1024 Feb 02 '15 at 00:26
  • @John1024: I'm afraid that `less -r/-R` will do no better than a plain `cat` for this problem. – rodrigo Feb 02 '15 at 00:27
  • @rodrigo Did you give it a try? It works fine for me at displaying the typescript log in full color. – John1024 Feb 02 '15 at 00:29
  • @John1024: Yes, so does `cat`. But IIUC OP's problem is how to convert a file with control codes into a file without control codes, without manually copying/pasting the terminal output. `less` doesn't do that. – rodrigo Feb 02 '15 at 00:33
  • @rodrigo Yes, of course, you are right. That is why I gave the OP info on both a `less` command (for display purposes) and a `sed` command (for file conversion). – John1024 Feb 02 '15 at 00:41

2 Answers2

6

To display a file that contains ANSI sequences,

less -r typescript

Or,

less -R typescript

To remove ANSI and backspace sequences from a file, creating a clean newfile, try:

sed -r ':again; s/[^\x08]\x08\x1b\[K//; t again; s/\x1b_[^\x1b]*\x1b[\]//g; s/\x1B\[[^m]*m//g' typescript >newfile

How it works

  • -r

    This turns on extended regular expressions. (On BSD systems, -r should be replaced with -E. Modern versions of GNU sed will accept either -r or -E.)

  • `:again; s/[^\x08]\x08\x1b[K//; t again

    This removes any backspace sequences. These are done one at a time in a loop.

  • s/\x1b_[^\x1b]*\x1b[\]//g

    As an xterm extension (see documentation), Esc _ something Esc \ will do nothing. This command removes these sequences.

  • s/\x1B\[[^m]*m//g

    This removes the remaining ANSI sequences which set colors, etc.

This covers all the control sequences that I normally run into. There are a wide variety of extended control sequences and, if your output has some that I haven't seen, the code may need to be extended.

POSIX or BSD sed

On a BSD or POSIX system, individual commands have to be chained together with -e options instead of semicolons. Thus, try:

sed -e ':again' -e 's/[^\x08]\x08\x1b\[K//' -e 't again' -e 's/\x1b_[^\x1b]*\x1b[\]//g' -e 's/\x1B\[[^m]*m//g'
John1024
  • 109,961
  • 14
  • 137
  • 171
3

The suggested answer using "sed -r" relies upon GNU sed, which makes it not really portable. It is possible to do the same functionality with POSIX sed, but differently: POSIX does not provide for passing a whole script in a command option as shown here. That means that the (POSIX) way to implement a loop would be in a separate file, passed to sed using the "-f" option. Likewise, the hexadecimal constants are not portable. After making these changes, a functionally equivalent script can be used on the BSDs and Unix systems.

The suggested answer also does not cover some of the uses of carriage returns which are fairly common (for instance in yum output), nor does it filter out "most" ANSI sequences (since it focuses on the SGR "m" final character). Finally, it refers to

escape _ text _

as an xterm extension. But no such extension is provided by xterm, because the two characters "escape" and "_" begin an Application Program Command sequence (and xterm implements none).

The resulting sed-script looks like this ("^[" is the escape character):

s/^[[[][<=>?]\{0,1\}[;0-9]*[@-~]//g
s/^[[]][^^[]*^G//g
s/^[[]][^^[]*^[\\//g
:loop
s/[^^H]^H\(.\)/\1/g
t loop
s/^M^M*$//g
s/^.*^M//g
s/^[[^[]//g

A more complete script, named "script2log" can be found here. There are, however, things (such as CSI K) which are not suited to a sed script.

Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105