34

I'm tailing logs and they output \n instead of newlines.

I thought I'd pipe the tail to awk and do a simple replace, however I cannot seem to escape the newline in the regex. Here I'm demonstrating my problem with cat instead of tail:

test.txt:

John\nDoe
Sara\nConnor
cat test.txt | awk -F'\\n' '{ print $1 "\n" $2 }'

Desired output:

John
Doe
Sara
Connor

Actual output:

John\nDoe

Sara\nConnor

So it looks like \\n does not match the \n between the first and last names in test.txt but instead the newline at the end of each line.

It looks like \\n is not the right way of escaping in the terminal right? This way of escaping works fine in e.g. Sublime Text:

regex working in ST3

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
Cotten
  • 8,787
  • 17
  • 61
  • 98

7 Answers7

49

How about this?

$ cat file
John\nDoe
Sara\nConnor

$ awk '{gsub(/\\n/,"\n")}1' file
John
Doe
Sara
Connor
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • Why you are using `awk`? `sed` seems more appropriate for *editing* files. – hek2mgl Jul 04 '14 at 12:41
  • 11
    op tagged the awk tag. – Avinash Raj Jul 04 '14 at 12:42
  • 2
    You could change `"\n"` with `RS` and get `awk '{gsub(/\\n/,RS)}1'` – Jotne Jul 04 '14 at 12:42
  • @AvinashRaj Ok, this is a valid reason. :) I think, although in most situations you can use both `sed` or `awk`, `sed` is for such *editing* tasks. `awk` is more for analytical tasks. Also you should never forget `grep` (not in this question) – hek2mgl Jul 04 '14 at 12:49
  • 2
    Maybe I should not have tagged this `awk` but I did, and this answer works. However the `sed` seems better for this task? The problem is `sed` behaves weird on Mac OSX, see @DarkDust and @Ed Morton answers. – Cotten Jul 04 '14 at 13:28
  • 2
    @Cotten i'm not compel you. Go for anyother solution which seems best for you. – Avinash Raj Jul 04 '14 at 13:30
14

Using GNU's sed, the solution is pretty simple as @hek2mgl already answered (and that IMHO is the way it should work everywhere, but unfortunately doesn't).

But it's bit tricky when doing it on Mac OS X and other *BSD UNIXes.

The best way looks like this:

sed 's/\\n/\'$'\n''/g' <<< 'ABC\n123'

Then of course there's still AWK, @AvinashRaj has the correct answer if you'd like to use that.

Community
  • 1
  • 1
DarkDust
  • 90,870
  • 19
  • 190
  • 224
  • This means that BSD `sed` isn't POSIX compatible. – hek2mgl Jul 04 '14 at 13:18
  • 1
    Err, no. It's the other way around: GNU's `sed` is extending POSIX `sed`; that's why it needs `--posix` in the first place. – DarkDust Jul 04 '14 at 13:20
  • sure, but the code is working with `--posix` and GNU sed – hek2mgl Jul 04 '14 at 13:21
  • 1
    I think your syntax is a little off - it should be `sed 's/\\n/\'$'\n''/g'` to insert a literal newline (generated by `$'\n'`) in the script before sed processes it. I'm not sure what the shell is making of the standalone `$` between the 2 halves of your sed script (`'s/\\n/\'` and `'\n/g'`). – Ed Morton Jul 04 '14 at 13:43
  • 2
    So, let's actually look at the [POSIX standard](http://pubs.opengroup.org/onlinepubs/009695399/utilities/sed.html): the problem is that the standard does not specify whether second part of `s` (the "replacement") shall interpret `\n` or not. Since it's not a BRE and the "\" has special meaning here I'd say it shouldn't. The BSD sed's [POSIX notes](https://github.com/freebsd/freebsd/blob/master/usr.bin/sed/POSIX) state that historic versions didn't and discarded the "\" (see point 16). So both are POSIX compatible since the standard doesn't specify the behavior. – DarkDust Jul 04 '14 at 13:45
  • @EdMorton: You're probably right, but interestingly both versions work (the one I've cited and your version). – DarkDust Jul 04 '14 at 13:48
  • 2
    Yeah, what's happening is that the shell is evaluating `$'\n/g'` before sed tries to execute the script and that expands to a literal newline followed by `/g` so it "works" by co-incidence. It would not work with different characters after the `/` that the shell would expand - `/g` just happens to be harmless. – Ed Morton Jul 04 '14 at 13:51
  • 1
    +1, but note that the use of an [ANSI C-quoted string](http://www.gnu.org/software/bash/manual/bash.html#ANSI_002dC-Quoting), `$'\n'`, makes the solution shell-dependent; fortunately, though, popular shells (`bash`, `zsh`, `ksh`) DO support that, but POSIX-features-only ones such as `dash` do not. – mklement0 Jul 04 '14 at 19:38
9

This will work with any sed on any system as it is THE portable way to use newlines in sed:

$ sed 's/\\n/\
/' file
John
Doe
Sara
Connor

If it is possible for your input to contain a line like foo\\nbar and the \\ is intended to be an escaped backslash then you cannot use a simple substitution approach like you've asked for.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 1
    Yes, you are right. Could you explain why `GNU sed --posix` behaves differently than BSD sed (which also claims to be POSIX compatible). Is the problem caused by `GNU sed --posix` not working properly or by BSD sed not being POSIX compatible? – hek2mgl Jul 04 '14 at 13:25
  • Sorry, no, I'd need to read the POSIX spec for sed to figure that one out and life's too short... I will say , though, that BSD awk is broken in some ways (e.g. parsing of unparenthesized ternary expressions in print statements) so maybe their sed is too? – Ed Morton Jul 04 '14 at 13:28
  • 1
    I would need too, thought maybe you know it already... :) thx . +1 for the solution – hek2mgl Jul 04 '14 at 13:29
  • @hek2mgl, as per POSIX, _The meaning of a immediately followed by any character other than `'&'`, ``, a digit, or the delimiter character used for this command, is unspecified._, so the GNU sed behaviour is conformant, as would be a sed that outputs `\n` or one that reboots your computer. A _script_ that uses `\n` there would be non-conformant (for that very reason that the behaviour is unspecified). – Stephane Chazelas Nov 22 '15 at 12:52
9

Why use either awk or sed for this? Use perl!

perl -pe 's/\\n/\n/g' file

By using perl you avoid having to think about posix compliance, and it will typically give better performance, and it will be consistent across all (most) platforms.

user239558
  • 6,964
  • 1
  • 28
  • 35
7

I have struggled with this problem before, but I discovered the cleanest way is to use the builtin printf

printf "$(cat file.txt)" | less

Here is a real world example dealing with aws iam embeded json policy in the output, the file file.txt contains:

{
  "registryId": "111122223333",
  "repositoryName": "awesome-repo",
  "policyText": "{\n  \"Version\" : \"2008-10-17\",\n  \"Statement\" : [ {\n    \"Sid\" : \"AllowPushPull\",\n    \"Effect\" : \"Allow\",\n    \"Principal\" : {\n      \"AWS\" : [ \"arn:aws:iam::444455556666:root\", \"arn:aws:iam::444455556666:user/johndoe\" ]\n    },\n    \"Action\" : [ \"ecr:BatchCheckLayerAvailability\", \"ecr:BatchGetImage\", \"ecr:CompleteLayerUpload\", \"ecr:DescribeImages\", \"ecr:DescribeRepositories\", \"ecr:GetDownloadUrlForLayer\", \"ecr:InitiateLayerUpload\", \"ecr:PutImage\", \"ecr:UploadLayerPart\" ]\n  } ]\n}"
}

after applying the above (without the less) you get:

{
  "registryId": "111122223333",
  "repositoryName": "awesome-repo",
  "policyText": "{
  "Version" : "2008-10-17",
  "Statement" : [ {
    "Sid" : "AllowPushPull",
    "Effect" : "Allow",
    "Principal" : {
      "AWS" : [ "arn:aws:iam::444455556666:root", "arn:aws:iam::444455556666:user/johndoe" ]
    },
    "Action" : [ "ecr:BatchCheckLayerAvailability", "ecr:BatchGetImage", "ecr:CompleteLayerUpload", "ecr:DescribeImages", "ecr:DescribeRepositories", "ecr:GetDownloadUrlForLayer", "ecr:InitiateLayerUpload", "ecr:PutImage", "ecr:UploadLayerPart" ]
  } ]
}"
}

Note that the value for "policyText" is itself a string containing json.

AZAhmed
  • 309
  • 3
  • 7
5

I would use sed:

sed 's/\\n/\n/g' file
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • That does not work for me: `sed 's/\\n/\n/g' test.txt` output: line1: `JohnnDoe` line2: `SaranConnor` – Cotten Jul 04 '14 at 12:44
  • It should. Try: `sed 's/\\n/\n/g' <<< 'ABC\n123'` – hek2mgl Jul 04 '14 at 12:46
  • `sed 's/\\n/\n/g' <<< 'ABC\n123'` gives me `ABCn123`. Is this a plattform issue? I'm on OS X using zsh – Cotten Jul 04 '14 at 12:50
  • `sed --posix 's/\\n/\n/g' <<< 'ABC\n123'` works. Srry, use Linux. I'm not the OSX support I'm tired of supporting this and will never get why a Mac should be used for hacking. (its not against you personally) – hek2mgl Jul 04 '14 at 12:55
  • @hek2mgl: Then you obviously hate the *BSD's as well, Mac's UNIX tools derive from FreeBSD. Pointless and uninformed bashing. – DarkDust Jul 04 '14 at 13:01
  • `sed --posix 's/\\n/\n/g' <<< 'ABC\n123'` gives `sed: illegal option --` ....... (fearing another bashing) :) – Cotten Jul 04 '14 at 13:02
  • @Cotten No bashing :) the `--posix` option is special for `GNU sed` to ensure sed will work in POSIX mode. It (might|will) not be available on other versions of `sed`. If your version of `sed` does not understand this option, but claims to be POSIX compatible, then just omit it. – hek2mgl Jul 04 '14 at 13:05
  • 1
    BSD's `sed` doesn't have a `--posix`, it is POSIX conform already. GNU's `sed` has a lot of (useful) extensions to POSIX that `--posix` is supposed to disable. I found an [answer with a solution that works on all systems](http://stackoverflow.com/a/19883696/400056) (tested on Mac and Linux). – DarkDust Jul 04 '14 at 13:18
  • ok, thanks guys. However, still with omitting --posix `sed 's/\\n/\n/g' <<< 'ABC\n123'` gives me `ABCn123`. I have an `-E` flag saying: `-E Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's). The re_format(7) manual page fully describes both formats.` However `sed -E 's/\\n/\n/g' <<< 'ABC\n123'` still gives me back `ABCn123` – Cotten Jul 04 '14 at 13:23
  • For a (hopefully) comprehensive discussion of the differences between FreeBSD and GNU `sed` (including POSIX compliance), see http://stackoverflow.com/a/24276470/45375 – mklement0 Jul 04 '14 at 20:34
  • @DarkDust, `--posix` doesn't remove GNU extensions, it makes GNU `sed` (more) POSIX-conformant. POSIX doesn't mandate not forbid `\n` to be expanded to a newline character there, it leaves the behaviour unspecified, which is why you should not use `\n` there. POSIX clearly specifies that backslash followed by a newline should be used to insert a newline. And GNU sed does it, with or without --posix. – Stephane Chazelas Nov 22 '15 at 12:55
1

In addition to the accepted answer, OP asked about tail, and on some unix variants, eg ubuntu you need to add -W interactive to awk

tail -f error.log | awk -W interactive '{gsub(/\\n/,"\n")}1'
Mal
  • 51
  • 3