shell script for reading file and replacing new file with | symbol

Question

i have txt file like below.

abc
def
ghi

123
456
789

expected output is

abc|def|ghi
123|456|789

I want replace new line with pipe symbol (|). i want to use in egrep.After empty line it should start other new line.

What platform and version of awk? solaris? awk flavor? – Jose Ricardo Bustos M. Feb 24 '17 at 22:48 — Jose Ricardo Bustos M., Feb 24 '17 at 22:48
Can you think of a fix after `tr '\n' '|' < file.txt` ? – Walter A Feb 24 '17 at 23:06 — Walter A, Feb 24 '17 at 23:06

score 6 · Answer 1 · edited Jun 20 '20 at 09:12

6

you can try with awk

awk -v RS= -v OFS="|" '{$1=$1}1' file

you get,

abc|def|ghi
123|456|789

Explanation

Set RS to a null/blank value to get awk to operate on sequences of blank lines.

From the POSIX specification for awk:

RS

The first character of the string value of RS shall be the input record separator; a by default. If RS contains more than one character, the results are unspecified. If RS is null, then records are separated by sequences consisting of a plus one or more blank lines, leading or trailing blank lines shall not result in empty records at the beginning or end of the input, and a shall always be a field separator, no matter what the value of FS is.

$1==$1 re-formatting output with OFS as separator, 1 is true for always print.

edited Jun 20 '20 at 09:12

Community

1
1

answered Feb 24 '17 at 23:16

Jose Ricardo Bustos M.

8,016
6
40
62

1

@EdMorton You're right, I thought the assignment always returned `true` ( my mistake) ..... thanks a lot – Jose Ricardo Bustos M. Feb 24 '17 at 23:49
@FloHe I add explanation in post – Jose Ricardo Bustos M. Feb 26 '17 at 01:27
@jose Ricardo i am getting below error. bash-3.00# awk -v RS= -v OFS="|" '{$1=$1}1' mm awk: syntax error near line 1 awk: bailing out near line 1 – naveen Feb 26 '17 at 06:54
@naveen What platform and version of awk? – Jose Ricardo Bustos M. Feb 26 '17 at 11:50

score 2 · Answer 2 · edited May 23 '17 at 12:17

Here's one using GNU sed:

cat file | sed ':a; N; $!ba; s/\n/|/g; s/||/\n/g'

If you're using BSD sed (the flavor packaged with Mac OS X), you will need to pass in each expression separately, and use a literal newline instead of \n (more info):

cat file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/|/g' -e 's/||/\
/g'

If file is:

abc
def
ghi

123
456
789

You get:

abc|def|ghi
123|456|789

This replaces each newline with a | (credit to this answer), and then || (i.e. what was a pair of newlines in the original input) with a newline.

The caveat here is that | can't appear at the beginning or end of a line in your input; otherwise, the second sed will add newlines in the wrong places. To work around that, you can use another character that won't be in your input as an intermediate value, and then replace singletons of that character with | and pairs with \n.

EDIT

Here's an example that implements the workaround above, using the NUL character \x00 (which should be highly unlikely to appear in your input) as the intermediate character:

cat file | sed ':a;N;$!ba; s/\n/\x00/g; s/\x00\x00/\n/g; s/\x00/|/g'

Explanation:

:a;N;$!ba; puts the entire file in the pattern space, including newlines
s/\n/\x00/g; replaces all newlines with the NUL character
s/\x00\x00/\n/g; replaces all pairs of NULs with a newline
s/\x00/|/g replaces the remaining singletons of NULs with a |

BSD version:

sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/\x00/g' -e 's/\x00\x00/\
/g' -e 's/\x00/|/g'

EDIT 2

For a more direct approach (GNU sed only), provided by @ClaudiuGeorgiu:

sed -z 's/\([^\n]\)\n\([^\n]\)/\1|\2/g; s/\n\n/\n/g'

Explanation:

-z uses NUL characters as line-endings (so newlines are not given special treatment and can be matched in the regular expression)
s/$[^\n]$\n$[^\n]$/\1|\2/g; replaces every 3-character sequence of <non-newline><newline><non-newline> with <non-newline>|<non-newline>
s/\n\n/\n/g replaces all pairs of newlines with a single newline

@ClaudiuGeorgiu This is good but adds an extra `|` at the end of each line — tavnab, Feb 24 '17 at 23:40
You're right. Here's the updated version without trailing `|`s: `sed -z "s/$\w$\n$\w$/\1|\2/g; s/\n\n/\n/g"`. If necessary, empty lines can be kept by removing the last part of the command. — Gabriel Claudiu Georgiu, Feb 24 '17 at 23:55
@ClaudiuGeorgiu this works so long as the input has no non-word characters (except for the newlines). If I added a space at the end of a line, it would break. The OP didn't mention anything other than their input, so I think your suggestion works for their case. — tavnab, Feb 25 '17 at 00:09
This will work for any non-word character (including spaces): `sed -z "s/$[^\n]$\n$[^\n]$/\1|\2/g; s/\n\n/\n/g"`. — Gabriel Claudiu Georgiu, Feb 25 '17 at 00:21
@ claudiaGeorgiu. it is not working in bash shell. bash-3.00# echo $0 bash bash-3.00# sed -z "s/$\w$\n/\1|/g" mm sed: illegal option -- z — naveen, Feb 26 '17 at 06:48
@naveen this is probably due to your version of `sed`. BSD (and OS X) sed and GNU sed differ in what they support (http://unix.stackexchange.com/q/13711/207645). I'll update my answer for BSD sed. I'm not sure if @ClaudiuGeorgiu's answer has a BSD equivalent since there is no -z equivalent for BSD sed. — tavnab, Feb 27 '17 at 14:57

score 1 · Answer 3 · answered Feb 24 '17 at 23:24

In native bash:

#!/usr/bin/env bash
curr=
while IFS= read -r line; do
  if [[ $line ]]; then
    curr+="|$line"
  else
    printf '%s\n' "${curr#|}"
    curr=
  fi
done
[[ $curr ]] && printf '%s\n' "${curr#|}"

Tested:

$ f() { local curr= line; while IFS= read -r line; do if [[ $line ]]; then curr+="|$line"; else printf '%s\n' "${curr#|}"; curr=; fi; done; [[ $curr ]] && printf '%s\n' "${curr#|}"; }
$ f < <(printf '%s\n' 'abc' 'def' 'ghi' '' 123 456 789)
abc|def|ghi
123|456|789

gregory · Answer 4 · 2017-02-25T00:49:24.017

1

Use rs. For example:

rs -C'|' 2 3 < file

rs = reshape data array. Here I'm specifying that I want 2 rows, 3 columns, and the output separator to be pipe.

edited Feb 25 '17 at 00:49

answered Feb 25 '17 at 00:41

gregory

10,969
2
30
42

shell script for reading file and replacing new file with | symbol

4 Answers4

EDIT

EDIT 2