remove newlines in file between every newlines

Question

below is the part of file which has zimbra accounts listed(500+) separated by an empty newline

cn: Jack 
displayName: Jack Johnson
givenName: Jack
sn: johnson
zimbraMailDeliveryAddress: Jack@example.com

cn: james ryan
displayName: James Ryan
givenName: James
sn: Ryan
zimbraMailDeliveryAddress: James@example.com

....

I want to have the file with the content like below so that i can import them to new server using zmprove

cn: Jack displayName: Jack Johnson givenName: Jack sn: johnson zimbraMailDeliveryAddress: Jack@example.com
cn: james ryan displayName: James Ryan givenName: James sn: Ryan zimbraMailDeliveryAddress: James@example.com

i tried writing the script without removing new lines but couldnt extract so far

for line in `cat /tmp/account3.txt`;
do
    echo $line | grep "zimbraMailDeliveryAddress:" > /dev/null
    RESULT=$?

        if [ $RESULT -eq 0 ];  then 
    email=`echo $line | cut -d' ' -f2`  > /dev/null
    continue

    elif   echo $line | grep "sn:"   > /dev/null
    RESULT=$?
    if [ $RESULT -eq 0 ];  then
    sn=`echo $line | awk '{ print $2; }'`  > /dev/null
    continue

        elif  echo $line | grep "givenName:"  > /dev/null
    RESULT=$?
    if [ $RESULT -eq 0 ];  then 
    givenName=`echo $line | awk '{ print $2; }'`  > /dev/null
        continue

    elif  echo $line | grep "displayName:"  > /dev/null
    RESULT=$?
    if [ $RESULT -eq 0 ];  then  
    displayName=`echo $line | awk '{ print $2; }'`  > /dev/null
        continue

        elif echo $line | grep "cn:" > /dev/null
    RESULT=$?
    if [ $RESULT -eq 0 ];  then 
    cn=`echo $line | cut -d' ' -f2`  > /dev/null
        continue
    fi
        else
          :
    fi
        echo $email $sn $cn $displayName $givenName
done
# awk '/cn:|displayName:|givenName:|sn:|zimbraMailDeliveryAddress:/{printf "%s ", $0; next} 1' /tmp/account2.txt

Instead of use a lot of fork to `grep`, have a look at this answer: [How to test presence of substring in string](http://stackoverflow.com/a/20460402/1765658) — F. Hauri - Give Up GitHub, Jul 10 '16 at 07:36

score 6 · Accepted Answer · answered Jul 10 '16 at 08:20

6

$ awk -v RS= '{$1=$1}1' file
cn: Jack displayName: Jack Johnson givenName: Jack sn: johnson zimbraMailDeliveryAddress: Jack@example.com
cn: james ryan displayName: James Ryan givenName: James sn: Ryan zimbraMailDeliveryAddress: James@example.com

answered Jul 10 '16 at 08:20

Ed Morton

188,023
17
78
185

1

can you put some more light on your answer , its short and simple too – sherpaurgen Jul 10 '16 at 09:02
3

Sure, `RS=` is telling awk to read each blank-line-separated block of text as a record, then recompiling the record by replacing the default input Field Separator `FS` (any chain of white space, including newlines) with the default Output Field Separator `OFS` (a single blank char) then printing the resulting record courtesy of a true condition `1` causing that default action to occur. I recommend you read the book Effective Awk Programming, 4th Edition, by Arnold Robbins if you'll be doing **any** text manipulation for the rest of your life :-). – Ed Morton Jul 10 '16 at 09:07

anubhava · Answer 2 · 2016-07-10T08:38:17.057

1

awk can handle this easily with empty RS:

awk -v RS= '{gsub(/\n/, " ")} 1' file

cn: Jack displayName: Jack Johnson givenName: Jack sn: johnson zimbraMailDeliveryAddress: Jack@example.com
cn: james ryan displayName: James Ryan givenName: James sn: Ryan zimbraMailDeliveryAddress: James@example.com

By using RS= we are splitting input data on records when we get an empty line after zimbraMailDeliveryAddress lines.

edited Jul 10 '16 at 08:38

answered Jul 10 '16 at 08:18

anubhava

761,203
64
569
643

1

Note that's concatenating `James` and `sn:` to `Jamessn:`, etc. ITYM `gsub(/[[:blank:]]*\n/," ")`. – Ed Morton Jul 10 '16 at 08:30
1

Ah that's true, replacement should be by space. Thanks. – anubhava Jul 10 '16 at 08:38
1

Yeah, and the OP has blank chars at the end of some lines which is why it looked like your script was working for parts of the input. – Ed Morton Jul 10 '16 at 08:40

score 1 · Answer 3 · answered Jul 10 '16 at 18:28

This might work for you (GNU sed):

sed ':a;N;/\n\s*$/!s/\s*\n/ /;ta;s/\n//p;d' file

Read two or more lines into the pattern space (PS) replacing zero or spaces followed by a newline with a space if the last line read is not an empty line. If the last line read is empty, remove it and print the lines in the PS and then delete the PS.

N.B. This also caters for the last empty line not being present.

If the format of the file is fixed as in the example text:

 sed 'N;N;N;N;N;s/\s*\n/ /g;s/ $//' file

may be suffice.

sjsam · Answer 4 · 2016-07-10T08:45:25.130

0

My awk solution would be:

awk 'BEGIN{RS="";FS="\n";}
      {
      for(i=1;i<=NF;i++)
        printf "%s%s", $i, (i<NF?OFS:ORS)
      }' file

Output

cn: Jack  displayName: Jack Johnson givenName: Jack sn: johnson zimbraMailDeliveryAddress: Jack@example.com
cn: james ryan displayName: James Ryan givenName: James sn: Ryan zimbraMailDeliveryAddress: James@example.com

edited Jul 10 '16 at 08:45

answered Jul 10 '16 at 08:36

sjsam

21,411
5
55
102

1

Instead of hard-coding the characters that separate output fields and terminate records, use the builtin values: `printf "%s%s", $i, (i – Ed Morton Jul 10 '16 at 08:42
1

@EdMorton Wow ! Much better Ed. Thankyou and adopted the technique. – sjsam Jul 10 '16 at 08:46
1

You're welcome. That's the idiomatic way to write loops when you know the terminating loop value. When you don't but you do know you'll start at 1 (or some other value) then it's `for (i=1;i in arr;i++) printf "%s%s", (i>1?OFS:""), $i; print ""` and when you don't know either end it's `for (i in arr) printf "%s%s", (++c>1?OFS:""), $i; print ""`. I used array indices in the loops just to demonstrate a case when you don't necessarily know the start/end values. – Ed Morton Jul 10 '16 at 08:49

score 0 · Answer 5 · answered Jul 11 '16 at 05:43

0

Another way with sed :

sed '/^$/!{H;d};/^$/{x;G;s/\n/ /g;s/^ //}' file

answered Jul 11 '16 at 05:43

SLePort

15,211
3
34
44

remove newlines in file between every newlines

5 Answers5