0

I have a data set:

<START
   col1=value;
   col2=value;
   col3=value;
   col4=value;
   col5=value;
<END
<START
  col1=value;
  col2=value;
  col4=value;
<END
<START
  col1=value;
  col2=value;
  col3=value;
  col4=value;
  col6=value;
<END

I want the output as

col1|col2|col3|col4|col5|col6
value|value|value|value|value|value
value|value|null|value|null|null
value|value|value|value|null|value

I am using tr -s '\n' ',' < file.txt > > Output.txt

This gives me the entire output in a single line. I tried to replace the "START" string with \n to get the values into rows. However i am running out of memory in my laptop.

Any optimal solution to this problem using awk or sed?

ThePatBan
  • 107
  • 2
  • 15

1 Answers1

0

This will print the columns in a random order since idk what order you want (first in, alphabetic, something else?):

$ cat tst.awk
BEGIN { FS="="; OFS="|" }
NR==FNR { if (!/^</) names[$1]; next }
FNR==1  {
    numNames = length(names)
    nameCnt = 0
    for (name in names) {
        printf "%s%s", name, (++nameCnt<numNames ? OFS : ORS)
    }
}
/^<END/ {
    nameCnt = 0
    for (name in names) {
        printf "%s%s", (name in vals ? vals[name] : "null"), (++nameCnt<numNames ? OFS : ORS)
    }
    delete vals
    next
}
{ vals[$1] = $2 }

$ awk -f tst.awk file file
col1|col2|col3|col4|col5|col6
value|value|value|value|value|null
value|value|null|value|null|null
value|value|value|value|null|value

With GNU awk 4.* you can control the output order using PROCINFO["sorted_in"] (see the man page).

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Thank you @Ed Morton I am getting a blank file in the output... Any suggestions? – ThePatBan Sep 22 '16 at 09:10
  • edit your question to show the input file and the command you are running. As you can see the command I posted does produce the output shown given the input file you provided so you must be running a different command or running the same command against input that doesn't look like what you posted. Make sure you copy/paste **exactly** what you are doing from your terminal into the SO editor if you'd like us to help you debug it. – Ed Morton Sep 22 '16 at 13:25
  • My apologies for the delay, There is an indent before the "col1=value" string and the values end with semi colon. I have updated the question. – ThePatBan Sep 26 '16 at 03:30
  • My apologies too but I'm not interested in revisiting this question once a week. Good luck. – Ed Morton Sep 26 '16 at 15:24