How to make column values to rows in unix

Question

Below is a sample data in a file

4 columns with TAB separated with last column as values separated by comma. The 3rd column actually shows number of values in the 4th column.

6338838 ESR 3   173812,10547556,10518181
6338822 ESR 2   7219086,12761162

Expected output :

6338838 ESR 3   173812
6338838 ESR 3   10547556
6338838 ESR 3   10518181
6338822 ESR 2   7219086
6338822 ESR 2   12761162

Tried with AWK , but not able to make it work.

There may be duplicates, but the linked one is not one of them! — karakfa, Jun 14 '18 at 19:54

RavinderSingh13 · Answer 1 · 2018-06-15T03:50:49.027

4

EDIT: How about simply using gsub to get rid of commas here :)

awk -F" +" '{gsub(",",ORS $1 OFS $2 OFS $3 OFS,$4)} 1' Input_file | column -t

Change -F to -F"\t" in case your Input_file is TAB delimited.

How about simple using -F of awk and printing as per fields values.

awk -F" +|," '{for(i=4;i<=NF;i++){print $1,$2,$3,$i}}'  Input_file

Append | column -t in above code in case you need TAB delimited output.

As per Cyrus and Ghoti's comment adding following too now in case your Input_file is TAB delimited.

awk -F '[\t,]' -v OFS='\t' '{for(i=4; i<=NF; i++) print $1,$2,$3,$i}' Input_file

edited Jun 15 '18 at 03:50

answered Jun 15 '18 at 03:14

RavinderSingh13

130,504
14
57
93

1

Nice. Though, beware that the input file is TAB delimited, not space. And instead of piping through `column -t`, you could set your OFS as I mentioned in my answer. – ghoti Jun 15 '18 at 03:29
e.g. `awk -F '[\t,]' -v OFS='\t' '{for(i=4; i<=NF; i++) print $1,$2,$3,$i}' file` – Cyrus Jun 15 '18 at 03:35
@ghoti, thanks Ghoti yo made my day, Cyrus, sure added this to code now too. – RavinderSingh13 Jun 15 '18 at 03:39
1

My goodness, that's a clever approach in your edit. Very nicely done! (And it looks as if it's tied with my second solution as the shorted awk code to produce the desired results.) – ghoti Jun 15 '18 at 04:10
@ghoti, thank you sir :) I will keep this comment save with me to get encouraged every time too. – RavinderSingh13 Jun 15 '18 at 04:16
1

Upon reflection, you could shave off six more characters by removing unnecessary spaces: `'{gsub(",",ORS$1OFS$2OFS$3OFS,$4)} 1'`. This definitely gets you top spot for brevity, though it is less readable and might not work in all versions of awk. – ghoti Jun 15 '18 at 10:52

lubgr · Accepted Answer · 2018-06-14T19:45:34.090

3

This should work:

awk '{n = split($4,x,","); for (i = 1; i <= n; ++i) {printf "%s %s %s %s\n", $1, $2, $3, x[i]} }' yourfile

edited Jun 14 '18 at 19:45

answered Jun 14 '18 at 19:23

lubgr

37,368
3
66
117

1

I do wonder, sometimes, why the awk community prefers one-liners. Isn't https://gist.github.com/charles-dyfis-net/8162e47224e74f564e31859fd14e0901 -- the same code, but formatted such that one can look at the alignment to see where each block and command begins and ends -- far more readable? – Charles Duffy Jun 14 '18 at 19:25
Shell scripts allow literal newlines inside single-quoted strings, so still perfectly pasteable; just need a leading `awk '` line at the beginning and `'` at the end. – Charles Duffy Jun 14 '18 at 19:26
@CharlesDuffy When navigating through your local history though, aren't one-liners easier to reuse to e.g. pipe the output through another command? – lubgr Jun 14 '18 at 19:28
In my local history search, up-arrowing into a command with a multi-line literal string shows me... that original command with its multi-line string, whitespace and indentation still intact, and with navigation across line boundaries available for editing. – Charles Duffy Jun 14 '18 at 19:29
@CharlesDuffy I am somewhat convinced :) – lubgr Jun 14 '18 at 19:31
Heh. Another practice available -- if you want to pipe multi-line chunks of code around, it's probably a good time to encapsulate that code in a shell function anyhow. Throw a `foo() {` on the beginning and a `}` on the end, and then you can `foo | ...` or `... | foo` to your heart's content. – Charles Duffy Jun 14 '18 at 19:33
1

split returns you the length, just capture it there and use in the for loop. – karakfa Jun 14 '18 at 19:42
`split()` returns the number of fields. – Cyrus Jun 15 '18 at 03:45

Cyrus · Answer 3 · 2018-06-15T03:38:27.633

2

With GNU awk:

awk 'BEGIN{FS=OFS="\t"} {c1to3=$1 FS $2 FS $3; columns=split($4,array,","); for(i=1; i<=columns; i++) print c1to3,array[i]}' file

or shorter:

awk -v OFS='\t' '{columns=split($4,array,","); for(i=1; i<=columns; i++) print $1,$2,$3,array[i]}' file

or

awk 'BEGIN{OFS="\t"} {c=split($4,a,","); NF=3; for(i=1; i<=c; i++) print $0,a[i]}' file

Output:

6338838 ESR     3       173812
6338838 ESR     3       10547556
6338838 ESR     3       10518181
6338822 ESR     2       7219086
6338822 ESR     2       12761162

edited Jun 15 '18 at 03:38

answered Jun 14 '18 at 19:33

Cyrus

84,225
14
89
153

Is there anything GNU-specific in this? – chepner Jun 14 '18 at 19:42
1

@chepner: I had only tested it with GNU awk. – Cyrus Jun 14 '18 at 19:52

ghoti · Answer 4 · 2018-06-15T10:46:55.137

I love these "who can do it shorter" contests. :-)

If we cared to use the item count from $3, we could do this:

awk '{split($4,a,",");for(i=1;i<=$3;i++){$4=a[i];print}}' OFS='\t' input.txt

But the following produces similar results in fewer bytes of code. Output is in the reverse order of subfields in $4.

awk '{for(i=split($4,a,",");i;i--){$4=a[i];print}}' OFS='\t' input.txt

Not bothering to set FS because your sample input doesn't appear to include spaces within the fields.

Charles Duffy · Answer 5 · 2018-06-14T19:28:30.710

1

In native bash:

while IFS=$'\t' read -r one two three four; do
    IFS=, read -r -a pieces <<<"$four"
    for piece in "${pieces[@]}"; do
      printf '%s\t%s\t%s\t%s\n' "$one" "$two" "$three" "$piece"
    done
done <yourfile

edited Jun 14 '18 at 19:28

answered Jun 14 '18 at 19:22

Charles Duffy

280,126
43
390
441

score 1 · Answer 6 · answered Jun 14 '18 at 19:58

1

Here is another awk, without referencing to unused fields.

$ awk '{n=split($NF,a,","); 
        for(i=1;i<=n;i++) 
          {sub($NF"$",a[i]); 
           print}}' file.t

answered Jun 14 '18 at 19:58

karakfa

66,216
7
41
56

How to make column values to rows in unix

6 Answers6