Bash: look in the first column of a table, if elements in row are the same combine the rows

Question

I have a table like this:

N1.txt N04_28  31022   39154   t1-8133/8133     
N1.txt N04_28  40504   47604   1-7101/7101    
N1.txt N05_159 1       6348    t1-6348/8133     
N1.txt N05_159 7698    14798   1-7101/7101     
N2.txt N06_30  1       6490    t1-6490/8133    
N2.txt N06_30  7840    14940   1-7101/7101    
N3.txt N07_170 1       6285    t1-6285/8133     
N4.txt N07_170 7635    14735   t1-7101/7101

I would like to look only into the first column and, if the row contain the same string I want to combine the rows in a single row. The output should be something like this:

    N1.txt N04_28  31022   39154   t1-8133/8133   N04_28  40504   47604   1-7101/7101 N05_159 1       6348    t1-6348/8133   N05_159 7698    14798   1-7101/7101    
    N2.txt N06_30  1       6490    t1-6490/8133   N06_30  7840    14940   1-7101/7101   
    N3.txt N07_170 1       6285    t1-6285/8133     
    N4.txt N07_170 7635    14735   t1-7101/7101

I thouhgt I could do that in awk, but I am afraid my skills are limited. I looked at this question which looked similar, but of course, it binds everything if I change the /@/ with /*.txt/

I am doing these things over and over and I really would like to learn how to do it properly and efficiently. Thank you

Maybe `awk awk '$1!=a{if(b);print b;b=""}a=$1{$1="";if(!b)b=a;b=b$0}END{print b}' file` will do? Looks like `awk -v ORS="" 'a!=$1{a=$1; $0=RS $0} a==$1{ sub($1":",";") } 1' file` also works... — Wiktor Stribiżew, Oct 02 '19 at 07:58
Thanks, @WiktorStribiżew both seem to work, but the first command does not give tab del output and the latter the merged rows are not separated. But in general, this is a great help. Would you mind to guide me through this? — efrem, Oct 02 '19 at 12:36

score 0 · Answer 1 · answered Oct 02 '19 at 08:11

You may use awk:

awk 'BEGIN{ PROCINFO["sorted_in"]="@ind_str_asc" } {
    s=$0
    sub(/^[[:blank:]]*[^[:blank:]]+[[:blank:]]+/, "", s)
    a[$1] = (a[$1] == "" ? "" : a[$1] OFS) s
}
END {
   for (i in a) print i, a[i]
}' file | column -t

N1.txt  N04_28   31022  39154  t1-8133/8133  N04_28  40504  47604  1-7101/7101  N05_159  1  6348  t1-6348/8133  N05_159  7698  14798  1-7101/7101
N2.txt  N06_30   1      6490   t1-6490/8133  N06_30  7840   14940  1-7101/7101
N3.txt  N07_170  1      6285   t1-6285/8133
N4.txt  N07_170  7635   14735  t1-7101/7101

Bash: look in the first column of a table, if elements in row are the same combine the rows

1 Answers1