I want to extract certain columns from a file based on the header name. Some of the names have spaces (the file is tab-delimited). I cannot remove or substitute the spaces since downstream applications will be affected. What I am looking for is to extract some column based on header names and output them in the order I desire
Here is an example of my file:
Sample Note Intragenic Rate ABCDE_177447
1032 NA 0.97867626 0.9300704670625763 0.72782564
ABCDE_177447 NA 0.97836965 1.0 0.87218356
ABCDE_188399 NA 0.97859967 0.905527730405171 0.81188565
ABCDE_189595 NA 0.9787659 0.9059075892313707 0.8089241
ABCDE_189596 NA 0.9788054 0.9065243881070291 0.8092951
My desired output;
Sample Intragenic ABCDE_177447
1032 0.97867626 0.9300704670625763 0.72782564
ABCDE_177447 0.97836965 0.87218356
ABCDE_188399 0.97859967 0.81188565
ABCDE_189595 0.9787659 0.8089241
ABCDE_189596 0.9788054 0.8092951
I have tried this solution AWK extract columns from file based on header selected from 2nd file
but it doesn't work with spaces on names and the same for this solution Extracting columns from a file
I also have tried this
$cat cols.awk
BEGIN {
n=split(cols,col)
for (i=1; i<=n; i++) s[col[i]]=i
}
NR==1 {
for (f=1; f<=NF; f++)
if ($f in s) c[s[$f]]=f
next
}
{ sep=""
for (f=1; f<=n; f++) {
printf("%c%s",sep,$c[f])
sep=FS
}
print ""
}
But when I ran my script like awk -F\t -f cols.awk.sh -v cols="Note,Sample,Intragenic Rate" metrics.txt
I got the following error:
awk: illegal field $(), name "1"
input record number 2, file metrics.txt
source line number 12