1

I am trying to reformat the reference legend files to make them compatible with bcftools.

Essentially, I need to go from this:

id position a0 a1 TYPE AFR AMR EAS EUR SAS ALL
1:123:A:T 123 A T SNP 0.01 0.01 0 0 0 0.01
1:679:A:T 123 A T SNP 0.01 0.01 0 0 0 0.01

to this:

id position a0 a1 TYPE AFR AMR EAS EUR SAS ALL
1:123_A_T 123 A T SNP 0.01 0.01 0 0 0 0.01
1:679_A_T 123 A T SNP 0.01 0.01 0 0 0 0.01

ideally using bash.

FranjoIM
  • 93
  • 11

1 Answers1

1

If sed is an option:

sed 's/:/_/2; s/:/_/2' file > reformatted_file

(This command s/:/_/2 is substituting the second ":" to an underscore, then substituting the third ":" to an underscore, although it's technically now the second ":" (s/:/_/2), because the first one has already been changed. Does that make sense?)

Or with only bash:

while read -r line
do
    tmp="${line//:/_}"
    echo "${tmp/_/:}"
done < file > reformatted_file

(*This works with your example, but replacing every ":" with an underscore, then changing the first one back to a ":" might have unintended effects on your file, e.g. it might mess up your header)

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46