0

I would like to insert in my file a part of the title of the file.

the file title is

GeneName_something.fas.

And my files have this format:

'>Speciesa
atgaatatagatata
'>Speciesb
atagtagctatgat

I would like to insert the Gene name after the Species name. output will be:

'>Speciesa-GeneName
atgaatatagatata
'>Speciesb-GeneName
atagtagctatgat

I would like to use bash with maybe awk or sed and run it as a loop in my folder. thanks

James Brown
  • 36,089
  • 7
  • 43
  • 59
Nico64
  • 161
  • 9
  • Possible duplicate of [add text at the end of each line](https://stackoverflow.com/questions/15978504/add-text-at-the-end-of-each-line) – Aserre Mar 28 '18 at 11:42

2 Answers2

1

Using GNU awk (BEGINFILE could be replaced with FNR==1 but I'm using -i inplace):

$ awk '
  BEGINFILE { split(FILENAME,f,"_") }  # split filename on _
  /^\47/ { $0=$0"-"f[1] }              # add to quote-starting records
  1' GeneName_something.fas            # output
'>Speciesa-GeneName
atgaatatagatata
'>Speciesb-GeneName
atagtagctatgat

This version outputs changed data to stdout but you can use awk -i inplace for inplace editing the file (see here).

Notice, that there is no check for the filename that the GeneName actually exists. If it doesn't exist, a - will be added anyway.

James Brown
  • 36,089
  • 7
  • 43
  • 59
  • Hi thanks. I tried but it is not working. Can you explain the part /^\47/ please? what is it 47? Maybe can explain why this is not working. In fact my line start like this >Speciesa. there is no ' (dont know how to show it correctly on stackoverflow. – Nico64 Mar 28 '18 at 12:06
  • `\47` is the single quote `'` as in your question lines start with it. If they start with something else (`>` perhaps) replace `\47` with `>`. – James Brown Mar 28 '18 at 12:19
  • 1
    yes sorry my bad. Work now. At least now i know \47 = ' – Nico64 Mar 28 '18 at 12:25
  • If you stick with awk, you'll need that info. – James Brown Mar 28 '18 at 12:34
1

Following awks could also help you on same.

Solution 1st:

awk 'FNR==1{val=FILENAME;sub(/_.*/,"",val)} />Species/{$0=$0"-"val;} 1' GeneName_something.fas

Solution 2nd:

awk 'FNR==1{val=FILENAME;sub(/_.*/,"",val)} />Species/{print $0"-"val;next} 1' GeneName_something.fas
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93