2

How to replace the 2nd column name in a .csv file with its corresponding file name for all files within a directory? Does anyone know how to do this with shell scripting? sed or awk

input file name CDXV1.csv

Gene,RPKM(26558640 pairs)
ENSTGUG00000013338 (GAPDH),971.678203888
ENSTGUG00000005054 (CAMKV),687.81249397
ENSTGUG00000006651 (ARPP19),634.296191033
ENSTGUG00000002582 (ITM2A),613.756010638

output file name CDXV1.csv (same)

Gene,CDXV1(26558640 pairs)
ENSTGUG00000013338 (GAPDH),971.678203888
ENSTGUG00000005054 (CAMKV),687.81249397
ENSTGUG00000006651 (ARPP19),634.296191033
ENSTGUG00000002582 (ITM2A),613.756010638
user2314737
  • 27,088
  • 20
  • 102
  • 114
  • 1
    What did you try for yourself? Post your research attempts even if they aren't successful – Inian Jun 08 '17 at 06:53

2 Answers2

1
awk -F, -v OFS=,  'NR==1{split(FILENAME,a,".");split($2,b,"(");$2= a[1] "(" b[2]}1' CDXV1.csv
Gene,CDXV1(26558640 pairs)
ENSTGUG00000013338 (GAPDH),971.678203888
ENSTGUG00000005054 (CAMKV),687.81249397
ENSTGUG00000006651 (ARPP19),634.296191033
ENSTGUG00000002582 (ITM2A),613.756010638

If your awk support inplace replacement then use -i inplace :

awk -i inplace -F, -v OFS=,  'NR==1{split(FILENAME,a,".");split($2,b,"(");$2= a[1] "(" b[2]}1' *.csv
P....
  • 17,421
  • 2
  • 32
  • 52
  • OP needs to apply this for all `.csv` files and also someway to modify the file in-place. – Inian Jun 08 '17 at 07:04
  • Thank you for nice code PS, could you please let me know which version of awk supports this, as I have gawk 4.0.1 and it is not having this option. – RavinderSingh13 Jun 08 '17 at 11:02
  • 1
    @RavinderSingh13 In latest GNU Awk (since 4.1.0 released), it has the option of "inplace" file editing check for more info https://stackoverflow.com/questions/16529716/awk-save-modifications-in-place – P.... Jun 08 '17 at 11:16
1

sed solution:

for f in yourdir/*.csv; do sed -i "1s~^\([^,]*\),\([^(]*\)~\1,${f%%.*}~g" "$f"; done

Details:

  • for f in yourdir/*.csv - iterating through csv filenames

  • -i - modify the file in-place

  • 1s - perform substitution only on the 1st line

  • ~ - used as sed subcommand sections separator

  • ^\([^,]*\),\([^(]*\) - capturing the 1st field and 2nd field value (till encountering ()

  • ${f%%.*} - bash variable substitution, truncate right of the first .

RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105