0

I have 20 text files named as follows: samp_100.out, samp_200.out, samp_300.out, ... ,samp_2000.out. The naming is consistent and the numbering increases by 100 until 2000. I want to make a short script to (1) delete the first line of each script, and (2) apply the following command to each one of the files: sed 's/ \+/,/g' ifile.txt > ofile.csv while keeping the naming the same when changed to a .csv extension

I am assuming I need to use a for loop, but I am not sure how to iterate through the file names.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
user19619903
  • 131
  • 5
  • You could iterate over the numbers (see ["How do I iterate over a range of numbers defined by variables in Bash?"](https://stackoverflow.com/questions/169511/how-do-i-iterate-over-a-range-of-numbers-defined-by-variables-in-bash), though you'll need to increment by 100 rather than just 1), or iterate over the filenames (["How to loop over files in directory and change path and add suffix to filename"](https://stackoverflow.com/questions/20796200/how-to-loop-over-files-in-directory-and-change-path-and-add-suffix-to-filename) has examples of how to do things like this). – Gordon Davisson Sep 24 '22 at 23:30

1 Answers1

1

This might work for you (GNU sed and parallel):

parallel "sed '1d;s/ \+/,/g' samp_{}.out > samp_{}.csv" ::: {100..2000..100}

Use GNU sed, GNU parallel and braces expansion, to delete first line and replace one or more spaces globally by commas for desired files and make copies.

Alternative:

for i in {100..2000..100}
do sed '1d;s/ \+/,/g' samp_${i}.out > samp_${i}.csv
done
potong
  • 55,640
  • 6
  • 51
  • 83
  • Unfortunately I get the error bash: parallel: command not found... I am working through a cluster and I do not think I have authorization to change or add anything to linux! Is there any other way I can achieve this without parallel? Thanks – user19619903 Sep 24 '22 at 23:46
  • @user19619903 see alternative – potong Sep 24 '22 at 23:59