-1

I need to generate filename from three parts, two strings, and one variable.

for f in `cat files.csv`; do echo fastq/$f\_1.fastq.gze; done

files.csv has the following lines:

Sample_11
Sample_12

I need to generate the following:

fastq/Sample_11_1.fastq.gze
fastq/Sample_12_1.fastq.gze

My problem is that I got the below files:

_1.fastq.gze_11
_1.fastq.gze_12

the string after the variable deletes the string before it.

I appreciate any help

Regards

shellter
  • 36,525
  • 7
  • 83
  • 90
forever
  • 139
  • 1
  • 2
  • 8

3 Answers3

0

Your best bet, generally, is to wrap the variable name in braces. So, in this case:

echo fastq/${f}_1.fastq.gz

See this answer for some details about the general concept, as well.

Edit: An additional thought looking at the now-provided output makes me think that this isn't a coding problem at all, but rather a conflict between line-endings and the terminal/console program.

Specifically, if the CSV file ends its lines with just a carriage return (ASCII/Unicode 13), the end of Sample_11 might "rewind" the line to the start and overwrite.

In that case, based loosely on this article, I'd recommend replacing cat (if you understandably don't want to re-architect the actual script with something like while) with something that will strip the carriage returns, such as:

for f in $(tr -cd '\011\012\040-\176' < temp.csv) do echo fastq/${f}_1.fastq.gze done

As the cited article explains, Octal 11 is a tab, 12 a line feed, and 40-176 are typeable characters (Unicode will require more thinking). If there aren't any line feeds in the file, for some reason, you probably want to replace that with tr '\015' '\012', which will convert the carriage returns to line feeds.

Of course, at that point, better is to find whatever produces the file and ask them to put reasonable line-endings into their file...

John C
  • 1,931
  • 1
  • 22
  • 34
  • Thank you, John, I got the same line _1.fastq.gze_11 I do not know why the second string eats the first string. What is the problem? – forever Nov 23 '17 at 22:19
  • @forever, after some more investigation and experimentation, I don't think it's a programming problem, but rather a literal "coding" problem (as in choice of characters), so I've updated the answer. As an outside tip, you can spot the carriage return characters in `vi`, to confirm, by looking for `^M` (character 13) at the ends of lines. And if you need to type it in the editor, like to run a command to delete them all, it's CTRL-V, CTRL-M. – John C Nov 24 '17 at 12:48
0

By the way your idiom: for f in cat files.csv should be avoid. Refer: Dangerous Backticks

while read f
do
    echo "fastq/${f}/_1.fastq.gze" 
done < files.csv
Yoda
  • 435
  • 2
  • 7
  • I do not know what is the problem, any strings I added after variable f, it replaces the string before f. This is the result of your code: /_1.fastq.ge_11 – forever Nov 23 '17 at 22:26
  • Can you set `xtrace` and `verbose`, run your script and post the output: #!/bin/bash **-xv** – Yoda Nov 23 '17 at 22:36
  • While the right thing to do, this won't solve the problem at hand. It would have been more appropriate as a comment than an answer. – chepner Nov 23 '17 at 23:07
0

You can make it a one-liner with xargs and printf.

xargs printf 'fastq/%s_1.fastq.gze\n' <files.csv

The function of printf is to apply the first argument (the format string) to each argument in turn.

xargs says to run this command on as many files as it can fit onto the command line (splitting it up into multiple invocations if the input file is too large to fit all the arguments onto a single command line, subject to the ARG_MAX constant in your kernel).

tripleee
  • 175,061
  • 34
  • 275
  • 318