0

I have a list of files (/c/Users/Roy/DataReceived) over which I want to grep some information and store it as txt files(/c/Users/Roy/Documents/Result).

For example purposes: Imagine I have a 20 files with different information about cities, and I want to grep information for the cities that are listed in a txt file. All this information will then be stored in another txt file that would have the name of the given city (NewYork.txt, Rome.txt, etc).

The following code is working:

#!/bin/bash

declare INPUT_DIRECTORY=/c/Users/Roy/DataReceived
declare OUTPUT_DIRECTORY=/c/Users/Roy/Documents/Result

while read -r city; do
  echo $city
  zgrep -Hwi "$city" "${INPUT_DIRECTORY}/"*.vcf.gz > "${OUTPUT_DIRECTORY}/${city}.txt"
done < list_of_cities.txt

However, the .txt files generated have an unrecognized format.

ls: cannot access 'NewYork'$'\n''.txt': No such file or directory
'NewYork'$'\n''.txt'

EDIT: I know what it is. This line of code > "${OUTPUT_DIRECTORY}/${city}.txt" is not working properly, as it's storing the files as .txt/c/Users/Roy/Documents/Result/NewYork. Not sure how to solve it

RoyBatty
  • 306
  • 1
  • 7
  • 1
    I'm guessing your input file (`list_of_cities.txt`) contains windows/dos line endings (`\r\n`) so the string `${city}.txt` actuall looks like `NewYork\r.txt` so the `\r` says to move the cursor to the start of the line before printing `.txt`; you can verify this with `head -2 list_of_cities | od -c` and in the output you should see the dual character sequence `\r \n`; the easy fix is to run `dos2unix list_of_cities.txt` ... either before the `while` loop or once before running your script; NOTE: you only have to run `dos2unix` once as this will permanently update the source file – markp-fuso Oct 06 '22 at 22:57
  • Do the answers to ["Are shell scripts sensitive to encoding and line endings?"](https://stackoverflow.com/questions/39527571/are-shell-scripts-sensitive-to-encoding-and-line-endings) solve your problem? – Gordon Davisson Oct 06 '22 at 23:39

1 Answers1

2

From the error, it looks like your file contains CRLF (Windows line endings) while Bash, even on Windows (it was not the case some years ago...), is expecting LF.

You must either use LF ending in your editor or convert it using dos2unix before:

while ... 
done < <(cat list_of_cities.txt | dos2unix)
NoDataFound
  • 11,381
  • 33
  • 59
  • Thanks @NoDataFound. Why the double < though? Is it a typo? – RoyBatty Oct 06 '22 at 23:04
  • 3
    @RoyBatty It's two separate syntactic elements. `<` means "take input from this file:", and `<(somecommand)` means "run this command with its output going to a pipe, and give me the name of the pipe so I can read from it (it's called a "process substitution"). So together, they redirect the output of `cat ... | dos2unix` into the input of the `while ... done` loop. BTW, another option is to have `read` trim the CR characters with `while IFS=$IFS$'\r' read ...`. – Gordon Davisson Oct 06 '22 at 23:35