0

I'm writing a script to process a data file, and started with these two command lines:

proc(){ echo $1 ; }

while read -r line ; do proc "$line";  done <data.csv | less

To my surprise, I found a listing of the current directory in the middle of the output, and realised it appears where the data file contains an asterisk character.

The expansion occurs before proc is called. Could someone please explain why this is happening. Is it happening in the command line processor or read or the file stream?

I can do this:

cat data.csv | tr '*' '_' | while read -r line ; do proc "$line";  done | less

but I'd like to know if there's a way to prevent this expansion.

Alternatively, is there a way to filter input using the redirection form, rather than the pipe, and what other special characters, eg: [,],?,- would I need to replace?

Cheers, bitrat

PS: An example data file:

gfh fg hfgh fgh fg hgh 
7 4 674 547767 56 7 56756
ghdghh gh fg h fgh fg hf gh
8 678 678 * 67 867 8 678 
gfh fg hfgh fgh fg hgh 
7 4 674 547767 56 7 56756
ghdghh gh fg h fgh fg hf gh
8 678 678 67 867 8 678 

And example output:

gfh fg hfgh fgh fg hgh
7 4 674 547767 56 7 56756
ghdghh gh fg h fgh fg hf gh
8 678 678 data.csv file1.txt file2.txt file3.txt  67 867 8 678
gfh fg hfgh fgh fg hgh
7 4 674 547767 56 7 56756
ghdghh gh fg h fgh fg hf gh
8 678 678 67 867 8 678

UPDATE:

The asterisk is only treated as a wildcard if it has spaces on either side, so must be getting tokenised at some point.

UPDATE2:

The expansion doesn't occur if echo is called directly..

while read -r line ; do echo "$line";  done <data.csv
bitrat
  • 25
  • 9
  • An asterisk expands to a directory listing. I don't think it matters what else is in the data a file, but I've added an example. :-) – bitrat Nov 20 '22 at 21:38
  • 1
    What did http://shellcheck.net tell you about your script? As the [bash](https://stackoverflow.com/questions/tagged/bash) tag you used says - `For shell scripts with syntax or other errors, please check them at https://shellcheck.net before posting here`. – Ed Morton Nov 20 '22 at 21:42
  • `$ shellcheck myscript No issues detected!` – bitrat Nov 20 '22 at 21:49
  • 2
    shellcheck *definitely* detects the unquoted expansion as a bug. Just a minute and I'll link to it running. – Charles Duffy Nov 20 '22 at 21:50
  • Thanks for shellcheck.net Handy to know. Yes, it does detect the quotes, thanks! For some reason I typed the variable in quotes there and not on my command line... – bitrat Nov 20 '22 at 21:51
  • 1
    https://replit.com/@CharlesDuffy2/LittleQuarrelsomeIntelligence -- wherein shellcheck emits the following warning: `SC2086: Double quote to prevent globbing and word splitting.` (relevant wiki page @ https://www.shellcheck.net/wiki/SC2086) -- if you can fork that and link me to a repl where it _isn't_ detected, I'll be interested to see it (and glad to analyze for why it didn't flag in whichever case is given). – Charles Duffy Nov 20 '22 at 21:54
  • 1
    Regarding `The asterisk is only treated as a wildcard if it has spaces on either side` - no, it just didn't match anything. `*` matches any string, `*foo` just matches strings that end in `foo` so apparently you don't have any files in your directory that end in whatever string is concatenated to the end of `*` in your input. – Ed Morton Nov 20 '22 at 22:06
  • 1
    Regarding `The expansion doesn't occur if echo is called directly..` - no, the problem has nothing to do with echo being called directly vs inside a function, it's entirely about your lack of quotes. See https://mywiki.wooledge.org/Quotes. – Ed Morton Nov 20 '22 at 22:08

1 Answers1

3

Why: Pathname expansion happens later than variable expansion. Search for "order of expansions" in man bash.

The expansion does not occur before proc is called, it happens inside proc.

Solution: Add double quotes:

echo "$1"
choroba
  • 231,213
  • 25
  • 204
  • 289