0

I need to write a script that checks some >20k files for some >2k search text and it needs to be flexible, so I came up with this script:

#!/bin/bash
# This script checks all files in a given directory against a list of criteria 

shopt -s expand_aliases
source ~/.bashrc

TIMESTAMP=$(date "+%Y-%m-%d-%T")
ROOT_DIR=/data
PROJECT_NAME=$1
FILE_DIR=$ROOT_DIR/projects/$1/$2
RESULT_DIR=$ROOT_DIR/projects/$1/check_result
SEARCHTEXT_FILE=$ROOT_DIR/scripts/$3

OIFS="$IFS"
IFS=$'\n'
files=$(find $FILE_DIR -type f -name '*.json')
for file in $files; do

  while read line; do
         grep -H -o $line "$file" >> $RESULT_DIR/check_result_$TIMESTAMP.log
  done < $SEARCHTEXT_FILE

done
IFS="$OIFS"

This script only produces the empty $RESULT_DIR/check_result_$TIMESTAMP.log log file with correct name.
Because the file names sometimes contain spaces I added the IFS... statements and I enclosed $file in " quotes (copied from another post). The content of the $SEARCHTEXT_FILE is for example:

'Tel alt........'
'City ..........'

If I place an echo before the grep like this

echo grep -H -o $line "$file"

then output I get is

grep -H -o 'Tel alt........' /data/projects/DNAR/input/report-157538.json

and I can execute this line as is and get the correct result.

I tried to put various combinations of " or ' or ` or () or {} around any part of this grep command but nothing changed. Somewhere I did read about alias and the alias set for grep is

alias grep='grep --color=auto'

After many hours of searching on the internet I couldn't find any post that helped me as most of them are covering issues around wrong quotes or inline bash issues. What are I missing here?

tripleee
  • 175,061
  • 34
  • 275
  • 318
Katie
  • 47
  • 5
  • I'm speculating that your input file contains DOS line feeds, but hard to tell without further diagnostics. See also https://stackoverflow.com/questions/39527571/are-shell-scripts-sensitive-to-encoding-and-line-endings – tripleee Dec 22 '20 at 09:08
  • I used dos2unix on all files and I also edited the files in vi but nothing changed. – Katie Dec 22 '20 at 10:25
  • There was a deleted comment which I think explained how your loop was running `grep` on the wrong things but I haven't studied the script in enough detail to tell whether that actually explains the lack of output. I was certainly also thinking that you are probably getting your arguments mixed up in that loop, but it's hard to tell when we can't see the contents of the file or the directory structure that it's supposed to operate on. – tripleee Dec 22 '20 at 10:27

1 Answers1

4

The simple and obvious workaround is to remove all that complexity and simply use the features of the commands you are running anyway.

find "$FILE_DIR" -type f -name '*.json' \
  -exec grep -H -o -f "$SEARCHTEXT_FILE" {} + > "$RESULT_DIR/check_result_$TIMESTAMP.log"

Notice also the quoting fixes; see When to wrap quotes around a shell variable; to avoid mishaps, you should switch to lower case for your private variables (see Correct Bash and shell script variable capitalization).

shopt -s expand_aliases and source ~/.bashrc merely look superfluous, but could contribute to whatever problem you are trying to troubleshoot; they should basically never be part of a script you plan to use in production.

tripleee
  • 175,061
  • 34
  • 275
  • 318