0

For each subject I have a folder with two files (.json and .tsv) per task (gram, plaus, and sem), for a total of 6 files per subject. Each pair of .tsv/.json files have the same name besides the file extension. For example, one subject's folder might contain: xxx.tsv, xxx.json, yyy.tsv, yyy.json, zzz.tsv, zzz.json.

I want to look through each .json file, see whether it contains the string "Gram", "Plaus", or "Sem", and rename the corresponding .tsv file to contain _Gram, _Plaus, or _Sem before the file extension based on which is found. Right now, my code (after changing to my subject folder) looks like this:

find -type f -name "*_regressors.json" -print0 | while IFS= read -r -d '' filename
do
    if [[grep -q 'Sem' "$filename"]]; then
        sem_name="${filename%.*}" 
    mv ${sem_name}.tsv ${sem_name}_sem.tsv
    fi 
    
    if [[grep -q 'Plaus' "$filename"]]; then
    plaus_name="${filename%.*}"
    mv ${plaus_name}.tsv ${plaus_name}_plaus.tsv
    fi
    
    if [[grep -q 'Gram' "$filename"]]; then
        gram_name="${filename%.*}"
    mv ${gram_name}.tsv ${gram_name}_gram.tsv
    fi
done

I'm wondering if an awk command might work better? I'm new to scripting with bash and unix in general, so any ideas are much appreciated!

anamarz
  • 1
  • 1
  • 2
    I suggest to remove all `[[` and `]]`. – Cyrus Jul 25 '22 at 19:15
  • This looks like a duplicate of ["Checking the success of a command in a bash `if [ .. ]` statement"](https://stackoverflow.com/questions/36371221/checking-the-success-of-a-command-in-a-bash-if-statement) and ["Bash conditional based on exit code of command"](https://stackoverflow.com/questions/49849957/bash-conditional-based-on-exit-code-of-command). Also, if you did need the `[[ ]]`, you'd need to put spaces around them. – Gordon Davisson Jul 25 '22 at 19:28
  • What if a file contains more than one of the three terms? – jhnc Jul 25 '22 at 23:54
  • Your code is broken if a file contains more than one of the terms (the next `if` will try to rename a file which no longer exists with the original name). What should happen in this scenario? – tripleee Jul 26 '22 at 06:06

2 Answers2

2

It does make sense to use awk instead of grep in this case:

#!/bin/bash

find . -type f -name "*_regressors.json" -print0 |
while IFS= read -r -d '' filename
do
    prefix=${filename%.*}
    suffix=$(
        awk '
            match($0,/Sem|Plaus|Gram/) {
                print tolower(substr($0,RSTART,RLENGTH))
                exit
            }
        ' "$filename"
    )
    mv "$prefix.tsv" "${prefix}_$suffix.tsv" 
done

but trying to match a literal string inside a JSON file without parsing it might yield unexpected results

Fravadona
  • 13,917
  • 1
  • 23
  • 35
1

Would you please try the following:

#!/bin/bash

find . -type f -name "*_regressors.json" -print0 | while IFS= read -r -d '' f; do
    if str=$(grep -wE "Sem|Plaus|Gram" "$f"); then              # search the json file for the strings
        str=$(head -n 1 <<< "$str" | tr [:upper:] [:lower:])    # pick the 1st match and lower the case
        base=${f%.json}                                         # remove the extention
        echo mv -- "${base}.tsv" "${base}_${str}.tsv"           # rename the file
    fi
done
  • The head command picks the 1st matched line just in case there are multiple matches. (It may be overthinking.)
  • If the printed commands look good, drop echo before mv and run.
tshiono
  • 21,248
  • 2
  • 14
  • 22