Renaming a .tsv file with bash if a .json file with the same name contains a certain string

Question

For each subject I have a folder with two files (.json and .tsv) per task (gram, plaus, and sem), for a total of 6 files per subject. Each pair of .tsv/.json files have the same name besides the file extension. For example, one subject's folder might contain: xxx.tsv, xxx.json, yyy.tsv, yyy.json, zzz.tsv, zzz.json.

I want to look through each .json file, see whether it contains the string "Gram", "Plaus", or "Sem", and rename the corresponding .tsv file to contain _Gram, _Plaus, or _Sem before the file extension based on which is found. Right now, my code (after changing to my subject folder) looks like this:

find -type f -name "*_regressors.json" -print0 | while IFS= read -r -d '' filename
do
    if [[grep -q 'Sem' "$filename"]]; then
        sem_name="${filename%.*}" 
    mv ${sem_name}.tsv ${sem_name}_sem.tsv
    fi 
    
    if [[grep -q 'Plaus' "$filename"]]; then
    plaus_name="${filename%.*}"
    mv ${plaus_name}.tsv ${plaus_name}_plaus.tsv
    fi
    
    if [[grep -q 'Gram' "$filename"]]; then
        gram_name="${filename%.*}"
    mv ${gram_name}.tsv ${gram_name}_gram.tsv
    fi
done

I'm wondering if an awk command might work better? I'm new to scripting with bash and unix in general, so any ideas are much appreciated!

This looks like a duplicate of ["Checking the success of a command in a bash `if [ .. ]` statement"](https://stackoverflow.com/questions/36371221/checking-the-success-of-a-command-in-a-bash-if-statement) and ["Bash conditional based on exit code of command"](https://stackoverflow.com/questions/49849957/bash-conditional-based-on-exit-code-of-command). Also, if you did need the `[[ ]]`, you'd need to put spaces around them. — Gordon Davisson, Jul 25 '22 at 19:28
Your code is broken if a file contains more than one of the terms (the next `if` will try to rename a file which no longer exists with the original name). What should happen in this scenario? — tripleee, Jul 26 '22 at 06:06

Fravadona · Answer 1 · 2022-07-26T08:53:07.400

It does make sense to use awk instead of grep in this case:

#!/bin/bash

find . -type f -name "*_regressors.json" -print0 |
while IFS= read -r -d '' filename
do
    prefix=${filename%.*}
    suffix=$(
        awk '
            match($0,/Sem|Plaus|Gram/) {
                print tolower(substr($0,RSTART,RLENGTH))
                exit
            }
        ' "$filename"
    )
    mv "$prefix.tsv" "${prefix}_$suffix.tsv" 
done

but trying to match a literal string inside a JSON file without parsing it might yield unexpected results

tshiono · Answer 2 · 2022-07-26T07:37:33.157

Would you please try the following:

#!/bin/bash

find . -type f -name "*_regressors.json" -print0 | while IFS= read -r -d '' f; do
    if str=$(grep -wE "Sem|Plaus|Gram" "$f"); then              # search the json file for the strings
        str=$(head -n 1 <<< "$str" | tr [:upper:] [:lower:])    # pick the 1st match and lower the case
        base=${f%.json}                                         # remove the extention
        echo mv -- "${base}.tsv" "${base}_${str}.tsv"           # rename the file
    fi
done

The head command picks the 1st matched line just in case there are multiple matches. (It may be overthinking.)
If the printed commands look good, drop echo before mv and run.

Renaming a .tsv file with bash if a .json file with the same name contains a certain string

2 Answers2