0

I have a a set of file wat follow the form L2_error_with_G_at_#_#.csv where the # symbol could be any number between 1 and 16. I want to rename the files using bash and sed so that any single digit number in the filename is buffered with a single 0, e.g. the filenames L2_error_with_G_at_2_9.csv and L2_error_with_G_at_1_16.csv would be replaced with filenames L2_error_with_G_at_02_09.csv and L2_error_with_G_at_01_16.csv. I'm seen solutions to this sort of problem that use the rename function from pearl. I don't have access to that function as I'm working on a system where I don't have installation privileges. Ideally the solution will use more of the basic features of bash, which is why I suggested a solution using sed.

EDIT: I do not need to use sed, I can use anything so long as it's a standard part of bash

Here's what I've tried

for file in L2_error_with_G_at_*.csv
do
    new=$(echo "$file" | sed 's/_\([0-9]\)/_0/g; s/_0\([0-9][0-9]\)/_\1/g;');
    mv "$file" "$new";
done;

David G.
  • 193
  • 2
  • 9
  • 2
    please update the question with your (coding) attempts to solve the problem – markp-fuso Mar 20 '23 at 13:35
  • 1
    will all filenames (sans ext) consist of 7 `_`-delimited fields, with only the last 2 fields being numeric? – markp-fuso Mar 20 '23 at 13:35
  • 1
    `sed` seems like a poor choice for something like this vs bash builtins or `awk`. – Ed Morton Mar 20 '23 at 13:40
  • I don't necessarily need to use `sed`, I just need to use some of the more basic functionality of bash. If `awk` is builtin then it should work fine with me – David G. Mar 20 '23 at 14:13
  • @markp-fuso I count 6, but yes the number of underscores, `_`, is constant across all file names – David G. Mar 20 '23 at 14:16
  • `awk` is, like `sed`, an external command and not buitlin to the shell. For seeing the builtins of bash, look at the _bash_ man-page and search for the section named _SHELL BUILTIN COMMANDS_. – user1934428 Mar 20 '23 at 15:02
  • @DavidG. I think you're confusing builtins (part of the shell) with mandatory POSIX commands (e.g. `awk` and `sed` but not `perl`) that are required to be present on all POSIX-compliant Unix boxes – Ed Morton Mar 20 '23 at 16:57
  • @EdMorton-SOstopbullying I do apologize with being a bit loose with the terminology. I would want solutions using functionality that would be typically available on more bare bones Linux systems. I'm working on a cluster with minimal packages – David G. Mar 21 '23 at 17:56

5 Answers5

4

Assumptions:

  • all filenames have the same format: a_b_c_d_e_n1_n2.ext where ...
  • a..e and ext do not contain white space
  • n1/n2 are numbers

One bash idea:

for fname in L2_error_with_G_at_2_9.csv L2_error_with_G_at_1_16.csv
do
    printf -v new "%s_%s_%s_%s_%s_%02d_%02d.%s" ${fname//[._]/ }
    echo mv "${fname}" "${new}"
done

This generates:

mv L2_error_with_G_at_2_9.csv L2_error_with_G_at_02_09.csv
mv L2_error_with_G_at_1_16.csv L2_error_with_G_at_01_16.csv

Once the output is verified the echo can be removed from the code so that on the next run the mv occurs.


If the list of filenames is in a file:

while read -r fname
do
    printf -v new "%s_%s_%s_%s_%s_%02d_%02d.%s" ${fname//[._]/ }
    echo mv "${fname}" "${new}"
done < file.list

If the list of filenames is coming from a command (eg, find or ls):

while read -r fname
do
    printf -v new "%s_%s_%s_%s_%s_%02d_%02d.%s" ${fname//[._]/ }
    echo mv "${fname}" "${new}"
done < <(command that generates list of filenames)
markp-fuso
  • 28,790
  • 4
  • 16
  • 36
  • Using `${fname//[._]/ }` unquoted exposes you to all the usual issues with unquoted variables, see https://mywiki.wooledge.org/Quotes, so YMMV depending on the original file names, etc. – Ed Morton Mar 21 '23 at 12:50
  • @EdMorton-SOstopbullying in this case wrapping in double quotes means a single string being fed to `printf` (ie, the entire string goes into the first `%s`) ... try it; this also goes back to one of the assumptions ... *does not contain white space* (which could probably be expanded to include all the other *issues with unquoted variables*); could the answer be expanded (or replaced) ... sure, but at this point it would be overkill since these types of issues don't exist in OP's samples; if OP runs into issues then we'll need to see a more representative example of their filenames – markp-fuso Mar 21 '23 at 13:37
  • 1
    Yes, I know you can't add double quotes as-is. Maybe just change `does not contain white space` to `does not contain anything the shell might interpret/expand in an unquoted variable, see https://mywiki.wooledge.org/Quotes` or similar? I commented mostly for the benefit of the next person with a similar issue who reads this and may not have exactly the same format of file names as the OP does. – Ed Morton Mar 21 '23 at 13:41
2
$ cat tst.sh
#!/usr/bin/env bash

olds=( L2_error_with_G_at_2_9.csv and L2_error_with_G_at_1_16.csv )

for old in "${olds[@]}"; do
    if [[ "$old" =~ ^(.*_)([0-9]+)(_)([0-9]+)([^_]*)$ ]]; then
        printf -v new '%s%02d%s%02d%s' "${BASH_REMATCH[@]:1:5}"
        echo mv "$old" "$new"
    fi
done

$ ./tst.sh
mv L2_error_with_G_at_2_9.csv L2_error_with_G_at_02_09.csv
mv L2_error_with_G_at_1_16.csv L2_error_with_G_at_01_16.csv

populate olds as you see fit or make the loop for old in L2_error_with_G_at_*.csv or whatever you need to get a list of old file names to iterate on. Obviously remove the echo when you're happy that it'll do what you want.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
1

Try this Shellcheck-clean pure Bash (except for the mv) code:

#! /bin/bash -p

shopt -s nullglob

filename_rx='^(.+)_([[:digit:]]{1,2})_([[:digit:]]{1,2}).csv$'

for file in L2_error_with_G_at_*_*.csv; do
    if [[ $file =~ $filename_rx ]]; then
        printf -v newfile '%s_%02d_%02d.csv' "${BASH_REMATCH[@]:1:3}"

        [[ $newfile == "$file" ]] || echo mv -v -- "$file" "$newfile"
    fi
done
  • shopt -s nullglob makes globs expand to nothing when nothing matches (otherwise they expand to the glob pattern itself, which is almost never useful in programs). That means the loop below will do nothing if nothing matches the L2_..._*_*.csv pattern.
  • See mkelement0's excellent answer to How do I use a regex in a shell script? for an explanation of [[ ... =~ ... ]].
  • Remove the echo from echo mv -v ... if you are happy that the code will do what you want.
  • The code in the loop should work for filenames containing any characters (including spaces, newlines, or glob characters).
pjh
  • 6,388
  • 2
  • 16
  • 17
1

Using parameter expansion - substring removal, requiring at_ as part of the string and printf

Only using these 2 constructs

  • %pattern* remove everything after and including "pattern"
  • #*pattern remove everything before and including "pattern"
for i in L2_error_with_G_at_2_9.csv L2_error_with_G_at_1_16.csv
do 
   base="${i%at_*}at_"
   num="${i#*at_}"
   num_2="${num#*_}"
   echo "$i" "${base}$(printf %02d "${num%_*}")_$(printf %02d "${num_2%.*}").csv"
done
L2_error_with_G_at_2_9.csv L2_error_with_G_at_02_09.csv
L2_error_with_G_at_1_16.csv L2_error_with_G_at_01_16.csv
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29
1

Using GNU AWK

awk -F'[._]'  -v OFS='_' '
    {
        srcfile = $0
        for (i=NF-1; i>=NF-2; i--){    
             $i = sprintf("%02d", $i)
        }
        trgfile = gensub(/^(.*)_([^_]*)$/,"\\1.\\2",1)
        if(srcfile != trgfile)
            print |"mv \"" srcfile "\" \"" trgfile "\""
    }
' <(ls L2_error_with_G_at_*.csv)
ufopilot
  • 3,269
  • 2
  • 10
  • 12
  • `print |"mv " srcfile " " trgfile` should be `print |"mv \047" srcfile "\047 \047" trgfile "\047"` or similar so you're not passing those strings to the shell unquoted. – Ed Morton Mar 21 '23 at 13:45