1

As an example, let's say I have a folder containing these folders:

Universal 2023 02 15 Some Name
Universal 2023 02 15 Some Name and Words After
Sony Some Name 2023 02 15
Sony Some Name 2023 02 15 and Words After

Desired output

Some Name - 2023 02 15 - Universal
Some Name - 2023 02 15 - And Words After - Universal
Some Name - 2023 02 15 - Sony
Some Name - 2023 02 15 - and Words After – Sony

I wrote a command for every name structure.

1. « Universal 2023 02 15 Some Name » will be renamed: « Some Name - 2023 02 15 - Universal » With this command:

rename -v 's/([\s\S]+)\s((\d{4})\s(\d{2})\s(\d{2}))\s(([\s\S]+)\s([\s\S]+))/$6 - $2 - $1/g' *

« Universal 2023 02 15 Some Name and Words After » will be renamed: « Some Name - 2023 02 15 - And Words After - Universal » With this command:

rename -v 's/([\s\S]+)\s((\d{4})\s(\d{2})(\s)(\d{2}))\s((\w+)\s(\w+))\s([\s\S]+)/$7 - $2 - $10 - $1/g' *

« Sony Some Name 2023 02 15 » will be renamed : « Some Name - 2023 02 15 - Sony » With this command :

rename -v 's/([\s\S]+)\s((\w+)\s(\w+))\s((\d{4})\s(\d{2})\s(\d{2}))/$2 - $5 - $1/g' *
  1. Finally, « Sony Some Name 2023 02 15 and Words After » will be renamed : « Some Name - 2023 02 15 - and Words After - Sony » With this command :
rename -v 's/([\s\S]+)\s((\d{4})\s(\d{2})\s(\d{2}))\s(([\w]+)\s([\w]+))\s([\s\S]+)/$6 - $2 - $9 - $1/g' *

When I want to rename these folders, I have to put them in separate folders and run the corresponding command, then put them all back in the same folder when I'm done. This is very annoying. So I thought of writing a script in bash to avoid having to file them separately and have everything done in the main folder. In the VS code, everything seems to work fine except for the renaming commands. This line is colored orange... Which means that something is missing but I don't know what it is:

's/([\s\S]+)\s((\d{4})\s(\d{2})\s(\d{2}))\s(([\w]+)\s([\w]+))\s([\s\S]+)/$6 - $2 - $9 - $1/g'

See this link to view the scipt in VS code colors: https://i.stack.imgur.com/tosSv.png

My script :

for i in $*/; do
        # for Universal 2023 02 15 Some Name
        if [[ "$i" =~ ([\s\S]+)\s((\d{4})\s(\d{2})\s(\d{2}))\s(([\s\S]+)\s([\s\S]+)) ]];
                then
                        rename -v 's/([\s\S]+)\s((\d{4})\s(\d{2})\s(\d{2}))\s(([\s\S]+)\s([\s\S]+))/$6 - $2 - $1/g' *
        
        # for Universal 2023 02 15 Some Name and Words After
        elif [[ "$i" =~ ([\s\S]+)\s((\d{4})\s(\d{2})(\s)(\d{2}))\s((\w+)\s(\w+))\s([\s\S]+) * ]];
                then
                        rename -v 's/([\s\S]+)\s((\d{4})\s(\d{2})(\s)(\d{2}))\s((\w+)\s(\w+))\s([\s\S]+)/$7 - $2 - $10 - $1/g' *

        # for Sony Some Name 2023 02 15
        elif [[ "$i" =~ ([\s\S]+)\s((\w+)\s(\w+))\s((\d{4})\s(\d{2})\s(\d{2})) ]];
                then
                        rename -v 's/([\s\S]+)\s((\w+)\s(\w+))\s((\d{4})\s(\d{2})\s(\d{2}))/$2 - $5 - $1/g' *
        
        # for Sony Some Name 2023 02 15 and Words After
        else [[ "$i" =~ ([\s\S]+)\s((\w+)\s(\w+))\s((\d{4})\s(\d{2})\s(\d{2})) ]];
                then
                        rename -v 's/([\s\S]+)\s((\d{4})\s(\d{2})\s(\d{2}))\s(([\w]+)\s([\w]+))\s([\s\S]+)/$6 - $2 - $9 - $1/g' *
        
        fi

done

the script in color for VS code. My commands are all orange...

Anyone can help me please!!!!!!!! Many Thanks! Martin

  • 2
    `What is the best way...` is asking for an opinion-based answer which is one of the reasons listed for closing questions ("Opinion-based - This question is likely to be answered with opinions rather than facts and citations. It should be updated so it will lead to fact-based answers.") so you might want to rephrase this to ask a specific question about some part of your code. Also "it doesn't work" is the worst possible problem statement as it tells us nothing that we could use to help you debug the problem. It's like dropping your car off at the garage to fix and just saying "it doesn't work" – Ed Morton May 14 '23 at 16:04
  • How do you determine where to split `Some Name and Words After` into `Some Name` and `and Words After`? Will `Some Name` be 2-words long systematically? – Fravadona May 14 '23 at 16:37
  • Of course not. But I have other orders for that. Let's just say that if I can solve my problem with 2 words, I can move on with the rest... – Martin Julien May 14 '23 at 16:40
  • @Ed Morton I'm sorry for the mistake. This is the first time I've asked a question... In fact when I click on the image I attached, I notice that my rename commands are all colored orange. This indicates that something is not working. But I don't know if there is something to add to make the command run. Is this more accurate? – Martin Julien May 14 '23 at 17:06
  • @EdMorton I rephrased my initial question to point out the problem I think I have. – Martin Julien May 14 '23 at 17:39
  • 2
    Your Subject line still starts with `What is the best way...` so it's likely to get closed no matter the content. Aside from that - as the [tag:bash] tag you used instructs "For shell scripts with syntax or other errors, please check them at https://shellcheck.net before posting them here.". So as a starting point fix the issues that shellcheck tells you about then update your question with the fixed script if you still have issues. – Ed Morton May 14 '23 at 18:04
  • 3
    Great, now please do the other thing I suggested and run your code through http://shellcheck.net and fix the issues it tells you about. – Ed Morton May 14 '23 at 21:10

2 Answers2

3

This script can rename all four directories (apart from the capitalization in And Words After):

rename -n 's/(\w+) (.* )?(\d{4} \d{2} \d{2})( \w+ \w+)?(.*)/
          ($2 ? $2 : (substr($4,1). " ")) .
          "- " .
          $3 .
          " -" .
          ($2 ? ($4 ? $4 : "") . $5 : $5) . ($5 ? " - " : " ") . $1
          /e' *

Remove -n once you are satisfied of the result.

Philippe
  • 20,025
  • 2
  • 23
  • 32
  • I didn't know that rename could use conditionals and functions +1 – Fravadona May 15 '23 at 07:40
  • @Fravadona Note the `/e` at the end for Evaluate. – Philippe May 15 '23 at 08:19
  • @Philippe Your code renames the folders in the right way. Thanks a lot for this ;-) The only problem for me is that I don't understand anything ;-( I'm not good enough in bash yet. I learn it by myself and I put a lot of time into it. The structure of pjh is more understandable for me. That said, I find your two solutions really useful because I will study them and learn new things. Wonderful! – Martin Julien May 15 '23 at 13:06
  • @MartinJulien `rename` uses `perl` syntax. `rename` is more performant than bash version which spwans a `mv` process for every single directory. – Philippe May 15 '23 at 13:13
  • @pjh Sorry for my ignorance… but what does the -p option after the shebang? – Martin Julien May 15 '23 at 13:47
  • 1
    @MartinJulien `-p` means reading startup files even if the effective user is different from real user. – Philippe May 15 '23 at 14:27
  • @Philippe, `-p` means *not* reading startup files, and *not* using some environment variables that can cause Bash programs to behave strangely. There's (now) a bit more information in my answer. – pjh May 15 '23 at 21:31
  • @pjh From `man bash` -> `If the shell is started with the effective user (group) id not equal to the real user (group) id, and the -p option is not supplied, no startup files are read` – Philippe May 15 '23 at 21:59
  • @Philippe, the description on the Bash manual page is wrong (or, at least, misleading). The description of the `-p` option in the [set builtin](https://www.gnu.org/software/bash/manual/bash.html#index-set) section of the [Bash Reference Manual](https://www.gnu.org/software/bash/manual/bash.html) is accurate. Note that it's not a Bash version issue. Bash "privileged mode" has always behaved the same way. – pjh May 15 '23 at 22:08
  • @pjh Curiously, there is no mention of `startup files` in the `-p` section of the link you posted. – Philippe May 15 '23 at 22:15
  • @Philippe, it's another inaccuracy in the man page. I guess that by "startup files" it means the files specified by the `BASH_ENV` and `ENV` environment variables, which are ignored in privileged mode. – pjh May 15 '23 at 22:24
2

Try this Shellcheck-clean code:

#! /bin/bash -p

sep_rx='[[:space:]]+'
part_rx='[^[:space:]]+'
company_rx=$part_rx
name_rx="${part_rx}${sep_rx}${part_rx}"
date_rx="[[:digit:]]{4}${sep_rx}[[:digit:]]{2}${sep_rx}[[:digit:]]{2}"
after_rx="${part_rx}(${sep_rx}${part_rx})*"

cdn_rx="^($company_rx)$sep_rx($date_rx)$sep_rx($name_rx)\$"
cdna_rx="^($company_rx)$sep_rx($date_rx)$sep_rx($name_rx)$sep_rx($after_rx)\$"
cnd_rx="^($company_rx)$sep_rx($name_rx)$sep_rx($date_rx)\$"
cnda_rx="^($company_rx)$sep_rx($name_rx)$sep_rx($date_rx)$sep_rx($after_rx)\$"

for d in */; do
    dir=${d%/}
    if [[ $dir =~ $cdn_rx ]]; then
        company=${BASH_REMATCH[1]}
        date=${BASH_REMATCH[2]}
        name=${BASH_REMATCH[3]}
        newdir="$name - $date - $company"
    elif [[ $dir =~ $cdna_rx ]]; then
        company=${BASH_REMATCH[1]}
        date=${BASH_REMATCH[2]}
        name=${BASH_REMATCH[3]}
        words_after=${BASH_REMATCH[4]}
        newdir="$name - $date - $words_after - $company"
    elif [[ $dir =~ $cnd_rx ]]; then
        company=${BASH_REMATCH[1]}
        name=${BASH_REMATCH[2]}
        date=${BASH_REMATCH[3]}
        newdir="$name - $date - $company"
    elif [[ $dir =~ $cnda_rx ]]; then
        company=${BASH_REMATCH[1]}
        name=${BASH_REMATCH[2]}
        date=${BASH_REMATCH[3]}
        words_after=${BASH_REMATCH[4]}
        newdir="$name - $date - $words_after - $company"
    else
        printf 'ERROR: Failed to match: %s\n' "$dir" >&2
        exit 1
    fi
    mv -v -- "$dir" "$newdir"
done
  • The long, and duplicated, regular expressions in the original code are very difficult to read, so I've tried to break them down into named parts.
  • Regular expression extensions such as \s, \S, and \d don't work consistently with =~ in Bash, so I've used portable character classes instead (e.g. [^[:space:]] for \S).
  • See mkelement0's excellent answer to How do I use a regex in a shell script? to learn more about using regular expressions in Bash code.
  • See Bash Pitfalls #35 (if [[ $foo =~ 'some RE' ]]) for an explanation of why I put all the regular expressions in variables.
  • The rename utility isn't available on all systems, and there are at least two very different versions of it in circulation, so I've used the standard mv utility instead. See Why is the rename utility on Debian/Ubuntu different than the one on other distributions, like CentOS?.
  • The code works on the given examples, but it may well fail on other directory names. You'll need to check the regular expressions and modify them as necessary.
  • The -p in the #! /bin/bash -p shebang prevents Bash from reading configuration files and environment variables that could change how it behaves (e.g. by defining functions that override standard utilities, or by defining environment variables that make standard utilities behave in non-standard ways). It makes Bash programs more reliable, and reduces the "it works on my machine" effect. It may also avoid some security issues (see Shell Script Security - Apple Developer).
  • The parentheses in regular expression strings like "^($company_rx)$sep_rx($date_rx)$sep_rx($name_rx)\$" delimit "capture groups". Matches for regular expression parts between parentheses are copied into the BASH_REMATCH array. For instance, the second set of parentheses in the given string surround the date pattern, so the matched date is copied into index 2 in BASH_REMATCH (${BASH_REMATCH[2]}). The \$ at the end of the reqular expression is a backslash-escaped literal dollar character, which is a regular expression metacharacter matching the end of the string being matched. See POSIX Extended Regular Expressions for a full description of the Bash regular expressions. (Though some implementations, inconsistently, support extensions like \s etc.) The backslash in \$ is to prevent the dollar causing an expansion (which it normally does within double quotes).
  • The */ in for d in */ expands to the list of slash-terminated names (excluding names beginning with the dot character (.)) of directories under the current directory. See glob - Greg's Wiki.
  • dir=${d%/} causes dir to get the value of d with a trailing slash removed. See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)).
  • printf 'ERROR: Failed to match: %s\n' "$dir" prints the string ERROR: Failed to match: %s with a trailing newline and with %s replaced by the value of dir. It is a safer version of echo "ERROR: Failed to match: $dir", which doesn't work in general. See the accepted, and excellent, answer to Why is printf better than echo? for more information. See the POSIX printf page for detailed information about the printf utility.
  • The >&2 at the end of printf 'ERROR: ...' "$dir" >&2 causes the output to go to the "standard error" stream ("stderr") instead of the "standard output" stream ("stdout"). One practical consequence of this is that the error message will be visible even if the (standard) output of the program is redirected. It is normal to do that for error messages, and other diagnostic messages (warning, debugging, ...). See BashGuide/InputAndOutput - Greg's Wiki (wooledge.org).
  • The -- in mv -v -- "$dir" "$newdir" is to ensure that there will not be a problem if the code is ever used with names that begin with hyphen/dash (-), even if the code is copied into a different program. Without the -- leading hyphens would cause the directory names to be interpreted as strings of options to mv. See Bash Pitfalls #2 (cp $file $target) and Bash Pitfalls #3 (Filenames with leading dashes).
pjh
  • 6,388
  • 2
  • 16
  • 17
  • Thanks for the hints. Now I have a lot of research to do ;-) Your solutions are certainly better than mines. Even tho each command works if launched separatly. Regards, Martin – Martin Julien May 14 '23 at 20:45
  • @pjh I tested your clean code and it worked perfectly! Now I'll do my studies line by line to understand it all. Amazing code!!! – Martin Julien May 14 '23 at 21:20
  • @pjh I studies all your code and I understant 99% of if. I still have a few questions to get de missing 1% ;-) Q01: what does de ``` -p ``` after the shebang? Q02. For exemple in the line ``` cdn_rx="^($company_rx)$sep_rx($date_rx)$sep_rx($name_rx)\$" ``` why it is important to put the variables between () and whta exactly does the \$ ? I guess this indicates the end of the string? Q03: here ``` for d in */; do ``` I guess that the / after * specifies that I want to modify directories only? Q04: in this line ``` dir=${d%/} ``` I'm not sure of the meaning of ```%/ ``` – Martin Julien May 15 '23 at 16:26
  • @pjh I had to split the questions haha. Second part. Only 2 left... Q05: in these lines ``` printf 'ERROR: Failed to match: %s\n' "$dir" >&2 exit 1 ``` I'm not good enough to understand the meaning of ``` %s\n' "$dir" >&2 ```. Finally, at the end, you put ``` mv -v -- "$dir" "$newdir" ```. What is the use of ``` -- ``` ? Many thanks for your help, regards, Martin – Martin Julien May 15 '23 at 16:27
  • 1
    WOW!!! I can't thank you enough for the help you gave me. For taking the time to rethink the process to get the result I wanted. Thank you also for the time you spent explaining the parts I didn't understand. This was my first time asking a question and I learned more in three days than I have in the last month! Very much appreciated! I would love to be able to help someone else someday when I have increased my skill level. Sincerely, Martin – Martin Julien May 16 '23 at 15:27