114

Objective

Change these filenames:

  • F00001-0708-RG-biasliuyda
  • F00001-0708-CS-akgdlaul
  • F00001-0708-VF-hioulgigl

to these filenames:

  • F0001-0708-RG-biasliuyda
  • F0001-0708-CS-akgdlaul
  • F0001-0708-VF-hioulgigl

Shell Code

To test:

ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'

To perform:

ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh

My Question

I don't understand the sed code. I understand what the substitution command

$ sed 's/something/mv'

means. And I understand regular expressions somewhat. But I don't understand what's happening here:

\(.\).\(.*\)

or here:

& \1\2/

The former, to me, just looks like it means: "a single character, followed by a single character, followed by any length sequence of a single character"--but surely there's more to it than that. As far as the latter part:

& \1\2/

I have no idea.

SherylHohman
  • 16,580
  • 17
  • 88
  • 94
Daniel Underwood
  • 1,143
  • 2
  • 8
  • 5
  • See also: [Changing file extensions with sed](https://stackoverflow.com/questions/44620236/changing-file-extensions-with-sed/44620449#44620449) – agc Jun 19 '17 at 03:06

13 Answers13

176

First, I should say that the easiest way to do this is to use the prename or rename commands.

On Ubuntu, OSX (Homebrew package rename, MacPorts package p5-file-rename), or other systems with perl rename (prename):

rename s/0000/000/ F0000*

or on systems with rename from util-linux-ng, such as RHEL:

rename 0000 000 F0000*

That's a lot more understandable than the equivalent sed command.

But as for understanding the sed command, the sed manpage is helpful. If you run man sed and search for & (using the / command to search), you'll find it's a special character in s/foo/bar/ replacements.

  s/regexp/replacement/
         Attempt  to match regexp against the pattern space.  If success‐
         ful,  replace  that  portion  matched  with  replacement.    The
         replacement may contain the special character & to refer to that
         portion of the pattern space  which  matched,  and  the  special
         escapes  \1  through  \9  to refer to the corresponding matching
         sub-expressions in the regexp.

Therefore, \(.\) matches the first character, which can be referenced by \1. Then . matches the next character, which is always 0. Then \(.*\) matches the rest of the filename, which can be referenced by \2.

The replacement string puts it all together using & (the original filename) and \1\2 which is every part of the filename except the 2nd character, which was a 0.

This is a pretty cryptic way to do this, IMHO. If for some reason the rename command was not available and you wanted to use sed to do the rename (or perhaps you were doing something too complex for rename?), being more explicit in your regex would make it much more readable. Perhaps something like:

ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh

Being able to see what's actually changing in the s/search/replacement/ makes it much more readable. Also it won't keep sucking characters out of your filename if you accidentally run it twice or something.

Jan
  • 1,231
  • 2
  • 13
  • 19
Edward Anderson
  • 13,591
  • 4
  • 52
  • 48
  • 2
    on my RHEL server, the rename syntax would be "rename 0000 000 F0000*" – David LeBauer Dec 02 '10 at 23:12
  • On my ArchLinux installation, `s///` should be `-s///`, or else it complains about "too many arguments". – Danilo Bargen Aug 24 '11 at 15:30
  • 1
    It is most likely that `rename` is itself a *"renamed"* link. ie `rename` has been *"renamed"* from `prename`.. eg, in Ubuntu: `readlink -f $(which rename)` outputs `/usr/bin/prename` ... The `rename` mentioned by *David* is a different program entirely. – Peter.O Apr 20 '12 at 11:19
  • 1
    Good point, Peter. I've updated the answer to address both rename utilities. – Edward Anderson Apr 23 '12 at 03:57
  • 4
    To debug this, remove the pipe into sh at the end. The commands will echo out to the screen. – Ben Mathews May 19 '14 at 15:09
  • Why the hell is this symlinked to rename in Ubuntu? I tried to use rename expecting the same syntax in Arch and got a rude surprise. This is something that can break shell scripts easily. – Braden Best Aug 13 '15 at 20:08
  • 1
    Are you sure it's a good advice to give to pipe random data through `sh`? this is potentially dangerous as arbitrary code can be executed (you're treating data as code). – gniourf_gniourf Dec 29 '16 at 21:36
  • 1
    @gniourf_gniourf That's a good thing to be wary of. I made the assumption here that the filenames are pretty basic and not malicious. Beware when running this without knowing that the filenames are safe, not containing characters that are meaningful to the shell. – Edward Anderson Dec 30 '16 at 19:25
  • The rename tool even allows for sub-expression matches using (man rename) the `-e` switch. This way you can even move a matched pattern around within the string easily. as in `rename -e 's/.CR2.~([1-9])~/-shot-on-different-sdcard\1.CR2/' IMG_*` – daniel.kahlenberg Feb 24 '17 at 15:46
  • 1
    On Ubuntu, `rename` prefers `$1` over `\1` for matched pattern references. – Kyle Jan 14 '20 at 13:34
  • on CentOS/RHEL, I believe if you want to replace a string with an empty string, `""` is needed. – HCSF Feb 07 '22 at 06:41
  • Thanks a ton. working on an old server with no rename and no install permission. saved me a headache. – goodguy5 Mar 22 '23 at 20:18
56

you've had your sed explanation, now you can use just the shell, no need external commands

for file in F0000*
do
    echo mv "$file" "${file/#F0000/F000}"
    # ${file/#F0000/F000} means replace the pattern that starts at beginning of string
done
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
46

I wrote a small post with examples on batch renaming using sed couple of years ago:

http://www.guyrutenberg.com/2009/01/12/batch-renaming-using-sed/

For example:

for i in *; do
  mv "$i" "`echo $i | sed "s/regex/replace_text/"`";
done

If the regex contains groups (e.g. \(subregex\) then you can use them in the replacement text as \1\,\2 etc.

Guy
  • 1,984
  • 1
  • 16
  • 19
25

The easiest way would be:

for i in F00001*; do mv "$i" "${i/F00001/F0001}"; done

or, portably,

for i in F00001*; do mv "$i" "F0001${i#F00001}"; done

This replaces the F00001 prefix in the filenames with F0001. credits to mahesh here: http://www.debian-administration.org/articles/150

gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
Mike
  • 251
  • 3
  • 2
8

The sed command

s/\(.\).\(.*\)/mv & \1\2/

means to replace:

\(.\).\(.*\)

with:

mv & \1\2

just like a regular sed command. However, the parentheses, & and \n markers change it a little.

The search string matches (and remembers as pattern 1) the single character at the start, followed by a single character, follwed by the rest of the string (remembered as pattern 2).

In the replacement string, you can refer to these matched patterns to use them as part of the replacement. You can also refer to the whole matched portion as &.

So what that sed command is doing is creating a mv command based on the original file (for the source) and character 1 and 3 onwards, effectively removing character 2 (for the destination). It will give you a series of lines along the following format:

mv F00001-0708-RG-biasliuyda F0001-0708-RG-biasliuyda
mv abcdef acdef

and so on.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • 1
    This was a good explanation, but it could be useful to point out how you use the sed command with other commands to actually rename the files. For example: `ls | sed "s/\(.\).\(.*\)/mv & \1\2/" | bash` – jcarballo Mar 04 '14 at 21:10
  • @jcarballo: it is dangerous to parse `ls`, pipe through `sed` and _then pipe through a shell!_ it's subject to arbitrary code execution with forged filenames. The problem is that data should be treated as data, and here it's typically serialized into code without any precautions whatsoever. I wish paxdiablo could delete this answer as it really doesn't show good practice. (I stumbled on this question because a beginner randomly piped `| sh` after a command that didn't work and after seeing this question and the answers thought it would work better—I'm horrified!) `:)`. – gniourf_gniourf Dec 29 '16 at 22:07
7

Using perl rename (a must have in the toolbox):

rename -n 's/0000/000/' F0000*

Remove -n switch when the output looks good to rename for real.

warning There are other tools with the same name which may or may not be able to do this, so be careful.

The rename command that is part of the util-linux package, won't.

If you run the following command (GNU)

$ rename

and you see perlexpr, then this seems to be the right tool.

If not, to make it the default (usually already the case) on Debian and derivative like Ubuntu :

$ sudo apt install rename
$ sudo update-alternatives --set rename /usr/bin/file-rename

For archlinux:

pacman -S perl-rename

For RedHat-family distros:

yum install prename

The 'prename' package is in the EPEL repository.


For Gentoo:

emerge dev-perl/rename

For *BSD:

pkg install gprename

or p5-File-Rename


For Mac users:

brew install rename

If you don't have this command with another distro, search your package manager to install it or do it manually:

cpan -i File::Rename

Old standalone version can be found here


man rename


This tool was originally written by Larry Wall, the Perl's dad.

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
3

The backslash-paren stuff means, "while matching the pattern, hold on to the stuff that matches in here." Later, on the replacement text side, you can get those remembered fragments back with "\1" (first parenthesized block), "\2" (second block), and so on.

Pointy
  • 405,095
  • 59
  • 585
  • 614
2
for i in *; do mv $i $(echo $i|sed 's/AAA/BBB/'); done
Digvijay S
  • 2,665
  • 1
  • 9
  • 21
  • 4
    Welcome to SO. Please consider adding explanation of your code. It will help other users in understanding it. – Digvijay S Apr 30 '20 at 03:26
  • This answer is good but it's a near duplicate answer of a highly upvoted answer above. – Eric Leschinski May 01 '20 at 03:41
  • As written, this code may fail for filenames containing spaces. To fix the problem, one should add quotes `"` around the `$(echo ...)`. – Asker Mar 07 '23 at 06:39
1

Here's what I would do:

for file in *.[Jj][Pp][Gg] ;do 
    echo mv -vi \"$file\" `jhead $file|
                           grep Date|
                           cut -b 16-|
                           sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done

Then if that looks ok, add | sh to the end. So:

for file in *.[Jj][Pp][Gg] ;do 
    echo mv -vi \"$file\" `jhead $file|
                           grep Date|
                           cut -b 16-|
                           sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done | sh
agc
  • 7,973
  • 2
  • 29
  • 50
Chris Po
  • 11
  • 1
1

If all you're really doing is removing the second character, regardless of what it is, you can do this:

s/.//2

but your command is building a mv command and piping it to the shell for execution.

This is no more readable than your version:

find -type f | sed -n 'h;s/.//4;x;s/^/mv /;G;s/\n/ /g;p' | sh

The fourth character is removed because find is prepending each filename with "./".

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • I wish you could delete this answer. While it was maybe good in the very specific case of the OP, there are lots of people seeing answers like this and don't understand it, and randomly pipe `| sh` after a command that doesn't work, in the hope that it'll work better. It's horrifying! (and besides, that's not good practice). I hope you'll understand! – gniourf_gniourf Dec 29 '16 at 21:48
0

The parentheses capture particular strings for use by the backslashed numbers.

Ewan Todd
  • 7,315
  • 26
  • 33
0
 ls F00001-0708-*|sed 's|^F0000\(.*\)|mv & F000\1|' | bash
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • Horrible! subject to arbitrary code execution (maybe not in the specific context of the question, but there are lots of people seeing answers like this and try to randomly type something that look like it, and it's scaring dangerous!). I wish you could delete this answer (besides, you have another good one on here, that I upvoted). – gniourf_gniourf Dec 29 '16 at 21:47
0

Some examples that work for me:

$ tree -L 1 -F .
.
├── A.Show.2020.1400MB.txt
└── Some Show S01E01 the Loreming.txt

0 directories, 2 files

## remove "1400MB" (I: ignore case) ...

$ for f in *; do mv 2>/dev/null -v "$f" "`echo $f | sed -r 's/.[0-9]{1,}mb//I'`"; done;
renamed 'A.Show.2020.1400MB.txt' -> 'A.Show.2020.txt'

## change "S01E01 the" to "S01E01 The"
## \U& : change (here: regex-selected) text to uppercase;
##       note also: no need here for `\1` in that regex expression

$ for f in *; do mv 2>/dev/null "$f" "`echo $f | sed -r "s/([0-9] [a-z])/\U&/"`"; done

$ tree -L 1 -F .
.
├── A.Show.2020.txt
└── Some Show S01E01 The Loreming.txt

0 directories, 2 files
$ 
Victoria Stuart
  • 4,610
  • 2
  • 44
  • 37