This has been a very eye-opening thread. I'm bringing to the table a solution to my own problem and hopefully clarifying a thing or two for you and other users looking for robustness (like I was).
In my case my mac had a bunch of duplicate photos. When macs make duplicates they append a space and a number to the end before the extension.
IMG_0001.JPG
might have multiplicity complex with IMG_0001 2.JPG
, IMG_0001 3.JPG
and so on. In my case, this went on and on making up about 2,600 useless files.
To get things pumped up, I navigated to the folder in question.
cd ~/Pictures/
Next, let's prove to ourselves that we can list all the files in the directory. You'll notice that in the regex it's necessary to include the .
that says "look in this directory". Also, you have to match the whole file name so the .+
is necessary to catch all the other characters.
find -E . -regex '\..+'
Appropriately, the results will yield the strings that you'll have to match including the .
i mentioned earlier, the slash /
, and everything else.
./IMG_1788.JPG
./IMG_1789.JPG
./IMG_1790.JPG
./IMG_1791.JPG
So I can't write this to find duplicates because it doesn't include the "./"
find -E . -regex 'IMG_[0-9]{4} .+'
but I can write this to find duplicates because it does include the "./"
find -E . -regex '\./IMG_[0-9]{4} .+`
or the more fancy version with .*/
as mentioned by @jackjr300 does the same thing.
find -E . -regex '.*/IMG_[0-9]{4} .+`
Lastly is the confusing part. \d
isn't recognized in BSD. [0-9]
works just as well. Other users' answers cited the re_format manual which lists out how to write common patterns that replace things like \d
with a funny square-colon syntax that looks like this: [:digit:]
. I tried and tried, but it never works. Just use [0-9]
. In my case, I wasted a bunch of time thinking I should have used [:space:]
instead of a space, but I found (as usual!) that I just needed to breath and really read the regex. It turned out to be my mistake. :)
Hope this helps someone!