2

I'm trying to design a single regex that produces the two following scenarios:

foobar_foobar_190412_foobar_foobar.jpg  =>  190412
foobar_20190311_2372_foobar.jpg         =>  20190311

The regex I came up with is close, but I can't figure out how to make it only output the first number:

.*_(\d+)_(\d*).*                        =>  $1

foobar_foobar_190412_foobar_foobar.jpg  =>  190412
foobar_20190311_2372_foobar.jpg         =>  (no match)

Anyone got an idea?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563

4 Answers4

1

With option -P (perl regex) and -o (only-matching):

grep -Po '^\D+\K\d+' file.txt
190412
20190311

Explanation:

^           # beginning of line
  \D+       # 1 or more non digit, you can use \D* for 0 or more non digits
  \K        # forget all we have seen until this position
  \d+       # 1 or more digits

Edit according to missunderstanding of grep tag

You can do:

  • Find: ^\D(\d+)_.*$
  • Replace: $1
Toto
  • 89,455
  • 62
  • 89
  • 125
  • I'd probably use `\D*`. – melpomene Jun 04 '19 at 16:31
  • 1
    @melpomene: True, I've added `*` as an option. – Toto Jun 04 '19 at 16:33
  • I can't use this. I need a regex of the form find/replace. This is being used in the InDesign grep panel. – Patrick Hennessey Jun 04 '19 at 16:44
  • 2
    @PatrickHennessey: So why have you tagged your question with "grep"? Grep can find a pattern but can't do replacement. What language/tool are you using? Please, edit your question and add [MCVE](https://meta.stackoverflow.com/q/366988/372239) – Toto Jun 04 '19 at 16:49
0

if you care about the underscore matches, here is a sed version

sed -E 's/[^0-9]*_([0-9]+)_.*/\1/' file
karakfa
  • 66,216
  • 7
  • 41
  • 56
0

This is what I was looking for:

find:    \D+_(\d+)_.*
replace: $1

I didn't know about the "non-digit" character!

-1

If we wish to capture the first number, we can likely use this simple expression:

_([0-9]+)?_

Demo

or

.+?_([0-9]+)?_.+

Demo

and replace it with $1.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Demo

This snippet just shows that how the capturing group works:

const regex = /_([0-9]+)?_/gm;
const str = `foobar_foobar_190412_foobar_foobar.jpg
foobar_20190311_2372_foobar.jpg`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}
Emma
  • 27,428
  • 11
  • 44
  • 69