0

I have a version regex which works in PCRE format while am having trouble getting this to work with sed using match groups.

Regex:

((^[[:alnum:]]+.*)-(\d+\.\d+\.\d+-VERS|\d+\.\d+\.\d+))

Input:

aaa1-bbb2-ccc3-dddd4-ffff5-1.0.0-VERS
aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS
zzz1-bbb2-ccc3-1.0.1
zzz1-1.0.1-VERS

expected output: split strings and separate the version string

group2="aaa1-bbb2-ccc3-dddd4-ffff5"
group3="1.0.0-VERS"
group2="aaa1-bbb2-ccc3-dddd4-ffff5"
group3="11.22.33-VERS"
group2="zzz1-bbb2-ccc3"
group3="1.0.1"
group2="zzz1"
group3="1.0.1-VERS"

The above output work as expected here

However, trying to use the same version with sed does not work. What am I missing?

echo "aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS" | sed -E 's#((^[[:alnum:]]+.*)-(\d+\.\d+\.\d+-VERS|\d+\.\d+\.\d+))#\3 \2#p'
askb
  • 6,501
  • 30
  • 43
  • 1
    regex syntax and features varies a lot between sed and pcre (see also https://unix.stackexchange.com/questions/119905/why-does-my-regular-expression-work-in-x-but-not-in-y) ... for ex: `\d` doesn't work in sed as you expect, you need `[0-9]`... if that alone solves your issue, please mark it as duplicate of https://stackoverflow.com/questions/14671293/why-doesnt-d-work-in-regular-expressions-in-sed – Sundeep Jul 30 '18 at 01:30
  • 1
    Possible duplicate of [Why doesn't \`\d\` work in regular expressions in sed?](https://stackoverflow.com/questions/14671293/why-doesnt-d-work-in-regular-expressions-in-sed) – askb Jul 30 '18 at 01:53
  • Besides, `+` is a gnu extension. – revo Jul 30 '18 at 02:40
  • @revo No, `+` is an ERE metachar which can be enabled in multiple sed versions with `\+` or the `-E` option. – Ed Morton Jul 30 '18 at 04:17
  • @askb why is there no output for the input line `zzz1-bbb2-ccc3-1.0.1`? Please add it or explain why it's missing. – Ed Morton Jul 30 '18 at 04:18
  • @EdMorton Actually I was thinking about `\+` while typing that comment. `+` is part of POSIX standard. – revo Jul 30 '18 at 05:14
  • Right, `+` part of the POSIX standard for EREs. See "BRE Special Characters" vs "ERE Special Characters" in the POSIX spec, http://pubs.opengroup.org/onlinepubs/9699919799/, and some sed versions (e.g. GNU and BSD/OSX) let you use EREs instead of BREs by adding the `-E` flag and GNU sed also lets you use it by preceding the ERE metachars with backslash. – Ed Morton Jul 30 '18 at 12:08

4 Answers4

0

I think \d isn't recognised by sed. This works for me on OSX.

sed -E 's/([[:alnum:]]+.*)-([0-9]+\.[0-9]+\.[0-9]+|[0-9]+\.[0-9]+\.[0-9]+-VERS)/\1 \2/'

Input:

aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS
aaa1-bbb2-ccc3-dddd4-ffff5-1.0.0-VERS
zzz1-bbb2-ccc3-1.0.1
zzz1-1.0.1-VERS

Output:

aaa1-bbb2-ccc3-dddd4-ffff5 11.22.33-VERS
aaa1-bbb2-ccc3-dddd4-ffff5 1.0.0-VERS
zzz1-bbb2-ccc3 1.0.1
zzz1 1.0.1-VERS
Matt
  • 3,677
  • 1
  • 14
  • 24
0

As @Sundeep pointed out \d+ does not work with sed and should be using [0-9]+ instead.

echo "aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS" | sed -E 's#((^[[:alnum:]]+.*)-([0-9]+\.[0-9]+\.[0-9]+-VERS|[0-9]+\.[0-9]+\.[0-9]+))#\3 \2#g'
askb
  • 6,501
  • 30
  • 43
  • 1
    `[0-9]+\.[0-9]+\.[0-9]+-VERS|[0-9]+\.[0-9]+\.[0-9]+` can be written as `[0-9]+\.[0-9]+\.[0-9]+(-VERS)?` which in turn can be written more concisely as `([0-9]+\.){2}[0-9]+(-VERS)?` but in reality all you need given your data is `[0-9.]+(-VERS)?`. See https://stackoverflow.com/a/51586900/1745001. – Ed Morton Jul 30 '18 at 04:30
  • 1
    ah, thks ... this saved me some time in making it compact. :) – askb Jul 30 '18 at 06:16
0

Why such a complicated regexp?

$ sed -E 's/(.*)-([0-9.]+(-VERS)?)$/\2\t\1/' file
1.0.0-VERS      aaa1-bbb2-ccc3-dddd4-ffff5
11.22.33-VERS   aaa1-bbb2-ccc3-dddd4-ffff5
1.0.1   zzz1-bbb2-ccc3
1.0.1-VERS      zzz1

or:

$ sed -E 's/(.*)-([^-]+-[^-]+)$/\2\t\1/' file
1.0.0-VERS      aaa1-bbb2-ccc3-dddd4-ffff5
11.22.33-VERS   aaa1-bbb2-ccc3-dddd4-ffff5
ccc3-1.0.1      zzz1-bbb2
1.0.1-VERS      zzz1

depending on what the output should be for input zzz1-bbb2-ccc3-1.0.1.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

This might work for you (GNU sed):

sed -r 'h;s/^(([[:alnum:]]+-?)+)-(([[:digit:]]+\.?){3}(-VERS)?)/group1="\1"/p;g;s//group3="\3"/p;d' file

However a simpler regexp would be:

sed -r 'h;s/^(.*)-([0-9].*)/group1="\1"/p;g;s//group2="\2"/p;d' file
potong
  • 55,640
  • 6
  • 51
  • 83