3

I cannot seem to figure out how to come up with the correct regex for my bash command line. Here's what I am doing:

echo "XML-Xerces-2.7.0-0.tar.gz" | sed -e's/^\(.*\)-[0-9].*/\1/g'

This gives me the output of ...

XML-Xerces-2.7.0

... but want I need is the output to be ...

XML-Xerces

... I guess I could do this ...

 echo "XML-Xerces-2.7.0-0.tar.gz" | sed -e's/^\(.*\)-[0-9].*/\1/g' | sed -e's/^\(.*\)-[0-9].*/\1/g'

... but I would like to know how understand sed regex a little better.

Update:

I tried this ...

echo "XML-Xerces-2.7.0-0.tar.gz" | sed -e's/^\([^-]*\)-[0-9].*/\1/g'

... as suggest but that outputs XML-Xerces-2.7.0-0.tar.gz

Red Cricket
  • 9,762
  • 21
  • 81
  • 166

2 Answers2

6

You can't do non greedy regex in sed, but you can do something like this instead:

echo "XML-Xerces-2.7.0-0.tar.gz" | sed -e 's/^\(\([^-]\|-[^0-9]\)*\).*/\1/g'

Which will capture everything up until it finds a - followed by [0-9].

Paul
  • 139,544
  • 27
  • 275
  • 264
3

You actually don't sed when you're in bash:

shopt -s extglob
V='XML-Xerces-2.7.0-0.tar.gz'
echo "${V%%-+([0-9]).+([0-9])*}"
konsolebox
  • 72,135
  • 12
  • 99
  • 105
  • Pretty slick! But I am not sure I understand how to read `${V%%-+([0-9]).+([0-9])*}`. Could you explain that part? – Red Cricket Sep 13 '13 at 17:24
  • 2
    @RedCricket It's an extended glob. See [here](http://www.gnu.org/software/bash/manual/html_node/Pattern-Matching.html). The feature is not enabled by default and we enable it through `shopt -s extglob`. The expansion method of the variable deletes the match found in the end of the variable's value. Expansion methods are explained [here](http://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html). The pattern `-+([0-9]).+([0-9])*` matches `-2.7.0-0.tar.gz` of `XML-Xerces-2.7.0-0.tar.gz` and so that part is deleted. In regex it's actually like `-[0-9]+\.[0-9]+.*$`. – konsolebox Sep 13 '13 at 17:40