2

I'm trying to make sure the input to my shell script follows the format Name_Major_Minor.extension

where Name is any number of digits/characters/"-" followed by "_"

Major is any number of digits followed by "_"

Minor is any number of digits followed by "."

and Extension is any number of characters followed by the end of the file name.

I'm fairly certain my regular expression is just messed up slightly. any file I currently run through it evaluates to "yes" but if I add "[A-Z]$" instead of "*$" it always evaluates to "no". Regular expressions confuse the hell out of me as you can probably tell..

if echo $1 | egrep -q [A-Z0-9-]+_[0-9]+_[0-9]+\.*$
then
    echo "yes"
else
    echo "nope"
    exit
fi

edit: realized I am missing the pattern for "minor". Still doesn't work after adding it though.

Ruslan Osmanov
  • 20,486
  • 7
  • 46
  • 60
Tremors
  • 133
  • 1
  • 13

2 Answers2

4

Use =~ operator

Bash supports regular expression matching through its =~ operator, and there is no need for egrep in this particular case:

if [[ "$1" =~ ^[A-Za-z0-9-]+_[0-9]+_[0-9]+\..*$ ]]

Errors in your regular expression

The \.*$ sequence in your regular expression means "zero or more dots". You probably meant "a dot and some characters after it", i.e. \..*$.

Your regular expression matches only the end of the string ($). You likely want to match the whole string. To match the entire string, use the ^ anchor to match the beginning of the line.

Escape the command line arguments

If you still want to use egrep, you should escape its arguments as you should escape any command line arguments to avoid reinterpretation of special characters, or rather wrap the argument in single, or double quotes, e.g.:

if echo "$1" | egrep -q '^[A-Za-z0-9-]+_[0-9]+_[0-9]+\..*$'

Use printf instead of echo

Don't use echo, as its behavior is considered unreliable. Use printf instead:

printf '%s\n' "$1"
Community
  • 1
  • 1
Ruslan Osmanov
  • 20,486
  • 7
  • 46
  • 60
1

Try this regex instead: ^[A-Za-z0-9-]+(?:_[0-9]+){2}\..+$.

  • [A-Za-z0-9-]+ matches Name
  • _[0-9]+ matches _ followed by one or more digits
  • (?:...){2} matches the group two times: _Major_Minor
  • \..+ matches a period followed by one or more character

The problem in your regex seems to be at the end with \.*, which matches a period \. any number of times, see here. Also the [A-Z0-9-] will only match uppercase letters, might not be what you wanted.

Nicolas
  • 6,611
  • 3
  • 29
  • 73
  • ./version_buddy.sh: line 4: `if echo $1 | egrep -q ^[A-Za-z0-9-]+(?:_[0-9]+){2}\..+$' I get this error with that – Tremors Dec 03 '16 at 02:45
  • I figured it out I think. I added your "a-z"(i didn't realize it was case sensitive). and I added a * before $ at the end. – Tremors Dec 03 '16 at 02:48
  • My regex might not be working because of the `(?:...)`, it doesn't seem to be supported, you can try `^[A-Za-z0-9-]+(_[0-9]+){2}\..+$.` instead. Don't forget upvote and accept! ;) – Nicolas Dec 03 '16 at 02:49
  • Use the POSIX character class `[:alnum:]` instead of hard-coding `A-Za-z0-9`. Yes, grep needs the `-P` arg to support lookarounds and other PCRE constructs but it's unnecessary in this case. The `+$` at the end is doing nothing useful - the same match occurs with or without it. – Ed Morton Dec 03 '16 at 15:05
  • Actually `[:alnum:]` matches underscore too. And `.+$` at the end matches the extension, it is necessary. – Nicolas Dec 03 '16 at 16:22