I want to write a regular expression for QR8.4_Z4J25
in shell script? How can i do it?
Is this correct?
[QR][0-9][.][0-9][_][A-Z][0-9][A-Z][0-9][0-9]
I want to write a regular expression for QR8.4_Z4J25
in shell script? How can i do it?
Is this correct?
[QR][0-9][.][0-9][_][A-Z][0-9][A-Z][0-9][0-9]
It's obviously wrong because it'll only match Q8.4_Z4J25
or R8.4_Z4J25
, but not QR8.4_Z4J25
A bracket matches any one character specified, so you'd like to write:
[Q][R][0-9][.][0-9][_][A-Z][0-9][A-Z][0-9][0-9]
You don't need to use brackets for a single character, though, so it can be simplified to
QR[0-9]\.[0-9]_[A-Z][0-9][A-Z][0-9][0-9]
Be sure to escape the dot if it's outside of a bracket because it would otherwise match any single character.
in case you want to match QR9.1_8A9YK
as well, you should change it to
QR[0-9]\.[0-9]_[A-Z0-9]\{5\}
If you're using Extented Regular Expression, usually by supplying an option -E
to the tool you're using, then you shouldn't escape the braces:
QR[0-9]\.[0-9]_[A-Z0-9]{5}
Square brackets in regular expressions denote a collection of characters.
[MX_5]
will match one character that is M
, X
_
or 5
.[0-9]
will match one character that is between 0
and 9
.[a-z]
will match one character that is between lowercase a
and z
.Notice the pattern? The square brackets match a single character. In order to match multiple characters they need to be followed by a +
or *
or {}
to denote how many of those characters it should match.
However, in your case, you just want to match the actual letters QR
in that order, so simply don't use square brackets.
QR[0-9]\.[0-9]_[A-Z][0-9][A-Z][0-9][0-9]
The same goes for characters like the underscore which are always in the same place. Note that the .
was escaped with a \
because it has a special meaning in regex.
Going back to matching multiple characters with square brackets, if the order of the last 5 characters doesn't matter, you can further reduce your expression using a single square bracket and a {}
to match all your trailing characters after the underscore.
QR[0-9]\.[0-9]_[A-Z0-9]{5}