76

Why can't I match the string

"1234567-1234567890"

with the given regular expression

\d{7}-\d{10}

with egrep from the shell like this:

egrep \d{7}-\d{10} file

?

jww
  • 97,681
  • 90
  • 411
  • 885
user377622
  • 865
  • 1
  • 6
  • 8

5 Answers5

96

egrep doesn't recognize \d shorthand for digit character class, so you need to use e.g. [0-9].

Moreover, while it's not absolutely necessary in this case, it's good habit to quote the regex to prevent misinterpretation by the shell. Thus, something like this should work:

egrep '[0-9]{7}-[0-9]{10}' file

See also

References

polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • Actually he only needs to quote the regex if it contains shell meta-characters. And now that it no longer contains backslashes, it doesn't, so quoting is optional. – sepp2k Jul 06 '10 at 10:49
  • @sepp2k: do you need quote for a space? I think you do. I guess you can argue that a space is a shell metacharacter. Anyway I think it's best to always quote, ala it's best to always use curly braces. – polygenelubricants Jul 06 '10 at 10:51
  • Then how would be with grep instead; I'm interested in \d prefix?! – user377622 Jul 06 '10 at 10:53
  • 1
    @persistent: according to comparison chart I linked, neither POSIX ERE (egrep) nor POSIX BRE (grep) knows `\d`, `\s`, `\w`, `\b`, etc. Also `\d` is not a prefix; it's a shorthand for the digit character class supported by many but not all flavors. – polygenelubricants Jul 06 '10 at 10:55
  • Well that's odd; then where they're specified if not inside (e)grep ? – user377622 Jul 06 '10 at 10:58
  • It's not prefix; inside the sintax it must be specified as a prefix before the brackets; thx for the correction anyway ;) – user377622 Jul 06 '10 at 11:01
  • @persistent: different flavors of regex does things differently, that's why it's important to mention which flavor you're using when asking regex questions, etc. I'll guess that Perl popularized the `\d` shorthand, and everyone else followed later. – polygenelubricants Jul 06 '10 at 11:01
  • 1
    @polygenelubricants: Yes, you need quotes with spaces (or put a backslash before every space). And sure, it doesn't hurt to always quote. – sepp2k Jul 06 '10 at 11:02
  • 1
    @persistent: you can't use `\d` with grep/egrep; you can use its expanded form `[0-9]` which is practically the same thing, but slightly longer. In some flavors that supports Unicode, `\d` is not the same as `[0-9]` because it also includes some other Unicode digit characters. – polygenelubricants Jul 06 '10 at 11:04
  • Well mate; I already knew for the [block]{} form ; I was interested in \d; thx – user377622 Jul 06 '10 at 11:08
  • I don't ever use `[0-9]`, when I really mean `[[:digit:]]`. Plus these character classes are supported almost everywhere, and are defined in POSIX. – J. M. Becker Sep 01 '12 at 22:42
  • '\d' wasn't working for me, but '\s' and '\w' did today. grep (GNU grep) 2.23 Packaged by Homebrew. – Pysis Aug 18 '16 at 17:30
27

For completeness:

Egrep does in fact have support for character classes. The classes are:

  • [:alnum:]
  • [:alpha:]
  • [:cntrl:]
  • [:digit:]
  • [:graph:]
  • [:lower:]
  • [:print:]
  • [:punct:]
  • [:space:]
  • [:upper:]
  • [:xdigit:]

Example (note the double brackets):

egrep '[[:digit:]]{7}-[[:digit:]]{10}' file
André Laszlo
  • 15,169
  • 3
  • 63
  • 81
  • 3
    Just a complaint about grep: `[[:digit:]]` is worse than `[[0-9]]` in every possible way. None of these are short hand, and they are harder to rememer than the default regex syntax. EG: `[[:lower:]]` is harder to remember, read and write than `[a-z]` – Zombies Feb 13 '16 at 08:41
  • @Zombies But `grep -E` i.e. `egrep` supports both `[:digit:]` and `[0-9]` so where is the complaint? If you're comparing with `\d` it's arguable. d could stand for anything, a bit like one letter variable names. Still seems \d has become more popular. I think grep character classes pre-date Perl `\d` – Jason S Jun 21 '17 at 04:55
22

you can use \d if you pass grep the "perl regex" option, ex:

grep -P "\d{9}"

rogerdpack
  • 62,887
  • 36
  • 269
  • 388
11

Use [0-9] instead of \d. egrep doesn't know \d.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
-3

try this one:

egrep '(\d{7}-\d{10})' file
Nikhil Jain
  • 8,232
  • 2
  • 25
  • 47
  • 1
    Traditional egrep did not support the { metacharacter, and some egrep implementations support \{ instead, so portable scripts should avoid { in egrep patterns and should use [{] to match a literal {. – Nikhil Jain Jul 06 '10 at 11:05
  • However neither traditional egrep nor GNU egrep support \d and that's why this does not work - not because of the {. Though it'd be useful to keep the { thing in mind if you ever have to be compatible with traditional egrep. – sepp2k Jul 06 '10 at 11:13