0

I'm having problems in egrep trying to figure out a AND like operator for regular expression pattern matching, I need to match all the strings that must satisfy multiple conditions in a given list of strings, here are some questions I'm having problems with

1) Find the set of words that contain two consecutive e's, AND also contains at least two i's ( ieei is valid )

2) Find the set of words that are at least 5 characters long AND do not contain any vowels

I tried using lookaheads (?=.*?ee)(?=.*?i.*i) but it doesn't work, what am I missing here?

revo
  • 47,783
  • 14
  • 74
  • 117
  • if you have `GNU grep`, then try using `grep -P ` ... by default, grep supports only BRE/ERE and those do not have lookarounds, see https://unix.stackexchange.com/questions/119905/why-does-my-regular-expression-work-in-x-but-not-in-y – Sundeep Apr 03 '19 at 12:20
  • another option is to use grep multiple times, see https://stackoverflow.com/questions/4487328/how-to-use-grep-to-match-string1-and-string2 – Sundeep Apr 03 '19 at 12:22
  • I've been using "egrep -e" from the UNIX egrep utility and what you said seems to be the case, is there a way to check for multiple requirements using BRE/ERE without using multiple lines? – Laventio_19 Apr 03 '19 at 17:32

1 Answers1

0

As mentionned by Sundeep your grep implementation might support PCRE through the use of the -P flag, in which case the following would work :

grep -P '(?=.*?ee)(?=.*?i.*i)'

Otherwise, you can use the following ERE pattern instead :

[^ ]*(i[^ ]*ee[^ ]*i|i[^ ]*i[^ ]*ee|ee[^ ]*i[^ ]*i)[^ ]*

It matches words that conform to one of those 3 patterns :

  • the word contains an i followed by two consecutive e followed by another i
  • the word contains an i followed by another i followed by two consecutive e
  • the word contains two consecutive e followed by an i followed by another i
Aaron
  • 24,009
  • 2
  • 33
  • 57
  • Thankyou the ERE pattern works, but i'm still stuck at the 2nd question cause it's seems impractical to list out all the patterns which have atleast 5 letters and don not contain any vowels – Laventio_19 Apr 03 '19 at 17:36
  • @Laventio_19 research negated character classes. I use them in my ERE regex to avoid matching more than a word : I use a character class that matches any character but a space, `[^ ]` – Aaron Apr 03 '19 at 17:39
  • I'm aware "....." and "[^aeiou] " would give me the answer im looking for but im having problems combining those two expressions into a single expression – Laventio_19 Apr 04 '19 at 04:32
  • @Laventio_19 you shouldn't use `.` as it might match vowels. You could use multiple occurences of `[^aeiou]`, but you'd better use a quantifier such as `{5,}` "5 of more occurences of the previous token". You might also need to use `\b` word-boundaries to make sure you match the whole word. Consider following a quick regex tuto, given the questions' level you should already be familiar with quantifiers – Aaron Apr 04 '19 at 08:26