regular expression Language of all those words that don’t contain the substring aaaa

Question

can anyone help me with the following regular expression Language of all those words that don’t contain the substring aaaa. i just want to get the idea. i came up with the following R.E

" (ab)*(aaaaa)*"

any help will be appreciated. thanks

The use of the term "language" indicates that you might be approaching this from a more computer sciencey perspective, rather than a programming perspective. Are we allowed to use constructs like negative lookaheads, which aren't in CS definitions of regular expressions? In a practical program, we would just search for a match for `aaaa` and negate the result. — user2357112, Apr 11 '15 at 20:19
@pashaahmed: how is this a part of a *Modeling and Simulation course*, this should be part of *Fundamentals*, or *Automata and Computability*, but this has almost nothing to do with modeling nor simulating... — Willem Van Onsem, Apr 11 '15 at 20:30
well i dont know :) i am students of bs in bioinformatics. it's our 7th sem course. — pasha ahmed, Apr 11 '15 at 20:46

Mazdak · Answer 1 · 2015-04-11T20:40:11.463

1

You can use a negative look-ahead :

^((?!aaaa).)*$

This will match any combinations of length 0 or more of characters that not followed by aaaa.

Regular expression visualization

aaaa matches the characters aaaa literally (case sensitive)

. matches any character (except newline)

* Between zero and unlimited times

Debuggex Demo

Also as an alternative and more efficient way in general as @CommuSoft says you can use the following pattern :

^(a{1,3}[^a]|[^a])*a{0,3}$

edited Apr 11 '15 at 20:40

answered Apr 11 '15 at 20:09

Mazdak

105,000
18
159
188

thanks. i want to learn more about these regular expressions. can you share an easy source @kasra – pasha ahmed Apr 11 '15 at 20:17
@pashaahmed welcome but as far as i know this is the best and more elegant way for this problem! for more detail you can read http://stackoverflow.com/questions/406230/regular-expression-to-match-text-that-doesnt-contain-a-word – Mazdak Apr 11 '15 at 20:21
If the regex engine doesn't use dynamic programming - which unfortunately many don't seem to do - it can result in a computationally intensive task... Another way to do this is `^(a{1,3}[^a]|[^a])*$` I think... – Willem Van Onsem Apr 11 '15 at 20:24
@CommuSoft As far as i know except bash mot of the popular lang support `look-around` and also dynamic programming – Mazdak Apr 11 '15 at 20:27
@Kasra: I don't say they don't support it, and `grep` does this as well btw (Java for instance doesn't). But it is in many cases not implemented with an efficient algorithm. Which is too bad... – Willem Van Onsem Apr 11 '15 at 20:28
@CommuSoft yep, in that case, i'm agree! i'll update the answer thanks for attention! – Mazdak Apr 11 '15 at 20:30
1

@Kasra: indeed, there is a small part missing `^(a{1,3}[^a]|[^a])*a{0,3}$`... Furthermore, dont forget the `^` and `$` markers in your answer, otherwise the regex will [simply match a small substring](https://regex101.com/r/oT0sX0/1). – Willem Van Onsem Apr 11 '15 at 20:32

regular expression Language of all those words that don’t contain the substring aaaa

1 Answers1