-1

Given a sentence,

Scheme is such a bizarre programming language.

So any sentence that contains is and language should return true. I found | means or, but couldn't find any symbol means and.

Thanks,

roxrook
  • 13,511
  • 40
  • 107
  • 156
  • "ab" means "a" and then "b". Now you know and. – tchrist Apr 23 '11 at 05:03
  • @tchrist: Thanks. But if I put `islanguage`, how does regex know i mean a whole word, or two separate words? – roxrook Apr 23 '11 at 05:06
  • 1
    You don't need regular expressions to find such strings (although a regular expression is guaranteed to exist because [regular languages are closed under intersection](http://en.wikipedia.org/wiki/Regular_language#Closure_properties) and the sets of strings that have "is" or "language" as substrings are regular). You could just perform two substring searches. Is it a requirement that "is" appear before "language"? – Josh Rosen Apr 23 '11 at 05:14
  • It knows if you say "something and then is and the something and then language and then something", thus more than one and. – Howard Apr 23 '11 at 05:15

5 Answers5

3

Try the following regex:

\bis\b.*\blanguage\b

This one will match if the two words appear in exactly that order. \b (word boundary) means that the words are standalone.

Howard
  • 38,639
  • 9
  • 64
  • 83
3

You can use the idiom.

(?=expr)

For example,

(?=.*word1)(?=.*word2)

For more details, please refer to this threads.

Community
  • 1
  • 1
jumperchen
  • 1,463
  • 12
  • 27
0

Kinda ugly, but it should work (regardless of the how 'is' and 'language' are ordered):

(.*is.*language.*|.*language.*is.*)
dmitrii
  • 188
  • 6
0

In c# (and I know you didn't ask about c#, but it illustrates how this can be done much quicker)...

 string s = "Scheme is such a bizarre programming language.";
 if ((s.Contains(" is") || s.Contains("is ")) &&  
     (s.Contains(" language") || s.Contains("language ")))
 {
    // found match if you got here
 }

Regexs can be slow and hard to parse by someone who is reading your code. Simple string matches are quicker generally.

EDIT: This doesn't care about the order of the words and works for simple whitespace only

0

Try this one if you don't care about the order of the words in the sentence:

\bis\b.*\blanguage\b|\blanguage\b.*\bis\b
Wes
  • 6,455
  • 3
  • 22
  • 26