2

I'm using this regex to mach some words without numbers and it works well

(?:searchForThis|\G).+?(\b[^\d\s]+?\b)

The problem that Regex searching the entire document and not only in the line that contains searchForThis

So if I have 2 times searchForThis it will take them twice

I want to stop it only on that 1st line so it will not search the other lines after Any help please?

I'm using Regex with php

Example of the problem here: http://www.rubular.com/r/vPhk8VbqZR

In the example you will see :

Match 1
1.  word
Match 2
1.  worldtwo
Match 3
1.  wordfive
Match 4
1.  word
Match 5
1.  worldtwo
Match 6
1.  wordfive

But I need only :

Match 1
1.  word
Match 2
1.  worldtwo
Match 3
1.  wordfive

You will see that it's doing twice

===========Edit for more details as asked ===========================

In my php I have :

define('CODE_REGEX', '/(?:searchForThis|\G(?<!^)).*?(\b[a-zA-Z]+\b)/iu')

Output :

if (preg_match_all(CODE_REGEX, $content, $result))
            return trim($result[1][0].' '.$result[1][1].' '.$result[1][2].' '.$result[1][3].' '.$result[1][4].' '.$result[1][5]);

Thank you

amorino
  • 375
  • 1
  • 3
  • 16

2 Answers2

3

You can use this pattern instead:

(?:\A[\s\S]*?searchForThis|\G).*?(\b[a-z]+\b)/iu

or

(?:\A(?s).*?searchForThis|\G)(?-s).*?(\b[a-z]+\b)/iu

To deal with multiple line between the first "searchForThis" and others or the end of the string, you can use this: (with your example string you will obtain "After" and "this".)

(?:\A.*?searchForThis|\G)(?>[^a-z]++|\b[a-z]++\S)*?(?!searchForThis)(\b[a-z]+\b)/ius

Note: in all the three pattern you can replace \A with ^ since the multiline mode is not used. Be carefull with rubular that is designed for ruby regexes: m in ruby = s in php (that is the dotall/singleline mode), m in php is the multiline mode (each start of the line can be matched with ^)

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Thank youuuuuuuuu a lot I was searching for this for 7 days and nights Thank you thank you Casimir et Hippolyte – amorino Dec 06 '13 at 14:31
  • Great regex, but it won't work if we've got another useless line before. – zessx Dec 06 '13 at 14:31
  • \A[\s\S]*? tell to search only on that line? – amorino Dec 06 '13 at 14:32
  • No, it means `Start of the input string, followed by any char, x time`. Note `\A` differs from `^` as it match the start of the **input string**, not the **line**. – zessx Dec 06 '13 at 14:36
  • @zessx: Yes however, you can use `^` instead, since the multiline mode is not set, it has the same meaning. – Casimir et Hippolyte Dec 06 '13 at 14:44
  • 1
    Interesting... @amorino You can use `(?s:^.*?searchForThis|\G).*?(?\b[a-z]+\b)/iu` to get all words in `$results['words']` – zessx Dec 06 '13 at 14:48
  • @Casmir et Hippolyte and @zessx Thank you a lot for all that precious help The solution 1 worked like a charm the 2 and 3 no I added the suggestion of zessx so : `(?:\A[\s\S]*?searchForThis|\G).*?(?\b[a-z]+\b)/iu` and `$results['words']` You saved my life ;) – amorino Dec 06 '13 at 15:03
  • @amorino: Test 2 and 3 directly in php, or with http://regex.larsolavtorvik.com . Rubular is for Ruby – Casimir et Hippolyte Dec 06 '13 at 15:07
  • @Casimir et Hippolyte directly in php and in the rubular site – amorino Dec 06 '13 at 15:11
0

You can make it in two stages :

// get the first line with 'searchForThis'
preg_match('/searchForThis(?<line>.*)\n/m', $text, $results);
$line = $results['line'];

// get every word from this line
preg_match_all('/\b[a-z]+\b/i', $line, $results);
$words = $results[0];

Another way, based on the great Casimir's answer (just for readibility) :

preg_match_all('/(?s:^.*?searchForThis|\G).*?(?<words>\b[a-z]+\b)/iu', $str, $results);
$words = $results['words'];
zessx
  • 68,042
  • 28
  • 135
  • 158