-1

i would like to put a positive and a negative regular expressions in one preg_match(). Is that possible?

I use these if Condition.

if ( preg_match('/bot|crawler|spider|/i', $user_agent) )
{
            if ( preg_match('/google|duckduck|yandex|baidu|yahoo|/i', $user_agent) )
            {
                $post['url'] = $url;
            }
            else
            {
                // nothing
            }
}

And would like to simplify the code to only one preg_match ...something like these

if ( preg_match('/bot|crawler|spider|/^(?!/google|duckduck|yandex|baidu|yahoo|/).i', $user_agent) )
      
   $post['url'] = $url;

}

But it is not working that way

BadHorsie
  • 14,135
  • 30
  • 117
  • 191
  • 1
    So you're trying to match any user agent that has `bot|crawler|spider`, that is not followed by `google|duckduck|yandex|baidu|yahoo` somewhere in the string? Your original PHP code is not doing that by the way, it's matching both rather than negating the second one. – BadHorsie Sep 30 '20 at 16:56
  • Hi BadeHorsie, yes that's right. I try to match User-Agents that contain words like "Bot" but not words like "Google". – bummerrang1 Sep 30 '20 at 16:59

1 Answers1

0

If you want to ensure that the word 'bot' is present anywhere, and the word 'google' is not present anywhere (before or after 'bot'), you can use positive and negative lookaheads together.

if (preg_match('/^(?=.*(bot|crawler|spider))(?!.*(google|duckduck|yandex|baidu|yahoo)).*/i', $user_agent)) {
BadHorsie
  • 14,135
  • 30
  • 117
  • 191
  • I test it with this user agent string "Mozilla/5.0 (compatible; Yandexbot/2.0; +http://www.yandex.com/yandexbot.htm)" But the request is also blocked with the preg_match. The first part seems working correct but in the second part there seems to be a mistake – bummerrang1 Sep 30 '20 at 17:20
  • BadHorsie, you are my hero. Now it is working. Thanks so much for your help. – bummerrang1 Sep 30 '20 at 17:27