1

I wrote a regex:

(^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) (\"{1}.+\"{1}) ([0-9]+) ([0-9-]+)

to match a string like:

141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233

and using the php preg_match.

When I remove from the string for example the first part 141.243.1.172 the preg_match returns me:

array(6
 0  =>  [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
 1  =>  // correctly empty
 2  =>  [29:23:53:25]
 3  =>  "GET /Software.html HTTP/1.0"
 4  =>  200
 5  =>  233
 )

where the index 1 is correctly empty. But if I remove from the string [29:23:53:25] I get an empty array from preg_match. How can I have the same result as above, getting just the related index empty and not all?

Stefano Maglione
  • 3,946
  • 11
  • 49
  • 96

2 Answers2

2

For the first part that works due to the .*. If you want to be able to remove the second part as well, you could make both groups optional and the first one non greedy. Move the space into the second group as well.

Note that you don't have to escape the double quote and that the quantifier {1} is superfluous so it can be omitted.

There is only a single double quote following after the first match, but to prevent possible over matching you could make that match also non greedy or use a negated character class ("[^"]+") instead to prevent unnecessary backtracking.

(^.*?)?(\[[0-9]+:[0-9]+:[0-9]+:[0-9]+\] )?(".+?") ([0-9]+) ([0-9-]+)

Regex demo

For example

$strings = [
    '141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233',
    '[29:23:53:25] "GET /Software.html HTTP/1.0" 200 233',
    '"GET /Software.html HTTP/1.0" 200 233'
];

$pattern = '/(^.*?)?(\[[0-9]+:[0-9]+:[0-9]+:[0-9]+\] )?(".+?") ([0-9]+) ([0-9-]+)/';

foreach ($strings as $string) {
    preg_match($pattern, $string, $matches);
    print_r($matches);
}

Result

Array
(
    [0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => 141.243.1.172 
    [2] => [29:23:53:25] 
    [3] => "GET /Software.html HTTP/1.0"
    [4] => 200
    [5] => 233
)
Array
(
    [0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => 
    [2] => [29:23:53:25] 
    [3] => "GET /Software.html HTTP/1.0"
    [4] => 200
    [5] => 233
)
Array
(
    [0] => "GET /Software.html HTTP/1.0" 200 233
    [1] => 
    [2] => 
    [3] => "GET /Software.html HTTP/1.0"
    [4] => 200
    [5] => 233
)

Php demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

change regex to this

((^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) )?(\"{1}.+\"{1}) ([0-9]+) ([0-9-]+)

for 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233

result would be

Array
(
    [0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => 141.243.1.172 [29:23:53:25] 
    [2] => 141.243.1.172
    [3] => [29:23:53:25]
    [4] => "GET /Software.html HTTP/1.0"
    [5] => 200
    [6] => 233
)

and for [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233

result would be

Array
(
    [0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => [29:23:53:25] 
    [2] => 
    [3] => [29:23:53:25]
    [4] => "GET /Software.html HTTP/1.0"
    [5] => 200
    [6] => 233
)
codegames
  • 1,651
  • 18
  • 16