I'm trying to format the following file;
[30-05-2013 15:45:54] A A
[26-06-2013 14:44:44] B A
[26-06-2013 14:44:44] C A
[26-06-2013 14:43:16] Some lines are so large, they take multiple lines, so explode('\n') won't work because
I need the complete message
[26-06-2013 14:44:44] E A
[26-06-2013 14:44:44] F A
[26-06-2013 14:44:44] G A
Expected output:
Array
(
[0] => [30-05-2013 15:45:54] A A
[1] => [26-06-2013 14:44:44] B A
[2] => [26-06-2013 14:44:44] C A
[3] => [26-06-2013 14:43:16] Some lines are so large, they take multiple lines, so
explode('\n') won't work because
I need the complete message
[4] => [26-06-2013 14:44:44] E A
...
)
Based on How do I include the split delimiter in results for preg_split()? I tried to use a positive lookbehind to persist the timestamps and came up with Regex101:
(?<=\[)(.+)(?<=\])(.+)
Which is used in the following PHP code;
#!/usr/bin/env php
<?php
class Chat {
function __construct() {
// Read chat file
$this->f = file_get_contents(__DIR__ . '/testchat.txt');
// Split on '[\d]'
$r = "/(?<=\[)(.+)(?<=\])(.+)/";
$l = preg_split($r, $this->f, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
var_dump(count($l));
var_dump($l);
}
}
$c = new Chat();
This is giving me the following output;
array(22) {
[0]=>
string(1) "["
[1]=>
string(20) "30-05-2013 15:45:54]"
[2]=>
string(4) " A A"
[3]=>
string(2) "
["
[4]=>
string(20) "26-06-2013 14:44:44]"
[5]=>
string(4) " B A"
[6]=>
string(2) "
["
[7]=>
string(20) "26-06-2013 14:44:44]"
[8]=>
string(4) " C A"
[9]=>
string(2) "
["
[10]=>
string(20) "26-06-2013 14:43:16]"
[11]=>
string(87) " Some lines are so large, they take multiple lines, so explode('\n') won't work because"
[12]=>
string(30) "
I need the complete message
["
Question
- Why is the first
[
being ignored? - How should I change the regex to get the desired output?
- Why are there sill empty strings with
PREG_SPLIT_NO_EMPTY
?