Assuming the file contains on valid JSON per line, a possible fragment of code for your request is this:
foreach (file($logpath) as $line) {
$entry = json_decode($line, TRUE);
foreach ($entry as $item) {
echo('IP: '.$item['ip'].'; prop1: '.$item['prop1']); // etc
}
}
If the file is large this workflow doesn't work any more because of memory limitations. You can use fopen()
/fgets()
/fclose()
to read one line at a time and process it:
$fh = fopen($logpath, 'r');
while (! feof($fh)) {
$line = fgets($fh);
$entry = json_decode($line, TRUE);
foreach ($entry as $item) {
echo('IP: '.$item['ip'].'; prop1: '.$item['prop1']); // etc
}
}
fclose($fh);
But if the assumption of one valid JSON per line is not met none of the above code fragments work. In this case you'll have to implement a JSON parser yourself (or find one already implemented) that is able to read from the input string as many data as it needs until it finds a complete JSON string.
Update
You say in a comment that the file does not contain one JSON per line. This renders the code above useless. However, if the file is not large and its entire content can be loaded in memory, there is a hope. You can try to load the content of the file in memory, try to patch it to convert it to a valid JSON then decode it.
If all the JSONs from the file look like the ones you posted in the question (i.e. an array of objects) you can try to identify the sequences of characters ]
and [
(or }]
and [{
) separated only by whitespace characters. This is where a JSON ends (}]
) and the next one begins ([{
). If you insert commas between each pair of ]
and [
and wrap everything in [
and ]
, the result should be a valid JSON that, when decoded, produce an array. Each element of the array is the array used to generate each JSON from the input file.
Let's try to write the code:
// Get the entire content of the log file in memory in $text
$text = file_get_contents($logpath);
// Try to patch the content of the file to generate a larger JSON
$fixed = '['.preg_replace('/]\s*\[/', '],[', $text).']';
// Decode the JSON to arrays
$all = json_decode($fixed, TRUE);
// If $all is not FALSE then we did it!
foreach ($all as $entry) {
// $entry is one entry from the original log
// it used to be an array of objects on the source
// but we decoded the objects to associative arrays
foreach ($entry as $item) {
echo('IP: '.$item['ip'].'; prop1: '.$item['prop1']); // etc
}
}
The regexp
The regular expression used to identify the boundaries of the original JSONs, split into pieces:
] # the ']' character, there is nothing special about it
\s # match a whitespace character (i.e. space, tab, enter)
* # the previous sub-expression (\s) repeated zero or more times
\[ # match the '[' character; it is a special character in regexps
# and needs to be escaped here to make it "unspecial".