1

I am using php to scrape a webpage and get this string:

'[{endTime:"2019-06-05T17:15:00.000+10:00",startTime:"2019-06-05T17:00:00.000+10:00"}]'

which is not valid json, the key names are encapsulated ...

I use preg_replace to create valid json:

$x = '[{endTime:"2019-06-05T17:15:00.000+10:00",startTime:"2019-06-05T17:00:00.000+10:00"}]'
$j = preg_replace('/(\w+)\s{0,1}:/', '"\1":', $x);

and get this value:

'[{"endTime":"2019-06-"05T17":"15":00.000+"10":00","startTime":"2019-06-"05T17":"00":00.000+"10":00"}]'

but I want this value:

'[{"endTime":"2019-06-05T17:15:00.000+10:00","startTime":"2019-06-05T17:00:00.000+10:00"}]'

How do I solve this problem?

Emma
  • 27,428
  • 11
  • 44
  • 69
David Bray
  • 566
  • 1
  • 3
  • 15

2 Answers2

2

RegEx 1

Your original expression seems to be find, we would just slightly modify that to:

([{,])(\w+)(\s+)?:

and it might work, we are adding a left boundary:

([{,])

and a right boundary:

:

and our key attribute is in this capturing group:

(\w+)

RegEx 2

We can expand our first expression to:

([{,])(\s+)?(\w+)(\s+)?:

in case, we might be having spaces before the key attribute:

Demo

Test 1

$re = '/([{,])(\w+)(\s+)?:/m';
$x = '[{endTime:"2019-06-05T17:15:00.000+10:00",startTime:"2019-06-05T17:00:00.000+10:00"}]';
$subst = '$1"$2":';

$result = preg_replace($re, $subst, $x);

echo $result;

Test 2

$re = '/([{,])(\s+)?(\w+)(\s+)?:/m';
$x = '[{endTime:"2019-06-05T17:15:00.000+10:00",startTime:"2019-06-05T17:00:00.000+10:00"}]';
$subst = '$1"$3":';

$result = preg_replace($re, $subst, $x);

echo $result;

Output

[{"endTime":"2019-06-05T17:15:00.000+10:00","startTime":"2019-06-05T17:00:00.000+10:00"}]

Demo

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Emma
  • 27,428
  • 11
  • 44
  • 69
0

use this pattern :

([{,])([^:]+):

it will find all texts which are following by { or ,

and use this for replacement:

$1"$2":

It will add a doublequote on both sides of your word.

Hamed Ghasempour
  • 435
  • 3
  • 12