10

I currently have JSON in the below format. Some of the Key values are NOT properly formatted as they are missing double quotes (")

How do I fix these key values to have double-quotes on them?

    {      
Name: "test",
Address: "xyz",
"Age": 40,
"Info": "test"
}

Required:

    {      
"Name": "test",
"Address": "xyz",
"Age": 40,
"Info": "test"
}

Using the below post, I was able to find such key values in the above INVALID JSON. However, I could NOT find an efficient way to replace these found values with double-quotes.

s = "Example: String"
out = re.findall(r'\w+:', s)

How to Escape Double Quote inside JSON

Georgy
  • 12,464
  • 7
  • 65
  • 73
mssqlsense
  • 153
  • 1
  • 1
  • 10

5 Answers5

12

Using Regex:

import re
data = """{ Name: "test", Address: "xyz"}"""
print( re.sub("(\w+):", r'"\1":',  data) )

Output:

{ "Name": "test", "Address": "xyz"}
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • 2
    This regex approach is very unsafe. It will alter values that happen to contain colons. See my answer for a safe solution. – Inigo May 04 '20 at 19:49
  • In JavaScript: `const result = jsonWithoutDoubleQuotes.replace(/((?=\D)\w+):/gm, '"$1":');` – Matthis Kohli May 17 '22 at 09:58
7

You can use PyYaml. Since JSON is a subset of Yaml, pyyaml may overcome the lack of quotes.

Example

import yaml

dirty_json = """
     {
  key: "value",
  "key2": "value"
}
"""
yaml.load(dirty_json, yaml.SafeLoader)

hectorcanto
  • 1,987
  • 15
  • 18
5

I had few more issues that I faced in my JSON. Thought of sharing the final solution that worked for me.

jsonStr = re.sub("((?=\D)\w+):", r'"\1":',  jsonStr)
jsonStr = re.sub(": ((?=\D)\w+)", r':"\1"',  jsonStr)
  1. First Line will fix this double-quotes issue for the Key. i.e. Name: "test"
  2. Second Line will fix double-quotes issue for the value. i.e. "Info": test

Also, above will exclude double-quoting within date timestamp which have : (colon) in them.

mssqlsense
  • 153
  • 1
  • 1
  • 10
  • It works well for me including the timestamp cases where timestamp has a : in it but it is failing for another key value pair in json data where value contains : in it. e.g. address:"2600:0000:cf02:ff64:0000:0000:345e:e820" . This regex converting this value to address":\"2600:0000:"cf02":4"e3f":7"c7a":2887:97"c9":40cd\" . – sgmbd Jun 09 '20 at 17:06
5

You can use online formatter. I know most of them are throwing error for not having double quotes but below one seems handling it nicely!

JSON Formatter

Jay Shukla
  • 5,794
  • 8
  • 33
  • 63
0

The regex approach can be brittle. I suggest you find a library that can parse the JSON text that is missing quotes.

For example, in Kotlin 1.4, the standard way to parse a JSON string is using Json.decodeFromString. However, you can use Json { isLenient = true }.decodeFromString to relax the requirements for quotes. Here is a complete example in JUnit.

import kotlinx.serialization.Serializable
import kotlinx.serialization.decodeFromString
import kotlinx.serialization.json.Json
import org.junit.jupiter.api.Assertions
import org.junit.jupiter.api.Test

@Serializable
data class Widget(val x: Int, val y: String)

class JsonTest {

    @Test
    fun `Parsing Json`() {
        val w: Widget = Json.decodeFromString("""{"x":123, "y":"abc"}""")
        Assertions.assertEquals(123, w.x)
        Assertions.assertEquals("abc", w.y)
    }

    @Test
    fun `Parsing Json missing quotes`() {
        // Json.decodeFromString("{x:123, y:abc}") failed to decode due to missing quotes
        val w: Widget = Json { isLenient = true }.decodeFromString("{x:123, y:abc}")
        Assertions.assertEquals(123, w.x)
        Assertions.assertEquals("abc", w.y)
    }
}
Big Pumpkin
  • 3,907
  • 1
  • 27
  • 18
  • Although you're correct that regex can be brittle for parsing recursive structures like JSON, this question is about Python, not Kotlin. – ggorlen Mar 24 '21 at 14:16