You can parse your weird input data with a series of nested regular expressions, using RegExp.exec() and one String.split().
const data = "[{Date: 2002,Entity Taking Action: Maryland Board ,Action Taken: Reprimand},{Date: 2002,Entity Taking Action: Massachusetts Board,Action Taken: Consent order because of Maryland Board},{Date: 2007,Entity Taking Action: North Carolina Medical Board,Action Taken: Consent order because of Maryland Board action},{Date: 2013,Entity Taking Action: NC Medical Board,Action Taken: Letter of concern for not reporting previous NC consent order on reactivation of NC licence}]"
const array = []
console.log ('data', data);
const regex1 = RegExp(/^\[(.*?)\]$/, 'g')
while (true) {
const rr1 = regex1.exec(data)
if (!rr1) break
const r1 = rr1[1]
const regex2 = RegExp(/{(.+?)},?/, 'g')
while (true) {
let rr2 = regex2.exec(rr1)
if (!rr2) break
const r2 = rr2[1]
const item = {}
console.log('item', r2)
const splits = r2.split(',')
for (let key in splits) {
let tagValue = splits[key]
console.log ('tagValue', tagValue)
const regex4 = RegExp(/ *(.+): *(.+) */, 'g')
while (true) {
const rr4=regex4.exec(tagValue)
if (!rr4) break
const tag = rr4[1]
const val = rr4[2]
item[tag] = val
console.log ('field', tag, val)
}
}
array.push(item)
console.log ('item', item)
}
}
console.log(JSON.stringify(array))
The outermost RegExp(/^\[(.*?)\]$/, 'g')
removes the []
delimiters.
The next one RegExp(/{(.+?)},?/, 'g')
splits up your {some thing},{another thing},{yet another}
data into some thing
, another thing
, and yet another
, removing the {}
curly braces as it does so.
The string.split turns your Tag: value, Tag: value
sequence into individual Tag: value
items.
And the innermost RegExp(/ *(.+): *(.+) */, 'g')
turns those into tag
and val
items.
You know, it's said that if you solve a problem with a regular expression, you then have two problems. This solution uses three regular expressions, so now you have four problems. The point is that regular-expression parsing is brittle. For example, if one of your Action Taken items says Disbarred, then thrown in federal prison
, that extra comma will wreck this parser.
You could write a robust parser for this stuff. But your better bet is to tell the provider of your data you need real JSON.