8

I want to remove trailing comma from json as,

{
  "key1": "value1",
  "object": {
    "key2": "value2", // <- remove comma
  },
  "key3": "value3", // <- remove comma
}

I came up with,

tr -d '\n' | \
sed -E 's:,(\s*}):\1:g' | \
jq .

and it works but I want to get this fully in sed.

I came up with,

sed -E '/,\s*$/ { N; s:,\s*(\n\s*},?):\1: }'

which works for above input but fails for

{
  "key1": "value1",
  "object": {
    "key2": "value2",
  },
  "key3": "value3",
  "key4": "value4", // <- remove comma
}

as N reads the next line and starts over from the line after next.

// output sed -E '/,\s*$/ { N;l }' using l/look command
{
  "key1": "value1",\n  "object": {$
  "key1": "value1",
  "object": {
    "key2": "value2",\n  },$
    "key2": "value2",
  },
  "key3": "value3",\n  "key4": "value4",$
  "key3": "value3",
  "key4": "value4",
}

Update:

Adding another example for testing:

{
  "key1": "value1",
  "object1": {
    "object2": {
      "key2": "value2"
    },
  },
  "key3": "value3",
}

Update:

This is working for whatever I've thrown at it.

sed -E -n 'H; x; s:,(\s*\n\s*}):\1:; P; ${x; p}' | \
    sed '1 d'

Explanation:

sed -E -n 'H; x; P; ${x; p}'

-n 'H; x' to get every line appended to the next line in pattern space (except for the last line which is simply printed with ${x; p})

and

s:,(\s*\n\s*}):\1:;

to remove the trailing comma in the pattern space.

hIpPy
  • 4,649
  • 6
  • 51
  • 65
  • 5
    Can you just fix the thing that is generating the invalid JSON instead? Because this isn't really the sort of problem you're going to solve robustly using `sed`. – larsks Nov 08 '18 at 03:15
  • There are JSON libraries for practically all languages. If you're getting invalid JSON like this, it's a sure sign that the creator rolled their own code instead of using a proper library, and they didn't know what they were doing. Tell them to fix it. – Barmar Nov 08 '18 at 03:20
  • 2
    Because if they can't even get commas right, they've probably got other problems as well, such as escaping special characters in strings. You'll drive yourself crazy trying to work around all their bugs. – Barmar Nov 08 '18 at 03:21
  • 2
    Anything you do will probably fail for something like `key: "foo,}"`, it will remove the comma that's inside the string. – Barmar Nov 08 '18 at 03:22
  • *"I want to remove trailing comma from json"* -- your input is not JSON (as [JSON](https://json.org)s do not have trailing commas). Assuming your input **is** JSON, `sed` is not the tool for handling it. Use [`jq`](https://stedolan.github.io/jq/) or a programming language. – axiac Nov 08 '18 at 05:59
  • I understand that it's not valid json, but it's convenient for adhoc testing. I'm editing the json input for quick testing for apis. I already have a sed command to remove `//` and wanted something to remove the trailing comma. – hIpPy Nov 13 '18 at 03:21
  • @axiac `jq` does not allow comments and trailing commas, hence this post. I'm piping the stripped output to it. – hIpPy Nov 16 '18 at 02:11
  • A little update to your regex did it for me -- thanks! -- now it works with closing } and ] as well: sed -E -n 'H; x; s:,(\s*\n\s*[]}]):\1:; P; ${x; p}' | sed '1 d' – zertyz Mar 08 '22 at 01:34
  • I've `'H; x; s:,(\s*\n\s*(}|])):\1:; P'`, and it's working for me. – hIpPy Mar 09 '22 at 08:53

7 Answers7

7

Since the input seems to be some kind of extension of JSON, you could use a command-line tool intended for such extensions. For example:

$ hjson -j < input.txt

or:

$ any-json --input-format=hjson input.txt

Output in both cases

{
  "key1": "value1",
  "object": {
    "key2": "value2"
  },
  "key3": "value3"
}
peak
  • 105,803
  • 17
  • 152
  • 177
3

Using the hold buffer:

sed '/^ *\}/{H;x;s/\([^}]\),\n/\1\n/;b};x;/^ *}/d' input

This is just a sed exercise, I don't think sed is the right tool for this job. It also needs a newline at the end or that the file ends with a }.

perreal
  • 94,503
  • 21
  • 155
  • 181
  • I added another sample json for testing where this `sed` fails. – hIpPy Nov 13 '18 at 09:54
  • This is not working for the new sample json I added. Are you sure this work? I'm on and `sed (GNU sed) 4.5` with `GNU bash, version 4.4.19(2)-release (x86_64-pc-msys)`, so I'm interested in the difference. Btw, I updated with an answer. – hIpPy Nov 15 '18 at 23:01
  • Can you comment on this: https://tio.run/##K05N0U3PK/3/Xz9OQSumVr/aw7rCulg/RiM6rjY2RlMnJk8/xhBIWCfVAiVAimr1U/7/r@ZSUFDKTq00VLJSUCpLzClNNVTSAYnlJ2WlJpeAhEFK4AJGcAGIPiO4PiMlsHAtSDeYAEkbw6WNgcbWAgA – perreal Nov 16 '18 at 03:56
  • Overall the accepted answer is better answer for the question that is titled now as hjson handles comments (both line, block), trailing comma, but thanks for pointing me in the right direction with `sed`. – hIpPy Nov 18 '18 at 18:06
2

Not an answer with sed but a (python) solution:

# load as python dictionary
d = {
  "key1": "value1",
  "object": {
    "key2": "value2",
  },
  "key3": "value3",
}

import json

json.dumps(d) # valid json string
Reut Sharabani
  • 30,449
  • 6
  • 70
  • 88
1

https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json seems to be your answer

# this works
echo '{"a": 1,}' | jq -n -f /dev/stdin

# as well as this
cat <<EOF | jq -n -f /dev/stdin
{
  "key1": "value1",
  "object": {
    "key2": "value2",
  },
  "key3": "value3",
}
EOF
Heechul Ryu
  • 381
  • 4
  • 6
1

I got it by loading the json as yaml with python pyyaml library and worked fine.

So for this example:

$ echo '{"a": 1,}' | jq
parse error: Expected another key-value pair at line 1, column 9

pyyaml fixes the input:

$ echo '{"a": 1,}' | python3 -c "import sys, json, yaml; print(json.dumps(yaml.safe_load(sys.stdin)))" | jq

{
  "a": 1
}

A more complex example:

$ echo '{"a": [1,],}' | python3 -c "import sys, json, yaml; print(json.dumps(yaml.safe_load(sys.stdin)))" | jq

{
  "a": [
    1
  ]
}
MagMax
  • 1,645
  • 2
  • 17
  • 26
0

Here is one in GNU awk. It uses " as field separator and removes commas before [ \n]*} from odd fields (outside quotes, will probably fail for "escaped \" inside"). Added "key4": "value4,}", to the file:

$ cat file
{
  "key1": "value1",
  "object": {
    "key2": "value2",
  },
  "key3": "value3",
  "key4": "value4,}",
}

The script processes the whole file as a single record (RS="^$") so it might not work for big files as-is:

$ awk '
BEGIN {
    FS=OFS="\""
    RS="^$"
}
{
    for(i=1;i<=NF;i++) {                         # or i+=2 and remove the if
        if(i%2)
            $i=gensub(/,([ \n]*\})/,"\\1","g",$i)
    }
}1' file

Output:

{
  "key1": "value1",
  "object": {
    "key2": "value2"
  },
  "key3": "value3",
  "key4": "value4,}"
}
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

sed and awk commands wasn’t working for me, so I ended up writing a small remove-json-trailing-comma.js file

const { readFileSync, writeFileSync } = require("fs");

const regex = /,(?!\s*?[{["'\w])/g;

const files = process.argv.splice(2);
for (let file of files) {
    const input = readFileSync(file).toString();
    let correct = input.replace(regex, '');
    writeFileSync(file, correct);
}

And then I could use it like this node remove-json-trailing-comma.js file.json (and work also with multiples files like node remove-json-trailing-comma.js **/package.json)

Jon
  • 29
  • 5