0

I have a JSON with following structure:

    {
    "id": 2,
    "image_id": 2,
    "segmentation": [
        [
            913.0,
            659.5,
            895.0,
        ],
        [   
            658.5,
            875.0,
            652.5,
            659.5
        ],
    ],
    "iscrowd": 0,
    "bbox": [
        4.5,
        406.5,
        1098.0,
        1096.0
    ],
    "area": 579348.0,
    "category_id": 0
},

Now I need to split each entry it into two separate entries, like these:

    {
    "id": 2,
    "image_id": 2,
    "segmentation": [
        [
            658.5,
            875.0,
            652.5,
            659.5
        ],
    ],
    "iscrowd": 0,
    "bbox": [
        4.5,
        406.5,
        1098.0,
        1096.0
    ],
    "area": 579348.0,
    "category_id": 0
    },
    {
    "id": 3,
    "image_id": 2,
    "segmentation": [
        [
            913.0,
            659.5,
            895.0,
        ],
    ],
    "iscrowd": 0,
    "bbox": [
        4.5,
        406.5,
        1098.0,
        1096.0
    ],
    "area": 579348.0,
    "category_id": 0
},

So that each new entry has the same image_id and iscrowd, bbox, area & category_id as the original entry, however gets new (incremental) id, and has only one segmentations:[] . So if the original entry had 15 segmentations, the code would split it into 15 entries with unique IDs.

Any tips how? I have found some posts on how to merge based on key value, but not how to split.

Deamoon
  • 3
  • 4
  • a) I'm confused about what has changed in the new structure. b) What have you tried already? c) Have you seen [how to ask](https://stackoverflow.com/help/how-to-ask)? – blueteeth Sep 05 '22 at 06:54
  • @blueteeth a) In original file, there are multiple segmentations under one annotation ID (which refers to single image ID). I need to split those segmentations, so each has a unique annotation ID (one segmentation under each), all refering to the original image ID. b) I havent found anything that would help me yet c) yes – Deamoon Sep 05 '22 at 07:36

2 Answers2

0
import json

new_json = []
ids = 0


for i in original_json:
    segms = i["segmentation"]
    for j in segms:
        dummy = {}
        for k in i:
            dummy[k] = i[k]
        dummy["id"] = ids
        dummy["segmentation"] = j
        ids+=1
        new_json.append(dummy)

with open("new_json_file.json", 'w') as f:
    json.dump(new_json, f)

Hope this helps

iamtrappedman
  • 176
  • 1
  • 7
  • Thanks. I have tried this, however I get "list indices must be integers or slices, not str" error on the segms = original_json["segmentation"] line – Deamoon Sep 05 '22 at 11:31
  • how are you trying reading the original json file ? try ```segms = original_json[0]["segmentation"]``` – iamtrappedman Sep 05 '22 at 11:33
  • using `segms = original_json[0]["segmentation"]` works for reading the original json, however the outputed json has the same structure, no segmentations have been splitted – Deamoon Sep 05 '22 at 11:48
  • So it seems that `segms = original_json[0]["segmentation"]` results in segms being only list of floats of the first segmentation ( would be `segms = [[658.5, 875.0, 652.5, 659.5]]`from my example) – Deamoon Sep 05 '22 at 12:09
  • can you share your new output ? I also haven't done anything to increment ID, you also need to implement that. – iamtrappedman Sep 05 '22 at 12:37
  • Added as new anwser so I can share the code better: – Deamoon Sep 05 '22 at 12:48
0

So the code provided by @iamtrappedman sort of works:

test_loc = "/content/TEST.json"
with open(test_loc) as j_f:
  original_json = json.load(j_f)

  segms = original_json[0]["segmentation"]
  new_json = []

  for i in segms:
    original_json[0]["segmentation"] = i
    new_json.append(original_json)

  with open("new_json_file.json", "w") as f:
    json.dump(new_json, f,indent=4)

If I input following JSON:

[
{
    "id": 0,
    "image_id": 0,
    "segmentation": [
        [
            465.0,
            1198.5,
            432.0,
            1190.5
        ],
        [
            525.0,
            2424.5,
            1257.0,
            2578.5
        ]
    ],
    "iscrowd": 0,
    "bbox": [
        0.5,
        407.5,
        869.0,
        791.0
    ],
    "area": 425968.25,
    "category_id": 0
}
]

I get a JSON thats splitted, however both entries are identical:

[
[
    {
        "area": 425968.25,
        "bbox": [
            0.5,
            407.5,
            869.0,
            791.0
        ],
        "category_id": 0,
        "id": 0,
        "image_id": 0,
        "iscrowd": 0,
        "segmentation": [
            525.0,
            2424.5,
            1257.0,
            2578.5
        ]
    }
],
[
    {
        "area": 425968.25,
        "bbox": [
            0.5,
            407.5,
            869.0,
            791.0
        ],
        "category_id": 0,
        "id": 0,
        "image_id": 0,
        "iscrowd": 0,
        "segmentation": [
            525.0,
            2424.5,
            1257.0,
            2578.5
        ]
    }
]
]

EDIT Now for JSON with two annotations:

[
{
    "id": 0,
    "image_id": 0,
    "segmentation": [
        [
            465.0,
            1198.5,
            432.0,
            1190.5
        ],
        [
            525.0,
            2424.5,
            1257.0,
            2578.5
        ]
    ],
    "iscrowd": 0,
    "bbox": [
        0.5,
        407.5,
        869.0,
        791.0
    ],
    "area": 425968.25,
    "category_id": 0
},
{
    "id": 1,
    "image_id": 2,
    "segmentation": [
        [
            4241.0,
            14.5,
            141.0,
            7557.5
        ],
        [
            578.0,
            2424.5,
            141.0,
            965.5
        ]
    ],
    "iscrowd": 0,
    "bbox": [
        0.5,
        407.5,
        869.0,
        791.0
    ],
    "area": 425968.25,
    "category_id": 0
}
]

It does not split the annotations but duplicates them

[
[
    {
        "id": 0,
        "image_id": 0,
        "segmentation": [
            525.0,
            2424.5,
            1257.0,
            2578.5
        ],
        "iscrowd": 0,
        "bbox": [
            0.5,
            407.5,
            869.0,
            791.0
        ],
        "area": 425968.25,
        "category_id": 0
    },
    {
        "id": 1,
        "image_id": 2,
        "segmentation": [
            [
                4241.0,
                14.5,
                141.0,
                7557.5
            ],
            [
                578.0,
                2424.5,
                141.0,
                965.5
            ]
        ],
        "iscrowd": 0,
        "bbox": [
            0.5,
            407.5,
            869.0,
            791.0
        ],
        "area": 425968.25,
        "category_id": 0
    }
],
[
    {
        "id": 0,
        "image_id": 0,
        "segmentation": [
            525.0,
            2424.5,
            1257.0,
            2578.5
        ],
        "iscrowd": 0,
        "bbox": [
            0.5,
            407.5,
            869.0,
            791.0
        ],
        "area": 425968.25,
        "category_id": 0
    },
    {
        "id": 1,
        "image_id": 2,
        "segmentation": [
            [
                4241.0,
                14.5,
                141.0,
                7557.5
            ],
            [
                578.0,
                2424.5,
                141.0,
                965.5
            ]
        ],
        "iscrowd": 0,
        "bbox": [
            0.5,
            407.5,
            869.0,
            791.0
        ],
        "area": 425968.25,
        "category_id": 0
    }
]
]
Deamoon
  • 3
  • 4