0

I am writing a parser that goes through a list of data that is roughly formatted:

{
    "teachers": [
        {
            "fullName": "Testing",
            "class": [
                {
                    "className": "Counselor",
                    "school": {
                        "id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
                    }
                }
            ]
        },
        ...
}

The parser is supposed to check for duplicate names within this json object, and when it stumbles upon said duplicate name, append the class to the class array.

So for example:

{
    "teachers": [
        {
            "fullName": "Testing",
            "class": [
                {
                    "className": "Counselor",
                    "school": {
                        "id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
                    }
                }
            ]
        },
{
            "fullName": "Testing",
            "class": [
                {
                    "className": "Math 8",
                    "school": {
                        "id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
                    }
                }
            ]
        },
        ...
}

Would return

{
    "teachers": [
        {
            "fullName": "Testing",
            "class": [
                {
                    "className": "Counselor",
                    "school": {
                        "id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
                    }
                },
                {
                    "className": "Math 8",
                    "school": {
                        "id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
                    }
                },
            ]
        },
        ...
}

My current parser works just fine for most objects, however for some reason it doesn't catch some of the duplicates despite the names being the exact same, and also is appending the string

}7d-48d6-b0b5-3d44ce4da21c"
                    }
                }
            ]
        }
    ]

to the end of the json document. I am not sure why it would do this considering I am just dumping the modified json (which only is modified within the array).

My parser code is:

i_duplicates = []
name_duplicates = []

def converter():
  global i_duplicates

  file = open("final2.json", "r+")
  infinite = json.load(file)
  for i, teacher in enumerate(infinite["teachers"]):
    class_name = teacher["class"][0]["className"]
    class_data = {
        "className": class_name,
        "school": {
          "id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
        }
      }
    d = {
      "fullName": teacher["fullName"], 
      "index": i
    }
    c = {
      "fullName": teacher["fullName"]
    }
    if c in name_duplicates:
      infinite["teachers"][search(i_duplicates, c["fullName"])]["class"].append(class_data)
      infinite["teachers"].pop(i)
      file.seek(0)
      json.dump(infinite, file, indent=4)
    else:
      i_duplicates.append(d)
      name_duplicates.append(c)

def search(a, t):
  for i in a:
    if i["fullName"] == t:
      return i["index"]
  print(Fore.RED + "not found" + Fore.RESET)

I know I am going about this inefficiently, but I am not sure how to fix the issues the current algorithm is having. Any feedback appreciated.

infecting
  • 523
  • 6
  • 10
  • 1
    Looks to me like you're only partially overwriting the file. e.g. you have a file with the contents `{"foo": 100}`, and you replace it with a file with the contents `{"foo": 2}`, and end up only rewriting the first 10 characters of the file, which would leave you with `{"foo": 2}0}`. Have you tried truncating the file after the writes? https://stackoverflow.com/questions/11469228/replace-and-overwrite-instead-of-appending/11469328 – Nick ODell Nov 04 '21 at 01:42
  • 1
    at a brief glance, your issue is probably here `file.seek(0)` if the new data is shorter than the previous data some of it will stay there. add a `file.truncate()` after the `json.dump` – Nullman Nov 04 '21 at 01:42
  • @Nullman, yep ```file.truncate()``` did the trick. Thank you! – infecting Nov 04 '21 at 02:16

0 Answers0