0

I've written a function to take excel columns and put them into a dictionary for outputting into json. When I write the json output the key values appear to be randomly ordered. Is there a way I can enforce the key values to be in the order that I add them?

def excel_to_json(filename):
    wb = xlrd.open_workbook(filename)
    sh = wb.sheet_by_index(0)

    # List to hold dictionaries
    c_list = []

    # Iterate through each row in worksheet and fetch values into dict
    for rownum in range(1, sh.nrows):
        cars = OrderedDict()
        row_values = sh.row_values(rownum)
        cars['Name'] = row_values[0]
        cars['Extensions'] = row_values[1]
        cars['Patterns'] = row_values[2]
        cars['RansomNoteFilenames'] = row_values[3]
        cars['Comment'] = row_values[4]
        cars['EncryptionAlgorithm'] = row_values[5]
        cars['AlternateNames'] = row_values[6]
        cars['Decryptor'] = row_values[7]
        cars['AdditionalInfo1'] = row_values[8]
        cars['AdditionalInfo2'] = row_values[9]
        cars['Screenshots'] = row_values[10]

        c_list.append(cars)

    # Serialize the list of dicts to JSON
    return formatJson(json.dumps(c_list))


def formatJson(input):
    return json.dumps(json.loads(input), indent=4)

An example json output looks like the following:

 {
        "Name": "dummydata",
        "Comment": "",
        "AlternateNames": "",
        "Screenshots": "dummydata",
        "RansomNoteFilenames": "dummydata",
        "Decryptor": "",
        "Extensions": "dummydata",
        "AdditionalInfo2": "",
        "Patterns": "",
        "EncryptionAlgorithm": "dummydata",
        "AdditionalInfo1": "dummydata",
    },

As you can see the keys here are in a different order to the for loop within excel_to_json() which makes readability of the json difficult if processed manually.

Michael A
  • 9,480
  • 22
  • 70
  • 114

2 Answers2

1

Do this

def formatJson(inp):
    return json.dumps(inp, indent=4)

And call it like

formatJson(c_list)

You are dumping and loading it again an extra time.

Also a slight improvement to your code

for rownum in range(1, sh.nrows):
        cars = OrderedDict()
        row_values = sh.row_values(rownum)
        row_names = ['Name','Extensions','Patterns','Comment',....]
        for i in range(len(row_values)):
            cars[row_names[i]] = row_values[i]

        c_list.append(cars)
Abhishek J
  • 2,386
  • 2
  • 21
  • 22
1

This depends on whether the json serializer you use actually honors the order within the OrderedDict. You could try a different one (e.g. simplejson) and see if it works. They are all API compatible.

Also, most json.dumps implementations have a sort_keys parameter which will sort all keys alphabetically. This is useful when you want to have a stable representation of your data, for e.g. diffing, etc.

p.s.

Also, your code can be made much more simple if you use the zip builtin.

columns = ['Name', 'Extension', 'Patterns']
row = ['Foo', 'Bar', 'Baz']
OrderedDict(zip(columns, row))

gives

OrderedDict([('Name', 'Foo'), ('Extension', 'Bar'), ('Patterns', 'Baz')])
pi.
  • 21,112
  • 8
  • 38
  • 59