How to parse a JSON/dict-like value from an Excel sheet and copy over to another column?

Question

I have an Excel sheet, where data is entered in this format:

{"id":"03CD2B3C22558D83","imp":6667, floor:7777}

How can I in python read and extract a certain value of an attribute in Excel, and copy this on another column? For example, I want to obtain the value 6667.

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Mar 24 '22 at 18:56
Read [How to read data form excel sheets in Python](https://www.sitepoint.com/using-python-parse-spreadsheet-data/) and [How to write data to a specific cell in Python](https://stackoverflow.com/questions/49681392/python-pandas-how-to-write-in-a-specific-column-in-an-excel-sheet) — Alias Cartellano, Mar 24 '22 at 19:01
Is that entire JSON/dict-like value entered in one cell? I'm assuming `floor` is a typo, and it's actually a string `"floor"` like `"id"` and `"imp"` — Gino Mempin, Mar 25 '22 at 12:17

score 1 · Answer 1 · answered Mar 24 '22 at 20:32

1

This works, simple and no error checking:

MID(A1,FIND("imp",A1,1)+5,4)

answered Mar 24 '22 at 20:32

Solar Mike

7,156
4
17
32

I think the question was asking for a Python solution: "*how i can in python to read and extract certain the value*". It was also tagged with python. – Gino Mempin Mar 24 '22 at 23:13
1

@GinoMempin if you look it is **also** tagged with excel, so I was being generous. – Solar Mike Mar 25 '22 at 04:41
@GinoMempin and given the poor use of English it is difficult to be absolutely sure - as pointed out in another comment. – Solar Mike Mar 25 '22 at 04:44
@GinoMempin and some other user may find the excel answer useful in the future - part of the reason people post on here, or don’t you understand that? – Solar Mike Mar 25 '22 at 05:04
@GinoMempin So where is your python solution? At least I was positive with my post… – Solar Mike Mar 25 '22 at 05:05

Gino Mempin · Answer 2 · 2022-04-25T08:29:16.747

You can use openpyxl to work with Excel files.
The openpyxl tutorial should help you get started on how to use it.

I'll assume this kind of value:

{"id":"03CD2B3C22558D83","imp":6667, floor:7777}

is stored on each cell, like this:

Using openpyxl, you first read the value of each cell as a string.

from openpyxl import load_workbook

# Assume Excel file named "test.xlsx"
# Assume sheet named "Sheet1"
wb = load_workbook(filename="test.xlsx")
ws = wb["Sheet1"]

# Get the value of cell A1 - A5
# In openpyxl, rows/columns start at index 1
for row_idx in range(1, 6):
    cell = ws.cell(row=row_idx, column=1)
    print(type(cell.value), cell.value)

$ python test.py
<class 'str'> {"id":"03CD2B3C22558D83","imp":6667, floor:7777}
<class 'str'> {"id":"1111111111111111","imp":1111, floor:1000}
<class 'str'> {"id":"2222222222222222","imp":2222, floor:2000}
<class 'str'> {"id":"3333333333333333","imp":3333, floor:3000}
<class 'str'> {"id":"4444444444444444","imp":4444, floor:4000

Then, once you can access each cell value, convert it to a regular Python dictionary.

Now, normally you can use json module's .loads to convert. But the problem is that the JSON string is malformed, it's missing double-quotes "" around floor. You can check on https://jsonlint.com/:

It would have worked nicely if floor was wrapped in double-quotes:

>>> import json
>>> s = '{"id":"03CD2B3C22558D83","imp":6667, "floor":7777}'
>>> d = json.loads(s)
>>> d["imp"]
6667

If you have no control over how the data was written to the Excel file, check the workarounds for parsing a somewhat wrong JSON with Python. A quick workaround I usually use is the dirtyjson library:

>>> import dirtyjson
>>> s = '{"id":"03CD2B3C22558D83","imp":6667, floor:7777}'
>>> d = dirtyjson.loads(s)
>>> d
AttributedDict([('id', '03CD2B3C22558D83'), ('imp', 6667), ('floor', 7777)])
>>> d = dict(d)
>>> d
{'id': '03CD2B3C22558D83', 'imp': 6667, 'floor': 7777}
>>> d["imp"]
6667

So, assuming there are no typos in the example data, let's use dirtyjson to convert to a dict, then access the imp value:

import dirtyjson
from openpyxl import load_workbook

# Assume Excel file named "test.xlsx"
# Assume sheet named "Sheet1"
wb = load_workbook(filename="test.xlsx")
ws = wb["Sheet1"]

# For each value of cell A1 - A5
#   Convert the value to a dictionary
for row_idx in range(1, 6):  # In openpyxl, rows/columns start at index 1
    cell = ws.cell(row=row_idx, column=1)
    data = dict(dirtyjson.loads(cell.value))
    print(data, data["imp"])

$ python test.py
{'id': '03CD2B3C22558D83', 'imp': 6667, 'floor': 7777} 6667
{'id': '1111111111111111', 'imp': 1111, 'floor': 1000} 1111
{'id': '2222222222222222', 'imp': 2222, 'floor': 2000} 2222
{'id': '3333333333333333', 'imp': 3333, 'floor': 3000} 3333
{'id': '4444444444444444', 'imp': 4444, 'floor': 4000} 4444

Finally, simply write the value to another column. Let's assume, you want to copy it over to the B column on the same row. You can pick any other column or cell, just remember that openpyxl processes cell rows and columns with 1-index notation, so for the B column, that would be column=2.

import dirtyjson
from openpyxl import load_workbook

# Assume Excel file named "test.xlsx"
# Assume sheet named "Sheet1"
wb = load_workbook(filename="test.xlsx")
ws = wb["Sheet1"]

# For each value of cell A1 - A5
#   Convert the value to a dictionary
#   Write the "imp" value to the B column
for row_idx in range(1, 6):  # In openpyxl, rows/columns start at index 1
    cell = ws.cell(row=row_idx, column=1)
    data = dict(dirtyjson.loads(cell.value))
    ws.cell(
        row=row_idx,
        column=2,  # B
        value=data["imp"],
    )

# DO NOT FORGET TO SAVE AND CLOSE WHEN DONE!
# Also, keep the file closed when writing
wb.save(filename="test_output.xlsx")
wb.close()

Tested with

Python 3.9.10
openpyxl 3.0.9
dirtyjson 1.0.7
macOS 10.15.7 with Microsoft Excel for Mac 16.6.6

How to parse a JSON/dict-like value from an Excel sheet and copy over to another column?

2 Answers2