1

I have a dict like this:

{
    'StreetAddress': {
        'type': 'string',
        'valueString': '4| 7 RULE CHIE N',
        'text': '4| 7 RULE CHIE N',
        'page': 1,
        'boundingBox': [
            76.0,
            342.0,
            829.0,
            342.0,
            829.0,
            382.0,
            76.0,
            382.0
        ],
        'confidence': 1.0
    },
    'Phone': {
        'type': 'string',
        'valueString': '5 4 3 | 9 7 | 0 0 1',
        'text': '5 4 3 | 9 7 | 0 0 1',
        'page': 1,
        'boundingBox': [
            77.0,
            465.0,
            648.0,
            465.0,
            648.0,
            502.0,
            77.0,
            502.0
        ],
        'confidence': 1.0
    },
    'FirstName': {
        'type': 'string',
        'valueString': 'HENRI',
        'text': 'HENRI',
        'page': 1,
        'boundingBox': [
            73.0,
            291.0,
            341.0,
            291.0,
            341.0,
            322.0,
            73.0,
            322.0
        ],
        'confidence': 1.0
    },
    'LastName': {
        'type': 'string',
        'valueString': 'AC THANICE',
        'text': 'AC THANICE',
        'page': 1,
        'boundingBox': [
            138.0,
            224.0,
            521.0,
            224.0,
            521.0,
            258.0,
            138.0,
            258.0
        ],
        'confidence': 0.986
    },
    'City': {
        'type': 'string',
        'valueString': 'MIO N REAL',
        'text': 'MIO N REAL',
        'page': 1,
        'boundingBox': [
            67.0,
            398.0,
            527.0,
            398.0,
            527.0,
            451.0,
            67.0,
            451.0
        ],
        'confidence': 0.997
    },
    'PostalCode': {
        'type': 'string',
        'valueString': 'H 3 B O A 2',
        'text': 'H 3 B O A 2',
        'page': 1,
        'boundingBox': [
            927.0,
            411.0,
            1249.0,
            411.0,
            1249.0,
            444.0,
            927.0,
            444.0
        ],
        'confidence': 1.0
    },
    'Province': {
        'type': 'string',
        'valueString': 'Q C',
        'text': 'Q C',
        'page': 1,
        'boundingBox': [
            792.0,
            410.0,
            842.0,
            410.0,
            842.0,
            436.0,
            792.0,
            436.0
        ],
        'confidence': 1.0
    },
    'FormId': {
        'type': 'string',
        'valueString': 'C1234567',
        'text': 'C1234567',
        'page': 1,
        'boundingBox': [
            972.0,
            712.0,
            1268.0,
            712.0,
            1268.0,
            766.0,
            972.0,
            766.0
        ],
        'confidence': 0.96
    },
    'DonationDate': {
        'type': 'string',
        'valueString': '2 6 0 6',
        'text': '2 6 0 6',
        'page': 1,
        'boundingBox': [
            72.0,
            165.0,
            395.0,
            165.0,
            395.0,
            198.0,
            72.0,
            198.0
        ],
        'confidence': 1.0
    },
    'Email': {
        'type': 'string',
        'valueString': 'h- lach @ gmail.com',
        'text': 'h- lach @ gmail.com',
        'page': 1,
        'boundingBox': [
            252.0,
            516.0,
            685.0,
            516.0,
            685.0,
            559.0,
            252.0,
            559.0
        ],
        'confidence': 0.57
    },
    'DonationAmount': {
        'type': 'string',
        'valueString': '600.00',
        'text': '600.00',
        'page': 1,
        'boundingBox': [
            623.0,
            162.0,
            776.0,
            162.0,
            776.0,
            191.0,
            623.0,
            191.0
        ],
        'confidence': 1.0
    },
    'UnitAddress': None
}

I need to create a dict like this:

resultsData = {'StreetAddress': '4| 7 RULE CHIE N',
 'Phone': '5 4 3 | 9 7 | 0 0 1',
 'FirstName': 'HENRI',
 'LastName': 'AC THANICE',
 'City': 'MIO N REAL',
 'PostalCode': 'H 3 B O A 2',
 'Province': 'Q C',
 'FormId': 'C1234567',
 'DonationDate': '2 6 0 6',
 'Email': 'h- lach @ gmail.com',
 'DonationAmount': '600.00'}

The items in resultsData dictionary combines the outer_keys of fieldData (ie: 'StreetAddress', 'Phone', 'FirstName'...) and the inner_value of the inner_key, if the inner_key == 'text' (ie: outer_key == 'FirstName, inner_key == 'text', inner_value == 'HENRI')

I have tried for loop based guidance from datacamp dictionary

resultsData = {}
for (outer_k, outer_v) in fieldData.items():
    for (inner_k, inner_v) in outer_v.items():
        if inner_k == 'text':
            resultsData.update({outer_k:inner_v})

This creates an object that I am looking for (as per above) but throws error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-27-dbae6918239f> in <module>
      1 resultsData = {}
      2 for (outer_k, outer_v) in fieldData.items():
----> 3     for (inner_k, inner_v) in outer_v.items():
      4         if inner_k == 'text':
      5             resultsData.update({outer_k:inner_v})

AttributeError: 'NoneType' object has no attribute 'items'

I have also tried dict comprehension based on pattern from Nested dictionary comprehension python

data = {outer_key: {inner_value: inner_key for inner_value, inner_key in outer_value.items()} 
        for outer_key, inner_value in fieldData.items()}

that returns error:

NameError                                 Traceback (most recent call last)
<ipython-input-114-c09167c38348> in <module>
      1 data = {outer_key: {inner_value: inner_key for inner_value, inner_key in outer_value.items()} 
----> 2         for outer_key, inner_value in fieldData.items()}

<ipython-input-114-c09167c38348> in <dictcomp>(.0)
      1 data = {outer_key: {inner_value: inner_key for inner_value, inner_key in outer_value.items()} 
----> 2         for outer_key, inner_value in fieldData.items()}

NameError: name 'outer_value' is not defined

Greatly appreciate any advice to resolve.

GLarose
  • 171
  • 1
  • 9
  • Thank you all for your answers. They have directed me to add a function to clean dictionary of None - without which, null reference errors would occur on the web app that edits the contents of fields. – GLarose Jun 07 '20 at 19:49

3 Answers3

2

A simple solution is to use dictionary comprehension, to iterate over all the keys and values and create a new dictionary:

>>> {field: value_dict['valueString'] for field, value_dict in fieldData.items()}
{'StreetAddress': '4| 7 RULE CHIE N', 'Phone': '5 4 3 | 9 7 | 0 0 1', 'FirstName': 'HENRI'}
Samuel Dion-Girardeau
  • 2,790
  • 1
  • 29
  • 37
1

Atleast one of your dictionary keys has a None value in the outer level. To avoid your program from crashing, check for None.

resultsData = {}
for outer_k, outer_v in fieldData.items():
    if outer_v is None:
        continue
    text = outer_v.get('text')
    if text is not None:
        resultsData[outer_k] = text
print(resultsData)

Output:

{'StreetAddress': '4| 7 RULE CHIE N', 'Phone': '5 4 3 | 9 7 | 0 0 1', 'FirstName': 'HENRI', 'LastName': 'AC THANICE', 'City': 'MIO N REAL', 'PostalCode': 'H 3 B O A 2', 'Province': 'Q C', 'FormId': 'C1234567', 'DonationDate': '2 6 0 6', 'Email': 'h- lach @ gmail.com', 'DonationAmount': '600.00'}
Balaji Ambresh
  • 4,977
  • 2
  • 5
  • 17
1

Now that you have shared the complete fieldDatadictionary, and seeing how you want your output resultData to be, I think that the following should be enough for your purposes:

resultData = {k: v['text'] for k, v in fieldData.items() if v}

Which outputs:

{'City': 'MIO N REAL',
 'DonationAmount': '600.00',
 'DonationDate': '2 6 0 6',
 'Email': 'h- lach @ gmail.com',
 'FirstName': 'HENRI',
 'FormId': 'C1234567',
 'LastName': 'AC THANICE',
 'Phone': '5 4 3 | 9 7 | 0 0 1',
 'PostalCode': 'H 3 B O A 2',
 'Province': 'Q C',
 'StreetAddress': '4| 7 RULE CHIE N'}
revliscano
  • 2,227
  • 2
  • 12
  • 21