5

I'm trying to fill a DynamoDB database with data from an old event store made out of a PostgreSQL database. After it ran through a good portion of the db entries, this error was thrown when attempting to call the put_item function.

botocore.exceptions.ClientError:-

An error occurred (ValidationException) when calling the PutItem operation: One or more parameter values were invalid: An AttributeValue may not contain an empty string

I decided to rerun the code and see what was happening by dumping out all of the table attributes right before it was inserted.

I can see the only "empty string" is in the answer_string attribute of a dictionary, called details, see below:-

Importing event type 5 completed by user: 1933
1933 5 {'answer': {'difficulty': 1, 'answer_string': ''}, 'card_id': n 
'13448', 'review_id': '153339', 'time_spent': 2431}
62 153339
2017-01-18 00:46:48.373009+00:00 2017-01-18 00:46:48.364217+00:00

I'm pretty certain this is what's causing the error to be thrown, as none of the other table attributes are incorrect.

My problem is the details dictionary can come from dozens of different locations and each details dictionary can have different attributes - the one with the answer_string attribute is just one of many possible dictionary configurations. I can't possibly check for all possible configurations of a dictionary and verify that they all don't have empty strings.

Is there a way I can do a one time overall check on a dictionary and see if any one part of it is empty?

Peter Haddad
  • 78,874
  • 25
  • 140
  • 134
danielschnoll
  • 3,045
  • 5
  • 23
  • 34
  • Note that if you try to write an empty string into the dynamodb web page, it lets you. I think this is a bug in boto, not an actual limitation of dynamodb – falsePockets Nov 11 '20 at 01:17

4 Answers4

5

Or, if you want replace all empty strings with None values:

def removeEmptyString(dic):
    for e in dic:
        if isinstance(dic[e], dict):
            dic[e] = removeEmptyString(dic[e])
        if (isinstance(dic[e], str) and dic[e] == ""):
            dic[e] = None
        if isinstance(dic[e], list):
            for entry in dic[e]:
                removeEmptyString(entry)
    return dic

dictionaryWithEmptyStringsReplacedWithNone = removeEmptyString(dicrionaryWithEmptyStrings)

It is far from perfect but it works.

vladimirror
  • 729
  • 12
  • 8
4

If you want to get a dictionary just containing all keys with empty values, you can simply apply a dictionary comprehension to the details-dict to get all key-value pairs with empty values. E.g.:

empty_values = {key: value for key, value in details.items() if not value}

If you instead want to filter out the key-value pairs with empty values, so you're left with a dictionary where all keys have values, simply use the same comprehension without the not:

details = {key: value for key, value in details.items() if value}
Dunedan
  • 7,848
  • 6
  • 42
  • 52
  • Thanks for the advice - unfortunately this only works for the outermost key value pairs. It is possible that a key's value is another dictionary, and that dictionary can have empty values as keys. Is there a way to extend this dictionary comprehension to cover a deeper scope? – danielschnoll Jul 31 '17 at 19:20
  • If course, e.g. check out the following question and its answers: https://stackoverflow.com/questions/10756427/loop-through-all-nested-dictionary-values – Dunedan Jul 31 '17 at 19:22
  • Thank you very much - I'll probably have to write my own helper function to get this done then. I'll give this a shot – danielschnoll Jul 31 '17 at 19:51
3

@PedoDorf's function worked for me though I had to add a check since sometimes it'd return "TypeError: string indices must be integers" when receiving a string

def removeEmptyString(dic):
  if isinstance(dic, str):
    if dic == "":
      return None
    else:
      return dic

  for e in dic:
    if isinstance(dic[e], dict):
      dic[e] = removeEmptyString(dic[e])
    if (isinstance(dic[e], str) and dic[e] == ""):
      dic[e] = None
    if isinstance(dic[e], list):
      for entry in dic[e]:
        removeEmptyString(entry)
  return dic

Thanks

0

If you need to account for nested objects, and clean them as well, give this a try. Requires some recursion:

def clean_ddb_data(obj):
    cleaned = {}
    for k, v in obj.items():
        if isinstance(v, dict):
            cleaned[k] = clean_ddb_data(v)
        elif isinstance(v, str):
            if len(v) > 0:
                cleaned[k]=v
        else:
            cleaned[k]=v
    return cleaned
parquar
  • 61
  • 3