0

I am trying to parse some tweets data I collected in a JSON file right now. The problem is some of the tweets don't have 'user' or 'place' in them. As a result, I get messages like:

  File "<stdin>", line 18, in <module>
  KeyError: 'user'

So I tried to add an if-else statement, but it is still giving me the error message. What is the next step?

for line in lines:
    try:
            tweet = json.loads(line)

            # Ignore retweets!
            if tweet.has_key("retweeted_status") or not tweet.has_key("text"):
                    continue

            # Fetch text from tweet
            text = tweet["text"].lower()

            # Ignore 'manual' retweets, i.e. messages starting with RT             
            if text.find("rt ") > -1:
                    continue

            tweets_text.append( text )
            # I added an if-else statement, but it's still having be the error message
            if tweet['user']:
                    tweets_location.append( tweet['user']['location'] )
            else:
                    tweets_location.append("")

    except ValueError:
            pass
Bach
  • 6,145
  • 7
  • 36
  • 61
user3781579
  • 13
  • 1
  • 4

3 Answers3

2

Use dict.get.

        if tweet.get('user'):
                tweets_location.append(tweet['user'].get('location', ''))
        else:
                tweets_location.append("")

See Why dict.get(key) instead of dict[key]?

Community
  • 1
  • 1
metatoaster
  • 17,419
  • 5
  • 55
  • 66
1

You are getting a KeyError. If you want to check whether the key is in the dictionary, do:

if 'user' in tweet:
    tweets_location.append( tweet['user']['location'] )

Or you could embed it in a try..except:

try:
    tweets_location.append( tweet['user']['location'] )
except KeyError:
    tweets_location.append('')

Alternatively, you may use the get method of dict, as suggested by XrXrXr. The get method gives you a convenient way of providing a default value, i.e., so you can do it all in one line:

tweets_location.append( tweet.get('user', '').get('location', '') )

This defaults to the empty string if 'user' is not a key in tweet, and also to the empty string if the 'location' is not a key of tweet['user']

Jacob Lee
  • 425
  • 1
  • 7
  • 12
0

By doing tweet['user'] in the if statement you are assuming the key user exist, which raises the KeyError. You can test if the key is in the dict by doing if 'user' in tweet. Alternatively, you can handle the KeyError similar to how to handle ValueError

try:
    ....
    try:
        tweets_location.append( tweet['user']['location'] )
    except KeyError:
        tweets_location.append("")
except ValueError:
        pass
XrXr
  • 2,027
  • 1
  • 14
  • 20