0

I have what seems like a rather simple questions but can't wrap my head around them.

I have a pandas dataframe for Tweets. The location of the users is registered in a variable named "Location" in various ways:

When the location is well recorded, I often get:

{'country_code': 'tr', 'state': 'Central Anatolia Region', 'county': 'Çankaya', 'city': 'Ankara'}

or

('country_code': 'tr', 'state': 'Black Sea Region', 'city': 'Trabzon'}

But sometimes, all I get is:

{'country_code': 'tr'}

('country_code': 'tr', 'state': 'Batman'}

and often, there's nothing and all that's registered is this:

{}

I want to write a script that can create new variables in my pandas dataframe for these individual values. In other words, if country_code is registered for a specific row, then I want the value in question to be recorded in a variable named country_code. And so on for state, county, and city. If nothing is there, it can simply input a blank or an NA for all the missing variables in question (county, state, city).

The end result should be such that I have four new variables in my dataframe: country-code, state, county, and city, based on the values registered in the "Location" variable with something (or nothing) registered for these values.

Can someone help by any chance?

Thank you so much!

saladin1991
  • 142
  • 9
  • i am confused because when you are describing `DataFrame` you are showing a `dict`. Is it a `list of dict` that you are referring to? – Inyoung Kim 김인영 Nov 10 '20 at 02:44
  • Thanks for the reply Inyoung! The variable Location in my pandas dataframe has these values--they seem to be registered as a series: `type(newdf2['Location']) Out[31]: pandas.core.series.Series` – saladin1991 Nov 10 '20 at 03:10
  • 1
    pandas will automatically fill missing variables with NULL. Try printing some rows from `newdf2`. – Inyoung Kim 김인영 Nov 10 '20 at 05:14
  • I understand, thanks Inyoung. But the problem is that I want to create four new variables based on the values registered for either country_code, city, county, and state in the variable "Location". – saladin1991 Nov 10 '20 at 12:43

1 Answers1

0

I was able to fix the problem by working with the original JSON file directly. All I did was store the location data into the different categories I was looking by using a for and if loop similar to what others suggest here. I did so instead of trying to use pandas specific functions to store the data registered in variable "Location" into different variables in my dataset.

saladin1991
  • 142
  • 9