1

I am converting a dict into a dataframe:

states = {'OH': 'Ohio', 'KY': 'Kentucky', 'AS': 'American Samoa', 'NV': 'Nevada', 'WY': 'Wyoming', 'NA': 'National', 'AL': 'Alabama', 'MD': 'Maryland', 'AK': 'Alaska', 'UT': 'Utah', 'OR': 'Oregon', 'MT': 'Montana', 'IL': 'Illinois', 'TN': 'Tennessee', 'DC': 'District of Columbia', 'VT': 'Vermont', 'ID': 'Idaho', 'AR': 'Arkansas', 'ME': 'Maine', 'WA': 'Washington', 'HI': 'Hawaii', 'WI': 'Wisconsin', 'MI': 'Michigan', 'IN': 'Indiana', 'NJ': 'New Jersey', 'AZ': 'Arizona', 'GU': 'Guam', 'MS': 'Mississippi', 'PR': 'Puerto Rico', 'NC': 'North Carolina', 'TX': 'Texas', 'SD': 'South Dakota', 'MP': 'Northern Mariana Islands', 'IA': 'Iowa', 'MO': 'Missouri', 'CT': 'Connecticut', 'WV': 'West Virginia', 'SC': 'South Carolina', 'LA': 'Louisiana', 'KS': 'Kansas', 'NY': 'New York', 'NE': 'Nebraska', 'OK': 'Oklahoma', 'FL': 'Florida', 'CA': 'California', 'CO': 'Colorado', 'PA': 'Pennsylvania', 'DE': 'Delaware', 'NM': 'New Mexico', 'RI': 'Rhode Island', 'MN': 'Minnesota', 'VI': 'Virgin Islands', 'NH': 'New Hampshire', 'MA': 'Massachusetts', 'GA': 'Georgia', 'ND': 'North Dakota', 'VA': 'Virginia'}

myStates = pd.DataFrame(states.items(), columns=['a','b'])

Jupyter Notebook throws following error:

     13 
---> 14     myStates = pd.DataFrame(states.items(), columns=['a','b'])

PandasError: DataFrame constructor not properly called!

But I am getting no error on PyCharm/Python2.7, output looks like:

     a                         b 
 0   WA                Washington
 1   WI                 Wisconsin
 2   WV             West Virginia
 3   FL                   Florida
 4   WY                   Wyoming
 5   NH             New Hampshire

Any other workaround to that?

Thanks, P

jpp
  • 159,742
  • 34
  • 281
  • 339
Peter
  • 553
  • 3
  • 7
  • 15
  • what sort of `DataFrame` is this *suppose* to result in? What would columns `a` and `b` be? This `dict` looks like it should make a `pd.Series`: `my_states = pd.Series(states)` – juanpa.arrivillaga Feb 01 '18 at 22:45

3 Answers3

2

pd.DataFrame.from_dict is likely what you need.

myStates = pd.DataFrame.from_dict(states, orient='index').reset_index()
myStates.columns = ['a', 'b']

myStates.head(5)

#     a               b
# 0  OH            Ohio
# 1  KY        Kentucky
# 2  AS  American Samoa
# 3  NV          Nevada
# 4  WY         Wyoming
jpp
  • 159,742
  • 34
  • 281
  • 339
  • That works. What's wrong with my approach and why is it not throwing errors in PyCharm but in Jupyter? – Peter Feb 01 '18 at 22:54
  • @Peter, see Alex's answer for PyCharm vs Jupyter. Looks like different python versions. – jpp Feb 01 '18 at 23:07
1

You are using Python3 when in Jupyter and Python2 in PyCharm. Python2 dict.items returns a list whereas Python3 dict.items returns a generator dictionary view object (thanks for the correction @juanpa.arrivillaga). Here's a SO answer that gives a bit more detail, but by and large this is to conserve memory and speed up computation time.

Here is a reference to the 2to3 docs that specifically calls out that dict.items() in Python2 is logically equivalent to list(dict.items()) in Python3.

So all you need to do is to make a call to the list constructor when running in Python3:

myStates = pd.DataFrame(list(states.items()), columns=['a','b'])

You can check what version of python you are running dynamically with sys:

import sys
print(sys.version)  # '3.6.3 ...'

Note: See @jp_data_analysis answer for how to handle this in a version agnostic way.

Alex
  • 18,484
  • 8
  • 60
  • 80
0

By using Series

pd.Series(states).to_frame('State').head()
Out[471]: 
             State
AK          Alaska
AL         Alabama
AR        Arkansas
AS  American Samoa
AZ         Arizona
BENY
  • 317,841
  • 20
  • 164
  • 234