Why are there missing records when I convert from pandas df to dictionary?

Question

I am trying to convert a DBF of about 3233 records created from a shapefile of US counties to a dataframe; then I want to take two of the columns from that dataframe and convert to a dictionary where column1 is the key and column2 is the value. However, the resulting dictionary doesn't have the same number of records as my dataframe.

I use arcpy to call in the shapefile for all US Counties. When I use arcpy.GetCount_management(county_shapefile), this returns a feature count of 3233 records.
In order to convert to a dataframe, I converted to a dbf first with arcpy.TableToTableconversion(), this returns a dbf with 3233 records.
After converting to a df using Dbf5 from simpledbf, I get a df with 3233 records.
I then convert the first two columns to a dictionary which returns 56 records. Can anyone tell me what's going on here? (I recently switched to Python 3 from Python 2, could that be part of the issue?)

Code:

county_shapefile = "U:/Shapefiles/tl_2018_us_county/tl_2018_us_county.shp"
dbf = arcpy.TableToTable_conversion(county_shapefile,"U:/","county_data.dbf")

from simpledbf import Dbf5
dbfile = Dbf5(str(dbf))
df = dbfile.to_dataframe()

df_dict = {row[0]:row[1] for row in df.values}

I have also tried doing this with the .to_dict() function, but I'm not getting the desired dictionary structure {column1:column2,column1:column2...}

from simpledbf import Dbf5
dbfile=Dbf5(str(dbf))
df=dbfile.to_dataframe()
subset=df[["STATEFP","COUNTYFP"]]
subset=subset.set_index("COUNTYFP")
dict=subset.to_dict()

In the end, I'm hoping to create a dictionary where the key is the County FIPS code (COUNTYFP) and the value is the State FIPS code (STATEFP). I do not want to have any nested dictionaries, just a simple dictionary with the format...

dict={
   COUNTYFP1:STATEFP1,
   COUNTYFP2:STATEFP2,
   COUNTYFP3:STATEFP3,
   ....
}

Have you played with the `orient=` keyword in `df.to_dict()` (`‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’`) to see if one of the available options gets you to the correct format? — G. Anderson, Aug 09 '19 at 16:47
There aren't missing records; just multiple records for the same key. **56 keys is suspiciously close to the ~56 FIPS State(/District/Territory) codes for 50 states+DC+ AS, FM, GU, PR, VI** — smci, Feb 08 '22 at 03:38

score 3 · Accepted Answer · answered Aug 09 '19 at 17:08

3

Are you sure that the column1 has no duplicates? Because dictionaries in python do not support duplicate keys! If you want to preserve all the values in the column1 as keys you'll have to find a workaround for the same.

answered Aug 09 '19 at 17:08

Sammit

169
1
8

I was able to concatenate two of the identifying columns to create a unique key. Thank you – user147793 Aug 13 '19 at 15:58

Why are there missing records when I convert from pandas df to dictionary?

1 Answers1