0

I want to use GeoPandas to visualize some election data. I have two DataFrames - a first DataFrame that includes the geometry data and precinct labels and a second DataFrame that includes the voting data. I want to add some of the voting data from the second DataFrame to the first DataFrame.

Here is structure of the first DataFrame:

Precinct_2020   geometry
345 Precinct 4-8    POLYGON ((-95.93331 41.22970, -95.93330 41.230...
346 Precinct 4-9    POLYGON ((-95.95904 41.23577, -95.95889 41.235...
347 Precinct 4-3    POLYGON ((-95.94178 41.20966, -95.94178 41.211...
348 Precinct 2-17   POLYGON ((-95.95277 41.26891, -95.95255 41.270...
349 Precinct 8-83   POLYGON ((-96.04293 41.33597, -96.04294 41.337...

Here is the structure of the second DataFrame:

Precinct_2020   diff
0   Precinct 1-2    67
1   Precinct 1-3    67
2   Precinct 1-4    27
3   Precinct 1-5    63
4   Precinct 1-7    43

I tried doing this by matching on precinct labels with two nested for loops as shown below:

for entry in douglas_county_df:
  for item in voting_diff:
    if item['Precinct_2020'] in entry['Precinct_2020']:
      entry['diff'] = item['diff']

Essentially, I want to add the vote difference value 'diff' in the second DataFrame to the corresponding precincts in the first DataFrame. I am getting an error that string indices must be integers. What is the best way to handle this issue?

Expected output:

Precinct_2020   geometry
    345 Precinct 4-8    POLYGON ((-95.93331 41.22970, -95.93330 41.230... [diff for 4-8]
    346 Precinct 4-9    POLYGON ((-95.95904 41.23577, -95.95889 41.235... [diff for 4-9]
    347 Precinct 4-3    POLYGON ((-95.94178 41.20966, -95.94178 41.211... [diff for 4-3]
    348 Precinct 2-17   POLYGON ((-95.95277 41.26891, -95.95255 41.270... [diff for 2-17]
    349 Precinct 8-83   POLYGON ((-96.04293 41.33597, -96.04294 41.337... [diff for 8-83]

Thanks!

tbb
  • 15
  • 4
  • Kindly share your expected output. – David Erickson Nov 09 '20 at 03:22
  • 3
    Is there a reason why you aren't doing a simple `merge`? You can do `entry = entry.merge(item, how='left', on='Precint_2020')` ? – David Erickson Nov 09 '20 at 03:26
  • Will give this a try. Thanks! – tbb Nov 09 '20 at 03:30
  • Also see: [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – David Erickson Nov 09 '20 at 03:31
  • If you have two dataframes that you want to merge, use `pandas.merge(df_left, df_right, how="left", on="Precinct_2020")`. One additional note: `df_left` and `df_right` are your dataframes. If you only want a few columns from `df_right` dataframe, use `df_right[list_of_target_columns]` as `df_right`. Refer to: [`pandas.DataFrame.merge` - Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html) – CypherX Nov 09 '20 at 03:41

1 Answers1

0

Solution

If you have two dataframes that you want to merge, use (like David Erickson mentioned in the comments as well):

COLUMN_TO_MERGE_ON = "Precinct_2020"
pandas.merge(df_left, df_right, how="left", on=COLUMN_TO_MERGE_ON)

Note:

  • To satisfy your requirement, we are using left join. Hence, how = left.
  • df_left and df_right are your dataframes. If you only want a few columns from df_right dataframe, use df_right[list_of_target_columns] as df_right.
  • See this: Left join using merge in geopandas - this stackexchange question shows you how.

References:

I would encourage you to explore the following references.

CypherX
  • 7,019
  • 3
  • 25
  • 37