1

Below is a example of my data set. I want to combine duplicated points together and get the sum of values in columns a , b and c into a single row.

I have looked at a previous example using groupby.sum() here How do I Pandas group-by to get sum?. Because I am dealing with geometries I can't get my code to work.

geometry a b c
point a 2 4 6
point a 3 1 7
point b 1 2 3

This is want I want:

geometry a b c
point a 5 5 13
point b 1 2 3
Ben Watson
  • 119
  • 6
  • 2
    Your `geometry` is not geometry in the context of geopandas' capabilities. It is `text string` in the current form. – swatchai Mar 03 '23 at 17:09
  • If this is actually a geometry, as it seems to be from your question text, then groupby will not work as you note. But please provide a more realistic example, ideally as a [mre] but at least by copying the result of print (df) into the question. – Michael Delgado Mar 04 '23 at 09:19
  • 1
    Frankly, since geometries are not hashable, I don’t know if this is possible using tools available in geopandas. You may be stuck with a horrible double for loop using shapely to compare each shape to all others? But maybe others have ideas. – Michael Delgado Mar 04 '23 at 09:27

2 Answers2

2

Covert geometry to wtk:

df = df([df['geometry'].to_wkt()], ).agg('a' : 'sum', 'b' : 'sum', 'c' :'sum').reset_index()

Then back to geometry:

df['index'] = gpd.GeoSeries.from_wkt(df['index'])

df = gpd.GeoDataFrame(df)

Ben Watson
  • 119
  • 6
0

Code:

import pandas as pd

df = pd.DataFrame({
    'geometry': ['point a', 'point a', 'point b'],
    'a': [2, 3, 1],
    'b': [4, 1, 2],
    'c': [6, 7, 3]})

res = df.groupby('geometry', as_index=False).sum()  

print(res)

Output:

  geometry  a  b   c
0  point a  5  5  13
1  point b  1  2   3
Yash Mehta
  • 2,025
  • 3
  • 9
  • 20
  • 2
    The question specifically notes this won’t work since the column contains geometries. These are not hashable and cannot be grouped on. – Michael Delgado Mar 04 '23 at 09:20