I'm working in geopandas with a large number (around 4.5 million) objects, where each has a unique ID number ('PARCEL_SPI') and also another code ('PC_PLANNO').
What I would like to do is write some code that, for each object, finds all other objects with the same PLANNO and adds their ID number as a list in a new attribute, say 'Same_code' for the object. The df is called spine_copy.
Here's a quick sample of what I have:
PARCEL_SPI | PC_PLANNO |
---|---|
23908 | LP12345 |
90435 | LP12345 |
329048 | LP90803 |
6409 | LP2399 |
34534 | LP90803 |
092824 | LP12345 |
and what I want out:
PARCEL_SPI | PC_PLANNO | Same_code |
---|---|---|
23908 | LP12345 | [90435, 092824] |
90435 | LP12345 | [23908,092824] |
329048 | LP90803 | 34534 |
6409 | LP2399 | None |
34534 | LP90803 | 329048 |
092824 | LP12345 | [23908, 90435] |
I'm not too sure how to do this, but here's my attempt using groupby:
spine_copy.groupby('PC_PLANNO')['PARCEL_SPI'].apply(list)
However, this doesn't add the list as a new attribute for each object, and I'm unsure how to do this.
Thanks in advance!