5

I am going through my coordinates data and I see some duplicate coordinates with different parameters due to certain preprocessing. I want to be able to merge the attributes corresponding to the matched coordinates and get the simplified results. To clarify what I mean here is an example:

X = [1.0, 2.0, 3.0, 2.0]
Y = [8.0, 3.0, 4.0, 3.0]
A = [13, 16, 20, 8]

The above data is read as follows: point (1.0, 8.0) has a value of 13 and (2.0, 3.0) has a value of 16. Notice that the second point and fourth point have the same coordinates but different attribute values. I want to be able to remove the duplicates from the lists of coordinates and sum the attributes so the results would be new lists:

New_X = [1.0, 2.0, 3.0]
New_Y = [8.0, 3.0, 4.0]
New_A = [13, 24, 20]

24 is the sum of 16 and 8 from the second and fourth points with the same coordinates, therefore one point is kept and the values are summed.

I am not sure how to do this, I thought of using nested for loops of zips of the coordinates but I am not sure how to formulate it to sum the attributes.

Any help is appreciated!

mb567
  • 691
  • 6
  • 21

6 Answers6

4

I think that maintaining 3 lists is a bit awkward. Something like:

D = dict()
for x,y,a in zip(X,Y,A):
    D[(x,y)] = D.get((x,y),0) + a

would put everything together in one place.

If you'd prefer to decompose it back into 3 lists:

for (x,y),a in D.items():
    newX.append(x)
    newY.append(y)
    newA.append(a)
dashiell
  • 812
  • 4
  • 11
2

Another option here is to use itertools.groupby. But since this only groups consecutive keys, you'll have to first sort your coordinates.

First we can zip them together to create tuples of the form (x, y, a). Then sort these by the (x, y) coordinates:

sc = sorted(zip(X, Y, A), key=lambda P: (P[0], P[1]))  # sorted coordinates
print(sc)
#[(1.0, 8.0, 13), (2.0, 3.0, 16), (2.0, 3.0, 8), (3.0, 4.0, 20)]

Now we can groupby the coordinates and sum the values:

from itertools import groupby
print([(*a, sum(c[2] for c in b)) for a, b in groupby(sc, key=lambda P: (P[0], P[1]))])
#[(1.0, 8.0, 13), (2.0, 3.0, 24), (3.0, 4.0, 20)]

And since zip is its own inverse, you can get New_X, New_Y, and New_A via:

New_X, New_Y, New_A = zip(
    *((*a, sum(c[2] for c in b)) for a, b in groupby(sc, key=lambda P: (P[0], P[1])))
)
print(New_X)
print(New_Y)
print(New_A)
#(1.0, 2.0, 3.0)
#(8.0, 3.0, 4.0)
#(13, 24, 20)

Of course, you can do this all in one line but I broke it up into pieces so that it's easier to understand.

pault
  • 41,343
  • 15
  • 107
  • 149
1

you could put the (x,y) coords in a dictionary:

dict = {}
for i in range(len(X)) # len(X) = len(Y)
    if (X[i], Y[i]) not in dict.keys():
        dict[(X[i], Y[i])] = A[i]
    else:
       dict[(X[i], Y[i])] += A[i]
SamAtWork
  • 455
  • 5
  • 17
1

Can use a dictionary

d = {}

for i in range(len(X)):
    tup = (X[i], Y[i])
    if tup in d:
        d[tup] += A[i]
    else:
        d[tup] = A[i]

New_X = []
New_Y = []
New_A = [] 
for key in d.keys():
    New_X.append(key[0])
    New_Y.append(key[1])
    New_A.append(d[key])
Sam
  • 1,542
  • 2
  • 13
  • 27
1

Try this list comprehension:

y,x,a=zip(*[e for c,e in enumerate(zip(Y,X,A)) if not e[0:1] in [x[0:1] for x in zip(X,Y,A)][c:]])
whackamadoodle3000
  • 6,684
  • 4
  • 27
  • 44
1

A dict seems like a more appropriate data structure here. This will build one.

from collections import Counter

D = sum((Counter({(x, y): a}) for x, y, a in zip(X, Y, A)), Counter())
print(D)
#Counter({(2.0, 3.0): 24, (3.0, 4.0): 20, (1.0, 8.0): 13})

You can unpack these back into separate lists using:

New_X, New_Y, New_A = map(list, zip(*[(x,y,a) for (x,y),a in D.items()]))
print(New_X)
print(New_Y)
print(New_A)
#[1.0, 2.0, 3.0]
#[8.0, 3.0, 4.0]
#[13, 24, 20]
pault
  • 41,343
  • 15
  • 107
  • 149
bphi
  • 3,115
  • 3
  • 23
  • 36
  • @pault While your edit is constructive, I'd prefer if you didn't edit my answer so substantially. Especially considering you have already answered the question yourself. – bphi Jun 27 '18 at 21:32
  • my apologies- I just wanted to show how to get the final result OP showed. Feel free to rollback if you wish and I can add that last part as a comment. – pault Jun 27 '18 at 21:44