How to take the maximum value of a row if it has repeated values elsewhere in the column and return a new matrix?

Question

I have created a matrix by concatenating two arrays as column vectors, so I have something like the following:

ErrKappa	error
1	0.5
2	0.76
2	0.5
3	0.15
4	0.5
4	0.9
2	0.5
3	0.05

And then I need it to output another matrix that which just has the maximum error of the values which are the same from the matrix, so the new one will look like the following:

ErrKappa	error
1	0.5
2	0.76
3	0.5
4	0.9

Please note that ErrKappa doesn't need to be put in order, it just so happened that it appeared like that in this toy example. Any help is massively appreciated. Thanks!

score 0 · Accepted Answer · answered Aug 29 '22 at 17:17

0

EK = [1,2,2,3,4,4,2,3]
er = [0.5,0.76,0.5,0.15,0.5,0.9,0.5,0.5]
import pandas as pd
df = pd.DataFrame({'ErrorKappa':EK,'error':er})
df.groupby('ErrorKappa').max()
            error
ErrorKappa       
1            0.50
2            0.76
3            0.50
4            0.90

answered Aug 29 '22 at 17:17

Mark Bower

569
2
16

Thanks for this, however I seem to get the error that "TypeError: unhashable type: 'numpy.ndarray'", when I am passing in lists for some strange reason. – Jamie North Aug 29 '22 at 18:02
Hmm. Doing an internet search for that error string returns a number of links: https://stackoverflow.com/questions/9022656/typeerror-unhashable-type-numpy-ndarray, https://linuxhint.com/unhashable-type-numpy-ndarray/, and so on. I would read through those. One of the possible causes is trying to us a Numpy array as a dictionary key. That would happen if you forgot the string "ticks": "... pd.DataFrame({EK=EK, ..." where "EK" is the Numpy array, instead of "...pd.DataFrame({'EK'=EK, ..." – Mark Bower Aug 29 '22 at 19:55

How to take the maximum value of a row if it has repeated values elsewhere in the column and return a new matrix?

1 Answers1