0

I have created a matrix by concatenating two arrays as column vectors, so I have something like the following:

ErrKappa error
1 0.5
2 0.76
2 0.5
3 0.15
4 0.5
4 0.9
2 0.5
3 0.05

And then I need it to output another matrix that which just has the maximum error of the values which are the same from the matrix, so the new one will look like the following:

ErrKappa error
1 0.5
2 0.76
3 0.5
4 0.9

Please note that ErrKappa doesn't need to be put in order, it just so happened that it appeared like that in this toy example. Any help is massively appreciated. Thanks!

Timus
  • 10,974
  • 5
  • 14
  • 28

1 Answers1

0
EK = [1,2,2,3,4,4,2,3]
er = [0.5,0.76,0.5,0.15,0.5,0.9,0.5,0.5]
import pandas as pd
df = pd.DataFrame({'ErrorKappa':EK,'error':er})
df.groupby('ErrorKappa').max()
            error
ErrorKappa       
1            0.50
2            0.76
3            0.50
4            0.90
Mark Bower
  • 569
  • 2
  • 16
  • Thanks for this, however I seem to get the error that "TypeError: unhashable type: 'numpy.ndarray'", when I am passing in lists for some strange reason. – Jamie North Aug 29 '22 at 18:02
  • Hmm. Doing an internet search for that error string returns a number of links: https://stackoverflow.com/questions/9022656/typeerror-unhashable-type-numpy-ndarray, https://linuxhint.com/unhashable-type-numpy-ndarray/, and so on. I would read through those. One of the possible causes is trying to us a Numpy array as a dictionary key. That would happen if you forgot the string "ticks": "... pd.DataFrame({EK=EK, ..." where "EK" is the Numpy array, instead of "...pd.DataFrame({'EK'=EK, ..." – Mark Bower Aug 29 '22 at 19:55