Find max 2 (or n) values in a column from a csv file(python)

Question

I want to find the max values in a column imported from a csv file. For the max value i used this code(I saw it prints also the value of the left column related to the max, why?):

data = pandas.read_csv(dataset, sep=',', usecols=[1])
maxValue=data.max(axis=0)[1]

How can i get the first 2 (or n) max values(using pandas, scipy or numpy)? I tried so, but doesn't work:

secondM=data[data[1]!=maxValue][data[1]].max()

See if this solves it : http://stackoverflow.com/questions/6910641/how-to-get-indices-of-n-maximum-values-in-a-numpy-array — Divakar, Oct 16 '16 at 16:05

MaxU - stand with Ukraine · Accepted Answer · 2016-10-16T18:02:09.660

UPDATE: more general solution for showing N largest values for all columns:

In [393]: df
Out[393]:
   a  b  c
0  2  9  9
1  4  8  0
2  8  6  3
3  0  8  3
4  3  6  0

In [394]: N = 2
     ...: pd.DataFrame([df[c].nlargest(N).values.tolist() for c in df.columns],
     ...:              index=df.columns,
     ...:              columns=['{}_largest'.format(i) for i in range(1, N+1)]).T
     ...:
Out[394]:
           a  b  c
1_largest  8  9  9
2_largest  4  8  3

In [395]: N = 3
     ...: pd.DataFrame([df[c].nlargest(N).values.tolist() for c in df.columns],
     ...:              index=df.columns,
     ...:              columns=['{}_largest'.format(i) for i in range(1, N+1)]).T
     ...:
Out[395]:
           a  b  c
1_largest  8  9  9
2_largest  4  8  3
3_largest  3  8  3

OLD answer:

I assume that you want to have 2 (or n) largest values for a single column (as you used usecols=[1]):

In [279]: df
Out[279]:
   a  b  c
0  1  0  2
1  0  7  7
2  7  7  9
3  5  1  6
4  7  0  3
5  4  0  4
6  0  6  1
7  8  3  6
8  2  8  8
9  2  9  2

In [280]: df['a'].nlargest(2)
Out[280]:
7    8
2    7
Name: a, dtype: int32

NOTE: if your CSV file doesn't have labels (column names), you can read it this way (assuming that you want to read only second (1) and fourth (3) columns from the CSV file):

df = pd.read_csv(r'/path/to/file.csv', sep=',', usecols=[1,3],
                 header=None, names=['col1','col2'])

Thanks Max, it should be right, but I am quite new and I have still some problems. If i dont have a label, is it correct so: — Joe, Oct 16 '16 at 16:34
data = pandas.read_csv(dataset, sep=',') df = pandas.DataFrame(data) max2=df[1].nlargest(2) — Joe, Oct 16 '16 at 16:35
@Giuseppe, you can do it this way: `df.iloc[:, 0].nlargest(2)`, where `0` - is your column number — MaxU - stand with Ukraine, Oct 16 '16 at 16:37
@Giuseppe, you are welcome! Please consider [accepting](http://meta.stackexchange.com/a/5235) an answer if you think it has answered your question — MaxU - stand with Ukraine, Oct 16 '16 at 17:02

Find max 2 (or n) values in a column from a csv file(python)

1 Answers1