How do I select a single row by value of column[0] in pandas data frame fast and easy

Question

I have loaded a two column csv file in pandas data frame that has a unique key in column[0] (integer index) of every row and a value in column[1] of every row. It has no header.

I want simply to look up the single value in column[1] (or the whole row series) by using a key to match against the value column[0] to locate the row.

of course efficient etc. As if it were a dictionary with the string keys like value = dict.get(key)

example csv file:

a, "valueForA"
x, "valueForX"
z, "valueForZ"

...

df = pandas.read_csv(fileAbove)

row = df.wayToFindRowByColumn0Value('x')  # row is a series of 2 elements

row[0] should be the key 'x'
row[1] should be 'valueForX'

row = df.wayToFindRowByColumn0Value('notThere')  # row should be None if not found

Add small example dataset and what you expected output is. [How to make a good reproducible pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — Erfan, Aug 11 '19 at 21:55

GZ0 · Answer 1 · 2019-08-11T23:27:59.600

3

My suggestion is to store the values as a simple pandas Series (rather than a data frame) using the key field as index and the value field as series values. This can be achieved by calling the read_csv method in this way (some details are from the OP's answer).

s = pd.read_csv('/data/companyDataByCusip.csv', encoding='utf_8', names=['cusip','value'], index_col='cusip', squeeze=True)

Now you can access the Series just like a dictionary:

val = s[key]

If a data frame is really needed, you could remove the squeeze argument from read_csv and access the value using df.at[key,'value'].

edited Aug 11 '19 at 23:27

answered Aug 11 '19 at 23:10

GZ0

4,055
1
10
21

thanks another way - works great for single values in a row. – peterk Aug 12 '19 at 16:30

alexbclay · Answer 2 · 2019-08-11T22:44:57.460

1

Use df.at[index, column]. See documentation here.

For example:

import pandas as pd

df = pd.DataFrame(data=[{'value': x*x} for x in range(10)], 
                  index=[x for x in range(10)])
print(df)
print('At 4: {}'.format(df.at[4, 'value']))

Output:

Update:

To get a dataframe as a return, can try something like this:

row = df[df.index.values == 4]
print(row)

Output:

   value
4     16

edited Aug 11 '19 at 22:44

answered Aug 11 '19 at 21:52

alexbclay

1,389
14
19

almost right except I want to get the row series or value in integer indexed column [1] for the row with the non-integer string name in column[0] of the row. df.at takes two non integer lookup names, and df.iat takes two integer indices like indexing a matrix. – peterk Aug 11 '19 at 22:17
Does it need to be a pandas Series? Or can you just build your own list? ie: `row = [4, df.at[4, 'value']]` – alexbclay Aug 11 '19 at 22:35
no I could parse and use a dict or other map type. I have been asked to do this in pandas. – peterk Aug 11 '19 at 22:44

kederrac · Answer 3 · 2019-08-11T22:00:14.873

0

You can try:

df[df.column0==key].column1.iloc[0]

edited Aug 11 '19 at 22:00

answered Aug 11 '19 at 21:51

kederrac

16,819
6
32
55

score 0 · Answer 4 · answered Aug 11 '19 at 22:12

0

A simple loc can do th e job:

df.loc[[integer index]]

answered Aug 11 '19 at 22:12

M-M

440
2
16

score 0 · Answer 5 · answered Aug 11 '19 at 22:50

Ok poking around looks like using .loc() is the way but it reqires you set a column explicitly as the row index field, and a set explicit column names. It works, though I havn't speed tested it.

The doc I found is here:

https://www.kdnuggets.com/2019/04/pandas-dataframe-indexing.html

#  read in the file of the cusip database to a data frame
df = pandas.read_csv('/data/companyDataByCusip.csv',encoding='utf_8', names=['cusip','value'])
df.set_index('cusip',inplace=True)

rec = df.loc[cusip,:]
print(rec) # series for row

How do I select a single row by value of column[0] in pandas data frame fast and easy

5 Answers5