0

I have a huge csv file with some columns. I want to extract the records based on particular column values

Till now i have read the csv file in pandas and now i want to extract data from the sheet based on particular column values. Below is the code and sample csv data.

import pandas as pd
import csv
data=pd.read_csv("raw_data_for_dryad.csv",usecols=["CaptureEventID","Species"])
print(data)

x=data.query('Species =="leopard" or Species=="cheetah" or Species=="buffalo" or Species=="human"',inplace=True)
print(x)

I tried the above code but it gives answer as NONE. This is my csv example

CaptureEventID  Species
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  blank
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  human
ASG0000001  blank
ASG0000001  human
ASG0000002  gazelleThomsons
ASG0000002  gazelleThomsons
ASG0000002  gazelleThomsons

I want to extract only the rows in which the value of Species column is either HUMAN or gazelleThomsons only. How can this be done?

Ankit
  • 203
  • 3
  • 14
  • try look at isin – BENY Apr 25 '19 at 18:06
  • `data.loc[ data.Species.isin('human', 'gazelleThomsons'), :]` – eliu Apr 25 '19 at 18:11
  • @eliu if i give 5 arguments it throws error. How can i modify the above line for the same – Ankit Apr 25 '19 at 18:26
  • `data.loc[ data.Species.isin('human', 'gazelleThomsons', 'leopard', 'more animals'), :]` like that? – eliu Apr 25 '19 at 18:32
  • @eliu yes something like that but this gives me boolean value as output.How can i get the actual values instead of the boolean values? – Ankit Apr 25 '19 at 18:39
  • can you copy paste my line exactly as it is written, into the code where right after you read your csv into "data". And then put a print() around my line – eliu Apr 25 '19 at 18:41
  • @eliu --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 z=data.loc[ data.Species.isin('leopard', 'cheetah','buffalo','human'), :] TypeError: isin() takes 2 positional arguments but 5 were given. I tried putting the names in a list and passing that list in the isin() function. But it gives output only for the 1st animal i.e Leopard – Ankit Apr 25 '19 at 18:44
  • 1
    haha, I was wrong, `isin( ['human', 'gazelleThomsons', 'leopard', 'more animals'] )`, `isin( should be a list in here)` – eliu Apr 25 '19 at 18:47
  • @eliu same concept :) but this worked. Thanks a lot :) – Ankit Apr 25 '19 at 18:51

0 Answers0