drop rows with multiple column values

Question

I have a dataset where I have to drop rows with multiple columns. I tried this, but do not know how to do with multiple values

import pandas as pd
df = pd.read_csv("data.csv")
new_df = df[df.location == 'New York' ]
new_df.count()

I also tried another method, but do not know, how to do with multiple values:

import pandas as pd
df = pd.read_csv("data.csv")
df.drop(df[df['location '] == 'New York'].index, inplace = True)

I have delete rows, with values new york, boston, Austin and keep other locations remaining.

Also, I have replace value of a column if San Francisco then change value to 1, if Miami change to 2, so all values in location, should be replaced

Does this answer your question? [How to filter Pandas dataframe using 'in' and 'not in' like in SQL](https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql) — mcskinner, Apr 19 '20 at 00:59

jcaliz · Accepted Answer · 2020-04-19T01:09:30.560

1

You can use query method and variable with all cities you want to filter

np.random.seed(0)
cities = ['New York', 'Chicago', 'Miami']
data = pd.DataFrame(dict(cities = np.random.choice(cities, 10),
                          values = np.random.choice(10,10)))

data.cities.unique() # array(['New York', 'Chicago', 'Miami'], dtype=object)
filter = ['New York', 'Chicago']
data_filtered = data.query('cities not in @filter').copy()
data_filtered.cities.unique() # array(['Miami'], dtype=object)

For the values, you can manually set values

data_filtered.loc[data_filtered.cities == 'Miami', ['values']] =2

edited Apr 19 '20 at 01:09

answered Apr 19 '20 at 00:49

jcaliz

3,891
2
9
13

Error or warning? – jcaliz Apr 19 '20 at 01:04
warning, but it does not work I tried: data_filtered.loc[data_filtered.cities == 'Miami'] =1, data_filtered.loc[data_filtered.cities == 'San Francisco'] = 2 – Prakash Kumar Apr 19 '20 at 01:09

score 0 · Answer 2 · answered Apr 19 '20 at 00:50

0

I don't quite follow what you mean by dropping rows with multiple columns, but to check for multiple values you could use: new_df = df[df.location in ['New York', 'Boston']]

answered Apr 19 '20 at 00:50

Ralvi Isufaj

442
2
9

score 0 · Answer 3 · answered Apr 19 '20 at 01:33

You can try:

# Drop the rows with location "New York", "Boston", "Austin" (1)
df = df[~df["location"].isin(["New York", "Boston", "Austin"])]

# Replace locations with numbers: (2)
loc_map = {"San Francisco": 1, "Miami": 2, ...}
df["location"] = df["location"].map(loc_map)

For step (2), in case you have many values, you can create loc_map automatically by:

loc_map = {df.location.unique()[i]: i+1 for i in range(len(df.location.unique()))}

Hope this helps.

drop rows with multiple column values

3 Answers3