1

I have the following dataset for Manhattan neighborhoods with the most common venues in each neighborhood:

df

I made a list of venues:

fit_venues = ['Coffee Shop', 'Café', 'Park', 'Hotel', 'Sandwich Place', 'Pizza Place', 'Gym / Fitness Center', 'Exhibit', 'Gym', 'Supermarket', 'Nightclub', 'Concert Hall', 'Jazz Club']

and I want to add a column to the dataframe (let's call it "Fit Neighborhood" for example), and compare the most common venues of each neighborhood (5 columns) with the list "fit_venues". Then we assign the result to the column "Fit Neighborhood" (Yes/No or True/False). For example, the first two rows should return Yes/True and the third row should return No/False.

Any help?

Feras
  • 11
  • 1
  • 2
  • 1
    Welcome! Can you please add a minimal sample of your input data as text, and provide the corresponding output you want? And also provide any attempts you have already made to produce the output. – smj Nov 19 '18 at 20:03
  • Please edit your question to add the output of `print(yourdataframe.head())` so we can see what kind of data you are working with. – soundstripe Nov 20 '18 at 16:53
  • There's a similar question with an extended explanation [here](https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql) – Julio Aug 06 '20 at 17:39

2 Answers2

1

See if this works:

 fit_venues = ['Coffee Shop', 'Café', 'Park', 'Hotel', 'Sandwich Place', 'Pizza Place', 'Gym / Fitness Center', 'Exhibit', 'Gym', 'Supermarket', 'Nightclub', 'Concert Hall', 'Jazz Club']

df["binary_check"] = df[df["5th Most Common Venue"].isin(fit_venues)]
pizza lover
  • 503
  • 4
  • 15
  • It gives me error: ValueError: Wrong number of items passed 7, placement implies 1 However, I need to compare with all common venues columns from 1st to 5th. – Feras Nov 20 '18 at 07:39
  • Not sure why this worked for my case but == did not work. This worked for me **final_table[final_table["Items"].isin([['I2','I3']])]** but this did not work **final_table[final_table.Items == "['I1', 'I2']"].Items** but – Vengenzz Vicky Jan 22 '23 at 14:21
1

Have you tried using DataFrame.isin()?

You didn't give me the names of your most common venue columns, so I'll assume they are the only columns in the DataFrame (df):

fit_venues = ['Coffee Shop', 'Café', 'Park', 'Hotel', 'Sandwich Place', 'Pizza Place', 'Gym / Fitness Center', 'Exhibit', 'Gym', 'Supermarket', 'Nightclub', 'Concert Hall', 'Jazz Club']

df['Fit Neighborhood'] = df.isin(fit_venues).any()
soundstripe
  • 1,454
  • 11
  • 19