0

As you will understand I am pretty new to pandas and I have found myself stuck with the below problem. Lets say I have the below Facebook data ( I have completely randomised them by the way for the sake of the example):

        Ad Set Name                             Impresions Link Clicks
0   253-Page.Visitors.10.Days                       100       3
1   254-Cart.Abandoners.10.Days                     300       9
2   253-Page.Visitors.10.Days                       900       27
3   256-LAL.5%.Add.to.Cart                        2,700       81
4   256-LAL.5%.Freq.Visits                        8,100       243
5   254-Cart.Abandoners.10.Days                   24,300      729
6   254-Cart.Abandoners.10.Days                   72,900      2,187

Now what I want to do is to create a new column called 'audience' and populate it based on these 3 conditions:

  • if the column 'Ad Set Name' contains the word 'Page.Visitors' the respective cell in the new column audience should be populated with 'Page Visitors'
  • If it contains 'Cart.Abandoners' it should be populated with 'Cart Abandoners'
  • And finally, if it contains 'LAL' it should be populated with 'Lookalikes'

This is how I tried to do it:

for i in data['Ad Set Name']:
    if 'Page.Visitors' in i:
        data.loc[i,'audiience'] = 'Page Visitors'
    elif 'Cart.Abandoners' in i:
        data.loc[i,'audience'] = 'Cart Abandoners'
    else:
        data.loc[i,'audience'] = 'Lookalikes'
data.head()

but the column I get back is filled with NaN.

Any help would be much appreciated!

Sajan
  • 1,247
  • 1
  • 5
  • 13
truelis
  • 25
  • 6
  • Take a look at numpy.where - https://numpy.org/doc/1.18/reference/generated/numpy.where.html – Sajan May 19 '20 at 19:32
  • Does this answer your question? [Pandas conditional creation of a series/dataframe column](https://stackoverflow.com/questions/19913659/pandas-conditional-creation-of-a-series-dataframe-column) – Davide Fiocco May 19 '20 at 20:46
  • Thanks, guys a lot guys. So I think numpy.where will work only if I have two choices right? @DavideFiocco yes the provided answer helps, as it has some useful solutions like the numpy.select that is also mentioned in this thread – truelis May 20 '20 at 00:30

3 Answers3

0

What you need is the following: First you need to import numpy:

import numpy as np

Then set the column audience to nans udn fill it with your data like this

df['audiience'] =np.nan
df.loc[df[df['Ad Set Name'].str.contains('Page.Visitors', regex=False)==True].index, 'audiience'] = 'Page Visitors'
df.loc[df[df['Ad Set Name'].str.contains('Cart.Abandoners', regex=False)==True].index, 'audiience'] = 'Cart Abandoners'
df.loc[df[df['Ad Set Name'].str.contains('LAL', regex=False)==True].index, 'audiience'] = 'Lookalikes'

this will give you the following dataframe:

Ad Set Name                  Impresions Link Clicks audiience
0   253-Page.Visitors.10.Days       100     3       Page Visitors
1   254-Cart.Abandoners.10.Days     300     9       Cart Abandoners
2   253-Page.Visitors.10.Days       900     27      Page Visitors
3   256-LAL.5%.Add.to.Cart         2700     81      Lookalikes
4   256-LAL.5%.Freq.Visits         8100     243     Lookalikes
5   254-Cart.Abandoners.10.Days   24300     729     Cart Abandoners
6   254-Cart.Abandoners.10.Days   72900     2187    Cart Abandoners
coco18
  • 836
  • 8
  • 18
  • Thanks for answering this! I tried applying this method but it gives me back the error "list index out of range" – truelis May 19 '20 at 23:52
0

i think you can do numpy.select

conditions = [
    (df["Ad Set Name"].str.contains("Page.Visitors")),
    (df["Ad Set Name"].str.contains("Cart.Abandoners")),
]

choices = ["Page Visitors", "Cart Abandoners"]

df["audience"] = numpy.select(conditions, choices, default="Office - Lookalikes")
D. Seah
  • 4,472
  • 1
  • 12
  • 20
0

Also, another user had initially provided the below answer which for some reason I can see now is deleted. This also worked perfectly and was also very easy for me (that I am a newbie in pandas) to read and understand:

def replace_ad_set(text):
    if "Page.Visitors" in text:
        return "Page Visitors"
    elif 'Cart.Abandoners' in text:
        return "Cart Abandoners"
    elif "LAL" in text:
        return "Lookalikes"

data["audience"] = data["Ad Set Name"].apply(replace_ad_set)

data
truelis
  • 25
  • 6