1

I have the following DataFrame (this is a condensed version - it goes back a long time)

Week Commencing     A1    A2    A3      A4
2016-01-03          28    1375  1999    1345
2016-01-10          48    1552  2428    1337
2016-01-17          43    1895  2615    1420
2016-01-24          29    1950  2568    1385
2016-01-31          41    1912  2577    1277
2016-02-07          29    2176  2771    1403
2016-02-14          50    2229  3013    1450
2016-02-21          60    2271  3029    1489
2016-02-28          43    2140  3133    1594
2016-03-06          51    2080  3140    1498

I want to create a new column that specifies a label based on a specific period in time. IE: if row is before a certain date, return a word.

I have tried the following:

def action(x):
    if x == "True":
        return "Before Migration"
    if x == "False":
        return "After Migration"

df.index.apply(action, axis=1)

I get the following error: "AttributeError: 'DatetimeIndex' object has no attribute 'apply'".

I have changed this to a string, tried resetting the index so I can apply to a column rather than index and it doesn't work.

I have also tried this:

if df.index < '2016-02-14':
    df["Migration_Type"] = "Before Migration"
else:
    df["Migration_Type"] = "After Migration"

Error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Any suggestions on better ways are also appreciated.

AdrianC
  • 383
  • 4
  • 18
  • I used this and it worked - thanks for this. Sometimes hard knowing what to search for. Happy for you to remove if need be. Apologies for this – AdrianC Feb 22 '18 at 02:34

2 Answers2

2

Try something like this:

# First, initialize a new column, set it to before migration by default
df = df.assign(Migration_Type = "Before Migration")
# Then, assign "after migration" to all rows after your chosen date
df.loc[df.index >= '2016-02-14', 'Migration_Type'] = "After Migration"
sacuL
  • 49,704
  • 8
  • 81
  • 106
  • This worked really well. Nice and simple too. More for my own learning, but was there anything wrong with my approach? Can you get a similar result? – AdrianC Feb 22 '18 at 02:34
  • It's actually very similar to what you were trying to do with your second attempt (the loop), but in a more pandas-like (vectorized) fashion. As far as your apply method, you were calling `.apply` as a method on the index of your dataframe, which doesn't have such a method. `.apply` is good for applying a simple function to a dataframe or series, where the arguments can be implied easily. Your function was just returning "Before Migration" if `x` is True, but there wasn't even any indication of what `x` was, nor where the "Before Migration" was meant to be returned to. – sacuL Feb 22 '18 at 02:40
  • Thank you very much for taking the time to help me. If it isn't too much trouble, can you show me how you would do this with a for loop (but the right way) - you've already helped me enough so don't feel obliged. – AdrianC Feb 22 '18 at 02:42
  • This would work (if you can imply the indentation), but is not recommended... would be slow and clunky. `df['Migration_Type'] = ""` `for i, row in df.iterrows(): if i >= '2016-02-14': row['Migration_Type'] = "After Migration" else: row['Migration_Type'] = "Before Migration"` – sacuL Feb 22 '18 at 02:46
2
df.assign(
    Migration_Type=np.where(
        df.index < '2016-02-14',
       'Before Migration',
       'After Migration'
    )
)


                 A1    A2    A3    A4    Migration_Type
Week Commencing                                        
2016-01-03       28  1375  1999  1345  Before Migration
2016-01-10       48  1552  2428  1337  Before Migration
2016-01-17       43  1895  2615  1420  Before Migration
2016-01-24       29  1950  2568  1385  Before Migration
2016-01-31       41  1912  2577  1277  Before Migration
2016-02-07       29  2176  2771  1403  Before Migration
2016-02-14       50  2229  3013  1450   After Migration
2016-02-21       60  2271  3029  1489   After Migration
2016-02-28       43  2140  3133  1594   After Migration
2016-03-06       51  2080  3140  1498   After Migration
piRSquared
  • 285,575
  • 57
  • 475
  • 624