1

I downloaded a dataset from the internet which contains the data related to daily transactions in the following format [dummy values mentioned below] :

ID  Date        Item1 Item2 Item3 Item4 Item5
0   2017-08-28  1234   5678 91011  1213 1415
1   2017-07-27  1234   5678 91011  1213 1415
2   2017-06-26  1234   5678 91011  1213 1415
3   2017-05-25  1234   5678 91011  1213 1415

How do I use the date column to get the values for the transactions that happened in the last week , month and the last 3 months ?

I used this to format the date in the pandas format: df['week'] = pd.to_datetime(df['week'])

And tried to get the values in the last week using this :

range_max = df['week'].max() 
range_min = range_max - dt.timedelta(days=7)
# take slice with final week of data 

sliced_df = df[(df['week'] >= range_min) & (df['week'] <= range_max)] 

This works for retrieving last week's data , but how do I retrieve values for , say, a specific month or week ?

Aditya Kaushik
  • 213
  • 1
  • 4
  • 20
  • 2
    What did you try? – Anton vBR Oct 05 '17 at 10:50
  • 1
    Possible duplicate of [Select dataframe rows between two dates](https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates) – Bart Oct 05 '17 at 10:52
  • @AntonvBR I used this to format the date in the pandas format: df['week'] = pd.to_datetime(df['week']) And tried to get the values in the last week using this : range_max = df['week'].max() range_min = range_max - dt.timedelta(days=7) # take slice with final week of data sliced_df = df[(df['week'] >= range_min) & (df['week'] <= range_max)] – Aditya Kaushik Oct 05 '17 at 10:54
  • 1
    @Abe put that in the question, and could you include what that gave you and why it wasn't what you expected? – Stael Oct 05 '17 at 10:56

1 Answers1

1

Here are some examples:

import pandas as pd
import datetime
import io

string = u"""ID,Date,Item1,Item2,Item3,Item4,Item5
0,2017-08-28,1234,5678,91011,1213,1415
1,2017-07-27,1234,5678,91011,1213,1415
2,2017-06-26,1234,5678,91011,1213,1415
3,2017-05-25,1234,5678,91011,1213,1415"""

df = pd.read_csv(io.StringIO(string))
df["Date"] = pd.to_datetime(df["Date"])

now = datetime.datetime.now().date()

mask1 = (df.Date < now) & (df.Date > now - datetime.timedelta(days=90)) # 90 days
mask2 = (df.Date.dt.month == 7) # specific month
mask3 = (df.Date.dt.week == 21) # specific week
mask4 = (df.Date.dt.month > now.month-3) # 3 months back
mask5 = (df.Date.dt.strftime("%B").isin(["May","July"])) # stfrtime and you can use this, quite clear!
mask6 = (df.Date.dt.strftime("%A").isin(["Monday","Tuesday"])) # stfrtime and you can use this, quite clear!


# Lastly
df[mask1]
#df[mask2]
#df[mask3]
...

It really boils down to what you exactly are looking for. Updated with months and weeksdays too.

Anton vBR
  • 18,287
  • 5
  • 40
  • 46