0

Dataset

                  starttime   User Type
0         7/1/2015 00:00:03  Subscriber
1         7/1/2015 00:00:06  Subscriber
2         7/1/2015 00:00:17  Subscriber
3         7/1/2015 00:00:23  Subscriber
4         7/1/2015 00:00:44  Subscriber
5         7/1/2015 00:01:00  Subscriber
6         7/1/2015 00:01:03  Subscriber
7         7/1/2015 00:01:06  Subscriber
8         7/1/2015 00:01:25    Customer
9         7/1/2015 00:01:41  Subscriber
10        7/1/2015 00:01:50    Customer
11        7/1/2015 00:01:58  Subscriber
12        7/1/2015 00:02:06  Subscriber
13        7/1/2015 00:02:07  Subscriber
14        7/1/2015 00:02:26  Subscriber
15        7/1/2015 00:02:26  Subscriber
16        7/1/2015 00:02:35  Subscriber
17        7/1/2015 00:02:43    Customer
18        7/1/2015 00:02:47    Customer
19        7/1/2015 00:02:47  Subscriber
20        7/1/2015 00:03:05  Subscriber
21        7/1/2015 00:03:16    Customer
22        7/1/2015 00:03:27  Subscriber
23        7/1/2015 00:03:34  Subscriber
24        7/1/2015 00:03:48  Subscriber
25        7/1/2015 00:03:56  Subscriber
26        7/1/2015 00:03:57  Subscriber
27        7/1/2015 00:03:58    Customer
28        7/1/2015 00:04:03  Subscriber
29        7/1/2015 00:04:17  Subscriber
...                     ...         ...
1085646  7/31/2015 23:57:25  Subscriber
1085647  7/31/2015 23:57:29    Customer
1085648  7/31/2015 23:57:32  Subscriber
1085649  7/31/2015 23:57:33  Subscriber
1085650  7/31/2015 23:57:44  Subscriber
1085651  7/31/2015 23:57:54  Subscriber
1085652  7/31/2015 23:58:03  Subscriber
1085653  7/31/2015 23:58:08  Subscriber
1085654  7/31/2015 23:58:12    Customer
1085655  7/31/2015 23:58:15  Subscriber
1085656  7/31/2015 23:58:18    Customer
1085657  7/31/2015 23:58:24  Subscriber
1085658  7/31/2015 23:58:27  Subscriber
1085659  7/31/2015 23:58:42  Subscriber
1085660  7/31/2015 23:58:43  Subscriber
1085661  7/31/2015 23:58:51    Customer
1085662  7/31/2015 23:58:53  Subscriber
1085663  7/31/2015 23:58:58  Subscriber
1085664  7/31/2015 23:59:04  Subscriber
1085665  7/31/2015 23:59:10  Subscriber
1085666  7/31/2015 23:59:24  Subscriber
1085667  7/31/2015 23:59:23    Customer
1085668  7/31/2015 23:59:24  Subscriber
1085669  7/31/2015 23:59:24  Subscriber
1085670  7/31/2015 23:59:38  Subscriber
1085671  7/31/2015 23:59:40  Subscriber
1085672  7/31/2015 23:59:41  Subscriber
1085673  7/31/2015 23:59:42    Customer
1085674  7/31/2015 23:59:56  Subscriber
1085675  7/31/2015 23:59:59  Subscriber

Question

Create a pandas DataFrame with the number of rides by Hour and User Type for Workdays and Weekends. Use starttime to determine each ride's hour.

Output should be like

    User 
    Type
    Hour    Customer    Subscriber

Weekday     0   124     2194
            1   120     1238
            2   53      716
            3   30      520
....    ....    ....    ....
Weekend     0   152     1879
            1   82      1222
            2   45      718
            3   34      431
            4   29      288
....    ....    ....    ....

Graphs should be like this

enter image description here enter image description here

My Code

def a11(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)

    hours_cats = ['12 AM', '01 AM', '02 AM', '03 AM', '04 AM', '05 AM', '06 AM', '07 AM', '08 AM', '09 AM', '10 AM', '11 AM', '12 PM', '01 PM', '02 PM', '03 PM', '04 PM', '05 PM', '06 PM', '07 PM', '08 PM', '09 PM', '10 PM', '11 PM']
    dates = pd.Categorical(rides.starttime.dt.strftime('%I %p'), categories=hours_cats, ordered=True)
    df = pd.crosstab(dates, rides['User Type'])

I have no idea How can I separate data-frame in weekdays and weekends as mention in Question.

RaTh0D
  • 323
  • 3
  • 19
  • To start: figure out how to get the day of the week for each ride, based on the start time for the ride. This might help get you pointed in the right direction: https://stackoverflow.com/questions/30222533/create-a-day-of-week-column-in-a-pandas-dataframe-using-python – payne Sep 09 '17 at 10:58
  • Use [pandas.DatetimeIndex.weekday](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DatetimeIndex.weekday.html) which returns numbers for each day from 0=Monday to 6= Sunday. Add this data to your df by using [merge, join, concat].(https://pandas.pydata.org/pandas-docs/stable/merging.html) Third use [pandas.pivot_table](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html) to sqeeze the data in the desired form – 2Obe Sep 09 '17 at 12:56

1 Answers1

1

For desired DataFrame you need create new column:

def a11(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)
    rides['Type'] = np.where(rides['starttime'].dt.dayofweek < 5, 'Weekday', 'weekend')
    return pd.crosstab([rides['Type'], rides['starttime'].dt.dayofweek], rides['User Type'])

print (a11(rides))

User Type          Customer  Subscriber
Type    starttime                      
Weekday 2                 6          24
        4                 6          24

For graphs output is a bit different - changed dates and for select by first level of MultiIndex was used DataFrame.xs:

print (rides)
             starttime   User Type
0   7/01/2015 10:00:03    Customer
1   7/02/2015 02:00:06  Subscriber
2   7/02/2015 02:00:17    Customer
3   7/02/2015 04:00:23  Subscriber
4   7/03/2015 05:00:44    Customer
5   7/11/2015 07:01:00  Subscriber
6   7/12/2015 07:01:03    Customer
7   7/12/2015 08:01:06  Subscriber
8   7/12/2015 00:01:25    Customer
9   7/12/2015 09:01:41  Subscriber
10  7/12/2015 10:01:50    Customer

def a11(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)
    types = np.where(rides['starttime'].dt.dayofweek < 5, 'Weekday', 'Weekend')
    hours_cats = ['12 AM', '01 AM', '02 AM', '03 AM', '04 AM', '05 AM', '06 AM', '07 AM', '08 AM', '09 AM', '10 AM', '11 AM', '12 PM', '01 PM', '02 PM', '03 PM', '04 PM', '05 PM', '06 PM', '07 PM', '08 PM', '09 PM', '10 PM', '11 PM']
    dates = pd.Categorical(rides.starttime.dt.strftime('%I %p'), categories=hours_cats, ordered=True)
    return pd.crosstab([types, dates], rides['User Type']).assign(All=lambda x: x.sum(axis=1))

a11(rides).xs('Weekday').plot()
a11(rides).xs('Weekend').plot()

g1

g2

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252