2

So i have this dataset in pandas, and i want to do a time series analysis on it. I used the pd.read_csv to import the dataset and named it 'df'. Then i checked the first 5 columns of the dataset with the df.head() function.

df.head()

        Date                ARTICLES                Detail  
0   2018-070-2 10:54:00   DUBIC MALT                CASH
1   2018-070-2 10:54:00  ANGEL COTTON-BUD           CASH
2   2018-070-2 10:54:00  RIBENA 1LITTER             CASH
3   2018-07-02 10:54:00  BOTTLE WATER 75CL          CASH
4   2018-07-02 10:54:00  LWAYS CLASSIC THICK LONG   CASH

Now i want to split the Date column into 'Day' like 2018-07-02 and Time like 10:54:00 in pandas.

I've tried using the rsplit function but it's throwing errors.

Adeola Adesina
  • 81
  • 3
  • 10
  • 1
    Do you really have a need for separate `Date` and `Time` columns? If you instead convert the column to a `datetime` dtype with `pd.to_datetime` you'll have all of that information readily accessible and you gain access to tons of built in methods that makes working with dates and times far easier than manipulating strings. See the [docs](http://pandas.pydata.org/pandas-docs/stable/timeseries.html) for how time series analysis is suggested with `pandas` – ALollz Dec 06 '18 at 14:53
  • Please give the code you try out and thew error message. – Mirko Ebert Dec 06 '18 at 14:53
  • 1
    Your first 3 records have an odd date format. And, @ALollz makes a good point, you a lot of time don't need to separate Date and Time. – Scott Boston Dec 06 '18 at 14:57
  • your data is corrupt ... – BENY Dec 06 '18 at 15:03

3 Answers3

3

First, you need to convert your 'Date' column into Datetime of Pandas:

In [1991]: df.Date = df.Date.apply(pd.to_datetime)

Then , you can split the Date column into 2 new columns like this:

In [1993]: df['Day'] = [d.date() for d in df['Date']]
      ...: df['Time'] = [d.time() for d in df['Date']]
      ...: 

In [1994]: df
Out[1994]: 
                 Date    Day         Time
0 2018-07-02 10:54:00  2018-07-02  10:54:00
1 2018-07-02 10:54:00  2018-07-02  10:54:00
2 2018-07-02 10:54:00  2018-07-02  10:54:00
3 2018-07-02 10:54:00  2018-07-02  10:54:00
4 2018-07-02 10:54:00  2018-07-02  10:54:00
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
2

assign and dt accessor

df.assign(Date=df.Date.dt.date, Time=df.Date.dt.time)

         Date      Time
0  2018-07-02  10:54:00
1  2018-07-02  10:54:00
2  2018-07-02  10:54:00
3  2018-07-02  10:54:00
4  2018-07-02  10:54:00
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • This worked very well. Thank you so much. But how can i adjust the code if i had some other columns in the original dataset and i want the dataframe to reflect the Date, Time, and the other columns? – Adeola Adesina Dec 06 '18 at 15:53
  • `assign` provides a copy of the dataframe with the new columns specified in the call to `assign` – piRSquared Dec 06 '18 at 15:54
0

You can try this way to create two new columns e.g Day and Time from the df['Date']

df['Day'] = [d.date() for d in df['Date']]
df['Time'] = [d.time() for d in df['Date']]

DEMO:

import pandas as pd
df = pd.DataFrame({'Date': pd.date_range('2018-07-02 10:54:00', periods=5)})
df['Day'] = [d.date() for d in df['Date']]
df['Time'] = [d.time() for d in df['Date']]
print(df)

See: https://repl.it/repls/HonoredFirstMicrokernel

A l w a y s S u n n y
  • 36,497
  • 8
  • 60
  • 103