0

I've a dataframe of this format -

var1  date
A     2017/01/01
A     2017/01/02
...

I want the date to be converted into YYYY-MM format but the df['date'].dtype is object.
How can I remove the day part from date while keeping the data type as datetime?

Expected Output -

A - 2017/01

Thanks

koPytok
  • 3,453
  • 1
  • 14
  • 29
  • you need to parse the current dates first - there's a parse_dates argument when you read a csv files. Then you can convert to whichever output you want. Alternatively just use a lambda function df.date.apply(lambda x: x[0:4] + "/" + x[4:5]) – alex314159 Jun 19 '18 at 08:34
  • 1
    `How can I remove the day part from date while keeping the data type as datetime?`. This is not possible. As in real life, each date has a day. Choose what you want: a string (with whatever components you like) or datetime (with all components, even if they aren't all *displayed*). – jpp Jun 19 '18 at 09:03
  • @jpp, we can use a `period` dtype as a compromise between `datetime` and `object` dtypes... – MaxU - stand with Ukraine Jun 19 '18 at 09:10
  • 1
    @MaxU, Fair point. Thanks for reopening with a valid compromise :) – jpp Jun 19 '18 at 09:11

2 Answers2

4

You can't have custom representation for the datetime dtype. But you have the following options:

  1. use strings - you might have any representation (as you wish), but all datetime methods and attributes get lost
  2. use datetime, but set the day part to 1 (as @Kopytok) has already shown.
  3. use period dtype, which still allows you to use some date arithmetic

Demo:

In [207]: df
Out[207]:
  var1       date
0    A 2018-12-31
1    A 2017-09-07
2    B 2016-02-29

In [208]: df['new'] = df['date'].dt.to_period('M')

In [209]: df
Out[209]:
  var1       date     new
0    A 2018-12-31 2018-12
1    A 2017-09-07 2017-09
2    B 2016-02-29 2016-02

In [210]: df.dtypes
Out[210]:
var1            object
date    datetime64[ns]
new             object
dtype: object

In [211]: df['new'] + 8
Out[211]:
0   2019-08
1   2018-05
2   2016-10
Name: new, dtype: object
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
1

It is possible replace every date with the first day of month:

pd.to_datetime(d["date"], format="%Y/%m/%d").apply(lambda x: x.replace(day=1))

Result:

0 2017-01-01
1 2017-01-01
koPytok
  • 3,453
  • 1
  • 14
  • 29