0

I have df

ID month
0       0001ee12f919a1b570658024bb59d118    2014-02 
1       0001ee12f919a1b570658024bb59d118    2014-03  
2       0001ee12f919a1b570658024bb59d118    2014-04  
3       0001ee12f919a1b570658024bb59d118    2014-05  
4       0001ee12f919a1b570658024bb59d118    2014-06  
5       0001ee12f919a1b570658024bb59d118    2014-07

and I try to turn year_month to datetime. I use df1['month'] = pd.to_datetime(df1.month) but it return ValueError: Unknown string format how can I fix that?

1 Answers1

1

You need to pass a format string '%Y-%m' as to_datetime can't deduce the format from your string:

In [42]:
df['date'] = pd.to_datetime(df['month'], format='%Y-%m')
df

Out[42]:
                                 ID    month       date
0  0001ee12f919a1b570658024bb59d118  2014-02 2014-02-01
1  0001ee12f919a1b570658024bb59d118  2014-03 2014-03-01
2  0001ee12f919a1b570658024bb59d118  2014-04 2014-04-01
3  0001ee12f919a1b570658024bb59d118  2014-05 2014-05-01
4  0001ee12f919a1b570658024bb59d118  2014-06 2014-06-01
5  0001ee12f919a1b570658024bb59d118  2014-07 2014-07-01

In [43]:
df.info()           

<class 'pandas.core.frame.DataFrame'>
Int64Index: 6 entries, 0 to 5
Data columns (total 3 columns):
ID       6 non-null object
month    6 non-null object
date     6 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(2)
memory usage: 192.0+ bytes
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • I have a problem and with this string `ValueError: time data '\\N' does match format specified` –  Aug 18 '16 at 11:10
  • Well then you have duff data, you can force the conversion by doing: `df['date'] = pd.to_datetime(df['month'], format='%Y-%m', errors='coerce')` to force the errant values to `NaT` – EdChum Aug 18 '16 at 11:11
  • How can I turn it to this form: `YYYY-MM` ? –  Aug 18 '16 at 11:19
  • @jezrael, can you say, how can I convert `YYYY-MM-DD` to `YYYY-MM`? I try `df1['date'] = [w[:6] for w in df1['date']]` but it return me `TypeError: 'Timestamp' object has no attribute '__getitem__'` –  Aug 18 '16 at 11:33
  • Why do you want this? here you get a `datetime` dtype, you can't get that format in the displayed output without converting it to a string, and if you do that it's no different to what you started with – EdChum Aug 18 '16 at 11:59
  • I need to use next `df1.set_index('date', inplace=True) df1 = df1.groupby('ID').resample('M').size().reset_index(name='val')` but I need this dataframe only with month –  Aug 18 '16 at 12:02
  • I need to create value `val` and use it only with month. I don't need every day. Earlier I use `df['month'] = pd.to_datetime(df.month) df.set_index('month', inplace=True) df = df.groupby('ID').resample('D').size().reset_index(name='val')` and it works well. But now I have a problem –  Aug 18 '16 at 12:07
  • Your problem bears no resemblance to your question, I suggest accepting my answer and posting a new question with raw data, code and desired output – EdChum Aug 18 '16 at 12:09