0

I have a series of 40 year data in the format stn;yyyymmddhh:rainfall. I want to convert the data into datetime64 format. When i convert it to datetime with the below code, i get the following format pandas._libs.tslibs.timestamps.Timestamp But, i need it to be in pandas datetime format. Basically, i want to convert for example 1981010100 which is numpy.int64 into datetime64.

import pandas as pd
df = pd.read_csv('data.txt', delimiter = ";")
df['yyyy'] = df['yyyymmddhh'].astype(str).str[:4]
df = pd.to_datetime(data.yyyy, format='%Y-%m-%d')

Stn;yyyymmddhh;rainfall 
xyz;1981010100;0.0
xyz;1981010101;0.0
xyz;1981010102;0.0
xyz;1981010103;0.0
xyz;1981010104;0.0
xyz;1981010105;0.0
xyz;1981010106;0.0
xyz;1981010107;0.0
xyz;1981010108;0.0
xyz;1981010109;0.4
xyz;1981010110;0.6
xyz;1981010111;0.1
xyz;1981010112;0.1
xyz;1981010113;0.0
xyz;1981010114;0.1
xyz;1981010115;0.6
xyz;1981010116;0.0
xyz;1981010117;0.0
xyz;1981010118;0.2
xyz;1981010119;0.0
xyz;1981010120;0.0
xyz;1981010121;0.0
xyz;1981010122;0.0
xyz;1981010123;0.0
xyz;1981010200;0.0
Dawar
  • 69
  • 8
  • Please supply the expected [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) (MRE). We should be able to copy and paste a contiguous block of your code, execute that file, and reproduce your problem along with tracing output for the problem points. This lets us test our suggestions against your test data and desired output. Please [include a minimal data frame](https://stackoverflow.com/questions/52413246/how-to-provide-a-reproducible-copy-of-your-dataframe-with-to-clipboard) as part of your MRE. – Prune Apr 29 '21 at 18:55
  • please find the sample added – Dawar Apr 29 '21 at 19:11
  • Please follow linked posting guidelines. – Prune Apr 29 '21 at 19:13
  • Hi Dawar, you can also see my solution. You don't need extra `.dt.strftime` to format the date in your desired format. The default datetime64 display format is already in your desired format. If you don't want to modify the existing column `yyyymmddhh`, you can just use another column to hold it. My solution should be more simple and achieve your requirement. – SeaBean Apr 29 '21 at 20:34

2 Answers2

1

You can use pd.to_datetime() together with format= parameter, as follows:

df['yyyymmddhh'] = pd.to_datetime(df['yyyymmddhh'], format='%Y%m%d%H')

Output:

print(df)

    Stn          yyyymmddhh  rainfall 
0   xyz 1981-01-01 00:00:00        0.0
1   xyz 1981-01-01 01:00:00        0.0
2   xyz 1981-01-01 02:00:00        0.0
3   xyz 1981-01-01 03:00:00        0.0
4   xyz 1981-01-01 04:00:00        0.0
5   xyz 1981-01-01 05:00:00        0.0
6   xyz 1981-01-01 06:00:00        0.0
7   xyz 1981-01-01 07:00:00        0.0
8   xyz 1981-01-01 08:00:00        0.0
9   xyz 1981-01-01 09:00:00        0.4
10  xyz 1981-01-01 10:00:00        0.6
11  xyz 1981-01-01 11:00:00        0.1
12  xyz 1981-01-01 12:00:00        0.1
13  xyz 1981-01-01 13:00:00        0.0
14  xyz 1981-01-01 14:00:00        0.1
15  xyz 1981-01-01 15:00:00        0.6
16  xyz 1981-01-01 16:00:00        0.0
17  xyz 1981-01-01 17:00:00        0.0
18  xyz 1981-01-01 18:00:00        0.2
19  xyz 1981-01-01 19:00:00        0.0
20  xyz 1981-01-01 20:00:00        0.0
21  xyz 1981-01-01 21:00:00        0.0
22  xyz 1981-01-01 22:00:00        0.0
23  xyz 1981-01-01 23:00:00        0.0
24  xyz 1981-01-02 00:00:00        0.0
SeaBean
  • 22,547
  • 3
  • 13
  • 25
  • I just realized that with this method, the column yyyymmddhh is in Timestamps format, but what i actually need is pandas date time format. How do i convert timestamp to datetime – Dawar May 08 '21 at 11:04
  • @Dawar Where did you see it in Timestamps format ? You can check the data type by using `df.info()` and refer to the `Dtype` column, you will see its dtype is `datetime64[ns]`. Also the function used for conversion [`pd.to_datetime()`](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html), you can refer to the doc and look at the `Returns` section that for input of a Series: the return type is: `Series of datetime64 dtype` . This function returns a Timestamp only when you pass it a scalar, i.e. a single element instead of a Series. – SeaBean May 08 '21 at 13:50
  • Thanks for the clarification. I checked it with **type(df.yyyymmddhh[0])** – Dawar May 08 '21 at 14:20
  • @Dawar No worry! The date column is still in datetime64 format. It's just the type() function will see the element type as Timestamp. You can further check using Pandas property with `df1.yyyymmddhh.dtype` and see the dtype as `dtype(' – SeaBean May 08 '21 at 14:29
0

I believe this should fit the bill for you.

import pandas as pd
df = pd.read_csv('data.txt', delimiter = ";")

df['date'] = pd.to_datetime(df['yyyymmddhh'], format='%Y%m%d%H')
df['formatted'] = pd.to_datetime(df['date'].dt.strftime('%Y-%m-%d %H:%M:%S'))

Jake Steele
  • 103
  • 8