0

Data looks like this in Excel/CSV

I have a large CSv file (80 000 rows) that has one column with dates in dd.mm.yyyy format and one column with time in hh:mm:ss format and I try to plt a graph that has date and time on the x-axis.

I have tried the parse_dates function, but that one uses yyyy-mm-dd and converts some to that format but some not.

Dates go from 9th of August to 18th of August.

1       2021-09-08
2       2021-09-08
3       2021-09-08
4       2021-09-08
           ...    
83771   2021-08-18
83772   2021-08-18
83773   2021-08-18
83774   2021-08-18
83775   2021-08-18

I can use a string but from what I saw it takes way longer to plot a graph with a string as x-axis, so I try to avoid that.

Is there a way to combine date and time and save it as datetime in any consistent format and save not as an object?

Thanks in advance.

Schlumpi
  • 21
  • 2
  • 2
    Your data shows only date. You have another with time ? Or the same column some with data and some with time ? What is actual data look like ? – EBDS Sep 28 '21 at 07:50
  • 2
    https://stackoverflow.com/questions/17978092/combine-date-and-time-columns-using-python-pandas Does this answer your question ? – EBDS Sep 28 '21 at 07:51
  • My biggest problem at the moment is, that, pars_dates doesnt recognize this format. I also tries to_datetime, but it says that it doesn't include this format – Schlumpi Sep 28 '21 at 08:04
  • I edited the question to include a picture of the data – Schlumpi Sep 28 '21 at 08:06
  • Thanks for the question update. If my codes below answers your questions, kindly accept the answer by checking it. Thanks. – EBDS Sep 28 '21 at 09:22

4 Answers4

0

try use Pandas:

example: enter image description here


import pandas as pd

df = pd.read_csv('date.csv',parse_datas = ['date'])
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['monthDay'] = df['date'].dt.day
df['weekDay'] = df['date'].dt.weekday

print(df)

It can be modified as required.

HackerDev-Felix
  • 226
  • 2
  • 5
  • Thanks for awnsering, but my problem is that the dates I have are not in the format that you are using, but instead look like 09.08.2021 for 9th of August – Schlumpi Sep 28 '21 at 08:01
0

I have found a way, so if anyone else is ever looking for this, here it is

datetime= Date+"/"+Time

Datum=pd.to_datetime(datetime, format="%d.%m.%Y/%H:%M:%S")

print(Datum)

0       2021-08-09 00:00:00
1       2021-08-09 00:00:10
2       2021-08-09 00:00:20
3       2021-08-09 00:00:30
4       2021-08-09 00:00:40
                ...        
83771   2021-08-18 17:22:30
83772   2021-08-18 17:22:40
83773   2021-08-18 17:22:50
83774   2021-08-18 17:23:00
83775   2021-08-18 17:23:10
Length: 83776, dtype: datetime64[ns]´´´
Schlumpi
  • 21
  • 2
0

You can actually do it when reading the read_csv but I've done it this way.

Code:

import pandas as pd

df = pd.read_csv('DT.csv') 

df['DateTime'] = df['Date'] + ' ' + df['Time']

df['Date'] = pd.to_datetime(df['Date'])
df['Time'] = pd.to_datetime(df['Time'])
df['DateTime'] = pd.to_datetime(df['DateTime'])
df.info()
df

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9 entries, 0 to 8
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype         
---  ------    --------------  -----         
 0   Date      9 non-null      datetime64[ns]
 1   Time      9 non-null      datetime64[ns]
 2   DateTime  9 non-null      datetime64[ns]
dtypes: datetime64[ns](3)
memory usage: 344.0 bytes

Date    Time    DateTime
0   2021-09-08  2021-09-28 00:00:00 2021-09-08 00:00:00
1   2021-09-08  2021-09-28 00:00:10 2021-09-08 00:00:10
2   2021-09-08  2021-09-28 00:00:20 2021-09-08 00:00:20
3   2021-09-08  2021-09-28 00:00:30 2021-09-08 00:00:30
4   2021-09-08  2021-09-28 00:00:40 2021-09-08 00:00:40
5   2021-09-08  2021-09-28 00:00:50 2021-09-08 00:00:50
6   2021-09-08  2021-09-28 00:01:00 2021-09-08 00:01:00
7   2021-09-08  2021-09-28 00:01:10 2021-09-08 00:01:10
8   2021-09-08  2021-09-28 00:01:20 2021-09-08 00:01:20
EBDS
  • 1,244
  • 5
  • 16
0

Generally speaking you don't need pandas for simply parsing dates/times:

from datetime import datetime

dstr = "09.08.2021"
tstr = "14:45:00"
d=datetime.strptime(f"{dstr} {tstr}", "%d.%m.%Y %H:%M:%S")
d
datetime.datetime(2021, 8, 9, 14, 45)
gimix
  • 3,431
  • 2
  • 5
  • 21