0

When I try to calculate de difference between date series and today I get Timestamp subtraction must have the same timezones or no timezones error.

Loading data

raw_data = pd.read_json('resultados_finales_completos10000.json')
print(raw_data['fecha_publicacion'][0])

2017-09-24T15:04:22.000Z

Turn object type column to datetime

raw_data['fecha_publicacion'] =  pd.to_datetime(raw_data['fecha_publicacion'])
print(raw_data['fecha_publicacion'][0])
print(raw_data['fecha_publicacion'][0].tzinfo, type(today.tzinfo))

2017-09-24 15:04:22+00:00

UTC <class 'datetime.timezone'>

Then I set today's value

today = datetime.now(tz=timezone.utc)
print(today)
print(today.tzinfo, type(today.tzinfo))

2021-08-13 21:31:16.031605+00:00

UTC <class 'datetime.timezone'>

In both cases I have the same timezone settings.

Finaly I'm trying to create a new column to store the time difference as follows and get the fore mentioned error.

raw_data['meses_venta'] = today - raw_data['fecha_publicacion']

I tryied the following posts with not much success. Any clues welcome. Thanks in advance!

Francisco Ghelfi
  • 872
  • 1
  • 11
  • 34
  • Please supply the expected [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) (MRE). We should be able to copy and paste a contiguous block of your code, execute that file, and reproduce your problem along with tracing output for the problem points. This lets us test our suggestions against your test data and desired output. Please [include a minimal data frame](https://stackoverflow.com/questions/52413246/how-to-provide-a-reproducible-copy-of-your-dataframe-with-to-clipboard) as part of your MRE. – Prune Aug 13 '21 at 21:55

1 Answers1

1

You have to set the same timezone or (no timezone) as the error says.

One way to do so:

from datetime import datetime, timezone
import pandas as pd
x = pd.to_datetime('2017-09-24T15:04:22.000Z')
today = datetime.now(tz=x.tz)

today - x # Timedelta('1419 days 06:59:48.134906')
Mohammad
  • 3,276
  • 2
  • 19
  • 35