4

The Wikidata date format looks like this:

+2018-03-26T00:00:00Z So if Douglas Adams was born on +1952-03-11T00:00:00Z

Then I can use Python to get the timestamp like this:

from datetime import datetime
from dateutil.parser import parse
datestring = '+1952-03-11T00:00:00Z'
dt_obj = parse(datestring[1:])
print(dt_obj.timestamp())

>>> -562032000.0

As you can see here I cannot use the + - value, which indicates that the date in AD or BC.

Moreover, I cannot work with an incomplete date:

For example Genghis Khan +1162-00-00T00:00:00Z(day and month missing)

And dates in BC that are also incomplete: Plato -0427-00-00T00:00:00Z

pajamas
  • 1,194
  • 1
  • 12
  • 25
  • 1
    Related question: [BC dates in Python](https://stackoverflow.com/questions/15857797/bc-dates-in-python). – skovorodkin Mar 27 '18 at 09:40
  • Also can look at [How to create negative datetime in python 2.7](https://stackoverflow.com/questions/32161570/how-to-create-negative-datetime-in-python-2-7), where they suggest using [Astropy](http://www.astropy.org/) or [datautil](https://pypi.org/project/datautil) (not the same as [dateutil](https://pypi.org/project/python-dateutil/), confusingly enough) instead of the standard [`datetime`](https://docs.python.org/3/library/datetime.html) objects. – jdehesa Mar 27 '18 at 09:45

1 Answers1

2

The standard datetime module cannot handle negative (BC) dates, but NumPy can. It allows you to parse both positive and negative dates, although for some reason it only allows no sign symbol (assumes positive) or negative symbol; maybe it's worth raising an issue, as the ISO 8601 standard is supposed to support it. The missing months and days, though, are not (afaik) part of the standard; you can split by '-00' as a kinda-hacky-but-effective solution. A complete function could look like this:

import numpy as np

def get_timestamp(date_str):
    # Probably not necessary
    date_str = date_str.strip()
    # Remove + sign
    if date_str[0] == '+':
        date_str = date_str[1:]
    # Remove missing month/day
    date_str = date_str.split('-00', maxsplit=1)[0]
    # Parse date
    dt = np.datetime64(date_str)
    # As Unix timestamp (choose preferred datetype)
    return dt.astype('<M8[s]').astype(np.int64)

date1 = '+1952-03-11T00:00:00Z'
date2 = '-0427-00-00T00:00:00Z'
print('Timestamp for {}: {}'.format(date1, get_timestamp(date1)))
# Timestamp for +1952-03-11T00:00:00Z: -562032000
print('Timestamp for {}: {}'.format(date2, get_timestamp(date2)))
# Timestamp for -0427-00-00T00:00:00Z: -75641990400
jdehesa
  • 58,456
  • 7
  • 77
  • 121