0

Hi I'm looking for some help replacing a certain string pattern in my data. I have a massive list of json data from an API, however, it appears a date field is being returned as a python function before having been parsed out into raw text. Below is an example of the started_at attribute in the raw json data which is causing me problems:

'started_at': datetime.datetime(2022, 2, 7, 16, 37, 26)

Like instead of the actual date + timestamp in raw text, it's just represented as the python function. When I try running json.loads() on this data, it fails because of this attribute not being properly parsed out. So my potential solution is I want to replace the datetime.datetime(2022, 2, 7, 16, 37, 26) text with a simple variable like current_date = str(datetime.now().date()), so it will just be 2022-02-07 instead of the whole datetime string.

I've tried doing raw_json.replace(datetime.datetime(*, *, *, *, *, *), current_date), but it doesn't work because I'm not sure how to manipulate the regex in a way so that it captures ANY pattern in that format.

Putting in some reproducible code below if anybody wants to try it out:

import re
from datetime import datetime

current_date = str(datetime.now().date())
sample_json = str({'started_at': 'datetime.datetime(2022, 2, 7, 16, 37, 26)'})


formatted_json = sample_json.replace('datetime.datetime(*, *, *, *, *, *)', current_date)

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
jyablonski
  • 711
  • 1
  • 7
  • 17
  • Instead of `str`, use the `strftime` method to convert the datetime object to a string. You don't need a regex. – mkrieger1 Feb 07 '22 at 19:37
  • And use `json` module to convert a Python dictionary to a proper JSON string. – mkrieger1 Feb 07 '22 at 19:39
  • Does this answer your question? [Convert datetime object to a String of date only in Python](https://stackoverflow.com/questions/10624937/convert-datetime-object-to-a-string-of-date-only-in-python) – mkrieger1 Feb 07 '22 at 19:42

1 Answers1

0

After enough trial & error I got the proper regex answer.

# all json text needs to be wrapped in double quotes
current_date = str(f'"{datetime.now().date()}"') 
sample_json = str({'started_at': 'datetime.datetime(2022, 2, 7, 16, 37, 26)'})
formatted_json = re.sub('datetime.datetime[\(\[].*?[\)\]]', current_date, sample_json)

The [\(\[].*?[\)\]] part will remove the parantheses and everything inside the parantheses, and adding in datetime.datetime will add that part to the string pattern as well.

jyablonski
  • 711
  • 1
  • 7
  • 17