0

I am having a problem with splitting this string:

"published": "2018-08-15T08:04:57Z",

I would like to split the 2018-08-15 part from the T08 part. After that the T08... part needs to be removed. This will be applied to every "published": rule in the .json file.

I'll have to do this with Python, as I also convert the XML file to JSON.

So in the convert process I would like to remove the T08... part.

I hope someone can help me and if more clarification is needed, I don't mind giving it :)

Searched the internet, had some look into the .split, .pop etc. methods. I am just a rookie at Python still but I want to learn.

Here is my current code:

import xmltodict
import json

#Searching for .xml file to convert
with open ('../../get_url/chocolatey.xml') as fd:
    xmlString = fd.read()

#Converting .xml file
print("XML Input (../../get_url/chocolatey.xml):")
print(xmlString)

#Removing certain Characters from strings in file
jsonString = json.dumps(xmltodict.parse(xmlString), indent=4)
jsonString = jsonString.replace("#", "")
jsonString = jsonString.replace("m:", "")
jsonString = jsonString.replace("d:", "")
#jsonString = jsonString.replace('"', '')

#Printing output in Json format
print("\nJson Output (../../get_url/chocolatey.json):")
print(jsonString)

#Applying output to .json file
with open("chocolatey.json", 'w') as fd:
   fd.write(jsonString)

Example of the JSON file

},
                "published": "2018-08-15T08:04:57Z",
                "updated": "2018-08-15T08:04:57Z",
                "author": {
                    "name": "Microsoft"
                },
CodeIt
  • 3,492
  • 3
  • 26
  • 37
  • Manipulate the dictionary before doing json.dumps. – Alex Hall Jul 22 '19 at 08:56
  • Using the split function : `yourstring.split("T8")` will break your string in parts, removing the T8 part. To get the first one you navigate in the array of the returned results so [0], giving in one command `yourstring.split("T8")[0]` – Mayeul sgc Jul 22 '19 at 08:58
  • 2
    Basically, you are trying to manipulate a Date. I feel it should be more comfortable (and maintainable) to use datetime methods for those manipulations. – Clément Berthou Jul 22 '19 at 09:05
  • By the way, you should make your title more explicit, as you may not get the best response with this. – Clément Berthou Jul 22 '19 at 09:06
  • @ClémentBerthou Thanks will do that :) –  Jul 22 '19 at 09:10
  • @Mayeulsgc Thanks will try that. –  Jul 22 '19 at 09:12
  • I somehow need to call the "published": line so it prints the string. –  Jul 22 '19 at 09:18
  • As @ClémentBerthou pointed out it is better to use datetime methods to get this done. See my answer [here](https://stackoverflow.com/a/57143746/3091398) – CodeIt Jul 22 '19 at 10:21

3 Answers3

2

you can try like this:

timestamp = "2018-08-15T08:04:57Z"
timestamp = timestamp.split("T")[0]

op:

2018-08-15
Faizan Naseer
  • 589
  • 3
  • 12
  • Every '"published":' date us different, so it needs to be variable. –  Jul 22 '19 at 09:10
  • 1
    you can apply this to string itself like: `"published":"2018-08-15T08:04:57Z".split("T")[0]` Defining variable is not necessary, it was just to show how to do it. – Faizan Naseer Jul 22 '19 at 10:45
1

You can use the dateutil.parser for this.

from dateutil import parser
d = "2018-08-15T08:04:57Z"
dt = parser.parse(d) # parses date string of any format and returns a date time object 
print(dt,type(dt))
# outputs 2018-08-15 08:04:57+00:00 <class 'datetime.datetime'>

You can then do use strftime to get the date only or date and time in any format.

print(dt.strftime('%Y-%m-%d')) # You can specify any format you need
# outputs 2018-08-15

Read more about how to get date string from datetime object in any format here.

Example code:

import json
from dateutil import parser

jsonDict = {"published": "2018-08-15T08:04:57Z", "updated": "2018-08-15T08:04:57Z", "author": { "name": "Microsoft"},}

# converting a dictionary object to json String
jsonString = json.dumps(jsonDict)

# converting a json string to json object
jsonObj = json.loads(jsonString)

# replacing the "published" value with date only
jsonObj["published"] = parser.parse("2018-08-15T08:04:57Z").strftime('%Y-%m-%d')

# printing the result
print(jsonObj["published"])
# outputs 2018-08-15

# converting back to json string to print
jsonString = json.dumps(jsonObj)

# printing the json string
print(jsonString)

# ouputs 
'''
{"published": "2018-08-15", "updated": "2018-08-15T08:04:57Z", "author":{"name": "Microsoft"}}
'''

You can test the code here

CodeIt
  • 3,492
  • 3
  • 26
  • 37
  • I need to replace this: "2018-08-15T08:04:57Z" with only this 2018-08-15. And that for every 'Published' string information. Because the dates can vary and the information behind is as well. In my example we have T08 but I also have T14, T09, T21 etc. So what I think I need is to remove everything after the 'T' word in the '"published":' string –  Jul 22 '19 at 10:29
  • @Appollonius333 It uses the parser to parse the date string, it will convert date string into `datetime` object which can be again converted into a the desired date string. – CodeIt Jul 22 '19 at 10:35
  • @Appollonius333 Can you please post the XML string ? I will show you a demo. – CodeIt Jul 22 '19 at 10:41
  • @Codelt This is the XML string 2019-01-08T09:59:13Z –  Jul 22 '19 at 10:47
  • @Appollonius333 That's understandable. I have updated my answer with sample code. – CodeIt Jul 22 '19 at 10:50
  • @Codelt Your example helps alot, but now I need to do this for every published: string with different data in it Like this: Published m:type="Edm.DateTime">2019-01-08T09:59:12.8293611Z Published m:type="Edm.DateTime">2019-01-08T09:59:13.5017015Z –  Jul 22 '19 at 10:56
  • @Appollonius333 It will work for any valid date time strings. Check this example [here](https://repl.it/repls/WorseThoughtfulAutocad) – CodeIt Jul 22 '19 at 10:57
  • @Codelt Yes, but then I have to define every single 'published' in the file. While this file has alot of code with "published": in it. So what I need is something where I can replace all "published": data with only the dates –  Jul 22 '19 at 11:03
0

Something like this. Using JSONEncoder

import json


class PublishedEncoder(json.JSONEncoder):
    def encode(self, o):
        if 'published' in o:
            o['published'] = o['published'][:o['published'].find('T')]
        return super().encode(o)


data = {1: 'X', 'published': '2018-08-15T08:04:57Z'}

json_str = json.dumps(data, cls=PublishedEncoder)
print(json_str)

output

{"1": "X", "published": "2018-08-15"}
balderman
  • 22,927
  • 7
  • 34
  • 52