I have a .XML file. Anh I want to find all date values in string then replace each value with new value after convert the date to integer. So how can I define a function in Python to do that? Please help me. Thank you. For Example, I have date in middle date and I want convert and replace date string to integer like that: 12345678 And all other data is not change. Here is my XML structure:
<?xml version="1.0" encoding="iso-8859-15"?>
<output>
<response>
<error>0</error>
<response>
<entries>
<entry>
<line>0</line>
<id></id>
<first>2014-03-08</first>
<last>0</last>
<md5>12e00ed477ad6</md5>
<virustotal></virustotal>
<vt_score></vt_score>
<scanner></scanner>
<virusname></virusname>
<url>eva247.com/image/</url>
<recent></recent>
<response></response>
<ip></ip>
<as></as>
<review></review>
<domain>eva247.com</domain>
<country></country>
<source></source>
<email></email>
<inetnum></inetnum>
<netname>Tam Nhin Moi</netname>
<descr></descr>
<ns1></ns1>
<ns2></ns2>
<ns3></ns3>
<ns4></ns4>
<ns5></ns5>
</entry>
<entry>
<line>1</line>
<id></id>
<first>2014-02-06</first>
<last>0</last>
<md5>de7db4cbe86d34373eb70</md5>
<virustotal></virustotal>
<vt_score></vt_score>
<scanner></scanner>
<virusname>N/A</virusname>
<url>files.downloadsmart.net</url>
<recent></recent>
<response></response>
<ip></ip>
<as>7643</as>
<review></review>
<domain>files.do</domain>
<country></country>
<source></source>
<email></email>
<inetnum></inetnum>
<netname>(VNPT)</netname>
<descr></descr>
<ns1></ns1>
<ns2></ns2>
<ns3></ns3>
<ns4></ns4>
<ns5></ns5>
</entry>
Here is my code:
fp = open("2014-03-12_16.19.xml","r")
tmp = fp.read()
for line in tmp:
fileopen = open("danh_sach_xml_new.xml","a")
match = re.search(r"(\d{4}-\d{2}-\d{2})", line)
if match:
result = match.group(1)
newline = result.replace(result,_timestamp_(result))
fileopen.write(newline)
else:
fileopen.write(line)
fileopen.close()
fp.close()
and
def _timestamp_(date):
list_date = date.split("-")
pprint.pprint(list_date)
t = (int(list_date[0]), int(list_date[1]), int(list_date[2]),0,0,0)
time_stamp = time.mktime(t)
return time_stamp