How to extract data in XML format?

Question

I have this service that I am calling and it returns to me data in xml format. I want to extract the server address from that. How can I do that? This is what I get when I call from the service.

from xml.dom import minidom
import requests


url="http://172.10.3.2:51106/GetConnectionStrings.asmx"

#headers = {'content-type': 'application/soap+xml'}
headers = {'content-type': 'text/xml'}
body = """<?xml version='1.0' encoding='utf-8'?>
                            <soap:Envelope xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:soap='http://schemas.xmlsoap.org/soap/envelope/'>
                              <soap:Body>
                                <DatabaseConnectionString xmlns='http://tempuri.org/'>
                                  <DatabaseName>ELMA</DatabaseName>
                                  <ApplicationName>MonitoringSystem</ApplicationName>
                                </DatabaseConnectionString>
                              </soap:Body>
                            </soap:Envelope>"""

response = requests.post(url,data=body,headers=headers)
#print response.content
doc = minidom.parseString(response.content)

# doc.getElementsByTagName returns NodeList
name = doc.getElementsByTagName("DatabaseConnectionStringResult")[0]
print(name.firstChild.data)

This is what I tried so far.

Data Source=172.10.3.3;Initial Catalog=Elma;User ID=User11021969;Password=ILoveMyMOM;MultipleActiveResultSets=True;Min Pool Size=5;Max Pool Size=5000;Connect Timeout=180;Application Name=MonitoringSystem

I want to extract the Data Source 172.10.3.3 and save it as a string.

[Duplicate](http://stackoverflow.com/questions/1912434/how-do-i-parse-xml-in-python) — Mayur Koshti, Dec 23 '16 at 10:05

score 0 · Answer 1 · answered Dec 23 '16 at 16:14

I am not sure what you mean by "the Data Source 172.10.33" because that IP address does not appear anywhere in the body of your text.

To search for and extract information from a body of text, use a regular expression

body = """<?xml version='1.0' encoding='utf-8'?>
                        <soap:Envelope xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:soap='http://schemas.xmlsoap.org/soap/envelope/'>
                          <soap:Body>
                            <DatabaseConnectionString xmlns='http://tempuri.org/'>
                              <DatabaseName>ELMA</DatabaseName>
                              <ApplicationName>MonitoringSystem</ApplicationName>
                            </DatabaseConnectionString>
                          </soap:Body>
                        </soap:Envelope>"""

If you want to extract the URL, then use the following code:

import re
url = re.findall("xsi='(.*?)'", body)[0]

If you wanted the database name: import re databaseName = re.findall("(.*?)", body)[0]

The key here is that things outside of (.*?) are the strings on the left and right sides of what you want (for example your xml tags), and the (.*?) itself means "Extract this information for me."

As long as you know what xml tags you are looking for, you can extract anything that this service you are using gives you. The function re.findall returns a list of everything that matches your description. The code above assumes that only 1 thing will match your description, so it will return only the first element of the list.

How to extract data in XML format?

1 Answers1