1

This is probably very simple but I can't seem to find a way of doing this.

I'm using the Bing Maps service to get me a City name from a lat/long.

It gives me a large amount of XML which I have downloaded as a String like this:

<Name>
High Street, Lincoln, LN5 7
</Name>
<Point>
<Latitude>
53.226592540740967
</Latitude>
<Longitude>
-0.54169893264770508
</Longitude>
</Point>
<BoundingBox>
<SouthLatitude>
53.22272982317029
</SouthLatitude>
<WestLongitude>
-0.55030130347707928
</WestLongitude>
<NorthLatitude>
53.230455258311643
</NorthLatitude>
<EastLongitude>
-0.53309656181833087
</EastLongitude>
</BoundingBox>
<EntityType>
Address
</EntityType>
<Address>
<AddressLine>
High Street
</AddressLine>
<AdminDistrict>
England
</AdminDistrict>
<AdminDistrict2>
Lincs
</AdminDistrict2>
<CountryRegion>
United Kingdom
</CountryRegion>
<FormattedAddress>
High Street, Lincoln, LN5 7
</FormattedAddress>
<Locality>
Lincoln
</Locality>
<PostalCode>
LN5 7
</PostalCode>
</Address>

Is there a simple way of just getting the city name that is in between the two locality tags?

samil90
  • 107
  • 6

4 Answers4

3

I'm actually surprised people use regex and things like indexOf here. You can be in for a nasty surprise or two if you process XML like that, f.ex. if Bing decides to start using CData.

.NET fortunately also has quite good support for XML, which is just as easy to use, so I'd always use that:

XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
var nav = doc.CreateNavigator();
var iterator = nav.Select(@"//Locality");
while (iterator.MoveNext()) 
{
    Console.WriteLine("{0}", iterator.Current.InnerXml.Trim());
}

Note that you will probably need to declare a namespace resolver for the xmlns's that Bing uses. Since I don't have that part of the XML I can't add that in this example, but these things are easy to add.

atlaste
  • 30,418
  • 3
  • 57
  • 87
0

You can do this by making a constant string variable to use as a string for regular expression. Try this

const string HTML_TAG_PATTERN = "<.*?>";

static string StripHTML(string inputString)
        {
            return Regex.Replace
              (inputString, HTML_TAG_PATTERN, string.Empty);
        }

call it where you want to get the city names

string cityname = StripHTML(the code);
nrsharma
  • 2,532
  • 3
  • 20
  • 36
  • Hi nrsharma, thanks for replying. I'm not familiar with RegEx whatsoever, that pattern didn't work and just returned the whole String back – samil90 Feb 09 '13 at 13:36
  • You have to loop through with the xml node and then pass values to function StripHTML(the value) one by one. It will give you exact value. – nrsharma Feb 11 '13 at 03:53
0

A simple way to parse that kind of string is through the use of string.IndexOf method

// I have saved your xml in this file to test
string xmlResult = File.ReadAllText(@"D:\temp\locality.txt");

int startPos = xmlResult.IndexOf("<Locality>");
int endPos = xmlResult.IndexOf("</Locality>");

if(endPos != -1 && startPos != -1)
{
    string result = xmlResult.Substring(startPos + 10, endPos-startPos-10).Trim();
    Console.WriteLine(result);
}

Search for the term <Locality>, then search for the term </Locality>. If the terms are found in your string the use the Substring method to extract the part required. ( 10 is the length of the <Locality> term)

A side note. Although your example is very simple, it is a bad practice to use Regular Expressions to parse XML or HTML files. While not strictly related to your question, this famous answer (one of most ever upvoted on SO) explain why is not a good idea to use Regex to parse non regular languages.

If you have one problem, after Regex you will have two problems.

Community
  • 1
  • 1
Steve
  • 213,761
  • 22
  • 232
  • 286
0

I also recommend that you use proper XML parsing for this. However, note that the XML you gave isn't well-formed for use as an XML document because it has multiple root nodes. That's easily fixed, though.

If you use XML parsing, you'll easily be able to get at all the other data too, without any fiddly parsing.

This is so easy to do, and so much more robust than rolling-your-own XML parsing code that really should use it if you can:

Here's an one-line example which assumes your XML is in the string variable called xml:

string locality = XElement.Load(new StringReader("<Root>"+xml+"<Root>")).XPathSelectElement("Address/Locality").Value.Trim();

And here's a proper example:

using System;
using System.IO;
using System.Xml.Linq;
using System.Xml.XPath;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            // Fix original XML, which has multiple root nodes!
            // We fix it just by enclosing it in a root level element called "Root":

            string xml = "<Root>" + originalXml() + "</Root>";  

            // Read the XML as an XML element.

            var xElement = XElement.Load(new StringReader(xml));

            // Easily access 'Locality' or any other node by name:

            string locality = xElement.XPathSelectElement("Address/Locality").Value.Trim();
            Console.WriteLine("Locality = " + locality);
        }

        // Note: This XML isn't well-formed, because it has multiple root nodes.

        private static string originalXml()
        {
            return
@"<Name>
High Street, Lincoln, LN5 7
</Name>
<Point>
<Latitude>
53.226592540740967
</Latitude>
<Longitude>
-0.54169893264770508
</Longitude>
</Point>
<BoundingBox>
<SouthLatitude>
53.22272982317029
</SouthLatitude>
<WestLongitude>
-0.55030130347707928
</WestLongitude>
<NorthLatitude>
53.230455258311643
</NorthLatitude>
<EastLongitude>
-0.53309656181833087
</EastLongitude>
</BoundingBox>
<EntityType>
Address
</EntityType>
<Address>
<AddressLine>
High Street
</AddressLine>
<AdminDistrict>
England
</AdminDistrict>
<AdminDistrict2>
Lincs
</AdminDistrict2>
<CountryRegion>
United Kingdom
</CountryRegion>
<FormattedAddress>
High Street, Lincoln, LN5 7
</FormattedAddress>
<Locality>
Lincoln
</Locality>
<PostalCode>
LN5 7
</PostalCode>
</Address>";
        }
    }
}
Matthew Watson
  • 104,400
  • 10
  • 158
  • 276