-1

I'm doing an XML reading process in my project. Where I have to read the contents of an XML file. I have achieved it.

Just out of curiosity, I also tried using the same by keeping the XML content inside a string and then read only the values inside the elemet tag. Even this I have achieved. The below is my code.

string xml = <Login-Form>
                 <User-Authentication>
                     <username>Vikneshwar</username>
                     <password>xxx</password>
                 </User-Authentication>

                 <User-Info>
                     <firstname>Vikneshwar</firstname>
                     <lastname>S</lastname>
                     <email>xxx@xxx.com</email>
                 </User-Info>
             </Login-Form>";
        XDocument document = XDocument.Parse(xml);

var block = from file in document.Descendants("client-authentication")
            select new
            {
                Username = file.Element("username").Value,
                Password = file.Element("password").Value,
            };

foreach (var file in block)
{
    Console.WriteLine(file.Username);
    Console.WriteLine(file.Password);
}

Similarly, I obtained my other set of elements (firstname, lastname, and email). Now my curiosity draws me again. Now I'm thinking of doing the same using the string functions?

The same string used in the above code is to be taken. I'm trying not to use any XMl related classes, that is, XDocument, XmlReader, etc. The same output should be achieved using only string functions. I'm not able to do that. Is it possible?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Vikneshwar
  • 1,029
  • 4
  • 20
  • 38
  • 5
    Yes it is possible but not a recommended or standard way of doing it. – Furqan Hameedi Jul 07 '12 at 07:12
  • just give me some hints to do that. Im not tat much familar with string functuions. – Vikneshwar Jul 07 '12 at 07:17
  • Does your XML contain special (escaped) chars, namespaces, cdata? What you're asking for is to rebuild XElement.Parse() in a more or less reduced fashion. Just forget about it. – H H Jul 07 '12 at 07:20
  • The thing i asked for is just to know if there is another way to do it. Ohterwise i have already met my requirements. I was just trying to read xml using string fucntions. Thanks for ur help. – Vikneshwar Jul 07 '12 at 07:32
  • 1
    If you want to write parser start with something simpler than XML - JSON is awesome for this - much simpler rules than XML. – Alexei Levenkov Jul 07 '12 at 08:14

4 Answers4

4

Don't do it. XML is more complex than can appear the case, with complex rules surrounding nesting, character-escaping, named-entities, namespaces, ordering (attributes vs elements), comments, unparsed character data, and whitespace. For example, just add

<!--
    <username>evil</username>
-->

Or

<parent xmlns=this:is-not/the/data/you/expected">
    <username>evil</username>
</parent>

Or maybe the same in a CDATA section - and see how well basic string-based approaches work. Hint: you'll get a different answer to what you get via a DOM.

Using a dedicated tool designed for reading XML is the correct approach. At the minimum, use XmlReader - but frankly, a DOM (such as your existing code) is much more convenient. Alternatively, use a serializer such as XmlSerializer to populate an object model, and query that.

Trying to properly parse xml and xml-like data does not end well.... RegEx match open tags except XHTML self-contained tags

Community
  • 1
  • 1
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
2

You could use methods like IndexOf, Equals, Substring etc. provided in String class to fulfill your needs, for more info Go here,

Using Regex is a considerable option too.

But it's advisable to use XmlDocument class for this purpose.

yogi
  • 19,175
  • 13
  • 62
  • 92
  • actually, now that [`XDocument`](http://msdn.microsoft.com/en-us/library/system.xml.linq.xdocument.aspx) exists, I would *not* recommend using the old and troublesome `XMLDocument`. [Jon Skeet agrees](http://stackoverflow.com/a/1542101/1106367). – Adam Jul 07 '12 at 08:53
1

It can be done without regular expressions, like this:

string[] elementNames = new string[]{ "<username>", "<password>"};
foreach (string elementName in elementNames)
{
    int startingIndex = xml.IndexOf(elementName);
    string value = xml.Substring(startingIndex + elementName.Length,
        xml.IndexOf(elementName.Insert(1, "/")) 
        - (startingIndex + elementName.Length));
    Console.WriteLine(value);
}

With a regular expression:

string[] elementNames2 = new string[]{ "<username>", "<password>"};
foreach (string elementName in elementNames2)
{
    string value = Regex.Match(xml, String.Concat(elementName, "(.*)",
        elementName.Insert(1, "/"))).Groups[1].Value;
    Console.WriteLine(value);
}

Of course, the only recommended thing is to use the XML parsing classes.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ivan Golović
  • 8,732
  • 3
  • 25
  • 31
  • Ivan i used xml classes. Just wanted to know if there are other options available. Ur post provided of much help to me. Thanks a lot. – Vikneshwar Jul 07 '12 at 08:18
1

Build an extension method that will get the text between tags like this:

public static class StringExtension
{
    public static string Between(this string content, string start, string end)
    { 
        int startIndex = content.IndexOf(start) + start.Length;
        int endIndex = content.IndexOf(end);
        string result = content.Substring(startIndex, endIndex - startIndex);
        return result;
    }
}
Kris Bonev
  • 552
  • 1
  • 3
  • 16
HatSoft
  • 11,077
  • 3
  • 28
  • 43