-2

I have an XML in the following format

<Employee>
<ID>..</ID>
<E-mail>..</E-mail>
 ...
<custom_1>..</custom_1>
<custom_2>..</custom_2>
 <custom_3>..</custom_3>
 </Employee>

My requirement is to find all the tags in the XML which starts with "custom_*". I'm using Groovy an hence doing something like this (using XMLParse in Groovy)

Could anyone please guide me here.

Thanks, Vipin

vipinn
  • 1
  • 1
  • 6
    [Regex is not cool to parse xml](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags). Did you tried reading tags using XmlSlurper's `breadthFirst().children()`? – Will Dec 10 '14 at 17:08
  • I don't know groovy but this will work in most languages with regex capabilities. However, like @WillP says, regex isn't great (or reliable ) toolfor xml parsing. Nested tags with the same name are really painful dealing with in regex. `<(custom_[\w-]+)(\s+[^>]+)*>[\s.]*?\1>` This captures the name of the tag as backreference 1 and uses it when finding the closing tag. I don't know if or what groovy supports, but most implementations of regex support this. However, for instance `` would give you messy results. – Regular Jo Dec 10 '14 at 17:12
  • And beside all of thou shalt not use regexp for xml: show us your code, what have you tried so far? – cfrick Dec 10 '14 at 18:01

1 Answers1

0

Try doing this :

import javax.xml.xpath.*
import javax.xml.parsers.DocumentBuilderFactory

def testxml = '''
                <Employee>
                  <ID>..</ID>
                  <E-mail>..</E-mail>
                  <custom_1>foo</custom_1>
                  <custom_2>bar</custom_2>
                  <custom_3>base</custom_3>
                </Employee>
  '''

def processXml( String xml, String xpathQuery ) {
  def xpath = XPathFactory.newInstance().newXPath()
  def builder     = DocumentBuilderFactory.newInstance().newDocumentBuilder()
  def inputStream = new ByteArrayInputStream( xml.bytes )
  def records     = builder.parse(inputStream).documentElement
  xpath.evaluate( xpathQuery, records, XPathConstants.NODESET )
}

def result = processXml( testxml, '//*[starts-with(name(), "custom")]' )
result.length.times{
    println result.item(it).textContent
}
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • all this http://stackoverflow.com/questions/27407810/iterate-over-all-xpath-results for that? – cfrick Dec 10 '14 at 18:20
  • 1
    This can be one line in groovy, but in an attempt to [check SO rules](http://meta.stackoverflow.com/questions/260828/do-we-need-a-close-reason-for-zero-effort-questions), it'd be nice if OP write something he tried – Will Dec 10 '14 at 18:21
  • @cfrick: not only for that, but to learn how to do it in groovy. I'm totally new in groovy. – Gilles Quénot Dec 10 '14 at 23:30