0

This has been asked before in "REXML - How to extract a single element" but the answer doesn't work. Apparently, the text method is no longer available.

I have an XML file:

<?xml version="1.0" encoding="UTF-8"?>
<ice_cream>
    <flavor>Vanilla</flavor>
</ice_cream>

and I can place its contents into an array using REXML:

flavors = xml_file.get_elements('//flavor')

I get an array:

puts flavors[0]

Which returns:

<flavor>Vanilla</flavor>

Instead, I want:

Vanilla

I've tried:

flavors = xml_file.get_elements('//flavor').text

But, I get:

NoMethodError: undefined method `text' for #<Array:0x007fa7a3b94220>

What's the correct way to accomplish this? I'm open to using other libraries, too.

Community
  • 1
  • 1
jcarpio
  • 3,350
  • 5
  • 23
  • 22

2 Answers2

1

Use Nokogiri. Your code will thank you.

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<ice_cream>
    <flavor>Vanilla</flavor>
</ice_cream>
EOT

doc.search('flavor') # => [#<Nokogiri::XML::Element:0x3feb8182fc60 name="flavor" children=[#<Nokogiri::XML::Text:0x3feb8182fa44 "Vanilla">]>]
doc.search('flavor').map(&:text) # => ["Vanilla"]

search finds all nodes, as a NodeSet, that match the CSS selector 'flavor'.

search('flavor').map(&:text) walks the NodeSet and applies (map) the text method to each Node, returning its text node(s).

If your XML is actually something more complex:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<ice_cream>
    <flavor>Vanilla</flavor>
    <flavor>Chocolate</flavor>
    <flavor>Strawberry</flavor>
</ice_cream>
EOT

doc.search('flavor') # => [#<Nokogiri::XML::Element:0x3fcc2a577afc name="flavor" children=[#<Nokogiri::XML::Text:0x3fcc2a5778e0 "Vanilla">]>, #<Nokogiri::XML::Element:0x3fcc2a5776c4 name="flavor" children=[#<Nokogiri::XML::Text:0x3fcc2a5774bc "Chocolate">]>, #<Nokogiri::XML::Element:0x3fcc2a5772b4 name="flavor" children=[#<Nokogiri::XML::Text:0x3fcc2a572c78 "Strawberry">]>]
doc.search('flavor').map(&:text) # => ["Vanilla", "Chocolate", "Strawberry"]

Or:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<ice_creams>
  <ice_cream>
      <flavor>Vanilla</flavor>
  </ice_cream>
  <ice_cream>
      <flavor>Chocolate</flavor>
  </ice_cream>
  <ice_cream>
      <flavor>Strawberry</flavor>
  </ice_cream>
</ice_creams>
EOT
ice_cream = doc.search('ice_cream') # => [#<Nokogiri::XML::Element:0x3fe6a91f6b00 name="ice_cream" children=[#<Nokogiri::XML::Text:0x3fe6a91f68f8 "\n      ">, #<Nokogiri::XML::Element:0x3fe6a91f681c name="flavor" children=[#<Nokogiri::XML::Text:0x3fe6a91f6600 "Vanilla">]>, #<Nokogiri::XML::Text:0x3fe6a91f63f8 "\n  ">]>, #<Nokogiri::XML::Element:0x3fe6a91f1de4 name="ice_cream" children=[#<Nokogiri::XML::Text:0x3fe6a91f1bdc "\n      ">, #<Nokogiri::XML::Element:0x3fe6a91f1ac4 name="flavor" children=[#<Nokogiri::XML::Text:0x3fe6a91f1880 "Chocolate">]>, #<Nokogiri::XML::Text:0x3fe6a91f1678 "\n  ">]>, #<Nokogiri::XML::Element:0x3fe6a91f13f8 name="ice_cream" children=[#<Nokogiri::XML::Text:0x3fe6a91f1074 "\n      ">, #<Nokogiri::XML::Element:0x3fe6a91f0e80 name="flavor" children=[#<Nokogiri::XML::Text:0x3fe6a91f0a98 "Strawberry">]>, #<Nokogiri::XML::Text:0x3fe6a91f0840 "\n  ">]>]
ice_cream.search('flavor').map(&:text) # => ["Vanilla", "Chocolate", "Strawberry"]

For searching, Nokogiri supports using both CSS and XPath selectors, and allows you to use either in the methods, if you want. search accepts both CSS and XPath, and has corollaries of css and xpath for the CSS or XPath specific methods. at returns a single Node and is similar to search('some_node').first and has at_css and at_xpath respectively.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

Here is the code :

require 'rexml/document'

doc = <<-xml
<?xml version="1.0" encoding="UTF-8"?>
<ice_cream>
    <flavor>Vanilla</flavor>
</ice_cream>
xml

xml_doc = REXML::Document.new(doc)
xml_doc.get_elements('//flavor').class # => Array
xml_doc.get_elements('//flavor')[0].class # => REXML::Element
xml_doc.get_elements('//flavor')[0].text # => "Vanilla"

Actually xml_doc.get_elements('//flavor') will give you the collection of REXML::Element objects. You then need to iterate through the collection and call the method #text on the REXML::Element object to get the text.

Arup Rakshit
  • 116,827
  • 30
  • 260
  • 317