3

I have trouble printing simple text from a <h1> element:

require 'nokogiri'

doc = Nokogiri::HTML("<h1><em>Name</em>A Johnson </h1>")
puts doc.at_xpath("//h1").content

It outputs:

NameA Johnson

I want just A Johnson in the output. Is it possible to select just this text using XPath or CSS selectors?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Иван Бишевац
  • 13,811
  • 21
  • 66
  • 93
  • 1
    Selecting just text nodes with XPath as suggested is the best. Could also using the hack: `doc.at('h1').children.last.text` – singpolyma Sep 14 '12 at 00:41

2 Answers2

2

How about using text() XPath function? Like this (untested though):

require 'nokogiri'

doc = Nokogiri::HTML("<h1><em>Name</em>A Johnson </h1>")
puts doc.at_xpath("//h1/text()").content
raina77ow
  • 103,633
  • 15
  • 192
  • 229
1

These solutions may only give part of the story. Consider:

doc = Nokogiri::HTML("<h1><em>Name</em>A <br>Johnson </h1>")
puts doc.at_xpath("//h1/text()").content

=> A

puts doc.at('h1').children.last.text

=> Johnson

or my suggestion:

puts doc.search("h1/text()").text

=> A Johnson
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
pguardiario
  • 53,827
  • 19
  • 119
  • 159