0

I would like to extract the value of some form tags. The value is unknown to me at runtime.

I have found several threads that come close, but they all focus on HTML parsing and scraping.

I already have the HTML source and the names of the form fields that I need the value for.

example:

<input type="hidden" name="currentRackU" id="currentRackU" value="11">

I can use a regex to get to 'id="currentRackU" value=' but I now need to get the next characters until the closing quotes.

billt
  • 3
  • 3

3 Answers3

1

How about this one-liner with nokogiri?

require 'nokogiri'
s = '<input type="hidden" name="currentRackU" id="currentRackU" value="11">'
Nokogiri::XML.parse(s).root.attributes['id'].value # currentRackU

You might need to run gem install nokogiri if you don't have nokogiri installed.

Lin Jen-Shin
  • 172
  • 8
0

When it comes to extracting data from HTML/XML documents, I usually use the gem nokogiri - it does the job well and in an elegant manner.

ksol
  • 11,835
  • 5
  • 37
  • 64
0

While it's true that HTML/XML shouldn't necessarily be parsed with a regular expression, here's something that may help you. It scans the tag and returns a hash of the attributes and their values:

html = '<input type="hidden" name="currentRackU" id="currentRackU" value="11">'
Hash[html.scan(/(\w+)="(.*?)"/)]
#=> {"type"=>"hidden", "name"=>"currentRackU", "id"=>"currentRackU", "value"=>"11"}
Michael Kohl
  • 66,324
  • 14
  • 138
  • 158