-4

I'm completely new to Ruby so I was just wondering if someone could help me out.

I have the following String:

"<planKey><key>OR-J8U</key></planKey>"

What is the regex I have to write to get the center part OR-J8U?

Paymon Wang-Lotfi
  • 545
  • 2
  • 11
  • 29
  • Does the string include the `"` character? – Ely Jun 28 '17 at 17:46
  • 2
    In general, parsing XML should be done with an XML parser. Regex can be used for quick and dirty solutions, but be advised that they can be tripped up by otherwise valid XML. – Mark Thomas Jun 28 '17 at 17:54
  • 2
    [Don't parse XML with regular expressions](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)! – Sergio Tulentsev Jun 28 '17 at 18:05
  • https://blog.engineyard.com/2010/getting-started-with-nokogiri – Sagar Pandya Jun 28 '17 at 18:09
  • see this [question](https://stackoverflow.com/questions/10799136/get-text-directly-inside-a-tag-in-nokogiri) also. Not sure if this is good code but you can do `require 'nokogiri'; Nokogiri::XML(str).at_xpath('planKey/key/text()').to_s #=> "OR-J8U"` – Sagar Pandya Jun 28 '17 at 18:40
  • `What is the regex I have to write to get the center part OR-J8U?` What do you mean by the center part? Like, which center part do you mean `"asdfOR-J8Uasdfasdf"` –  Jun 28 '17 at 21:16

2 Answers2

2

Use the following:

str = "<planKey><key>OR-J8U</key></planKey>"
str[/(?<=\<key\>).*(?=\<\/key\>)/]
 #=> "OR-J8U" 

This captures anything in between opening and closing 'key' tags using lookahead and lookbehinds

Graham
  • 7,431
  • 18
  • 59
  • 84
Sagar Pandya
  • 9,323
  • 2
  • 24
  • 35
1

If you want to get the string OR-J8U then you could simply use that string in the regular expression; the - character has to be escaped:

/OR\-J8U/

Though, I believe you want any string that is enclosed within <planKey><key> and </key></planKey>. In that case ice's answer is useful if you allow for an empty string:

/(?<=\<key\>).*(?=\<\/key\>)/

If you don't allow for an empty string, replace the * with +:

/(?<=\<key\>).*(?=\<\/key\>)/

If you prefer a more general approach (any string enclosed within any tags), then I believe the common opinion is not to use a regular expression. Instead consider using an HTML parser. On SO you can find some questions and answers in that regard.

Ely
  • 10,860
  • 4
  • 43
  • 64