-2

I have an array of string as below

["<table class=\"size-table _size-table\">\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255138\" role=\"option\" aria-disabled=\"true\" aria-label=\"2\">\n<td class=\"size-name _size-name\">2",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255136\" role=\"option\" aria-disabled=\"true\" aria-label=\"3\">\n<td class=\"size-name _size-name\">3",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255137\" role=\"option\" aria-disabled=\"true\" aria-label=\"4\">\n<td class=\"size-name _size-name\">4",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255135\" role=\"option\" aria-disabled=\"true\" aria-label=\"5\">\n<td class=\"size-name _size-name\">5",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255134\" role=\"option\" aria-disabled=\"true\" aria-label=\"6\">\n<td class=\"size-name _size-name\">6",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255133\" role=\"option\" aria-disabled=\"true\" aria-label=\"7\">\n<td class=\"size-name _size-name\">7",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255132\" role=\"option\" aria-disabled=\"true\" aria-label=\"8\">\n<td class=\"size-name _size-name\">8",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n<tr class=\"product-size _product-size disabled _disabled\" data-sku=\"3255131\" role=\"option\" aria-disabled=\"true\" aria-label=\"9\">\n<td class=\"size-name _size-name\">9",
 "\n<td class=\"subscribe\">",
 "\n</tr>\n</table>\n"]

I want the contents inside the aria-label. This will be numbers as present above or in some cases it could be some values like S,M,L,XL.

So I am trying to each array element and select the content under aria-label by some sort of regular expressions in ruby. But I am not able to get it properly. Please help

tin tin
  • 362
  • 1
  • 7
  • 19
  • You shoudn't use regexps to parse html and [this is the reason why](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). Use a html, xml library like [nokogiri](http://www.nokogiri.org/) instead – spickermann Jun 27 '16 at 11:34
  • @spickermann : Thanks for your suggestion and I agree with it. My main intention for posting this question is to understand how to use and work with regex and so in the title of the question I have specified as regular expression. Thanks :) – tin tin Jun 27 '16 at 11:55

1 Answers1

2

While there is an opinion that HTML should not be parsed with regexps, in this particular case it might be considered OK IMHO, since the input is more like strings, not like HTML.

inp.map { |e| e[/(?<=aria-label=").+?(?=")/] }

#⇒ ["2", nil, "3", nil, "4", nil, "5", nil, "6", 
#     nil, "7", nil, "8", nil, "9", nil, nil]

to retrieve meaningful values only:

inp.map { |e| e[/(?<=aria-label=").+?(?=")/] }.compact
#⇒ ["2", "3", "4", "5", "6", "7", "8", "9"]
Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160