0

I'm not getting what I want from my regex.

I'm looking thru @my_string which contains:

"Black Multi/Wide Calf":[{"large":"http://ecx.images-willy.com/images
/I/41suNF66r4L.jpg","variant":"MAIN","hiRes":"http://ecx.images-willy.com
/images/I/51knTtAU6mL._UL1000_.jpg","thumb":"http://ecx.images-willy.com
/images/I/41suNF66r4L._US40_.jpg","main":{"http://ecx.images-willy.com/images
/I/51knTtAU6mL._UY500_.jpg":["500","500""]}}],"Dark Brown":
[{"large":"http://ecx.images......

And I have a variable which is:

@color = "Black Multi"

And my regex looks like:

/^#{@color}(.*)\d+(}\])$/i.match(@my_string)

I want the string that starts with "Black Multi" and ends with }]:

Black Multi/Wide Calf":[{"large":"http://ecx.images-willy.com/images
/I/41suNF66r4L.jpg","variant":"MAIN","hiRes":"http://ecx.images-willy.com
/images/I/51knTtAU6mL._UL1000_.jpg","thumb":"http://ecx.images-willy.com
/images/I/41suNF66r4L._US40_.jpg","main":{"http://ecx.images-willy.com/images
/I/51knTtAU6mL._UY500_.jpg":["500","500""]}}]

I'm getting nil with what I have. where did I jack this up?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
ToddT
  • 3,084
  • 4
  • 39
  • 83
  • it's not clear what @my_string is supposed to be equal to there. It looks more like the contents of a hash which has been cut off towards the end. – Max Williams Jun 09 '15 at 16:27

2 Answers2

3

It looks like your string is a JSON-encoded object. Don't try to parse it using regex. Instead parse it using a JSON parser, then access its contents like normal.

require 'json'

my_string = '{"Black Multi/Wide Calf":[{"large":"http://ecx.images-willy.com/images/I/41suNF66r4L.jpg","variant":"MAIN","hiRes":"http://ecx.images-willy.com/images/I/51knTtAU6mL._UL1000_.jpg","thumb":"http://ecx.images-willy.com/images/I/41suNF66r4L._US40_.jpg","main":{"http://ecx.images-willy.com/images/I/51knTtAU6mL._UY500_.jpg":["500","500"]}}]}'
obj = JSON[my_string]
# => {"Black Multi/Wide Calf"=>
#      [{"large"=>"http://ecx.images-willy.com/images/I/41suNF66r4L.jpg",
#        "variant"=>"MAIN",
#        "hiRes"=>
#         "http://ecx.images-willy.com/images/I/51knTtAU6mL._UL1000_.jpg",
#        "thumb"=>
#         "http://ecx.images-willy.com/images/I/41suNF66r4L._US40_.jpg",
#        "main"=>
#         {"http://ecx.images-willy.com/images/I/51knTtAU6mL._UY500_.jpg"=>
#           ["500", "500"]}}]}

Because it's now a regular object, in this case a hash, it's easy to access its key/value pairs:

obj["Black Multi/Wide Calf"] # => [{"large"=>"http://ecx.images-willy.com/images/I/41suNF66r4L.jpg", "variant"=>"MAIN", "hiRes"=>"http://ecx.images-willy.com/images/I/51knTtAU6mL._UL1000_.jpg", "thumb"=>"http://ecx.images-willy.com/images/I/41suNF66r4L._US40_.jpg", "main"=>{"http://ecx.images-willy.com/images/I/51knTtAU6mL._UY500_.jpg"=>["500", "500"]}}]

And it's easy to drill down:

obj["Black Multi/Wide Calf"][0]['large'] # => "http://ecx.images-willy.com/images/I/41suNF66r4L.jpg"
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • Awesome.. thanks for going above and beyond! But its the source from a webpage.. So I'm assuming it can't be json.. but heck I will try! – ToddT Jun 09 '15 at 18:45
  • It *is* the source for a [JSON](http://json.org) representation of a JavaScript object. Don't use regex for this, instead use the right tools. Regular expressions will break badly if that string is dynamically built, which is very likely. – the Tin Man Jun 09 '15 at 18:55
  • Also, if this is in a web page, how are you retrieving the JSON string itself? If you're using regex to get to that point you're doing it the hard and unrecommended way. You should use a HTML parser, such as Nokogiri. – the Tin Man Jun 09 '15 at 19:04
  • Nice! Ok, that is great.. I'm using Mechanize and Nokogiri.. but I had no idea what it was and was just using regex and string match to get to where I was going.. – ToddT Jun 09 '15 at 19:09
  • Well, it's possible to do it via Regular expressions but at the same time it's very fragile and error-prone which is why we don't use them for parsing HTML or XML *except* when it's an extremely simple pattern and we aren't using matches for tag starts/end or quotes and there's no chance of the line being broken across a line boundary and case won't change and.... See http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags. The answers discuss the possible problems well. – the Tin Man Jun 09 '15 at 22:45
1

You need to add the "multiline" flag (/m) to the regex:

str[/Black Multi.*?\}\]/m]
  #=> "Black Multi/Wide Calf\"......\"500\"\"]}}]" 
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100