2

Sample Text:

\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Michael"
    int_value: 
    id: "35972390"
    date_value: 
    name: first_name
  attributes_cache: {}

\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Johnson"
    int_value: 
    id: "35533149"
    date_value: 
    name: last_name
  attributes_cache: {}

Target:

I'm trying to extract the value after "string_value" where the "name" equals some string. Let's say it equals last_name. The attributes are not in any particular order. I've explored using capture groups but I did not get very far.

Any help on this would be appreciated. Thanks!

Moody
  • 51
  • 1
  • 1
  • 5
  • This should be of help: http://stackoverflow.com/questions/19193251/regex-to-get-the-words-after-matching-string – Travis Dec 11 '16 at 03:42
  • Thanks @Travis This helps. However, in that example, "Object Name" is unique. In my case, string_value can repeat an infinite number of times. I can retrieve the string after "string_value", but "name" needs to equal a certain value within the same block. – Moody Dec 11 '16 at 03:48
  • The answer to that depends on the following: What tool are you using to perform the regex? And what do you want to do with each match? Depending on your tool, the simplest solution would just be to loop and search again from the previous location + 1 until no more matches are found. *(that would necessarily use a language other than regex to drive the search, though)* – Travis Dec 11 '16 at 03:52
  • This would be part of a SQL search. I would display the value. – Moody Dec 11 '16 at 03:55
  • Got it. Yeah this was the first path I wanted to take to see if this can be done with regex. I'll go a more traditional route and script it. – Moody Dec 11 '16 at 03:55

1 Answers1

3

You can try this regex:

string_value:(?=(?:(?!attributes_cache).)*name: last_name)\s+\"(\w+)\".*?attributes_cache

Explanation

  1. string_value: matches the characters string_value:
  2. Positive Lookahead (?=(?:(?!attributes_cache).)*name: last_name) it looks ahead to see if it contains name: last_name but will not go beyond attributes_cache , otherwise it may overlap with the next result set which may have name: last_name
  3. \s+ matches any whitespace character (equal to [\r\n\t\f\v ])
  4. Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
  5. \" matches the character " literally (case sensitive)
  6. 1st Capturing Group (\w+) : \w+ matches any word character (equal to [a-zA-Z0-9_]) => this is the text that you want capture.

The capture group 1 contains the text that you are looking for.

Although you haven't described the programming language but the following sample is done on ruby (run it) :

re = /string_value:(?=(?:(?!attributes_cache).)*name: last_name)\s+\"(\w+)\".*?attributes_cache/m
str = '\\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Johnson1"
    int_value: 
    id: "35533149"
    date_value: 
    name: last_name
  attributes_cache: {}

\\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Michael"
    int_value: 
    id: "35972390"
    date_value: 
    name: first_name
  attributes_cache: {}

\\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Johnson2"
    int_value: 
    id: "35533149"
    date_value: 
    name: last_name
  attributes_cache: {}'

# Print the match result
str.scan(re) do |match|
    puts match.to_s
end
Mustofa Rizwan
  • 10,215
  • 2
  • 28
  • 43
  • 2
    Thank you @Maverick_Mrt this works great. I modified a couple things to account for multiple words. It turns out MySQL does not allow for capture groups so I had to resort to parsing the YAML in ruby, iterating through the objects, and testing each string_value/name. – Moody Dec 12 '16 at 01:23