2

My Rails app is retrieving data from a third-party service that doesn't allow me to attach arbitrary data to records, but does have a description area that supports Markdown. I am attempting to pass data for each record over to my Rails app within the description content, via Markdown comments:

[//]: # (POST:28|USERS:102,78,90)

... additional Markdown content.

I found the [//]: # (...) syntax in this answer for embedding comments in Markdown, and my idea was to then pass pipe-separated key-value pairs in as the comment content.

Using my example above, I would like to be able to parse the description content string to interpret the key-value pairs. In this case, POST=28 and USERS=102,78,90. If it helps, this comment will always appear at the first line of the Markdown content.

I imagine Regex is the way to go here? I would really appreciate any help!

Community
  • 1
  • 1
Jody Heavener
  • 2,704
  • 5
  • 41
  • 70
  • I think your idea is good, but I suggest *not* using your own artisanal serialization format and instead using something standard like JSON after the `#`. It would simplify both the matching and the parsing significantly. – Jordan Running Jul 26 '16 at 20:38

3 Answers3

1

You can use \G:

(?:^\[//\]:[^(]+\(\K # match your token [//]: at the beginning
|                    
\G(?!\A)\|           # or right after the previous match
)
(\w+):([\w,]+)       # capture word chars (=key)
                     # followed by :
                     # followed by word chars and comma (=val) 

See a demo on regex101.com.

Jan
  • 42,290
  • 8
  • 54
  • 79
0

You'll need 2 steps to properly parse this:

First find the comment: ^\[\/\/\]: # \(([^)]*)\)

  • This captures the comment's content.

Then parse the content: (\w+):([^|]+) (with the global flag)

  • This captures the key and value separately.
Whothehellisthat
  • 2,072
  • 1
  • 14
  • 14
0

As I mentioned in my comment above, you could simplify things a lot by using a standard data serialization format like JSON after the #. For example:

require "json"

MATCH_DATA_COMMENT = %r{(?<=^\[//\]: # ).*}

markdown = <<END
[//]: # {"POST":28,"USERS":[102,78,90]}

... additional Markdown content.
END

p JSON.parse(markdown[MATCH_DATA_COMMENT])
# => { "POST" => 28, "USERS" => [102, 78, 90] }

The regular expression %r{(?<=^\[//\]: # ).*} uses negative lookbehind to match anything that follows "[//]: #".

Jordan Running
  • 102,619
  • 17
  • 182
  • 182