2

In Ruby:

Given the following example string:

str = 'foo,baz(some,other,stuff),hello,goodbye'

I would like to parse the string such that the commas appearing with the parens are not considered to be delimiters. So the following fields would be captured given this example:

  1. foo
  2. baz(some,other,stuff)
  3. hello
  4. goodbye

Help much appreciated!

bjlevine
  • 873
  • 1
  • 9
  • 23
  • Note that it is impossible (in the strict mathematical sense!) to handle _nested_ parentheses using regular expressions. http://stackoverflow.com/a/133684/239816 – Paul Cantrell Oct 28 '15 at 20:32
  • @PaulCantrell: what is commonly called "regular expression" or "regex" is different of what is called "regular expression" in computer science, and this tool doesn't have this kind of limitations (in particular in ruby). – Casimir et Hippolyte Oct 28 '15 at 20:36
  • @CasimiretHippolyte: You’re right: since the questioner did specify Ruby, which I didn’t notice at first, they can use the `\g` extension. – Paul Cantrell Oct 28 '15 at 20:40
  • Your example string needs to be bracketed by single or double quotes. Everybody knows what you mean, of course, but you should know that some will downvote for such an omission. Also, whenever you give an example it's helpful to assign each input object to a variable (e.g., `str = "foo,...goodbye"`). That way, readers can refer to the variables (`str`) in answers and comments without having to define them. – Cary Swoveland Oct 28 '15 at 21:20
  • If you found either answer helpful please consider selecting one. – Cary Swoveland Mar 31 '17 at 21:56

2 Answers2

3

Use regex

[^,(]*(?:\([^)]*\))*[^,]*

Regex explanation here

Regular expression visualization

Pranav C Balan
  • 113,687
  • 23
  • 165
  • 188
  • This seems to work in that it matches the pattern. However, I'm having trouble defining the capture groups such that I can capture the fields (as mentioned in my original post) – bjlevine Nov 02 '15 at 16:37
1

Here's a non-regex solution that makes use of Ruby's little-used flip-flop operator:

str = "foo,baz(some,other,stuff),hello,goodbye"

str.split(',').chunk { |s| s.include?('(') .. s.include?(')') ? true : false }.
               flat_map { |tf, a| tf ? a.join(' ') : a }
  #=> ["foo", "baz(some", "other", "stuff)", "hello", "goodbye"]

The steps:

arr = str.split(',')
  #=> ["foo", "baz(some", "other", "stuff)", "hello", "goodbye"] 

enum = arr.chunk { |s| s.include?('(') .. s.include?(')') ? true : false }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fdf9d01d2e8>:each> 

Aside: the flip-flop operator must be within an if statement, so this cannot be simplified to:

enum = arr.chunk { |s| s.include?('(') .. s.include?(')') }

We can convert this enumerator to an array to see the values it will pass on to Enumerable#flat_map:

enum.to_a
  #=> [[false, ["foo"]], [true, ["baz(some", "other", "stuff)"]],
  #    [false, ["hello", "goodbye"]]] 

Lastly:

enum.flat_map { |tf, a| tf ? a.join(' ') : a }
  #=> ["foo", "baz(some", "other", "stuff)", "hello", "goodbye"]
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100