1

I asked the original question here, and got a practical response with mixed Ruby and Regular Expressions. Now, the purist in me wants know: Can this be done in regular expressions? My gut says it can. There's an ABNF floating around for bash 2.0, though it doesn't include string escapes.

The Spec

Given an input line that is either (1) a variable ("key") assignment from a bash-flavored script or (2) a key-value setting from a typical configuration file like postgresql.conf, this regex (or pair of regexen) should capture the key and value in such a way that I can use those captures to substitute a new value for that key.

You may use a different regular expression for shell-flavored and config-flavored lines; the caller will know which to use.

There will be a 50-point bounty here. I can't add a bounty for two days, so I won't accept an answer till then, but you can start answering immediately. You earn points for:

  • Readability (named capture groups, definitions via ?(DEFINE) or {0})
  • Using a single regex instead of two
  • Teaching me something about DFA
  • Regex performance, if relevant
  • Getting upvoted
  • First to use a technique

Example:

Given the input

export RAILS_ENV=production

I should be able to write in Ruby:

match = THE_REGEX.match("export RAILS_ENV=production")
newline = "export #{match[:key]}=#{match[:value]}"

Test cases: shell style

RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

Test cases: config style

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

For the purpose of this challenge, "regular expression" and "regex" mean the same thing and both can refer to any common flavor you like, though I prefer Ruby 1.9-compatible.

Community
  • 1
  • 1
Jay Levitt
  • 1,680
  • 1
  • 19
  • 28

1 Answers1

2

I'm not sure about the full specs and what exactly you want in the value capturing group, but this should work for your test cases:

/
^\s*+

(?:export\s++)?
(?<key>\w++)

\s*+
=
\s*+

(?<value>
  (?>  "(?:[^"\\]+|\\.)*+"
  |    '(?:[^'\\]+|\\.)*+'
  |    `(?:[^`\\]+|\\.)*+`
  |    [^#\n\r]++
  )
)

\s*+
(?:#.*+)?
$
/mx;

Handles comments and quotes with escapes.

Perl/PCRE flavor and quoting.


Example usage in Perl:

my $re = qr/
    ^\s*+

    (?:export\s++)?
    (?<key>\w++)

    \s*+
    =
    \s*+

    (?<value>
      (?>  "(?:[^"\\]+|\\.)*+"
      |    '(?:[^'\\]+|\\.)*+'
      |    `(?:[^`\\]+|\\.)*+`
      |    [^#\n\r]++
      )
    )

    \s*+
    (?:\#.*+)?
    $
/mx;

my $str = <<'_TESTS_';
RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

TEST="foo'bar\"baz#"
TEST='foo\'bar"baz#\\'
_TESTS_


for(split /[\r\n]+/, $str){
    print "line: $_\n";
    print /$re/? "match: $1, $2\n": "no match\n";
    print "\n";
}

Output:

line: RAILS_ENV=development     # Don't forget to change this for TechCrunch
match: RAILS_ENV, development

line: HOSTNAME=`cat /etc/hostname`
match: HOSTNAME, `cat /etc/hostname`

line: plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`
match: plist, `cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

line: # Optional bonus input: "#" present in the string
no match

line: FORMAT="  ##0.00 passe\`" #comment
match: FORMAT, "  ##0.00 passe\`"

line: listen_addresses = 127.0.0.1 #localhost only by default
match: listen_addresses, 127.0.0.1

line: # listen_addresses = 0.0.0.0 commented out, should not match
no match

line: TEST="foo'bar\"baz#"
match: TEST, "foo'bar\"baz#"

line: TEST='foo\'bar"baz#\\'
match: TEST, 'foo\'bar"baz#\\'
Qtax
  • 33,241
  • 9
  • 83
  • 121