1

I am trying to parse !Name=John OR !Address="1234 Place Lane" AND tall in Javascript, extracting the "parameters" as key/value if they have a ! in front and otherwise as a simple string. Example code is below:

 1 var input = '!Name=John OR !Address="1234 Place Lane" AND tall';
 2 var params = input.match(/.../g); // <--??  what is the proper regex?
 3 var i = params.length;
 4 while(i--) {
 5   params[i] = params[i].replace(/"/g."");
 6   if( params[i].indexOf("!")==0 ) {
 7      params[i] = params[i].substring(1);
 8        // Address=1234 Place Lane
 9      var position = params[i].indexOf('=');
10      var key = params[i].substring(0,position);
11      var value = params[i].substring(1+position);
12      params[i] = {"key": key, "value": value}; 
13        // {key: "Address", value: "1234 Place Lane"}
14        // {key: "Name", value: "John"}
15   }
16 }
17 // params = [ {key:"Name",value:"John"}, "OR", {...}, "AND", "tall" ];

Related question: javascript split string on space or on quotes to array

Community
  • 1
  • 1
verideskstar
  • 183
  • 2
  • 8
  • 3
    This kind of problem is generally better to solve with a state machine that just walks through the string char by char collecting the pieces it encounters. – jfriend00 May 13 '13 at 14:19
  • Is it possible to use `json` rather than string parsing? – Ryan Gates May 13 '13 at 14:19
  • 1
    The *separating keywords* can only be `OR` or `AND`? If yes, I would rather split by them and parse each substring, instead of using a complicated regex. – sp00m May 13 '13 at 14:30
  • I have had success implementing a similar query micro-language with Jison: http://zaach.github.io/jison/ — it complicates the toolchain, but gives you certainty. – Vasiliy Faronov May 13 '13 at 15:02
  • State machine seems the quickest option, I'll try that. Thanks. – verideskstar May 13 '13 at 15:15

1 Answers1

3

Description

Consider the following powershell example of a regex.

(?i)(?:\sor\s|\sand\s|^)!?([^=]*)(?:=?["]?([^"]*)["]?)?(?=\sor\s|\sand\s|$) see also link

Example

$Matches = @()
$String = '!Name=John OR !Address="1234 Place Lane" AND tall'
Write-Host start with 
write-host $String
Write-Host
Write-Host found
([regex]'(?i)(?:\sor\s|\sand\s|^)!?([^=]*)(?:=?["]?([^"]*)["]?)?(?=\sor\s|\sand\s|$)').matches($String) | foreach {
    write-host "key at $($_.Groups[1].Index) = '$($_.Groups[1].Value)'`t= value at $($_.Groups[2].Index) = '$($_.Groups[2].Value)'"
    } # next match

Yields

start with
!Name=John OR !Address="1234 Place Lane" AND tall

found
key at 1 = 'Name'   = value at 6 = 'John'
key at 15 = 'Address'   = value at 24 = '1234 Place Lane'
key at 45 = 'tall'  = value at 49 = ''

Summary

You will need to continue with your original logic where you're taking key/value sets and constructing your value set and testing to see if a value is actually present

enter image description here

  • (?i) ensure case not sensitive
  • (?:\sor\s|\sand\s|^) require the start of the start of a string, "or", or "and". Both "or" and "and" are required to be surrounded by a space
  • !? consume the exclamination point if it exists
  • ([^=]*) return all characters which are not equal signs
  • (?: start non capture group
  • =? consume the equals sign if it exists
  • ["]? consume the quote if it exists
  • ([^"]*) capture all non quotes using greedy search
  • ["]? consume the quote if it exists
  • ) close the non capture group
  • ? make the entire non-capture group not required
  • (?=\sor\s|\sand\s|$) search for the next "and", "or", or end of string. This forces the greedy capture to stop at the and/or/endofstring breaks
Community
  • 1
  • 1
Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43
  • What tool did you use to generate the graph? – sp00m May 13 '13 at 20:24
  • Hey sp00m, I used http://www.debuggex.com/ which is a bit faster and more realtime then http://www.regexper.com/. But they both to about the same thing. – Ro Yo Mi May 13 '13 at 20:32