0

Say I have an array of strings of the following format:

"array[5] = 10"

What would be the best solution to parse it in JavaScript?

Ashamedly not being familiar with regular expressions, I can come up only with something like this:

for (i in lines){

var index = lines[i].indexOf("array[");

    if (index >= 0) {
        var pair = str.substring(index + 6).trim().split('=');
        var index = pair[0].trim().substring(0, pair[0].trim().length - 1);
        var value = pair[1].trim();
    }
}

Is there a more elegant way to parse something like this? If the answer is using regex, would it make the code slower?

msgmaxim
  • 788
  • 2
  • 8
  • 15

4 Answers4

4

Don't ask which approach is faster; measure it!

This is a regular expression that should match what you've implemented in your code:

/array\[(\d+)]\s*=\s*(.+)/

To help you learn regular expression, you can use a tool like Regexper to visualize the code. Here's a visualization of the above expression:

enter image description here

Note how for the index I assumed it should be an integer, but for the value any characters are accepted. Your code doesn't specify that either the index or value should be numbers, but I made some assumptions to that effect. I leave it as an exercise to the reader to tweak the expression to something more fitting if need be.

Marcus Stade
  • 4,724
  • 3
  • 33
  • 54
  • Oh wow, V8 has some seriously optimized regex. I was going to suggest matching `.` instead of `\d` for a better comparison since the `\d` penalizes the regex engine by doing the extra number match but the results astounded me - regex FTW! – slebetman Oct 23 '13 at 06:12
  • It probably doesn't matter much whether `\d` is penalized, but it's not a bad suggestion. For this answer in particular I figured the educational aspect was much more important anyway. Hopefully the OP will have some better tools to learn regexp now. – Marcus Stade Oct 23 '13 at 13:56
1

If you want a regular expression approach, then, something like so will do the trick: ^".*?\[(\d+)\]\s*=\s*(\d+)"$. This will match and extract the number you have in your square brackets (\[(\d+)\]) and also any numbers you will have at the end just before the " sign.

Once matched, it will put them into a group which you can then eventually access. Please check this previous SO post to see how you can access said groups.

I can't comment on speed, but usually regular expressions make string processing code more compact, the drawback of which is that the code is usually more difficult to read (depending on the complexity of the expression).

Community
  • 1
  • 1
npinti
  • 51,780
  • 5
  • 72
  • 96
1

Regex is slower than working by finding the index of a given char, regardless of the language.

In your case, don't use split but only substring at given index.

Moreover, some hints to improve perf : pair[0].trim() is called twice and first trim is useless because you already call pair[1].trim().

It's all about algorithms…

Here is a faster implementation :

for (var i = 0; i < lines.length; i++) {
    var i1 = lines[i].indexOf("[");
    var i2 = lines[i].indexOf("]");
    var i3 = lines[i].indexOf("=");


    if (i1 >= 0) {
        var index = lines[i].substring(i1, i2);
        var value = lines[i].substring(i3, lines[i].length-1).trim();
    }
}
Yann Moisan
  • 8,161
  • 8
  • 47
  • 91
  • Not true in this case. See the answer by macke - regex is 6 times faster in chrome and only around 0.2 times slower in firefox – slebetman Oct 23 '13 at 06:11
0

If all you want to do is extract the index and value, you don't need to parse the string (which infers tokenising and processing). Just find the bits you want and extract them.

If your strings are always like "array[5] = 10" and the values are always integers, then:

var nums = s.match(/\d+/);
var index = nums[0];
var value = nums[1];

should do the trick. If there is a chance that there will be no matches, then you might want:

var index = nums && nums[0];
var value = nums && nums[1];

and deal with cases where index or value are null to avoid errors.

If you genuinely want to parse the string, there's a bit more work to do.

RobG
  • 142,382
  • 31
  • 172
  • 209