5

Let say I have these two examples

  1. (A = 1) and ( B = 2)
  2. (A = 1)(B = 2 ()).

I need a way to get the following array:

  1. [(],[A][=][1],[)],[and],[(],[B],[=],[2],[)]
  2. [(],[A][=][1],[)],[(],[B],[=],[2],[(],,[)][)]

What I tried to do is the following

Find the delimiters using the following function (in this case the delimiters are the space "" and any brackets ( or ) )

 function findExpressionDelimeter (textAreaValue){
    var delimiterPositions = [];
    var bracesDepth = 0;
    var squareBracketsDepth = 0;
    var bracketsDepth = 0;

    for (var i = 0; i < textAreaValue.length; i++) {
        switch (textAreaValue[i]) {
            case '(':
                bracketsDepth++;
                delimiterPositions.push(i);
                break;
            case ')':
                bracketsDepth--;
                delimiterPositions.push(i);
                break;
            case '[':
                squareBracketsDepth++;
                break;
            case ']':
                squareBracketsDepth--;
                break;
            default:
                if (squareBracketsDepth == 0 && textAreaValue[i] == ' ') {
                    delimiterPositions.push(i);
                }
        }
    }
    return delimiterPositions;
}

Then I tried to loop trough the values returned and extract the values using substring. The issue is that when I have a ( or ) I need to get the next substring as well as the bracket. This is where I am stuck.

    function getTextByDelimeter(delimiterPositions, value) {
            var output = [];
            var index = 0;
            var length = 0;
            var string = "";

            for (var j = 0; j < delimiterPositions.length; j++) {

                if (j == 0) {
                    index = 0;
                } else {
                    index = delimiterPositions[j - 1] + 1;
                }

                length = delimiterPositions[j];


                string = value.substring(index, length);
                output.push(string);
            }
            string = value.substring(length, value.length);
            output.push(string);
            return output;
        }

Any help would be appreciated.

trooper
  • 4,444
  • 5
  • 32
  • 32
Moddinu
  • 185
  • 1
  • 7
  • 1
    So you're trying to write a syntactic parser of some sort? – Evan Knowles May 21 '14 at 09:03
  • I trying to get that expression so that I can evaluate that it is correct give an "id" to each item so A would be a term = would be an operator and the brackets well remain brackets :) – Moddinu May 21 '14 at 09:05
  • I'd recommend checking out this and related links: http://stackoverflow.com/questions/9957873/creating-a-parser-for-a-simple-pseudocode-language – Evan Knowles May 21 '14 at 09:08
  • Have you tried splitting by `\b`? – sp00m May 21 '14 at 09:17
  • So I read the article so from what I can tell the best thing to do is parse it with while I find the delimiters.Not afterwards – Moddinu May 21 '14 at 09:18

3 Answers3

1

You could just match the tokens you are interested in:

var str = "(A = 1) and ( B = 2)";
var arr = str.match(/[()]|[^()\s]+/g);

Result:

["(", "A", "=", "1", ")", "and", "(", "B", "=", "2", ")"]

The regex with some comments:

[()]     # match a single character token
|        # or
[^()\s]+ # match everything else except spaces

If you would like to add more single character tokens, like for example a =, just add it to both character classes. Ie: [()=]|[^()=\s]+

Qtax
  • 33,241
  • 9
  • 83
  • 121
  • I think I need to learn more regex :) as its not my area of expertise. It's quite an elegant solution thank you. – Moddinu May 21 '14 at 09:51
0

The similar question with the answer is here.

You can split your string(string.split('')) And then delete whitespaces from array or just check if array[i] != ' ' before your switch block.

Community
  • 1
  • 1
0

What you want to do is a lexical analyser.

Regular expressions won't allow you to parse a language (a mathematical expression is one). The tree decomposition of the formula cannot be done with it.

However, regex can allow you to discriminate tokens. This is usually done by reading the stream of character. Once you've detect a lexeme, you generate the token.

If you want to check the validity of the formula, or compute the value: you need a parser (semantic analyser). This can't be done using regex.

M'vy
  • 5,696
  • 2
  • 30
  • 43