6

I need something that takes a string, and divides it into an array. I want to split it after every space, so that this -

"Hello everybody!" turns into ---> ["Hello", "Everybody!"]

However, I want it to ignore spaces inbetween apostrophes. So for examples -

"How 'are you' today?" turns into ---> ["How", "'are you'", "today?"]

Now I wrote the following code (which works), but something tells me that what I did is pretty much horrible and that it can be done with probably 50% less code. I'm also pretty new to JS so I guess I still don't adhere to all the idioms of the language.

function getFixedArray(text) {

        var textArray = text.split(' '); //Create an array from the string, splitting by spaces.

        var finalArray = [];
        var bFoundLeadingApostrophe = false;
        var bFoundTrailingApostrophe = false;
        var leadingRegExp = /^'/;
        var trailingRegExp = /'$/;
        var concatenatedString = "";

        for (var i = 0; i < textArray.length; i++) {
            var text = textArray[i];

            //Found a leading apostrophe
             if(leadingRegExp.test(text) && !bFoundLeadingApostrophe && !trailingRegExp.test(text)) {
                concatenatedString =concatenatedString + text;
                bFoundLeadingApostrophe = true;
             }

             //Found the trailing apostrophe
             else if(trailingRegExp.test(text ) && !bFoundTrailingApostrophe) {

                concatenatedString = concatenatedString + ' ' + text;
                finalArray.push(concatenatedString);

                concatenatedString = "";

                bFoundLeadingApostrophe = false;
                bFoundTrailingApostrophe = false;
             }

             //Found no trailing apostrophe even though the leading flag indicates true, so we want this string.
             else if (bFoundLeadingApostrophe && !bFoundTrailingApostrophe) {
                concatenatedString = concatenatedString + ' ' + text;
             }

             //Regular text
             else {
                finalArray.push(text);
             }

        }

        return finalArray;

    }

I would deeply appreciate it if somebody could go through this and teach me how this should be rewritten, in a more correct & efficient way (and perhaps a more "JS" way).

Thanks!

Edit -

Well I just found a few problems, some of which I fixed, and some I'm not sure how to handle without making this code too complex (for example the string "hello 'every body'!" doesn't split properly....)

thomas
  • 1,133
  • 1
  • 12
  • 31

1 Answers1

3

You could try matching instead of splitting:

string.match(/(?:['"].+?['"])|\S+/g)

The above regex will match anything in between quotes (including the quotes), or anything that's not a space otherwise.

If you want to also match characters after the quotes, like ? and ! you can try:

/(?:['"].+?['"]\W?)|\S+/g

For "hello 'every body'!" it will give you this array:

["hello", "'every body'!"]

Note that \W matches space as well, if you want to match punctuation you could be explicit by using a character class in place of \W

[,.?!]

Or simply trim the strings after matching:

string.match(regex).map(function(x){return x.trim()})
elclanrs
  • 92,861
  • 21
  • 134
  • 171
  • I found this question previously answered. http://stackoverflow.com/questions/2817646/javascript-split-string-on-space-or-on-quotes-to-array I think Sean Kinsey's JS combined with your regular expression should get the job done pretty well. I tried modifying his regular expression and was able to make it work pretty well, particularly with matching single quotes also, but it fell apart when I tried adding in punctuation after the quotes. – cyk Aug 10 '14 at 21:05
  • That's pretty epic, considering that the RegExp does what a 40~ line code barely does. Could you just explain why do you have quotes in the square brackets? – thomas Aug 10 '14 at 21:09
  • 1
    The quotes within the square brackets indicates the specific characters you'd like to match (ie, `'` and `"`) respectively. – Anthony Forloney Aug 10 '14 at 21:12
  • @elclanrs, I did find one issue with your regex--it's matching the space directly after a closing quote. Not sure how to get negate that one. – cyk Aug 10 '14 at 21:14
  • @cyk: the second regex? yes, it does because `\W` is a bit too broad. To match punctuation you could be explicit like `[,.?!]?` or simply trim the strings after matching `string.match(regex).map(function(x){return x.trim()})` – elclanrs Aug 10 '14 at 21:18