1

I have a regex which will split my string into arrays, if it finds NULL\n or '\n.

My string is:

"'<xml↵ data>', NULL↵'abc', '<xml↵ data>'↵'abc', 'abc'"

(String has values separated with comma, all values are wrapped with single quote, except NULL values. New rows are defined with new lines, but my real problem is that values can also have new lines.)

With /NULL\n|'\n/ I get this result:

["'<xml↵ data>', ", "'abc', '<xml↵ data>", "'abc', 'abc'"] 

But now I would like to keep NULL and ' part of the delimiter (I'm also ok if \n is preserved). So it would look like this:

["'<xml↵ data>', NULL", "'abc', '<xml↵ data>'", "'abc', 'abc'"] 

My code so far:

var data = "'<xml\n data>', NULL\n'abc', '<xml\n data>'\n'abc', 'abc'"
var result = data.split(/NULL\n|'\n/)
console.log(result)

Thank you very much for your help. I now similar threads exist (like this one) but i'm not good in regex so i was not successful when transforming solutions for my needs.

EDIT: Working solution (if anybody else needs it)

From @Michael Sanchez answer I created this working function based on indexOf (although I'm a little worried from the performance point of view, because in my case the loop must go over 4MB large text):

Live demo: http://jsfiddle.net/ngr97jz7/3/

function ConvertToArray(text){
    var rows = [];
    var i = 1;
    while(i != -1 && i != 0){
        //find closer appearance
        var a = text.indexOf("NULL\n");
        var b = text.indexOf("'\n");
        i = ((a < b && a != -1) || (a > b && b == -1)) ? a+4 : b+1; //set index + 4 chars for NULL or 1 char for '
        if(i == 0 || i == -1){
            rows.push( text );
            break;
        }
        rows.push( text.substring(0,i) );
        text = text.substring(i+1, text.length)
    }
    return rows;
}
Community
  • 1
  • 1
user44387
  • 13
  • 3
  • You must split by a VOID string, that is doable by using grouping in regex but since lookbehind is not supported by javascript regex i'm not too sure how to do that. If you wanna know more, here's a PCRE example (that won't work with your regex) but will give you the idea of why your code is like that, or some library for lookbehind support in javascript, whatever. http://regex101.com/r/lR8wG9/1 – Javier Diaz Sep 02 '14 at 12:50
  • http://stackoverflow.com/questions/12317499/javascript-and-regex-split-and-keep-delimiter?rq=1 – Evan Knowles Sep 03 '14 at 05:50

1 Answers1

0

One approach I can think of is to first find all the indices of your delimiter throughout your whole string using a loop and .indexOf(<string>, <index>);

Create another loop that gets your desired substrings of your main string using the indices you previously retrieved. You can add those to a list, then turn it into an array afterwards:

List<String> list = new ArrayList<String>();
// after list is populated
String[] arr = list.toArray(new String[list.size()]);

EDIT: My bad, your problem is in javascript. Just disregard the list step.

Michael Sanchez
  • 1,215
  • 1
  • 11
  • 19