2

Given this text:

1/12/2011
I did something.

10/5/2013
I did something else.

Here is another line.

And another.

5/17/2014
Lalala.
More text on another line.

I would like to use regex (or maybe some other means?) to get this:

["1/12/2011", "I did something.", "10/5/2013", "I did something else.\n\nHere is another line.\n\nAnd another.", "5/17/2014", "Lalala.\nMore text on another line."]

The date part and content part are each separate entries, alternating.

I've tried using [^] instead of the dot since JS's .* does not match new lines (as Matching multiline Patterns says), but then the match is greedy and takes up too much, so the resulting array only has 1 entry:

var split_pattern = /\b(\d\d?\/\d\d?\/\d\d\d\d)\n([^]+)/gm;
var array_of_mems = contents.match(split_pattern);

// => ["1/12/2011↵I did something else..."]

If I add a question mark to get [^]+?, which according to How to make Regular expression into non-greedy? makes the match non-greedy, then I only get the first character of the content part.

What's the best method? Thanks in advance.

Community
  • 1
  • 1
dmonopoly
  • 3,251
  • 5
  • 34
  • 49

2 Answers2

2
(\d{1,2}\/\d{1,2}\/\d{4})\n|((?:(?!\n*\d{1,2}\/\d{1,2}\/\d{4})[\s\S])+)

You can try this.grab the captures.See demo.

https://regex101.com/r/sJ9gM7/126

var re = /(\d{1,2}\/\d{1,2}\/\d{4})\n|((?:(?!\n*\d{1,2}\/\d{1,2}\/\d{4})[\s\S])+)/gim;
var str = '1/12/2011\nI did something.\n\n10/5/2013\nI did something else.\n\nHere is another line.\n\nAnd another.\n\n5/17/2014\nLalala.\nMore text on another line.';
var m;

if ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
vks
  • 67,027
  • 10
  • 91
  • 124
  • Thanks for the great answer! Does the "?:" apply to "(?!\n*\d{1,2}\/\d{1,2}\/\d{4})" only, or does it apply to "((?!\n*\d{1,2}\/\d{1,2}\/\d{4})[\s\S])+" ? The syntax tells me the latter, but then I don't understand how the content like "I did something" gets matched, because "?:" means to not capture the match. – dmonopoly Apr 18 '15 at 00:30
1

You can use the exec() method in a loop to get your desired results.

var re  = /^([\d/]+)\s*((?:(?!\s*^[\d/]+)[\S\s])+)/gm, 
matches = [];

while (m = re.exec(str)) {
  matches.push(m[1]);
  matches.push(m[2]);
}

Output

[ '1/12/2011',
  'I did something.',
  '10/5/2013',
  'I did something else.\n\nHere is another line.\n\nAnd another.',
  '5/17/2014',
  'Lalala.\nMore text on another line.' ]

eval.in

hwnd
  • 69,796
  • 4
  • 95
  • 132
  • Could I get some explanation on the regex - how does the [\d/]+ work? And it seems a key idea is to use [\S\s]... what is this exactly? Or just key points about the regex in general, since I would like to understand it instead of just plainly copying it. I will look into the ?: and ?! - non-capturing grouop & negative lookahead I think... those are key ideas I wasn't considering when trying to write my own regex. Thanks! – dmonopoly Apr 17 '15 at 15:21