0

So here's what I want to do:

input string: "abc From: blah" I want to split this so that the result is

["abc" "From: blah"] or ["abc" "From" "blah"

I have several other patterns to match ["abcd" "To:" "blah"] etc

So I have the following regex

val datePattern = """((.*>.*)|(.*(On).*(wrote:)$)|(.*(Date):.*(\+\d\d\d\d)?$)|(.*(From):.*(\.com)?(\]|>)?$))"""
val reg = datePattern.r

If I do a match the result comes out fine. If I do a split on the same regex I get an empty list.

inputStr match {
      case reg(_*) => return "Match"
      case _ => return "Output: None"
}

on the input string :

"abc From: blah blah"

returns Match

Split

inputStr.split(datePattern)

returns an empty array. What am I possibly missing ?

Pradeep Banavara
  • 983
  • 2
  • 11
  • 33

1 Answers1

1

Since the regexp matches the string, split will remove the entire string (considered as a separator).
The default behavior is not to return two empty strings, but an empty array in this case, as given by the split signification.

https://stackoverflow.com/a/14602089/1287856

Concerning why your regex matches in its entirety, you might find this website useful (it concerns your example directly)

https://regex101.com/r/zY0lX9/1

Split finds the whole regexp and removes all its occurences from the string, returning the interleaved strings as an array. You may want to split on something like "(?=From:)" so that it does not remove anything.

Community
  • 1
  • 1
Mikaël Mayer
  • 10,425
  • 6
  • 64
  • 101
  • Thank you. That was very helpful. I'm still a bit stumped as to why "abc From: def" is matched in it's entirety. I'm expecting on the 'From:' substring to match as evidenced by the regex. So shouldn't I get "abc" "From: def" or something like that upon splitting ? – Pradeep Banavara Dec 03 '15 at 15:33