13

If I have a List of type String,

scala> val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
items: List[java.lang.String] = List(Apple, Banana, Orange, Tomato, Grapes, BREAK, Salt, Pepper, BREAK, Fish, Chicken, Beef)

how can I split it into n separate lists based on a certain string/pattern ("BREAK", in this case).

I've thought about finding the position of "BREAK" with indexOf, and split up the list that way, or using a similar approach with takeWhile (i => i != "BREAK") but I'm wondering if there's a better way?

If it helps, I know there will only ever be 3 sets of items in the items list (thus 2 "BREAK" markers).

Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
jbnunn
  • 6,161
  • 4
  • 40
  • 65

6 Answers6

11
def splitBySeparator[T](l: List[T], sep: T): List[List[T]] = {
  l.span( _ != sep ) match {
    case (hd, _ :: tl) => hd :: splitBySeparator(tl, sep)
    case (hd, _) => List(hd)
  }
}
val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
splitBySeparator(items, "BREAK")

Result:

res1: List[List[String]] = List(List(Apple, Banana, Orange, Tomato, Grapes), List(Salt, Pepper), List(Fish, Chicken, Beef))

UPDATE: The above version, while concise and effective, has two problems: it does not handle well the edge cases (like List("BREAK") or List("BREAK", "Apple", "BREAK"), and is not tail recursive. So here is another (imperative) version that fixes this:

import collection.mutable.ListBuffer
def splitBySeparator[T](l: Seq[T], sep: T): Seq[Seq[T]] = {
  val b = ListBuffer(ListBuffer[T]())
  l foreach { e =>
    if ( e == sep ) {
      if  ( !b.last.isEmpty ) b += ListBuffer[T]()
    }
    else b.last += e
  }
  b.map(_.toSeq)
}

It internally uses a ListBuffer, much like the implementation of List.span that I used in the first version of splitBySeparator.

Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
Régis Jean-Gilles
  • 32,541
  • 5
  • 83
  • 97
6

Another option:

val l = Seq(1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5)

l.foldLeft(Seq(Seq.empty[Int])) {
  (acc, i) =>
    if (i == 9) acc :+ Seq.empty
    else acc.init :+ (acc.last :+ i)
}

// produces:
List(List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5))
Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
Ryan LeCompte
  • 4,281
  • 1
  • 14
  • 14
0

How about this: use scan to figure out which section every element in the list belongs to.

val l = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
val count = l.scanLeft(0) { (n, s) => if (s=="BREAK") n+1 else n } drop(1)
val paired = l zip count
(0 to count.last) map { sec => 
  paired flatMap { case (x, c) => if (c==sec && x!="BREAK") Some(x) else None }  
}
// Vector(List(Apple, Banana, Orange, Tomato, Grapes), List(Salt, Pepper), List(Fish, Chicken, Beef))
Kane
  • 1,314
  • 2
  • 9
  • 14
0

This is not tail-recursive either, but it does ok with edge cases:

def splitsies[T](l:List[T], sep:T) : List[List[T]] = l match {
  case head :: tail =>
    if (head != sep)
      splitsies(tail,sep) match {
        case h :: t => (head :: h) :: t
        case Nil => List(List(head))
      }
    else
      List() :: splitsies(tail, sep)
  case Nil => List()
}

The only annoying thing:

scala> splitsies(List("BREAK","Tiger"),"BREAK")
res6: List[List[String]] = List(List(), List(Tiger))

If you want to handle separator-started cases better, look at something not unlike the use of span in Martin's answer (to a slightly different question).

Community
  • 1
  • 1
Francois G
  • 11,957
  • 54
  • 59
0

Using List.unfold (Scala 2.13 and above):

val p: String => Boolean = _ != "BREAK"

val result: List[List[String]] = List.unfold(items) {
  case Nil =>
    None
  case l if p(l.head) =>
    Some(l.span(p))
  case _ :: tail =>
    Some(tail.span(p))
}

Code run at Scastie.

Using reverse + foldLeft:

def splitAtElement[T](list: List[T], element: T): List[List[T]] = {
  list.reverse.foldLeft(List(List[T]()))((l, currentElement) => {
    if (currentElement == element) {
      List() :: l
    } else {
      (currentElement :: l.head) :: l.tail
    }
  })
}

Code run at Scastie.

Using foldRight:

def splitBySeparator[T](list: List[T], sep: T): List[List[T]] = {
  list.foldRight(List(List[T]()))((s, l) => {
    if (sep == s) {
      List() :: l
    } else {
      (s :: l.head) :: l.tail
    }
  }).filter(_.nonEmpty)
}

Code run at Scastie.

Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
-1
val q = items.mkString(",").split("BREAK").map("(^,|,$)".r.replaceAllIn(_, "")).map(_.split(","))

Here "," is a unique separator that does not appear in any of the strings in the items list. We could choose a different separator if needed.

items.mkString(",") combines everything into a string

.split("BREAK") // which we then split using "BREAK" as delimiter to get a list

.map("(^,|,$)".r.replaceAllIn(_, "")) // removes the leading/trailing commas of each element of the list in previous step

.map(_.split(",")) // splits each element using comma as seperator to give a list of lists


scala> val q = items.mkString(",").split("BREAK").map("(^,|,$)".r.replaceAllIn(_, "")).map(_.split(","))
q: Array[Array[String]] = Array(Array(Apple, Banana, Orange, Tomato, Grapes), Array(Salt, Pepper), Array(Fish, Chicken, Beef))

scala> q(0)
res21: Array[String] = Array(Apple, Banana, Orange, Tomato, Grapes)

scala> q(1)
res22: Array[String] = Array(Salt, Pepper)

scala> q(2)
res23: Array[String] = Array(Fish, Chicken, Beef)
Jonas Czech
  • 12,018
  • 6
  • 44
  • 65