9

Say I have a string

"Hello! How do you do? Good day!"

and I want to split it, with my delimiters being: ? and ! using the "split" function the result would be:

`[Hello, How do you do, Good day]`

However, I want it to be:

`[Hello, !, How do you do, ?, Good day, !]`
Towfik Alrazihi
  • 536
  • 3
  • 8
  • 25
Dina Kleper
  • 1,949
  • 4
  • 17
  • 23

4 Answers4

13

Here is a similar question in Java: How to split a string, but also keep the delimiters?

Use lookahead. In Kotlin, the code maybe like this:

fun main(args: Array<String>) {
    val str = "Hello! How do you do? Good day!"

    val reg = Regex("(?<=[!?])|(?=[!?])")

    var list = str.split(reg)

    println(list)
}

The output of this is:

[Hello, !, How do you do, ?, Good day, !]
Cactus
  • 27,075
  • 9
  • 69
  • 149
zhumengzhu
  • 698
  • 5
  • 22
  • 1
    This solution seems to create empty list entries if the string starts or ends with the split pattern. Also, it does not seem to work with a split pattern like `\d+` to match multiple digits. – sschuberth May 10 '22 at 15:14
1

This is my version of such a function:

fun String.splitKeeping(str: String): List<String> {
    return this.split(str).flatMap {listOf(it, str)}.dropLast(1).filterNot {it.isEmpty()}
}

fun String.splitKeeping(vararg strs: String): List<String> {
    var res = listOf(this)
    strs.forEach {str -> 
        res = res.flatMap {it.splitKeeping(str)}
    }
    return res
}

//USAGE:
"Hello! How do you do? Good day!".splitKeeping("!", "?")

It is not very fast (square complexity), but works well for relatively short strings.

voddan
  • 31,956
  • 8
  • 77
  • 87
  • 1
    Any chance you can come up with a function that would do that but for regular-expressions delimiters? – Dina Kleper May 10 '16 at 08:08
  • @user3601872 This is a piece of code from one of my past projects. If you can improve it, feel free to post it here, so I could benefit from it too. – voddan May 10 '16 at 08:12
1

Here's an extension function wrapping the code discussed here:

private const val withDelimiter = "((?<=%1\$s)|(?=%1\$s))"

fun Regex.splitWithDelimiter(input: CharSequence) = 
    Regex(withDelimiter.format(this.pattern)).split(input)
s1m0nw1
  • 76,759
  • 17
  • 167
  • 196
0

Create a new extension that has a simple modification:

private fun CharSequence.splitWithDelimiters(delimiter: String, ignoreCase: Boolean = false, limit: Int = 0): List<String> {
    require(limit >= 0) { "Limit must be non-negative, but was $limit" }

    var currentOffset = 0
    var nextIndex = indexOf(delimiter, currentOffset, ignoreCase)
    if (nextIndex == -1 || limit == 1) {
        return listOf(this.toString())
    }

    val isLimited = limit > 0
    val result = ArrayList<String>(if (isLimited) limit.coerceAtMost(10) else 10)
    do {
        result.add(substring(currentOffset, nextIndex))
        // Adding delimiter(s) 
        result.add(substring(nextIndex, nextIndex + delimiter.length))
        currentOffset = nextIndex + delimiter.length
        // Do not search for next occurrence if we're reaching limit
        if (isLimited && result.size == limit - 1) break
        nextIndex = indexOf(delimiter, currentOffset, ignoreCase)
    } while (nextIndex != -1)

    result.add(substring(currentOffset, length))
    return result
}
neteinstein
  • 17,529
  • 11
  • 93
  • 123