1

I'm trying to split the text of a string into lines no longer than 72 characters (to break lines to the usual Usenet quoting line length). The division should be done by replacing a space with a new line (choosing the closest space so that every line is <= 72 characters). [edited]

The text is present in a string and could also contain emoji or other symbols.

I have tried different approaches but the fact that I can not separate a word but I must necessarily separate the text where there is a space has not allowed me to find a solution for now.

Does anyone know how this result can be obtained in Swift? Also with Regular expressions if needed. [edited]

Cue
  • 2,952
  • 3
  • 33
  • 54
  • Duplicate of https://stackoverflow.com/questions/19600372/how-do-i-get-word-wrap-information-with-the-new-ios-7-apis ? Just pick a width and let the text system wrap for you. – matt Oct 14 '18 at 11:09
  • Thank you for the comment Matt, very useful. Being a beginner, I do not understand very well the Objective-C and I can not understand well which method is used in the example provided. Would you have a clue for Swift? Thanks in advance. – Cue Oct 14 '18 at 11:28
  • 1
    Do you want a hard-break (make every line exactly 72 characters, even if it breaks a word onto multiple lines) or break-by-space (choose the closest space so that every line is <= 72 characters)? – Code Different Oct 14 '18 at 11:45
  • Hi Code Different I would like to break-by-space. – Cue Oct 14 '18 at 11:48
  • Well, it doesn't matter whether you know Objective-C. The linked code is still what I suggest you do. It's a waste of your time to line-break manually at a certain width when Text Kit already knows exactly how to do that for you. So just let Text Kit wrap for you, and then ask it where the line fragments are. – matt Oct 14 '18 at 11:52
  • It would help a lot of you explained _why_ you think you need to do this. If you just display the text in a label, it will wrap _automatically_. So why do _you_ need to line-break it yourself? – matt Oct 14 '18 at 11:53
  • I Matt, thank you. I need this because I would like to break lines to the usual Usenet quoting line length. – Cue Oct 14 '18 at 11:56
  • 1
    Oh, I see. Well, I still think the linked question and answer is the sort of thing you need. If you ask Text Kit to draw your text wrapped, in a monospace font, at a width known to be 72 times the character width, it will automatically wrap just the way you want, and then you can find out the range of text of each line and insert a return character at the start of each range. – matt Oct 14 '18 at 12:03

1 Answers1

1

In other languages you can index a string with an integer. Not so in Swift: you must interact with its character index, which can be a pain in the neck if you are not familiar with it.

Try this:

private func split(line: Substring, byCount n: Int, breakableCharacters: [Character]) -> String {
    var line = String(line)
    var lineStartIndex = line.startIndex

    while line.distance(from: lineStartIndex, to: line.endIndex) > n {
        let maxLineEndIndex = line.index(lineStartIndex, offsetBy: n)

        if breakableCharacters.contains(line[maxLineEndIndex]) {
            // If line terminates at a breakable character, replace that character with a newline
            line.replaceSubrange(maxLineEndIndex...maxLineEndIndex, with: "\n")
            lineStartIndex = line.index(after: maxLineEndIndex)
        } else if let index = line[lineStartIndex..<maxLineEndIndex].lastIndex(where: { breakableCharacters.contains($0) }) {
            // Otherwise, find a breakable character that is between lineStartIndex and maxLineEndIndex
            line.replaceSubrange(index...index, with: "\n")
            lineStartIndex = index
        } else {
            // Finally, forcible break a word
            line.insert("\n", at: maxLineEndIndex)
            lineStartIndex = maxLineEndIndex
        }
    }

    return line
}

func split(string: String, byCount n: Int, breakableCharacters: [Character] = [" "]) -> String {
    precondition(n > 0)
    guard !string.isEmpty && string.count > n else { return string }

    var string = string
    var startIndex = string.startIndex

    repeat {
        // Break a string into lines.
        var endIndex = string[string.index(after: startIndex)...].firstIndex(of: "\n") ?? string.endIndex
        if string.distance(from: startIndex, to: endIndex) > n {
            let wrappedLine = split(line: string[startIndex..<endIndex], byCount: n, breakableCharacters: breakableCharacters)
            string.replaceSubrange(startIndex..<endIndex, with: wrappedLine)
            endIndex = string.index(startIndex, offsetBy: wrappedLine.count)
        }

        startIndex = endIndex
    } while startIndex < string.endIndex
    return string
}

let str1 = "Iragvzvyn vzzntvav chooyvpngr fh Vafgntenz r pv fbab gnagvffvzv nygev unfugnt, qv zvabe fhpprffb, pur nttertnab vzzntvav pba y’vzznapnovyr zntyvrggn"
let str2 = split(string: str1, byCount: 72)
print(str2)

Edit: this turns out to be more complicated than I thought. The updated answer improves upon the original by processing the text line by line. You may ask why I devise my own algorithm to break lines instead of components(separatedBy: "\n"). The reason is to preserve blank lines. components(...) will collapse consecutive blank lines into one.

Code Different
  • 90,614
  • 16
  • 144
  • 163
  • Thank you Code Different, a truly complete solution to the problem. – Cue Oct 14 '18 at 14:08
  • Hi Code Different, your function works fine the first time I use it, but I noticed that if I use it again in an already processed text, it process the text again. I tested it also in Playgrounds with the output of your post where it splits the already shorted lines. – Cue Oct 14 '18 at 22:55
  • Hm... good question. It doesn't consider already existing line breaks. You can split the string by new line, then process each line individually. I'm away from a Mac right now. Will take a look later – Code Different Oct 14 '18 at 23:01
  • Thank you for the advice. I will process each line individually. BTW, I saw that in some cases, the last line, after processing, could have more that 72 characters. For example, after processing this text, the last line has 79 characters: "Iragvzvyn vzzntvav chooyvpngr fh Vafgntenz r pv fbab gnagvffvzv nygev unfugnt, qv zvabe fhpprffb, pur nttertnab vzzntvav pba y’vzznapnovyr zntyvrggn." What could it be? – Cue Oct 16 '18 at 09:17
  • 1
    Updated my answer to make it more robust – Code Different Oct 16 '18 at 12:38
  • It works perfectly, very nice code. Thanks for your help. – Cue Oct 19 '18 at 02:06