0

I'm looking for an existing swift2 function to split string input on whitespace while at the same time preserving whitespace within quoted strings.

I have read stack overflow question 25678373. My question does not appear to be a duplicate.

I searched for similar functionality in cocoapods. I did not find it.

If this shlex.split function does not exist in swift2, what is an effective alternate way to accomplish something similar? What is an alternate way to split a string while preserving whitespace within internal quoted strings?

Here's an example of what I mean in python:

$    python
Python 2.7.6 (default, Jun 22 2015, 18:00:18) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> input=""" alpha 2 'chicken with teeth' 4 'cat with wings' 6 turkey"""
>>> results = shlex.split(input)
>>> type(results)
<type 'list'>
>>> results[0]
'alpha'
>>> results[2]
'chicken with teeth'
>>> for term in results:
...     print(term)
... 
alpha
2
chicken with teeth
4
cat with wings
6
turkey
>>> 
Community
  • 1
  • 1
StandardEyre
  • 113
  • 1
  • 1
  • 7
  • `I'm looking for an existing swift2 function to [...]` There is no such function in the Swift standard library. – Eric Aya Apr 03 '16 at 20:29
  • 1
    BTW, the [source code for shlex.split](http://scons.org/doc/1.3.1/HTML/scons-api/SCons.compat._scons_shlex-pysrc.html) is rather complex, it uses a lexical parser to achieve this task. – Eric Aya Apr 04 '16 at 10:37

2 Answers2

2

As @EricD writes in his comment to you, there exists no such native Swift function. You can, however, quite readily write your own such split function, e.g.

extension String {

    func shlexSplit() -> [String] {
        /* separate words by spaces */
        var bar = self.componentsSeparatedByString(" ")

        /* identify array idx ranges of quoted (') sets of words */
        var accumulating = false
        var from = 0
        var joinSegments : [(Int, Int)] = []

        for (i,str) in bar.enumerate() {
            if str.containsString("'") {
                if accumulating { joinSegments.append((from, i)) }
                else { from = i }
                accumulating = !accumulating
            }
        }

        /* join matching word ranges with " " */
        for (from, through) in joinSegments.reverse() {
            bar.replaceRange(from...through, 
                with: [bar[from...through].joinWithSeparator(" ")])
        }

        return bar
    }
}

Usage example

/* exampe usage */
let foo = "alpha 2 'chicken with teeth' 4 'cat with wings' 6 turkey"
let bar = foo.shlexSplit()

bar.forEach{ print($0) }
/* alpha
 2
 'chicken with teeth'
 4
 'cat with wings'
 6
 turkey */

Note that the above assumes the input string have matching sets of quote delimiters '.

dfrib
  • 70,367
  • 12
  • 127
  • 192
0

'pure' swift (no Foundation) example

extension String {
    // split by first incidence of character 
    func split(c: Character)->(String,String) {
        var head: String = "", tail: String = ""
        if let i = characters.indexOf(c) {
            let j = startIndex.distanceTo(i)
            head = String(characters.prefix(j))
            tail = String(characters.dropFirst(j + 1))
        } else {
            head = self
        }
        return (head, tail)
    }
}

// what you are looking for

func split(str: String)->[String] {
    // internal state
    var state:((String,String), [String], Bool) = (str.split("'"), [], false)
    repeat {
        if !state.2 {
            // you can define more whitespace characters
            state.1
                .appendContentsOf(state.0.0.characters.split{" \t\n\r".characters.contains($0)}
                    .map(String.init))
            state.2 = true
        } else {
            state.1.append(state.0.0)
            state.2 = false
        }
        state.0 = state.0.1.split("'")
    } while !state.0.0.isEmpty
    return state.1
}

Usage

let str = "a 2  'b   c'   d  ''"
dump(split(str))
/*
 ▿ 4 elements
   - [0]: a
   - [1]: 2
   - [2]: b   c
   - [3]: d
 */
user3441734
  • 16,722
  • 2
  • 40
  • 59