347

From the documentation, it's not clear. In Java you could use the split method like so:

"some string 123 ffd".split("123");
maxcountryman
  • 1,562
  • 1
  • 24
  • 51
Incerteza
  • 32,326
  • 47
  • 154
  • 261
  • 2
    https://doc.rust-lang.org/std/string/struct.String.html – Dulguun Otgon Oct 29 '16 at 02:10
  • @bow Is there a way to make it a String array instead of a vector? – Greg Nov 07 '17 at 07:25
  • I'm not aware of any way to do that, directly at least. You'd probably have to manually iterate over the `Split` and set it into the array. Of course this means the number of items in each split must be the same since arrays are fixed size and you have to have the array defined before. I imagine this may be more trouble than simply creating a `Vec`. – bow Nov 07 '17 at 07:37

7 Answers7

404

Use split()

let parts = "some string 123 content".split("123");

This gives an iterator, which you can loop over, or collect() into a vector. For example:

for part in parts {
    println!("{}", part)
}

Or:

let collection = parts.collect::<Vec<&str>>();
dbg!(collection);

Or:

let collection: Vec<&str> = parts.collect();
dbg!(collection);
Alan W. Smith
  • 24,647
  • 4
  • 70
  • 96
Manishearth
  • 14,882
  • 8
  • 59
  • 76
197

There are three simple ways:

  1. By separator:

     s.split("separator")  |  s.split('/')  |  s.split(char::is_numeric)
    
  2. By whitespace:

     s.split_whitespace()
    
  3. By newlines:

     s.lines()
    
  4. By regex: (using regex crate)

     Regex::new(r"\s").unwrap().split("one two three")
    

The result of each kind is an iterator:

let text = "foo\r\nbar\n\nbaz\n";
let mut lines = text.lines();

assert_eq!(Some("foo"), lines.next());
assert_eq!(Some("bar"), lines.next());
assert_eq!(Some(""), lines.next());
assert_eq!(Some("baz"), lines.next());

assert_eq!(None, lines.next());
DenisKolodin
  • 13,501
  • 3
  • 62
  • 65
57

There is a special method split for struct String:

fn split<'a, P>(&'a self, pat: P) -> Split<'a, P> where P: Pattern<'a>

Split by word:

let v: Vec<&str> = "Mary had a little lamb".split(' ').collect();
assert_eq!(v, ["Mary", "had", "a", "little", "lamb"]);

Split by delimiter:

let v: Vec<&str> = "lion::tiger::leopard".split("::").collect();
assert_eq!(v, ["lion", "tiger", "leopard"]);

Split by closure:

let v: Vec<&str> = "abc1def2ghi".split(|c: char| c.is_numeric()).collect();
assert_eq!(v, ["abc", "def", "ghi"]);
Saurabh
  • 5,176
  • 4
  • 32
  • 46
Denis Kreshikhin
  • 8,856
  • 9
  • 52
  • 84
39

split returns an Iterator, which you can convert into a Vec using collect: split_line.collect::<Vec<_>>(). Going through an iterator instead of returning a Vec directly has several advantages:

  • split is lazy. This means that it won't really split the line until you need it. That way it won't waste time splitting the whole string if you only need the first few values: split_line.take(2).collect::<Vec<_>>(), or even if you need only the first value that can be converted to an integer: split_line.filter_map(|x| x.parse::<i32>().ok()).next(). This last example won't waste time attempting to process the "23.0" but will stop processing immediately once it finds the "1".
  • split makes no assumption on the way you want to store the result. You can use a Vec, but you can also use anything that implements FromIterator<&str>, for example a LinkedList or a VecDeque, or any custom type that implements FromIterator<&str>.
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Jmb
  • 18,893
  • 2
  • 28
  • 55
  • 2
    Thank you for your detailed answer, any ideas why `let x = line.unwrap().split(",").collect::>();` does not work unless it is separated into two separate lines: `let x = line.unwrap();` and `let x = x.split(",").collect::>();`? The error message says: `temporary value created here ^ temporary value dropped here while still borrowed` – Greg Nov 07 '17 at 10:53
  • 1
    However it works as expected if I use `let x = line.as_ref().unwrap().split(",").collect::>();` – Greg Nov 07 '17 at 10:58
15

There's also split_whitespace()

fn main() {
    let words: Vec<&str> = "   foo   bar\t\nbaz   ".split_whitespace().collect();
    println!("{:?}", words);
    // ["foo", "bar", "baz"] 
}
jayelm
  • 7,236
  • 5
  • 43
  • 61
4

If you are looking for the Python-flavoured split where you tuple-unpack the two ends of the split string, you can do

if let Some((a, b)) = line.split_once(' ') {
    // ...
}
Piotr Ostrowski
  • 520
  • 5
  • 8
  • Nice addition, related to https://stackoverflow.com/questions/41517187/split-string-only-once-in-rust – Code4R7 Dec 29 '22 at 18:51
3

The OP's question was how to split with a multi-character string and here is a way to get the results of part1 and part2 as Strings instead in a vector.
Here splitted with the non-ASCII character string "☄☃" in place of "123":

let s = "☄☃";  // also works with non-ASCII characters
let mut part1 = "some string ☄☃ ffd".to_string();
let _t;
let part2;
if let Some(idx) = part1.find(s) {
    part2 = part1.split_off(idx + s.len());
    _t = part1.split_off(idx);
}
else {
    part2 = "".to_string();
}    

gets: part1 = "some string "
         part2 = " ffd"

If "☄☃" not is found part1 contains the untouched original String and part2 is empty.


Here is a nice example in Rosetta Code - Split a character string based on change of character - of how you can turn a short solution using split_off:

fn main() {
    let mut part1 = "gHHH5YY++///\\".to_string();
    if let Some(mut last) = part1.chars().next() {
        let mut pos = 0;
        while let Some(c) = part1.chars().find(|&c| {if c != last {true} else {pos += c.len_utf8(); false}}) {
            let part2 = part1.split_off(pos);
            print!("{}, ", part1);
            part1 = part2;
            last = c;
            pos = 0;
        }
    }
    println!("{}", part1);
}

into that

Task
Split a (character) string into comma (plus a blank) delimited strings based on a change of character (left to right).

Kaplan
  • 2,572
  • 13
  • 14