3

I'm trying to split an std::string::String using regex::Regex and get a Vec<String>. The minimal code looks as follows:

let words: Vec<String> = Regex::new(r"\W+")
    .unwrap()
    .split(&file_raw.to_owned())
    .collect()
    .map(|x| x.to_string());

On the last line I'm getting an error: method not found in 'std::vec::Vec<&str>' with a note:

the method `map` exists but the following trait bounds were not satisfied:
`std::vec::Vec<&str>: std::iter::Iterator`
which is required by `&mut std::vec::Vec<&str>: std::iter::Iterator`
`[&str]: std::iter::Iterator`
which is required by `&mut [&str]: std::iter::Iterator`

This question bears strong resemblance to this one, but the offered solution is exactly the one I tried and fails. I'd be grateful if someone could explain the type difference between splitting by String::split member function and Regex::split.

Vilda
  • 1,675
  • 1
  • 20
  • 50

2 Answers2

7

Your .collect() returns a Vec which is not an iterator. You could of course obtain an iterator from Vec with .iter() and then apply .map() but I suspect what you really want is to .map() before you .collect():

let words: Vec<String> = Regex::new(r"\W+")
    .unwrap()
    .split(&file_raw.to_owned())
    .map(|x| x.to_string())
    .collect();
eggyal
  • 122,705
  • 18
  • 212
  • 237
2

If you want to split on spaces but exclude spaces that fall inside a single or double quoted string, here is an example:

use regex::Regex;

fn main() {
    let pattern = Regex::new(r#"[^\s"']+|"([^"]*)"|'([^']*)'"#).unwrap();
    let s = "swap drop rot number(3) \"item list\" xget 'item count'";
    let matches: Vec<String> = pattern
        .find_iter(s)
        .map(|m| m.as_str().to_string())
        .collect();
    println!("Matches: {}\n  - {}", matches.len(), matches.join("\n  - "));
}

The output:

Matches: 7
  - swap
  - drop
  - rot
  - number(3)
  - "item list"
  - xget
  - 'item count'
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Paul Chernoch
  • 5,275
  • 3
  • 52
  • 73
  • Do you have an idea how to preserve a string when the quotes are included? Like: `-metadata service_provider='Test Inc.'`. This should give: `["-metadata", "service_provider='Test Inc.'"]` – jb_alvarado Mar 21 '22 at 16:48