Is there a simple way to use str::matches
case-insensitively?
Asked
Active
Viewed 2.3k times
28

Shepmaster
- 388,571
- 95
- 1,107
- 1,366

Ala Douagi
- 1,035
- 2
- 10
- 14
-
1Convert both strings to lowercase first using `fn to_lowercase(&self) -> String` – ChickenFeet Nov 15 '17 at 02:19
-
*If I want to get the occurrences count of a string* — `matches(...).count()` – Shepmaster Nov 15 '17 at 02:35
-
5If you just want to do a case-insensitive string comparison with ASCII strings, then use [eq_ignore_ascii_case](https://doc.rust-lang.org/std/string/struct.String.html#method.eq_ignore_ascii_case). – Meet Sinojia Oct 05 '20 at 14:49
2 Answers
24
You can always convert both strings to the same casing. This will work for some cases:
let needle = "μτς";
let haystack = "ΜΤΣ";
let needle = needle.to_lowercase();
let haystack = haystack.to_lowercase();
for i in haystack.matches(&needle) {
println!("{:?}", i);
}
See also str::to_ascii_lowercase
for ASCII-only variants.
In other cases, the regex crate might do enough case-folding (potentially Unicode) for you:
use regex::RegexBuilder; // 1.4.3
fn main() {
let needle = "μτς";
let haystack = "ΜΤΣ";
let needle = RegexBuilder::new(needle)
.case_insensitive(true)
.build()
.expect("Invalid Regex");
for i in needle.find_iter(haystack) {
println!("{:?}", i);
}
}
However, remember that ultimately Rust's strings are UTF-8. Yes, you need to deal with all of UTF-8. This means that picking upper- or lower-case might change your results. Likewise, the only correct way to change text casing requires that you know the language of the text; it's not an inherent property of the bytes. Yes, you can have strings which contain emoji and other exciting things beyond the Basic Multilingual Plane.
See also:

Shepmaster
- 388,571
- 95
- 1,107
- 1,366
-
7I know it wasn't meant like this, but statements like "other exciting things beyond the Basic Multilingual Plane" really has sort of lovecraftian connotations. As if strings, like the old gods, are beyond human comprehension and to delve into a _true_ understanding of them would drive you insane. After the rust book introduced me to the notion of grapheme clusters... I kind of agree! – fostandy Dec 24 '19 at 03:59
-
Question is about strings, so [`escape`](https://docs.rs/regex/1.3.3/regex/fn.escape.html) function could be useful: `RegexBuilder::new(regex::escape(needle))` – diralik Jan 17 '20 at 23:40
8
If you're using the regex crate, you can make the pattern case insensitive:
let re = Regex::new("(?i)μτς").unwrap();
let mat = re.find("ΜΤΣ").unwrap();

Rumpelstiltskin Koriat
- 351
- 3
- 5