0

How would you access an element in a borrowed string by index?

Straightforward in Python:

my_string_lst = list(my_string) 
print my_string_list[0]
print my_string[0]                # same as above

Rust (attempt 1):

let my_string_vec = vec![my_string];    # doesn't work
println!("{}", my_string_vec[0]);       # returns entire of `my_string` 

Rust (attempt 2):

let my_string_vec = my_string.as_bytes();  # returns a &[u8]
println!("{}", my_string_vec[0]);          # prints nothing

My end goal is to stick it into a loop like this:

for pos in 0..my_string_vec.len() {
    while shift <= pos && my_string_vec[pos] != my_string_vec[pos-shift] {
        shift += shifts[pos-shift];
    }
    shifts[pos+1] = shift;
}

for ch in my_string_vec {
    let pos = 0;    // simulate some runtime index
    if my_other_string_vec[pos] != ch {
        ...
    }
}

I think it's possible to do use my_string_vec.as_bytes()[pos]and my_string_vec.as_bytes()[pos-shift]in my condition statement, but I feel that this has a bad code smell.

sjagr
  • 15,983
  • 5
  • 40
  • 67
elleciel
  • 2,297
  • 3
  • 17
  • 19
  • possible duplicate of [How to index a String in Rust](http://stackoverflow.com/questions/24542115/how-to-index-a-string-in-rust) – Shepmaster Apr 10 '15 at 13:43

1 Answers1

0

You can use char_at(index) to access a specific character. If you want to iterate over the characters in a string, you can use the chars() method which yields an iterator over the characters in the string.

The reason it was specifically not made possible to use indexing syntax is, IIRC, because indexing syntax would give the impression that it was like accessing a character in your typical C-string-like string, where accessing a character at a given index is a constant time operation (i.e. just accessing a single byte in an array). Strings in Rust, on the other hand, are Unicode and a single character may not necessarily consist of just one byte, making a specific character access a linear time operation, so it was decided to make that performance difference explicit and clear.

As far as I know, there is no method available for swapping characters in a string (see this question). Note that this wouldn't have been possible anyways via an immutably borrowed string, since such a string isn't yours to modify. You would have to most likely use a String, or perhaps a &mut str if you're strictly swapping, but I'm not too familiar with Unicode's intricacies.

I recommend instead you build up a String the way you want it, that way you don't have to worry about the mutability of the borrowed string. You'd refer/look into the borrowed string, and write into the output/build-up string accordingly based on your logic.

So this:

for pos in 0..my_string_vec.len() {
    while shift <= pos && my_string_vec[pos] != my_string_vec[pos-shift] {
        shift += shifts[pos-shift];
    }
    shifts[pos+1] = shift;
}

Might become something like this (not tested; not clear what your logic is for):

for ch in my_string.chars()
    while shift <= pos && ch != my_string.char_at(pos - shift) {
        // assuming shifts is a vec; not clear in question
        shift += shifts[pos - shift];
    }

    shifts.push(shift);
}

Your last for loop:

for ch in my_string_vec {
    let pos = 0;    // simulate some runtime index
    if my_other_string_vec[pos] != ch {
        ...
    }
}

That kind of seems like you want to compare a given character in string A with the corresponding character (in the same position) of string B. For this I would recommend zipping the chars iterator of the first with the second, something like:

for (left, right) in my_string.chars().zip(my_other_string.chars()) {
  if left != right {
  }
}

Note that zip() stops iterating as soon as either iterator stops, meaning that if the strings are not the same length, then it'll only go as far as the shortest string.

If you need access to the "character index" information, you could add .enumerate() to that, so the above would change to:

for (index, (left, right)) in my_string.chars().zip(my_other_string.chars()).enumerate()
Community
  • 1
  • 1
Jorge Israel Peña
  • 36,800
  • 16
  • 93
  • 123
  • Thanks for the detailed response. Now the history of why the indexing syntax isn't available in Rust makes sense. I haven't gone through the last snippet you gave me but the `.char_at()` method offends the compiler. I get `lib.rs:10:1: 10:22 error: unstable feature lib.rs:10 #![feature(str_char)]`. Will get back to you when I get that fixed. – elleciel Apr 10 '15 at 04:44
  • That error should have a `note` message that appears under it that tells you how to get around it, basically add `#![feature(str_char)]` to the top of your crate root (e.g. the top of your main.rs file if it's binary, or lib.rs if it's a library). – Jorge Israel Peña Apr 10 '15 at 04:46
  • Yes the original compilation gave me an error requiring me to add the `str_char` feature; I get the above error after I have added `#[feature(str_char)]` to my crate attributes. I guess the functionality is depreciated. I'm looking into it now. – elleciel Apr 10 '15 at 04:50
  • 1
    I don't think it's deprecated or it would've told you that. Are you by any chance running on the rust beta? Unstable features are not usable within the beta and other "stable" releases; they're only usable in nightly releases. If you want to stick to the beta and not use a nightly, I believe an alternative to `char_at(index)` would be something like `chars().nth(index).unwrap()`. – Jorge Israel Peña Apr 10 '15 at 04:53
  • Speaking of which I just found the issue report: https://github.com/rust-lang/rust/issues/23973 - I'll try the alternative that you've suggested! – elleciel Apr 10 '15 at 04:56
  • `chars().nth(index).unwrap()` is not a full replacement for `char_at(index)`. The former gives the `index`th `char` in the `String` in O(index) time, while the latter gives the `char` starting at the `index`th byte in O(1) time. – huon Apr 10 '15 at 07:54
  • For completeness, a full replacement of `s.char_at(i)` is `s[i..].chars().next().unwrap()`. (Both yield the Unicode character starting at byte position `i` in `s` in `O(1)` time.) – BurntSushi5 Apr 10 '15 at 11:29
  • Question: would `s[i..(i+1)].chars().next().unwrap()` be any faster for long strings? I'm guessing not since `chars` is an iterator. – anderspitman Apr 10 '15 at 17:40
  • 1
    @anders It wouldn't be faster (slicing just does pointer arithmetic, and `chars` indeed is an iterator and thus lazy) and it would be wrong and crash for everything but ASCII. –  Apr 10 '15 at 18:07