2

Problem:

Im new to Rust, and im trying to implement a macro which simulates sscanf from C. So far it works with any numeric types, but not with strings, as i am already trying to parse a string.

macro_rules! splitter {
    ( $string:expr, $sep:expr) => {
        let mut iter:Vec<&str> = $string.split($sep).collect();
        iter
    }
}

macro_rules! scan_to_types {    
    ($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
        let res = splitter!($buffer,$sep);
        let mut i = 0;
        $(
            $x = res[i].parse::<$y>().unwrap_or_default();
            i+=1;
        )*        
    };
}

fn main() {
   let mut a :u8;   let mut b :i32;   let mut c :i16;   let mut d :f32;
   let buffer = "00:98;76,39.6";
   let sep = [':',';',','];
   scan_to_types!(buffer,sep,[u8,i32,i16,f32],a,b,c,d);  // this will work
   println!("{} {} {} {}",a,b,c,d);
}

This obviously wont work, because at compile time, it will try to parse a string slice to str:

let a :u8;   let b :i32;   let c :i16;   let d :f32;   let e :&str;
let buffer = "02:98;abc,39.6";
let sep = [':',';',','];
scan_to_types!(buffer,sep,[u8,i32,&str,f32],a,b,e,d);
println!("{} {} {} {}",a,b,e,d);
$x = res[i].parse::<$y>().unwrap_or_default();
   |        ^^^^^ the trait `FromStr` is not implemented for `&str`

What i have tried

I have tried to compare types using TypeId, and a if else condition inside of the macro to skip the parsing, but the same situation happens, because it wont expand to a valid code:

macro_rules! scan_to_types {    
    ($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
        let res = splitter!($buffer,$sep);
        let mut i = 0;
        $(
            if TypeId::of::<$y>() == TypeId::of::<&str>(){
                $x = res[i];
            }else{                
                $x = res[i].parse::<$y>().unwrap_or_default();
            }
            i+=1;
        )*        
    };
}

Is there a way to set conditions or skip a repetition inside of a macro ? Or instead, is there a better aproach to build sscanf using macros ? I have already made functions which parse those strings, but i couldnt pass types as arguments, or make them generic.

mateusns12
  • 25
  • 2
  • Replace `&str` with `String` and your macro [works](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=fc4ac5a64730242bd5e744baa067005c). – user4815162342 May 07 '22 at 18:44
  • Yes, that works. Thanks for the clever answer. Woulndt make sense to pass a &str variable anyway, since it is immutable. I will be using it [here](https://github.com/mateusns12/RUST-SRT-FileParser) . – mateusns12 May 07 '22 at 19:24
  • It actually _would_ make sense to use an `&str` because my approach incurs an unnecessary allocation instead of a slice that could point to the inside of the string you are scanning. (While an `&str` is immutable, the variable that holds it may well be mutable.) But as you discovered, it's more work, and likely requires special-casing the string case in the macro (or specialization, which is not yet stable). – user4815162342 May 07 '22 at 19:51

1 Answers1

4

Note before the answer: you probably don't want to emulate sscanf() in Rust. There are many very capable parsers in Rust, so you should probably use one of them.

Simple answer: the simplest way to address your problem is to replace the use of &str with String, which makes your macro compile and run. If your code is not performance-critical, that is probably all you need. If you care about performance and about avoiding allocation, read on.

A downside of String is that under the hood it copies the string data from the string you're scanning into a freshly allocated owned string. Your original approach of using an &str should have allowed for your &str to directly point into the data that was scanned, without any copying. Ideally we'd like to write something like this:

trait MyFromStr {
    fn my_from_str(s: &str) -> Self;
}

// when called on a type that impls `FromStr`, use `parse()`
impl<T: FromStr + Default> MyFromStr for T {
    fn my_from_str(s: &str) -> T {
        s.parse().unwrap_or_default()
    }
}

// when called on &str, just return it without copying
impl MyFromStr for &str {
    fn my_from_str(s: &str) -> &str {
        s
    }
}

Unfortunately that doesn't compile, complaining of a "conflicting implementation of trait MyFromStr for &str", even though there is no conflict between the two implementations, as &str doesn't implement FromStr. But the way Rust currently works, a blanket implementation of a trait precludes manual implementations of the same trait, even on types not covered by the blanket impl.

In the future this might be resolved by specialization. Specialization is not yet part of stable Rust, and might not come to stable Rust for years, so we have to turn to another solution. Since you're already using a macro, we can just let the compiler "specialize" for us by creating two separate traits which share the name of the method. (This is similar to the autoref-based specialization invented by David Tolnay, but even simpler because it doesn't require autoref resolution to work, as we have the types provided explicitly.)

We create separate traits for parsed and unparsed values, and implement them as needed:

trait ParseFromStr {
    fn my_from_str(s: &str) -> Self;
}

impl<T: FromStr + Default> ParseFromStr for T {
    fn my_from_str(s: &str) -> T {
        s.parse().unwrap_or_default()
    }
}

pub trait StrFromStr {
    fn my_from_str(s: &str) -> &str;
}

impl StrFromStr for &str {
    fn my_from_str(s: &str) -> &str {
        s
    }
}

Then in the macro we just call <$y>::my_from_str() and let the compiler generate the correct code. Since macros are untyped, this is one of the rare cases where a duck-typing-style approach works in Rust. This is because we never need to provide a single "trait bound" that would disambiguate which my_from_str() we want. (Such a trait bound would require specialization.)

macro_rules! scan_to_types {
    ($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
        #[allow(unused_assignments)]
        {
            let res = splitter!($buffer,$sep);
            let mut i = 0;
            $(
                $x = <$y>::my_from_str(&res[i]);
                i+=1;
            )*
        }
    };
}

Complete example in the playground.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • Thats really interesting. I will be studying about it. I thought about adding the trait, but couldnt find a way to do that. About using parsers, as im comming from C, i have this habit of using as little extern libraries as possible, but it seems thats not the case with crates in rust. So far my code is not critical, so i've decided to implement my own, more to understand macros, than "getting the job done". I will be also studying about ownership, and references. That was my biggest struggle when trying to use a function instead of a macro. – mateusns12 May 07 '22 at 21:52
  • @mateusns12 The issue with using a function is that Rust doesn't support functions with variable number of arguments yet (and also the trick described in this answer wouldn't work). But I really do encourage you to find a nice parser and use it. Rust has a truly minimal standard library and a great ecosystem of crates, so there is no reason to reinvent the wheel. – user4815162342 May 07 '22 at 22:33
  • Im still learning how the rust ecossystem works. So far i've been just writing "C like" code, not really using the strengths of the language. I will take time to explore crates.io. Thanks for the explanation, and also thanks for linking the work of David Tolnay, it is a great reference. – mateusns12 May 08 '22 at 02:41