0
  • I'm new to Rust.
  • I'm reading SHA1-as-hex-strings from a file - a lot of them, approx. 30 million.
  • In the text file, they are sorted ascending numerically.
  • I want to be able to search the list, as fast as possible.
  • I (think I) want to read them into a (sorted) Vec<primitive_type::U256> for fast searching.

So, I've tried:

log("Loading haystack.");
// total_lines read earlier
let mut the_stack = Vec::<primitive_types::U256>::with_capacity(total_lines);
if let Ok(hay) = read_lines(haystack) { // Get BufRead
  for line in hay { // Iterate over lines
    if let Ok(hash) = line {
      the_stack.push(U256::from(hash));
    }
  }
}
log(format!("Read {} hashes.", the_stack.len()));

The error is:

$ cargo build
   Compiling nsrl v0.1.0 (/my_app)
error[E0277]: the trait bound `primitive_types::U256: std::convert::From<std::string::String>` is not satisfied
  --> src/main.rs:55:24
   |
55 |         the_stack.push(U256::from(hash));
   |                        ^^^^^^^^^^ the trait `std::convert::From<std::string::String>` is not implemented for `primitive_types::U256`
   |
   = help: the following implementations were found:
             <primitive_types::U256 as std::convert::From<&'a [u8; 32]>>
             <primitive_types::U256 as std::convert::From<&'a [u8]>>
             <primitive_types::U256 as std::convert::From<&'a primitive_types::U256>>
             <primitive_types::U256 as std::convert::From<&'static str>>
           and 14 others
   = note: required by `std::convert::From::from`

This code works if instead of the variable hash I have a string literal, e.g. "123abc".

I think I should be able to use the implementation std::convert::From<&'static str>, but I don't understand how I'm meant to keep hash in scope?

I feel like what I'm trying to achieve is a pretty normal use case:

  • Iterate over the lines in a file.
  • Add the line to a vector.

What am I missing?

Bridgey
  • 529
  • 5
  • 15
  • I think you should represent SHA1 hashes as `[u8; 20]` and store them in a `HashSet<[u8; 20]>` for fast lookup. This isn't really an answer to your question, but would shift the problem to decoding a hex string to binary data (which incidentally is your actual problem anyway). – Sven Marnach Aug 03 '20 at 08:34
  • The conversion from a static string to `U256` seems to parse a decimal string into a number. That's useles for your use case – you need to parse hex strings. – Sven Marnach Aug 03 '20 at 08:38
  • @SvenMarnach, even though the hashes are already sorted when I read them, i.e. I can read them straight into a sorted `Vec<...>`, you'd still recommend `HashSet<...>` for speed? – Bridgey Aug 03 '20 at 08:56
  • Closely related: [How can I convert a hex string to a u8 slice?](https://stackoverflow.com/q/52987181) – Sven Marnach Aug 03 '20 at 09:18

1 Answers1

0

You almost want something like,

U256::from_str(&hash)?

There is a conversion from &str in the FromStr trait called from_str. It returns a Result<T, E> value, because parsing a string may fail.

I think I should be able to use the implementation std::convert::From<&'static str>, but I don't understand how I'm meant to keep hash in scope?

You can’t keep the hash in scope with 'static lifetime. It looks like this is a convenience method to allow you to use string constants in your program—but it is really nothing more than U256::from_str(&hash).unwrap().

However…

If you want a SHA-1, the best type is probably [u8; 20] or maybe [u32; 5].

You want a base 16 decoder, something like base16::decode_slice. Here’s how that might look in action:

/// Error if the hash cannot be parsed.
struct InvalidHash;

/// Type for SHA-1 hashes.
type SHA1 = [u8; 20];

fn read_hash(s: &str) -> Result<SHA1, InvalidHash> {
    let mut hash = [0; 20];
    match base16::decode_slice(s, &mut hash[..]) {
        Ok(20) => Ok(hash),
        _ => Err(InvalidHash),
    }
}
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Thanks! This looks good. I've not seen that type notation before in your "However" - guessing it's some kind of "20 u8s in a row"/"5 u32s in a row"? Could you provide a link so I can read more? I tried Google-ing "concatenated types" and similar but no joy - just results about concatenating strings! – Bridgey Aug 03 '20 at 08:45
  • @Bridgey that's just an array. https://doc.rust-lang.org/std/primitive.array.html – justinas Aug 03 '20 at 08:46
  • @Bridgey: Yeah, the syntax for arrays in Rust is a bit unusual. – Dietrich Epp Aug 03 '20 at 08:47