0

I wrote a example,and run without compiler error

use std::collections::HashSet;
use std::error::Error;

use html5ever::rcdom;
use html5ever::rcdom::Handle;
use reqwest;
use soup::{prelude, QueryBuilder};
use soup::prelude::*;

use testquestion::testtrait::Test;

fn main() -> Result<(), Box<Error>> {
    let resp = reqwest::get("https://docs.rs/soup/0.1.0/soup/")?;
    let soup = Soup::from_reader(resp)?;
    let result = soup
        .tag("section")
        .attr("id", "main")
        .find()
        .and_then(|section:Handle| -> Option<String> {
            section
                .tag("span")
                .attr("class", "in-band")
                .find()
                .map(|span:Handle| -> String {
                    (&span as  &rcdom::Node).text();
                    (&span as  &Handle).text()
                }
                )
        });
    assert_eq!(result, Some("Crate soup".to_string()));
    Ok(())
}

but I'm confused about

 (&span as  &rcdom::Node).text();
 (&span as  &Handle).text()

trait NodeExt have text method,and struct Node and Handle implement it. but why can I convert reference of struct handle to other reference (handle and node) without compiler error? is it safe? I'm complete novice in rust.

pub trait NodeExt: Sized {
 /// Retrieves the text value of this element, as well as it's child elements
    fn text(&self) -> String {
        let node = self.get_node();
        let mut result = vec![];
        extract_text(node, &mut result);
        result.join("")
    }
}

impl<'node> NodeExt for &'node rcdom::Node {
    #[inline(always)]
    fn get_node(&self) -> &rcdom::Node {
        self
    }
}

impl NodeExt for Handle {
    #[inline(always)]
    fn get_node(&self) -> &rcdom::Node {
        &*self
    }
}
edwardw
  • 12,652
  • 3
  • 40
  • 51
王奕然
  • 3,891
  • 6
  • 41
  • 62

1 Answers1

0

Compiler usually doesn't allow any unsafe code to be written outside unsafe blocks, but there are known soundness issues in the compiler and language itself. Some crates and especially standard library rely on unsafe implementations under the hood, to provide safe abstractions.

You are not 100% protected, like in any language, but practically you can be sure that if safe rust program compiles than it will work without undefined behavior.

Your code is perfectly safe, because it compiled without errors.

Core problem that this part of code tries to solve is method call ambiguity.

                    (&span as  &rcdom::Node).text();
                    (&span as  &Handle).text()

Consider if I change it to span.text(). What method compiler should call? Handle::text? rcdom::Node::text? Rust compiler doesn't have rules to deside what to call in this particular case.

We have two options. First is to use fully-qualified syntax

rcdom::Node::text(&span);
Handle::text(&span)

Actually it's suggested by rustc.

error[E0034]: multiple applicable items in scope
  --> src/main.rs:20:5
   |
20 |     A::test();
   |     ^^^^^^^ multiple `test` found
   |
note: candidate #1 is defined in an impl of the trait `Trait1` for the type `A`
  --> src/main.rs:12:5
   |
12 |     fn test() {}
   |     ^^^^^^^^^
   = help: to disambiguate the method call, write `Trait1::test(...)` instead
note: candidate #2 is defined in an impl of the trait `Trait2` for the type `A`
  --> src/main.rs:16:5
   |
16 |     fn test() {}
   |     ^^^^^^^^^
   = help: to disambiguate the method call, write `Trait2::test(...)` instead

(Playground link)

Second is to cast types around. Cast from &A to &dyn TraitN. (More about dyn keyword for trait objects) is always safe is A implements trait TraitN.

Trait objects are unsized, it means that compiler can't understand how much memory it should allocate for particular trait object, because it can be implemented by many types and exists solely with its implementers. Size can vary between different types implementing trait.

Because of that you can't cast A directly. Even when compiler knows that trait object will always be A in this case, trait objects are unsized by their nature.


error[E0620]: cast to unsized type: `A` as `dyn Trait1`
  --> src/main.rs:20:5
   |
20 |     (A as Trait1).test()
   |     ^^^^^^^^^^^^^
   |
help: consider using a box or reference as appropriate
  --> src/main.rs:20:6
   |
20 |     (A as Trait1).test()
   |      ^

You can borrow reference to A so you will get reference which size is fixed and cast it to a &dyn TraitN which size is also fixed (because it's reference to). So you get out object with type &dyn TraitN which method call can be easily made, without ambiguities.

    let a: &dyn Trait1 = &A as &dyn Trait1;
    a.test()

You are essentially erasing type of A, treating memory to which it points as a trait.

Inline
  • 2,566
  • 1
  • 16
  • 32