25

I am trying to create a library and I want to include some binary (or text) files in it that will have data which will be parsed at runtime.

My intention is to have control over these files, update them constantly and change the version of the library in each update.

Is this possible via cargo? If so, how can I access these files from my library?

A workaround I thought of is to include some .rs files with structs and/or constants like &str which will store the data but I find it kind of ugly.

EDIT:

I have changed the accepted answer to the one that fits more my case, however take a look at Shepmaster's answer as this can be more suitable in your case.

Community
  • 1
  • 1
Otobo
  • 714
  • 1
  • 7
  • 13
  • 4
    **which will be parsed at runtime** => why? Since the data is static, it would be more efficient to store the already parsed data in the binary rather than parse it at run-time. Rust is fairly limited in what it can represent at compile-time (quite unfortunately), however you mention maintaining a `.rs` file so it seems possible in your case. If this is the case, then I advise using a `build.rs` file, which is a "build script" in Rust which will allow you to parse the source file(s) and generate `.rs` files right before building "proper". – Matthieu M. Sep 24 '15 at 06:31
  • 1
    To be more accurate these data are already parsed and need to move to structs, like `HashMap`, which can not be initialized (at least not without using `lazy_static` crate). This is why I need to "parse" them at runtime. Having them in a binary (or text) file makes easier to replace only this file when I want to update the data. However using `build.rs` in combination with `lazy_static` crate sounds like a better alternative and I'll give it a try. Thanks for the suggestion! – Otobo Sep 24 '15 at 08:49

2 Answers2

30

Disclaimer: I mentioned it in a comment, but let me re-iterate here, as it gives me more space to elaborate.

As Shepmaster said, it is possible to include text or binary verbatim in a Rust library/executable using the include_bytes! and include_str! macros.

In your case, however, I would avoid it. By deferring the parsing of the content to run-time:

  • you allow building a flawed artifact.
  • you incur (more) run-time overhead (parsing time).
  • you incur (more) space overhead (parsing code).

Rust acknowledges this issue, and offers multiple mechanisms for code generation destined to overcome those limitations:

  • macros: if the logic can be encoded into a macro, then it can be included in a source file directly
  • plugins: powered up macros, which can encode any arbitrary logic and generate elaborate code (see regex!)
  • build.rs: an independent "Rust script" running ahead of the compilation proper whose role is to generate .rs files

In your case, the build.rs script sounds like a good fit:

  • by moving the parsing code there, you deliver a lighter artifact
  • by parsing ahead of time, you deliver a faster artifact
  • by parsing ahead of time, you deliver a correct artifact

The result of your parsing can be encoded in different ways, from functions to statics (possibly lazy_static!), as build.rs can generate any valid Rust code.

You can see how to use build.rs in the Cargo Documentation; you'll find there how to integrate it with Cargo and how to create files (and more).

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
24

The include_bytes! macro seems close to what you want. It only gives you a reference to a byte array though, so you'd have to do any parsing starting from that:

static HOST_FILE: &'static [u8] = include_bytes!("/etc/hosts");

fn main() {
    let host_str = std::str::from_utf8(HOST_FILE).unwrap();

    println!("Hosts are:\n{}", &host_str[..42]);
}

If you have UTF-8 content, you can use include_str!, as pointed out by Benjamin Lindley:

static HOST_FILE: &'static str = include_str!("/etc/hosts");

fn main() {
    println!("Hosts are:\n{}", &HOST_FILE[..42]);
}
Community
  • 1
  • 1
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • 10
    If you want text (rather than binary data), and it's already in utf8 format, can't you just use `include_str!`, instead of using `include_bytes!` and then converting it? *i.e.* -- `let host_str = include_str!("/etc/hosts");` – Benjamin Lindley Sep 23 '15 at 22:04
  • 2
    @BenjaminLindley hmm, good point! I saw `include!` and realized it wasn't right but skipped right over `include_str!`. – Shepmaster Sep 23 '15 at 22:12