3

The include_bytes! and include_str! macros seem like a mystery to me. I understand that the file is included in the binary, but how does it work at runtime?

  1. When is the file loaded into memory?
  2. Is there any reason not to store the result of include_bytes! / include_str! as a top-level const? Will the file then be in memory for the entire duration of the application runtime?
  3. Are there any penalties for including a "big" file, other than the binary size?
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
cambunctious
  • 8,391
  • 5
  • 34
  • 53

1 Answers1

6

There is no runtime CPU cost.

  1. The file is included in the binary at compile time. When the operating system loads the binary, it is placed into memory along with the executable code.
  2. I would encourage using a static or a const variable for include_bytes / include_str. The included data will be in memory regardless, unless the compiler has determined that it was unused anyway and it was optimized out.
  3. No.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Looking at that link and [this one](https://stackoverflow.com/a/11698458/365102), it looks as if `static` data is stored in a different location than the stack, which means it won't accidentally bloat the area around the program instructions. (CMIIW.) – Mateen Ulhaq May 06 '20 at 01:46
  • well except program is longer to load... well, don't think it's noticeable if you don't include a 2 go file – Stargateur May 06 '20 at 02:19
  • 1
    Note that the OS will usually place the data into _virtual_ memory when it loads the program. It isn't loaded into RAM until and unless you access it. – Jmb May 06 '20 at 06:50
  • I appreciate the responses so far. The virtual memory point seems significant. I wonder if the OS would quickly determine that the `include_bytes` data should be allocated to the disk in the case of a large file that does not fit in memory. Does the OS distinguish between data that is data and data that is executable instructions? Ultimately I guess I am looking for guidelines to decide between `include_bytes` or a shipping a separate file. – cambunctious May 06 '20 at 14:43
  • 1
    There's a good chance the os will just flag that the virtual memory is backed by the executable file. It would only then load things it actively needs to change like refences to the addresses of shared libraries. Everything else will only loaded when it's actually required by the processor, code or data. – user1937198 May 06 '20 at 15:11
  • 1
    In which case, include bytes may actually be more efficient than using a file open. – user1937198 May 06 '20 at 15:12
  • @user1937198 non sense, that would be be same. both are strictly equivalent thus, include_bytes has more disadvantage. But in perf read the file directly is equivalent. – Stargateur May 06 '20 at 21:25
  • So by using include_bytes, you are deferring some responsibility to the OS in deciding what should be loaded into memory when. And I suppose the OS can be trusted to do this optimally. – cambunctious May 06 '20 at 21:36
  • 1
    @Stargateur It can be more efficient in two ways: 1) if you don't actually need all the bytes, the OS may be able to skip loading some pages where a naive open and load into buffer would force the pages to be be loaded. 2) If you are loading a significant amount of data, and then the OS needs to page that data for some reason, if you used include_bytes, it may be able to avoid writing the data to swap, because it knows it can load it from the executable. – user1937198 May 06 '20 at 23:48
  • But in terms of pure read, if you are going to need the whole thing, and you have enough memory to avoid needing to swap, file open could be more efficient because of read-ahead. So they are not strictly equivalent. What would be equivalent would be using mmap to preform the same mapping but from a file. – user1937198 May 06 '20 at 23:57