3

G'Day!

I have an executable (Unix or Windows - it should be cross-compiling). If one opens this executable by any editor and write some stuff to the end - the application would still run perfect. On execution, the application with all its data loads to the RAM. So, the user-written part of file is also loaded into memory.

Is there any chance to read this data?

I need this data in fast access. Other workarounds are not OK, because it takes too much time:

  1. Reading directly from file (on hard disk) or mapping it is not fine, because the application have to read this file on each run, but this application has lots of launches per sec.
  2. Using shared memory with another process (something like server, which holds data) is not cross-compiling
  3. Using pipes between app and so-called server is not fast enough, imho.

That's why I decided to write some stuff to the end of application.

Thanks in advance!

serenheit
  • 123
  • 1
  • 7
  • 1
    im not shure, so only a comment: You should not do something like this. Manipulating a build executable and add some data to the end may corrupt your binary. You should create a extra data file. Depending on the kind of data, you should open the file as a resource on startup and store the content into memory. Or you have to compile the data into your executable. – Thomas Berger Aug 19 '11 at 10:37
  • Thomas, the thing is adding data to the end doesn't corrupt binary itself. Using extra file is time consuming. The stored data basically, is a ordinary text. – serenheit Aug 19 '11 at 10:40
  • What do you mean 'time consuming'? That's what the rest of the world does! Also, this has implications for code signing etc. – Joe Aug 19 '11 at 10:43
  • @serenheit Editing a binary with an text-editor (in the worst case, notepad on windows) WILL corrupt the binary. You should not do something like that. Not with an text-editor – Thomas Berger Aug 19 '11 at 10:45
  • @Joe time consuming means that reading file takes about 50% of program working time. This part must be really fast, because I have a server with lots of request to this binary. – serenheit Aug 19 '11 at 11:03
  • @serenheit: You do realize that there is such a thing as a page cache? Reading the same file into memory thousands of times per second will simply remap pages from the page cache to userland, in the worst case doing a memcpy... that is even more true for memory mapping. – Damon Aug 19 '11 at 11:13

1 Answers1

4

Are you re-inventing

I also think you're might be optimizing the wrong things.

Reading directly from file (on hard disk) or mapping it is not fine, because the application have to read this file on each run, but this application has lots of launches per sec.

The kernel[1] is way smarter than we are and is perfectly capable of caching the mapped stuff. Heck, if you map it READ-ONLY there will be no difference with directly accessing data from your program's base image.

[1]: this goes for both WIndows and Unix

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
  • If I'm not mistaken, I map file on the start of process, after that, at the end, I have to unmap it. I don't think the kernel would cache it. But.. It can be the way. I'll try to find some information about.. – serenheit Aug 19 '11 at 10:51
  • 2
    @serenheit The kernel will cache it, thats the reason why it has a filesystem cache – Thomas Berger Aug 19 '11 at 11:01
  • @serenheit In terms of reading the file. Mapping within a binary has nothing todo with the filesystem cache – Thomas Berger Aug 19 '11 at 11:06
  • @Thomas The file will be cached and the next time I read it, Will it be loaded from cache? And is it correct for mapping? – serenheit Aug 19 '11 at 11:11
  • @serenheit The file content will be delivered from filesystem cache (ram) thats right. Define what you mean with mapping. If you mean to map the file to memory, that should be faster as reading it from the own file. – Thomas Berger Aug 19 '11 at 11:15
  • 1
    Regardless of mapping or reading, there is block cache. The actual blocks are already in RAM, so they are just remapped to the process space (and shared IFF the maps are readonly). Reopening the same map does not require disk access unless the file changed. – sehe Aug 19 '11 at 11:25
  • @Sehe Or unless the other process refilled the cache with its own stuff. – serenheit Aug 19 '11 at 11:31
  • @serenheit: That's called cache eviction. You could try to prevent that from happening using VirtualLock (or mlock on UNIX). Mind the caveats, though; it has limited usefulness and limited guarantees – sehe Aug 19 '11 at 12:13
  • @Sehe Thanks, I'll try and make a really hard tests to establish the best way. – serenheit Aug 19 '11 at 12:19