1

the program i am working on at the moment processes a large amount of data (>32GB). Due to "pipelining" however, a maximum of arround 600 MB is present in the main memory at each given time (i checked that, that works as planned).

If the program has finished however and i switch back to the workspace with Firefox open, for example (but also others), it takes a while till i can use it again (also HDD is highly active for a while). This fact makes me wonder if Linux (operating system i use) swaps out other programs while my program is running and why?

I have 4 GB of RAM installed on my machine and while my program is active it never goes above 2 GB of utilization.

My program only allocates/deallocates dynamic memory of only two different sizes. 32 and 64 MB chunks. It is written in C++ and i use new and delete. Should Linux not be sufficiently smart enough to reuse these blocks once i freed them and leave my other memory untouched?

Why does Linux kick my stuff out of the memory? Is this some other effect i have not considered? Can i work arround this problem without writing a custom memory management system?

Joel
  • 4,732
  • 9
  • 39
  • 54
Lazarus535
  • 1,158
  • 1
  • 8
  • 23
  • 3
    If you are doing file I/O, the OS will try to cache your file data, and "old" program data is less important to maintain in memory, which is probably what you are seeing, rather than anything to do with allocations. – Mats Petersson Mar 16 '15 at 21:53
  • Unless you have overloaded `new` and `delete` to skip over your compiler's heap manager implementation, it is your compiler's heap manager that controls the inner workings of how your application interacts with the OS w.r.t. memory management, and not directly `new` and `delete`. – PaulMcKenzie Mar 16 '15 at 21:57
  • Thanks to you both! @Mats Petersson: Yes i am doing a lot of file I/O from HDD. So a custom memory manager (reuse the blocks myself) would not be of any help? Can i disable this behavior by telling Linux that it should not keep the read files chunks in memory? – Lazarus535 Mar 16 '15 at 22:02
  • 1
    @PaulMcKenzie: The word I would use is the "C++ runtime library's `new` and `delete`", rather than the compiler's. And most likely, the runtime library will reuse blocks - if it doesn't, you'd see increase in memory usage! – Mats Petersson Mar 16 '15 at 22:02
  • @Lazarus535: There are a number of "tuneable" parameters that can be configured in Linux. I'm not sure exactly which ones you should tune to improve this, you will likely affect SOMETHING else (e.g. firefox will also be slower), so it's not just "change parameter X to 28 instead of 42". Here's something I just found this googling for cache settings, tho' trying to achieve the OPPOSITE of what you are doing, it does show what some of the settings are, with links to explanation. http://unix.stackexchange.com/questions/30286/can-i-configure-my-linux-system-for-more-aggressive-file-system-caching – Mats Petersson Mar 16 '15 at 22:16
  • I suggest you to start with `vmstat 1` command to confirm your guesses. Note `si` and `so` columns that is related with swap reading/writing. – myaut Mar 16 '15 at 22:38
  • Sounds like someone needs to learn about [posix_fadvise](http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_fadvise.html). – David Schwartz Mar 17 '15 at 01:14

2 Answers2

1

The most likely culprit is file caching. The good news is that you can disable file caching. Without caching, your software will run more quickly, but only if you don't need to reload the same data later.

You can do this directly with linux APIs, but I suggest you use a library such as Boost ASIO. If your software is I/O bound, you should additionally make use of asynchronous I/O to improve performance.

Sophit
  • 491
  • 2
  • 8
  • Without caching, your software _may_ run more quickly. It may also run more slowly. – Mooing Duck Mar 16 '15 at 22:54
  • It's like you didn't even read what I wrote. – Sophit Mar 16 '15 at 23:12
  • No, you wrote "but only if you don't need to reload the same data later". Without file caching, it can also be slower under many other common circumstances. – Mooing Duck Mar 16 '15 at 23:13
  • He's processing > 32 GB of data sequentially using 4 GB of RAM. Caching would be faster than non-caching in the extraordinary case that he processes each file as it's written to disk, and only if he can process files as quickly as they're written. Were that true, I'd still question whether it'd be true for future versions of the software. – Sophit Mar 16 '15 at 23:21
  • A am using asynchronous I/O (own implementation, but happy with it:-D). I try to stick with the standard library with this project, because of the later uses, where boost might not be available easily. I have used boost::asio for another project (networking mostly). So boost::asio can provide platform independant flags for disabling caching in a platform independant way (important)? In that case i will consider switching. Thx. – Lazarus535 Mar 16 '15 at 23:25
  • The >32GB refers to the input file. So reading only. Probably should have mentioned that. The written file (result) is quite small (under 5MB). – Lazarus535 Mar 16 '15 at 23:30
  • I've always used my own libraries too. I assume asio has the feature, but I could be wrong. If you prefer, use aio.h directly in linux to get unbuffered asynchronous reads. – Sophit Mar 16 '15 at 23:37
  • The program will later on be multi-platform. So i cant do Linux only. And i dont want to code platform specific code :-D. I will have a look at boost...will see. – Lazarus535 Mar 16 '15 at 23:47
  • I might have been wrong, judging from the ASIO documentation I just skimmed and also [this question](http://stackoverflow.com/questions/378515/whats-the-deal-with-boost-asio-and-file-i-o). I'm sorry that I don't have another library to suggest. Otherwise you're stuck writing an implementation for each platform. – Sophit Mar 16 '15 at 23:55
1

All the recently-used pages are causing older pages to get squeezed out of the disk cache. As a result, when some other program runs, it has to paged back in.

What you want to do is use posix_fadvise (or posix_madvise if you're memory mapping the file) to eject pages you've forced the OS to cache so that your program doesn't have a huge cache footprint. This will let older pages from other programs remain in cache.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • I was using memory mapped files at one point. But if there are more than one files to read in parallel (one index from file one followed by one index in other file) linux will only read 4K pages at once for each file, which basically cripples I/O performance (at least for me it did). I used the boost mapped file in an earlier version. Unfortunately it does NOT expose the madvice (or similar platform independent) flag. And as a mentioned: NO platform specific code. Otherwise nice answer. Thx. – Lazarus535 Mar 17 '15 at 01:42