26

Basic situation:

I am copying some NTFS disks in openSUSE. Each one is 2 TB. When I do this, the system runs slow.

My guesses:

I believe it is likely due to caching. Linux decides to discard useful caches (for example, KDE 4 bloat, virtual machine disks, LibreOffice binaries, Thunderbird binaries, etc.) and instead fill all available memory (24 GB total) with stuff from the copying disks, which will be read only once, then written and never used again. So then any time I use these applications (or KDE 4), the disk needs to be read again, and reading the bloat off the disk again makes things freeze/hiccup.

Due to the cache being gone and the fact that these bloated applications need lots of cache, this makes the system horribly slow.

Since it is USB, the disk and disk controller are not the bottleneck, so using ionice does not make it faster.

I believe it is the cache rather than just the motherboard going too slow, because if I stop everything copying, it still runs choppy for a while until it recaches everything.

And if I restart the copying, it takes a minute before it is choppy again. But also, I can limit it to around 40 MB/s, and it runs faster again (not because it has the right things cached, but because the motherboard busses have lots of extra bandwidth for the system disks). I can fully accept a performance loss from my motherboard's I/O capability being completely consumed (which is 100% used, meaning 0% wasted power which makes me happy), but I can't accept that this caching mechanism performs so terribly in this specific use case.

# free
             total       used       free     shared    buffers     cached
Mem:      24731556   24531876     199680          0    8834056   12998916
-/+ buffers/cache:    2698904   22032652
Swap:      4194300      24764    4169536

I also tried the same thing on Ubuntu, which causes a total system hang instead. ;)

And to clarify, I am not asking how to leave memory free for the "system", but for "cache". I know that cache memory is automatically given back to the system when needed, but my problem is that it is not reserved for caching of specific things.

Is there some way to tell these copy operations to limit memory usage so some important things remain cached, and therefore any slowdowns are a result of normal disk usage and not rereading the same commonly used files? For example, is there a setting of max memory per process/user/file system allowed to be used as cache/buffers?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Peter
  • 3,067
  • 2
  • 17
  • 18
  • BTW I am using rsync, and have many disks (currently 8 at once). Some are transferred locally, some with USB 3.0. Some are transferred over 1Gbps network. – Peter Apr 11 '12 at 12:34
  • when copying nothing: # free total used free shared buffers cached Mem: 24731556 24474096 257460 0 16478072 6342668 -/+ buffers/cache: 1653356 23078200 Swap: 4194300 22564 4171736 Seems there is a memory leak with buffers. – Peter Apr 12 '12 at 06:57

8 Answers8

28

The nocache command is the general answer to this problem! It is also in Debian and Ubuntu 13.10 (Saucy Salamander).

Thanks, Peter, for alerting us to the --drop-cache" option in rsync. But that was rejected upstream (Bug 9560 – drop-cache option), in favor of a more general solution for this: the new "nocache" command based on the rsync work with fadvise.

You just prepend "nocache" to any command you want. It also has nice utilities for describing and modifying the cache status of files. For example, here are the effects with and without nocache:

$ ./cachestats ~/file.mp3
pages in cache: 154/1945 (7.9%)  [filesize=7776.2K, pagesize=4K]
$ ./nocache cp ~/file.mp3 /tmp
$ ./cachestats ~/file.mp3
pages in cache: 154/1945 (7.9%)  [filesize=7776.2K, pagesize=4K]\
$ cp ~/file.mp3 /tmp
$ ./cachestats ~/file.mp3
pages in cache: 1945/1945 (100.0%)  [filesize=7776.2K, pagesize=4K]

So hopefully that will work for other backup programs (rsnapshot, duplicity, rdiff-backup, amanda, s3sync, s3ql, tar, etc.) and other commands that you don't want trashing your cache.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
nealmcb
  • 12,479
  • 7
  • 66
  • 91
  • @Peter I think that the answer by nealmcb with 'nocache' is more appropriate as not all rsync have the drop-cache option, and nocache is more widely used – Krzysztof Krasoń Sep 20 '15 at 07:08
  • from my perspective, every Linux distro I tested (except Manjaro, and also not FreeBSD) has applied the patch to rsync... and nocache wasn't in openSUSE back when I used it, and isn't in the main repos in arch/manjaro. And I like the nocache idea better (just as I like nice, ionice, trickle, etc.) but unless I see it in more main repos, it's kinda subjective which is the best solution for everyone. – Peter Sep 22 '15 at 15:26
  • @Peter Debian doesn't have it – Krzysztof Krasoń Oct 01 '15 at 06:18
  • @krzyk Which debian is nocache not in? Looks like it is in wheezy-backports, jessie, stretch and sid: https://packages.debian.org/sid/utils/nocache – nealmcb Oct 01 '15 at 17:00
  • I was replying to @Peter about --drop-cache in rsync – Krzysztof Krasoń Oct 01 '15 at 18:43
  • The `nocache` command has some (to me) unexpected performance issues. Be careful how you apply it, and to which commands. I made the mistake of applying it to a shell script, and things did not go well. Try running `time /bin/true` and `time nocache /bin/true` -- the use of `nocache` adds a full second of CPU overhead to every command I try. (Ubuntu 20.04.4 LTS on an Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz) – Ian D. Allen Apr 01 '22 at 07:57
3

Kristof Provost was very close, but in my situation, I didn't want to use dd or write my own software, so the solution was to use the "--drop-cache" option in rsync.

I have used this many times since creating this question, and it seems to fix the problem completely. One exception was when I am using rsync to copy from a FreeBSD machine, which doesn't support "--drop-cache". So I wrote a wrapper to replace the /usr/local/bin/rsync command, and remove that option, and now it works copying from there too.

It still uses huge amount of memory for buffers and seems to keep almost no cache, but it works smoothly anyway.

$ free
             total       used       free     shared    buffers     cached
Mem:      24731544   24531576     199968          0   15349680     850624
-/+ buffers/cache:    8331272   16400272
Swap:      4194300     602648    3591652
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Peter
  • 3,067
  • 2
  • 17
  • 18
3

You have practically two choices:

  1. Limit the maximum disk buffer size: the problem you're seeing is probably caused by default kernel configuration that allows using huge piece of RAM for disk buffering and, when you try to write lots of stuff to a really slow device, you'll end up lots of your precious RAM for disk caching to that slow the device.

    The kernel does this because it assumes that the processes can continue to do stuff when they are not slowed down by the slow device and that RAM can be automatically freed if needed by simply writing the pages on storage (the slow USB stick - but the kernel doesn't consider the actual performance of that process). The quick fix:

     # Wake up background writing process if there's more than 50 MB of dirty memory
     echo 50000000 > /proc/sys/vm/dirty_background_bytes
     # Limit background dirty bytes to 200 MB (source: http://serverfault.com/questions/126413/limit-linux-background-flush-dirty-pages)
     echo 200000000 > /proc/sys/vm/dirty_bytes
    

    Adjust the numbers to match the RAM you're willing to spend on disk write cache. A sensible value depends on your actual write performance, not the amount of RAM you have. You should target on having barely enough RAM for caching to allow full write performance for your devices. Note that this is a global setting, so you have to set this according to the slowest devices you're using.

  2. Reserve a minimum memory size for each task you want to keep going fast. In practice this means creating cgroups for stuff you care about and defining the minimum memory you want to have for any such group. That way, the kernel can use the remaining memory as it sees fit. For details, see this presentation: SREcon19 Asia/Pacific - Linux Memory Management at Scale: Under the Hood

Update year 2022:

You can also try creating new file /etc/udev/rules.d/90-set-default-bdi-max_ratio-and-min_ratio.rules with the following contents:

# For every BDI device, set max cache usage to 30% and min reserved cache to 2% of the whole cache
# https://unix.stackexchange.com/a/481356/20336
ACTION=="add|change", SUBSYSTEM=="bdi", ATTR{max_ratio}="30", ATTR{min_ratio}="2"

The idea is to put limit per device for maximum cache utilization. With the above limit (30%) you can have two totally stalled devices and still have 40% of the disk cache available for the rest of the system. If you have 4 or more stalled devices in parallel, even this workaround cannot help alone. That's why I have also added minimum cache space of 2% for every device but I don't know how to check if this actually effective. I've been running with this config for about half a year and I think it's working nicely.

See https://unix.stackexchange.com/a/481356/20336 for details.

Mikko Rantalainen
  • 14,132
  • 10
  • 74
  • 112
2

The kernel can not know that you won't use the cached data from copying again. This is your information advantage.

But you could set the swapiness to 0: sudo sysctl vm.swappiness=0. This will cause Linux to drop the cache before libraries, etc. are written to the swap.

It works nice for me too, especially very performant in combination with huge amount of RAM (16-32 GB).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
MPW
  • 21
  • 1
1

It's not possible if you're using plain old cp, but if you're willing to reimplement or patch it yourself, setting posix_fadvise(fd, 0, 0, POSIX_FADV_NOREUSE) on both input and output file will probably help.

posix_fadvise() tells the kernel about your intended access pattern. In this case, you'd only use the data once, so there isn't any point in caching it. The Linux kernel honours these flags, so it shouldn't be caching the data any more.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Kristof Provost
  • 26,018
  • 2
  • 26
  • 28
1

Try using dd instead of cp.

Or mount the filesystem with the sync flag.

I'm not completely sure if these methods bypass the swap, but it may be worth giving a try.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
KurzedMetal
  • 12,540
  • 6
  • 39
  • 65
  • In my experience, it is always a good idea to use rsync (without the -u option which ruins everything). If not, I'll end up with undetected partial files when the transfer is interrupted. – Peter Apr 11 '12 at 12:26
  • On first try, it didn't work on a remote copy, since the server does not support that option. (both have rsync version 3.0.9 protocol version 30; Linux supports it, but FreeBSD 8.2 does not) And on local transfers, it seems to limit speed quite a bit. – Peter Apr 11 '12 at 12:34
  • @Peter yeah, i usually try to avoid `dd` i forgot about `rsync`... so many utilities :D – KurzedMetal Apr 11 '12 at 12:39
  • 2
    Insert this comment before 2nd comment... got deleted somehow: Thanks for the suggestion. This lead me to reading the rsync man page, in which I found the "--drop-cache" option, which actually seems to work well for local transfers. – Peter Apr 11 '12 at 12:49
  • look at the other answer, you may even speed up the syncing too. – KurzedMetal Apr 11 '12 at 12:55
1

I am copying some NTFS disks [...] the system runs slow. [...] Since it is USB [...]

The slowdown is a known memory management issue.

Use a newer Linux Kernel. The older ones have a problem with USB data and "Transparent Huge Pages". See this LWN article. Very recently this issue was addressed - see "Memory Management" in LinuxChanges.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Turbo J
  • 7,563
  • 1
  • 23
  • 43
  • I am using openSuSE with kernel 3.1.0-1.2-desktop. I don't know if this applies; what are the symptoms of this particular issue? – Peter Apr 12 '12 at 06:56
  • 1
    Interesting - but I'm confused about the status of this. It seems to me that the patch from Mel Gorman discussed on LWN isn't in the current kernel (3.13-rc5) and I couldn't tell where it was addressed on the LinuxChanges page you reference. What was the resolution, and which kernel did it appear in? Thanks. – nealmcb Dec 28 '13 at 17:56
  • 1
    @nealmcb It is fixed in Linux 3.3. The LinuxChanges line is "Compaction combined with Transparent Huge Pages can cause significant stalls with USB sticks or browser. Recommended [LWN article](https://lwn.net/Articles/467328/)". [The same LWN article as linked in this answer]. – sourcejedi Jun 24 '19 at 21:57
  • 1
    @nealmcb Or where I first found it: https://lore.kernel.org/lkml/1323877293-15401-1-git-send-email-mgorman@suse.de/ . Commits with the same titles as patches 1-10 were merged in v3.3-rc1. You can see them listed together in the github interface here: https://github.com/torvalds/linux/commits/0cee34fd72c582b4f8ad8ce00645b75fb4168199 The cover letter says patch 11 was "a prototype", and the discussion sounds like it didn't work out. I haven't tried to dig into the details for that part. But in general, this seems pretty conclusive. – sourcejedi Jun 24 '19 at 22:00
  • See also: https://unix.stackexchange.com/q/714267/20336 – Mikko Rantalainen Sep 12 '22 at 13:30
0

Ok, now that I know that you're using rsync and I could dig a bit more:

It seems that rsync is ineffective when used with tons of files at the same time. There's an entry in their FAQ, and it's not a Linux/cache problem. It's an rsync problem eating too much RAM.

Googling around someone recommended to split the syncing in multiple rsync invocations.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
KurzedMetal
  • 12,540
  • 6
  • 39
  • 65
  • Good info, but I don't get any out of memory issues. I have 22032652 bytes free. This may be due to the fact that most of these files of mine are many GB. When rsync starts up, it says "18658 files to consider". I've also used it when I have 100s of millions of files, and it also works. But thanks to your dd suggestion, and finding that --drop-cache option, it actually seems to be completely solved (as long as I don't do too many remote transfers when the server doesn't support --drop-cache). – Peter Apr 11 '12 at 13:05