0

Currently I am using chokidar to watch a directory. The directory has a large amount of files, and it is constantly being written too. I am also using polling because I need to watch folders on a network. I noticed that when I start watching the directory, my CPU usage is really high.

From what I understand, watchers are also being created for each file in the directory?

I only need to be notified if a file has been added, I don't need to monitor for any changes to the file itself. So I feel like there is a lot of overhead being created for what I need. Is this possible with chokidar in any way? Or should I look for another solution for these needs.

Updated: Added a snippet of how I am creating my watcher instance. I'm not really doing anything special. I noticed that the CPU usage spikes really high as soon I create the watcher. The directory has about 20k files in it.

var fileWatcher = chokidar.watch('path to directory', {
  ignored: '*.txt',
  ignoreInitial: true,
  usePolling: true,
  interval: 600,
  depth: 0});

fileWatcher.on('add', function(path) {
  //Do something when a new file is created in the watched directory
});
  • "From what I understand, watchers are also being created for each file in the directory?" what code are you using? I thought you were polling not using watchers. I'd say, yeah use watchers... but, what have you tried? what makes you think you are making one per file? – Theraot Jan 11 '20 at 03:08
  • Does this answer your question? [Watch a folder for changes using node.js, and print file paths when they are changed](https://stackoverflow.com/questions/13695046/watch-a-folder-for-changes-using-node-js-and-print-file-paths-when-they-are-cha) – Theraot Jan 11 '20 at 03:13
  • Don't suppose you have `CHOKIDAR_INTERVAL` set as an environment variable? [The docs](https://www.npmjs.com/package/chokidar#performance) are a little unclear, but may be saying the environment variable could take precedence? Also, are the files in a deeply nested tree? If so, do you need to watch the whole tree, or could you set a `depth` limit to avoid polling too deeply? Regardless, repeatedly polling a network resource with thousands of files seems likely to be pretty expensive; just listing them probably occupies most of your 600 ms `interval`. Can you not find a better way? – ShadowRanger Jan 15 '20 at 01:45
  • @ShadowRanger I'll edit my post, but the depth is set to 0. I'll probably have to organize the files a bit better, by date created for example. At least I won't be polling thousands of files, but just the folder with the newest date –  Jan 15 '20 at 01:58
  • I notice `interval: 600` part in your snippet, which means polling will run in every 600 ms. Setting it to a larger value like `interval: 10000` might help. – Stream Huang Nov 15 '22 at 06:01

1 Answers1

2

So I found a solution that works for me. Basically if all you need is to be notified if a new file is created in the directory, without all the overhead of watching all the files in a large directory, you can do something similar.

fileWatcher.on('ready', function() {

 //Handle anything that need to be done on ready

 //At the end of the function unwatch everything in the directory.
 //With a large directory this will significantly decrease CPU usage.

});

fileWatcher.on('add', function(path) {

 //Do what you need to do when a new file is created


 //unwatch this file that was created since we do not care about monitoring it
 fileWatcher.unwatch(path);
});