2

I am using logstash to push all the text logs from storage to elastic search. My storage size is about 1 TB. To Start with I have started to push 368 GB data (may be few hundred thousand files) to elastic search but logstash is failing with following error.

{:timestamp=>"2014-05-15T00:41:12.436000-0700", :message=>"/root/share/archive_data/sessionLogs/965c6f46-1a5e-4820-a68d-7c32886972fc/Log.txt: file grew, old size 0, new size 1557420", :level=>:debug, :file=>"filewatch/watch.rb", :line=>"81"}
{:timestamp=>"2014-05-15T00:41:12.437000-0700", :message=>":modify for /root/share/archive_data/sessionLogs/965c6f46-1a5e-4820-a68d-7c32886972fc/Log.txt, does not exist in @files", :level=>:debug, :file=>"filewatch/tail.rb", :line=>"77"}
{:timestamp=>"2014-05-15T00:41:12.441000-0700", :message=>"_open_file: /root/share/archive_data/sessionLogs/965c6f46-1a5e-4820-a68d-7c32886972fc/Log.txt: opening", :level=>:debug, :file=>"filewatch/tail.rb", :line=>"98"}
{:timestamp=>"2014-05-15T00:41:12.441000-0700", :message=>"(warn supressed) failed to open /root/share/archive_data/sessionLogs/965c6f46-1a5e-4820-a68d-7c32886972fc/Log.txt: Permission denied - /root/share/archive_data/sessionLogs/965c6f46-1a5e-4820-a68d-7c32886972fc/Log.txt", :level=>:debug, :file=>"filewatch/tail.rb", :line=>"110"}

share is network mounted. I am using root user to start logstash. User should have all the access needed to mount. share directory has following access drwxr-xr-x 44 root root 0 May 13 08:36 share

Now, my log files are static they don't change.

So, my question is - Is there anyway to let logstash know that do not store file handles once it process one log file. I think above error is occurred because number of log files is huge.

I have already filed a bug and there is existing bug in logstash which says that logstash doesn't do well when log files are more in number.

I see some duplicate issues here but I would like to know if anybody has any experience with this kind of issue?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Rocky
  • 391
  • 7
  • 20
  • 1
    How many file do you have? You need to modify your "Max open files" in your system! Lookup this:http://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux – Ban-Chuan Lim May 16 '14 at 00:42
  • As it is log archive, I am thinking there would be at least 100 thousand. I will see if I could increase to that level and keep system healthy. – Rocky May 16 '14 at 03:45
  • I have this same issue. I have a folder of 500,000 JSONs with 777 access. I get the same error. – Joseph Sep 10 '14 at 05:15
  • Ben Lim's suggestions works when you increase open file handles, I increased it to maxmimum, but half a million would be too much to deal with it. My suggestion is to develop a StaticFile plugin which would read and process the file and forget about it rather than keeping open file handles on all the files. If all of your files are changing i.e. stream of logs, then I think the only option is File plugin, but Logstash will have to come up with an option of dealing with such use case. Right now, Logstash opens file handles on all the files it is monitoring. – Rocky Sep 11 '14 at 06:05

1 Answers1

1

I think, for logstash 1.4.2, the only answer is to:

  • move or delete the files from the monitored directory
  • restart logstash

I don't think there's any other way to have logstash release file handles of logs that have been processed and won't be added to any more.

mooreds
  • 4,932
  • 2
  • 32
  • 40
  • As my case was log archive where I had to keep all the logs, i wrote a quick plugin to deal with this case, it copies the file, reads it and then delete copy of original file. It solved my problem. – Rocky Apr 30 '15 at 23:33