57

I have a huge file that I am writing into approximately 450 files. I am getting error as too many files open. I searched the web and found some solution but it is not helping.

import resource
resource.setrlimit(resource.RLIMIT_NOFILE, (1000,-1))
>>> len(pureResponseNames) #Filenames 
434
>>> resource.getrlimit(resource.RLIMIT_NOFILE)
(1000, 9223372036854775807)
>>> output_files = [open(os.path.join(outpathDirTest, fname) + ".txt", "w") for fname in pureResponseNames]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 24] Too many open files: 'icd9_737.txt'
>>> 

I also changed ulimit from the command line as below:

$ ulimit -n 1200
$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1200
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited
$ 

I am still getting the same error. PS: I also restarted my system and run the program but with no success.

learner
  • 2,582
  • 9
  • 43
  • 54
  • 2
    That is a crapload of files. Do you really need to have them all open at the same time? – user2357112 Aug 16 '13 at 19:21
  • 3
    I highly suggest you develop some sort of queue system so that all those file handles aren't left open, that is highly inefficient. –  Aug 16 '13 at 19:23
  • Since the input file is huge I want to read it once only. Also if opening multiple files is supported by python then why not to use it. It made my life easier a lot as long as the number of open files were less than 256. – learner Aug 16 '13 at 19:42
  • 1
    Strange. I just tried running your code and it worked for me (up to 1000 files). – ron rothman Aug 16 '13 at 21:32
  • You need a better algorithm. You really don't need that many files open to do this. – Keith Aug 17 '13 at 01:24
  • @enginefree While it might be wrong to have these many filehandles open at once in this particular scenario, what is this "high inefficiency" that you mention in the general case? Will anything be slower if a process has thousands of filehandles open at the same time? – josch Feb 15 '18 at 18:18

8 Answers8

27

"Too many open files" errors are always tricky – you not only have to twiddle with ulimit, but you also have to check system-wide limits and OSX-specifics. This SO post gives more information on open files in OSX. (Spoiler alert: the default is 256).

However, it is often easy to limit the number of files that have to be open at the same time. If we look at Stefan Bollman's example, we can easily change that to:

pureResponseNames = ['f'+str(i) for i in range(434)]
outpathDirTest="testCase/"
output_files = [os.path.join(outpathDirTest, fname) + ".txt" for fname in pureResponseNames]

for filename in range(output_files):
    with open(filename, 'w') as f:
        f.write('This is a test of file nr.'+str(i))
OmG
  • 18,337
  • 10
  • 57
  • 90
publysher
  • 11,214
  • 1
  • 23
  • 28
26

You should try $ ulimit -n 50000 instead of 1200.

OmG
  • 18,337
  • 10
  • 57
  • 90
naveenkumar.s
  • 901
  • 8
  • 17
  • `/usr/bin/ulimit: line 4: ulimit: open files: cannot modify limit: Invalid argument` Apparently 50000 is invalid, however 1200 did work. – luckydonald Nov 15 '21 at 11:50
  • worked to access a vmdk split into 1026 files! btw, if you are testing and lower from lets say 50000 to 1000, and then try to up again it wont work, I had to close the terminal and it worked on a new one :) – Aquarius Power Jun 01 '22 at 13:14
  • Just adding a reference to [this answer](https://stackoverflow.com/questions/39537731#answer-53661748) on similar problems from @dave4jr: note that setting ulimit is limited to the current terminal and will be scraped once you run a new session. To use this trick on a task (for instance), you will need to alter limits.conf. – tgrandje Aug 30 '22 at 08:08
14

I changed my ulimit to 4096 from 1024 and it worked. Following is the procedure:

Check your num of descriptors limit using:

ulimit -n

For me it was 1024, and I updated it to 4096 and it worked.

ulimit -n 4096
devil in the detail
  • 2,905
  • 17
  • 15
9

In case you can't close the file for some reasons(e.g. you're using 3rd party module), you may consider to set based on hard maximum limit instead of predefined hard-coded limit (It will throws ValueError if you try to set hard+1):

import resource
soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
resource.setrlimit(resource.RLIMIT_NOFILE, (hard, hard))

And I want to make it clear that even you manually delete the files created while python process still running, it will still throws such error later.

林果皞
  • 7,539
  • 3
  • 55
  • 70
2

sudo vim /etc/security/limits.conf

add

*         hard    nofile      500000
*         soft    nofile      500000

to the file.

wsdzbm
  • 3,096
  • 3
  • 25
  • 28
1

I strongly discourage you from increasing the ulimit.

  1. For example, your database may grow a lot and result in generating many more files than it used to, so much that it would become greater than the limit you fixed and thought was enough.
  2. It's a time-consuming/error-prone maintenance task because you would have to make sure that every environment/server has that limit properly set and never changed.

You should ensure that open is used in combination with close or that the with statement is used (which is more pythonic).

Third-party libraries might give you issues (for example, pyPDF2 PdfFileMerger.append keeps files open until the write method is called on it). The way I tracked this is pretty ugly but trying a couple of things on the server while monitoring the number of open files did the trick (my local development computer runs under Mac OS X and server is CentOs):

watch 'lsof | grep "something-created-filenames-have-in-common" | wc -l'
Q Caron
  • 952
  • 13
  • 26
0

If the given solutions did not help, first try to restart the machine(not the terminal) and then try the solutions again. For me, this was enough:

ulimit -n 102400
Melih
  • 666
  • 1
  • 9
  • 24
-2

A minimal working example would be nice. I got the same results like ron.rothman using the following script with Python 3.3.2, GCC 4.2.1 on mac 10.6.8. Do you get errors using it?

    import os, sys
    import resource
    resource.setrlimit(resource.RLIMIT_NOFILE, (1000,-1))
    pureResponseNames = ['f'+str(i) for i in range(434)]
    try:
        os.mkdir("testCase")
    except:
        print('Maybe the folder is already there.')
    outpathDirTest="testCase/"
    output_files = [open(os.path.join(outpathDirTest, fname) + ".txt", "w") for fname in pureResponseNames]
    for i in range(len(output_files)):
        output_files[i].write('This is a test of file nr.'+str(i))
        output_files[i].close()
Stefan Bollmann
  • 640
  • 4
  • 12
  • 32