12

I have about 1 million files (which are output simulation outputs). I want to store a specific information of them within one file. I have a for loop which goes to 1M. I put a counter to track the state of the for loop. It will be killed some where between 875000 and 900000. I thought that it may be a space problem. When I run df -h or df /, I have about 68G available. What are other possible reasons that a Python script may be killed? How can I explore it more?

Shannon
  • 985
  • 3
  • 11
  • 25
  • 1
    can you post the script that you are using? Are you closing the files after reading them? – Ma0 Nov 21 '17 at 08:40
  • 1
    What do you mean by "killed"? Do you mean its process just disappears with no error message? – BoarGules Nov 21 '17 at 08:42
  • This question doesn't seem to have anything to do with Python; it sounds like your process is being killed by the OS. So we ought to ask: What OS is it? And might this question be a duplicate of https://stackoverflow.com/questions/726690/who-killed-my-process-and-why ? – Chris Martin Nov 21 '17 at 09:00
  • @Ev.Kounis, yes, I am closing them. – Shannon Nov 24 '17 at 05:12
  • @BoarGules, yes, with no errors – Shannon Nov 24 '17 at 05:12
  • @ChrisMartin, thanks for the link. I guess it is a RAM problem (based on htop) – Shannon Nov 24 '17 at 05:13

3 Answers3

27

On a Linux system, check the output of dmesg. If the process is getting killed by the kernel, it will have a explanation there. Most probable reason: out of memory, or out of file descriptors.

Guillaume
  • 5,497
  • 3
  • 24
  • 42
9

Usually, you get killed message when the program runs out of RAM (as opposed to hard disk which you have in plenty). You should keep a watch on main memory. Run top and have a look at the memory being taken by your program, or alternatively use a tool like guppy (https://pypi.python.org/pypi/guppy/) to track memory utilization programmatically.

I would hazard a guess that you are creating some big in memory data structure while processing files, perhaps not de-allocating them as you iterate through the files.

Vivek Pandey
  • 3,455
  • 1
  • 19
  • 25
2

Snippet of code will help. However, I presume, you're loading all the files in memory in one go and since files are huge, that might have bloated RAM completely, and thus making script to die. If your use case is to get particular line/text from every file, I would recommend to use re modules for pattern and read files accordingly.

Please refer syslog. You can get syslog in /var/log/ in Ubuntu. syslog will give you hints of possible reasons of the script failure

Ajay2588
  • 527
  • 3
  • 6