What causes a Python segmentation fault?

Question

I am implementing Kosaraju's Strong Connected Component(SCC) graph search algorithm in Python.

The program runs great on small data set, but when I run it on a super-large graph (more than 800,000 nodes), it says "Segmentation Fault".

What might be the cause of it? Thank you!

Additional Info: First I got this Error when running on the super-large data set:

"RuntimeError: maximum recursion depth exceeded in cmp"

Then I reset the recursion limit using

sys.setrecursionlimit(50000)

but got a 'Segmentation fault'

Believe me it's not a infinite loop, it runs correct on relatively smaller data. It is possible the program exhausted the resources?

May be you can have a look [CrashingPython](http://wiki.python.org/moin/CrashingPython) — Abhijit, Apr 05 '12 at 20:32
Is this running in pure Python or are you using a C extension module? If it's pure Python then it's a bug there and congratulations. If you're using a c module, then the segfault is probably coming from there. — aaronasterling, Apr 05 '12 at 20:32
it's Pure python. The program runs great on relatively small data set and it made me think that the code is correct. — xiaolong, Apr 05 '12 at 20:44
According to the Python documentation:::::: The highest possible limit is platform-dependent. A user may need to set the limit higher when she has a program that requires deep recursion and a platform that supports a higher limit. This should be done with care, because a too-high limit can lead to a crash.:::::: You didn't specify an OS. The reference to _crash_ might mean _segmentaion fault_ on your OS. Try a smaller stack. But IIRC the algorithm you're using puts the rntire SSC on the stack so you may run out of stack. — James Thiele, Apr 05 '12 at 21:17
@MattyW Yup. Later I translated Python into C/C++ and didn't find the problem when I store the Graph as global variable. Seems like Python relies more on the system stack. [see solution code](http://codehiker.wordpress.com/2012/04/06/kosarajus-scc/) — xiaolong, Apr 08 '12 at 20:33

score 97 · Answer 1 · edited May 23 '17 at 10:31

97

This happens when a python extension (written in C) tries to access a memory beyond reach.

You can trace it in following ways.

Add sys.settrace at the very first line of the code.

Use gdb as described by Mark in this answer.. At the command prompt

gdb python
(gdb) run /path/to/script.py
## wait for segfault ##
(gdb) backtrace
## stack trace of the c code

edited May 23 '17 at 10:31

Community

1
1

answered Apr 05 '12 at 20:32

Shiplu Mokaddim

56,364
17
141
187

2

thanks, but my code is pure python, does it make a difference? – xiaolong Apr 05 '12 at 20:47
Check which python modules you are using? Some modules are written in python and other are in C. I think you need to report a bug. – Shiplu Mokaddim Apr 05 '12 at 20:49
1

similar, also helpful: stdlib's [trace](http://pymotw.com/2/trace/) module just helped me get to the bottom of a segmentation fault on a staging server, without installing a new dependency, and without modifying code. – floer32 May 06 '15 at 02:53
7

on OSX Sierra, gdb was replaced by lldb – kthouz Dec 01 '16 at 18:19
On OS X see https://unconj.ca/blog/setting-up-gdb-for-debugging-python-on-os-x.html – kilgoretrout Oct 26 '18 at 20:24
`sys.settrace` takes 1 argument. What would that be? – Eduardo Reis Mar 12 '22 at 00:07
@EduardoReis The argument is a trace function. See the documentation link I provided in the answer. – Shiplu Mokaddim Mar 15 '22 at 09:02

score 66 · Answer 2 · answered Jul 06 '12 at 19:19

66

I understand you've solved your issue, but for others reading this thread, here is the answer: you have to increase the stack that your operating system allocates for the python process.

The way to do it, is operating system dependant. In linux, you can check with the command ulimit -s your current value and you can increase it with ulimit -s <new_value>

Try doubling the previous value and continue doubling if it does not work, until you find one that does or run out of memory.

answered Jul 06 '12 at 19:19

Davide

17,098
11
52
68

Also a good way to check if you are coming up against a ulimit max is to run `lsof` and use `grep` or`wc -l` to keep track of everything. – cdated Feb 05 '13 at 17:57
I concur. This actually worked for my Kosaraju's SCC implementation by fixing the segfault on both Python and C++ implementations.
For my MAC, I found out the possible maximum via : – Rock Nov 18 '16 at 03:59
4

note that the ulimit value is modified only for the particular shell it is executed in, so that you do not accidentally modify the value for your whole system – Tanmay Garg Dec 04 '16 at 06:40
2

I did this and ended up with ulimit -s 16384, however after running I still got a segmentation error. – Sreehari R Dec 29 '17 at 09:44
@SreehariR Try increasing it even more. However it could also be an issue with a python extension (if you are using any), which (this other answer)[https://stackoverflow.com/a/10035594/25891] suggests how to debug – Davide Dec 30 '17 at 03:20

score 28 · Answer 3 · edited Jan 09 '19 at 16:50

28

Segmentation fault is a generic one, there are many possible reasons for this:

Low memory
Faulty Ram memory
Fetching a huge data set from the db using a query (if the size of fetched data is more than swap mem)
wrong query / buggy code
having long loop (multiple recursion)

edited Jan 09 '19 at 16:50

Paul

5,473
1
30
37

answered Nov 04 '13 at 21:04

Sadheesh

895
9
6

score 5 · Answer 4 · answered Nov 18 '16 at 04:08

5

Updating the ulimit worked for my Kosaraju's SCC implementation by fixing the segfault on both Python (Python segfault.. who knew!) and C++ implementations.

For my MAC, I found out the possible maximum via :

$ ulimit -s -H
65532

answered Nov 18 '16 at 04:08

Rock

177
3
11

how to update that value ? is that value in what type of unit ? – Pablo Apr 21 '20 at 22:41
1

If you KNOW a lot about what you need your limits to be (and you know your platform will never change away from linux), you could use a python execute command to simply execute that command from within your code. I personally have added it to my .bashrc file. – trumpetlicks Dec 10 '20 at 13:59

llinfeng · Answer 5 · 2019-10-07T21:22:33.000

Google search found me this article, and I did not see the following "personal solution" discussed.

My recent annoyance with Python 3.7 on Windows Subsystem for Linux is that: on two machines with the same Pandas library, one gives me segmentation fault and the other reports warning. It was not clear which one was newer, but "re-installing" pandas solves the problem.

Command that I ran on the buggy machine.

conda install pandas

More details: I was running identical scripts (synced through Git), and both are Windows 10 machine with WSL + Anaconda. Here go the screenshots to make the case. Also, on the machine where command-line python will complain about Segmentation fault (core dumped), Jupyter lab simply restarts the kernel every single time. Worse still, no warning was given at all.

Updates a few months later: I quit hosting Jupyter servers on Windows machine. I now use WSL on Windows to fetch remote ports opened on a Linux server and run all my jobs on the remote Linux machine. I have never experienced any execution error for a good number of months :)

score 1 · Answer 6 · answered Mar 22 '19 at 12:42

I was experiencing this segmentation fault after upgrading dlib on RPI. I tracebacked the stack as suggested by Shiplu Mokaddim above and it settled on an OpenBLAS library.

Since OpenBLAS is also multi-threaded, using it in a muilt-threaded application will exponentially multiply threads until segmentation fault. For multi-threaded applications, set OpenBlas to single thread mode.

In python virtual environment, tell OpenBLAS to only use a single thread by editing:

    $ workon <myenv>
    $ nano .virtualenv/<myenv>/bin/postactivate

and add:

    export OPENBLAS_NUM_THREADS=1 
    export OPENBLAS_MAIN_FREE=1

After reboot I was able to run all my image recognition apps on rpi3b which were previously crashing it.

reference: https://github.com/ageitgey/face_recognition/issues/294

Rustam A. · Answer 7 · 2020-04-12T21:42:35.907

Looks like you are out of stack memory. You may want to increase it as Davide stated. To do it in python code, you would need to run your "main()" using threading:

def main():
    pass # write your code here

sys.setrecursionlimit(2097152)    # adjust numbers
threading.stack_size(134217728)   # for your needs

main_thread = threading.Thread(target=main)
main_thread.start()
main_thread.join()

Source: c1729's post on codeforces. Runing it with PyPy is a bit trickier.

Aravind · Answer 8 · 2021-11-12T02:43:22.083

0

I'd run into the same error. I learnt from another SO answer that you need to set the recursion limit through sys and resource modules.

edited Nov 12 '21 at 02:43

answered Nov 12 '21 at 02:30

Aravind

47
3

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. – – jahantaila Nov 12 '21 at 20:14

What causes a Python segmentation fault?

8 Answers8

Linked

Related