3

I'm currently using PyCall to load a Python library for data compression based on LZ-77 into Julia. The python library is sweetsourcod and I have it installed in my home directory. Within that library, I am using the module lempel_ziv for some entropy measurements. I load the python module following PyCall's example. This is how I'm loading it into Julia:

using PyCall

sc = pyimport("sweetsourcod.lempel_ziv")

PyObject <module 'sweetsourcod.lempel_ziv' from '/Users/danielribeiro/sweetsourcod/sweetsourcod/lempel_ziv.cpython-38-darwin.so'>

The use of this python library seems to be causing a segmentation fault within Julia; however, when I write the same code in python, the segmentation fault does not take place. The following Julia example triggers the segmentation fault

using PyCall

L = 1000000
nbins = [2*i for i = 1:2:15]
sc = pyimport("sweetsourcod.lempel_ziv")
# loop through all n
for n in nbins
    # loop through all configurations
    for i = 1:65
        # analogous to reading a configuration from scratch
        config = rand(0:255, L)
        # calculate entropy
        # 1.1300458785794484e6 --> cid of random sequence of same L
        entropy = sc.lempel_ziv_complexity(config, "lz77")[2] / 1.1300458785794484e6
    end
end

the line entropy = sc.lempel_ziv_complexity(config, "lz77")[2] / 1.1300458785794484e6 is what triggers the segfault. This the minimal working example I was able to write in Julia to generate the segfault. The function lempel_ziv_complexity() compresses the array and returns a tuple with the LZ factors and the approximate size of the compressed file. When I write identical code in Python, the segfault is not triggered. This is the working example in Python

import numpy as np
from sweetsourcod.lempel_ziv import lempel_ziv_complexity

L = 1000000
nbins = [2*i for i in range(1, 15, 2)]

for n in nbins:
    for i in range(1, 65, 1):
        config = np.random.randint(0, 256, L)
        entropy = lempel_ziv_complexity(config, "lz77")[1] / 1.1300458785794484e6

I suspect the triggering of the segfault has to do with PyCall's internals, with which I am unfamiliar with. I have also tried precompiling sweetsourcod into a module like it is suggested in PyCall's README. Does anyone have any suggestions on how to address this issue? Thank you in advance!

  • In your Julia code range is from 1 to 65 while in the Python code it is from 1 to 64. Should it be that way? – Przemyslaw Szufel Jun 26 '21 at 00:12
  • I'm experiencing it too and the message `free(): invalid pointer` seems to be almost always involved. The furthest I've been able to figure out is that it happens while freeing up memory of complex objects. – David Feb 21 '22 at 12:15

0 Answers0