3

I'm maintaining a project that uses a number of Python libraries such as numpy, pandas, and netcdf4 which have dependencies such as libhdf5, ATLAS, LAPACK, etc. I have previously installed these libraries via my system package manager prior to installing using pip. Now it is desired to list all the dependencies required, including C/Fortran dependencies. (Python is pretty easy with pip freeze and pipdeptree, of course) Is there any way to show which linked C/Fortran libraries are being used? Failing that, is there any way to show build options for the Python libraries using C dependencies?

EDIT: this answer details how to do this for numpy and perhaps other libraries with C dependencies through ldd. What's the recommended approach across the board?

benjwadams
  • 1,520
  • 13
  • 16

1 Answers1

2

On Linux systems you can get the dynamic linker to dump all sorts of debug information that you can gather this type of information from (see ld.so(8)). For example I have a python program called plot_all and if I invoke it as:

LD_DEBUG=libs plot_all 2> ld-libs-output

then the dynamic linker will output all of its library file information into the file ld-libs-output. This will encompass every dynamic library dependency for that file to run. If further processed, e.g.:

grep "calling init" ld-libs-output | cut -f3 -d: | sort | uniq > LDLibs

you will get a sorted list of all unique libraries loaded in the course of executing the python script. If you want to turn this into dependency information you can use your distributions tools for mapping files to packages. On Gentoo, I can query the packages that own these libraries with a command like:

 equery -q b -n $(cat LDLibs | grep "calling init" | cut -f3 -d: | sort | uniq) | sort | uniq

The output of this command is a sorted list of all the packages that own at least one of the libraries dynamically loaded during my script execution:

app-arch/bzip2
app-arch/lz4
app-arch/xz-utils
dev-lang/python
dev-libs/expat
dev-libs/glib
dev-libs/icu
dev-libs/libffi
dev-libs/libpcre
dev-libs/libxml2
dev-libs/openssl
dev-python/h5py
dev-python/matplotlib
dev-python/mpi4py
dev-python/numpy
dev-python/pillow
dev-python/PyQt5
dev-python/sip
dev-qt/qtcore
dev-qt/qtdbus
dev-qt/qtgui
dev-qt/qtsvg
dev-qt/qtwidgets
media-gfx/graphite2
media-libs/fontconfig
media-libs/freetype
media-libs/harfbuzz
media-libs/jpeg
media-libs/libpng
media-libs/openjpeg
media-libs/tiff
sci-libs/blas-reference
sci-libs/cblas-reference
sci-libs/hdf5
sci-libs/lapack-reference
sci-libs/szip
sys-apps/attr
sys-apps/dbus
sys-apps/hwloc
sys-apps/systemd
sys-apps/util-linux
sys-cluster/openmpi
sys-devel/gcc
sys-libs/glibc
sys-libs/libcap
sys-libs/zlib
sys-process/numactl
x11-drivers/nvidia-drivers
x11-libs/libICE
x11-libs/libpciaccess
x11-libs/libSM
x11-libs/libX11
x11-libs/libXau
x11-libs/libxcb
x11-libs/libXcursor
x11-libs/libXdmcp
x11-libs/libXext
x11-libs/libXfixes
x11-libs/libXi
x11-libs/libxkbcommon
x11-libs/libXrender
x11-libs/xcb-util
x11-libs/xcb-util-image
x11-libs/xcb-util-keysyms
x11-libs/xcb-util-renderutil
x11-libs/xcb-util-wm

This list is quite verbose and you can see it pulls in dependent packages quite deep, well past what we really need and some that are environment dependent (e.g. the dependency on nvidia-drivers would not apply to someone without an nvidia graphics card). To turn this into a useful list you would have to look at the dependency graph and only depend on the top-level packages as those will implicitly pull in the ones below them. Analyzing the first-level dependencies of these packages, all of them can be pulled in with a minimum list of:

dev-python/h5py
dev-python/matplotlib
dev-python/pillow
sys-libs/glibc

I would then repeat this for any other scripts in my python package and consolidate all of the information into a master dependency list for my package.


This should give you an idea of a general workflow for discovering the distro packages that a python script depends on. In my case all C/Fortran dependencies external to python are brought in by the distro python packages, but this process would have discovered any other top-level packages needed. The workflow will need to be modified to your distro tools for matching files to packages and analyzing dependencies.

casey
  • 6,855
  • 1
  • 24
  • 37