1

I'm packaging up a minimal Ubuntu distro to fit in a 4GB disk image, for use on a VPS. This image is a (C++) webapp which (among other things) writes and runs simple Python scripts to handle conversions between csv and xls files, with csvkit and XlsxWriter doing the heavy lifting. My entire Python knowledge is unfortunately limited to writing and running these scripts.

Problem: I install pip in the image to handle the download and install of csvkit and XlsxWriter. This creates a huge amount of cruft, including what seems to be a C++ development environment, just to install what I imagine (presumably incorrectly) is simply Python source code. I can't really afford this in a 4GB distribution.

Is there a lightweight alternative to using pip to do this? Can I just copy over a handful of files from the dev machine, for example? I suppose one alternative is simply to uninstall pip after use, but I'd rather keep the disk image clean if possible (if nothing else, it will compress better).

QF0
  • 329
  • 2
  • 14
  • XlsxWriter doesn't have any dependencies so it can be installed from source using 'python setup.py install'. See the XlsxWriter [installation docs](https://xlsxwriter.readthedocs.io/getting_started.html). However csvkit has a number of [dependencies](https://github.com/wireservice/csvkit/blob/master/setup.py) and may prove difficult to install from source. Which is why it is best to stick with pip since it handles that for you. Either way, I don't think pip should be that big an install. Maybe try install it via [get-pip](https://github.com/pypa/get-pip). – jmcnamara Jan 25 '22 at 14:46

2 Answers2

1

If you are using python3.4 or newer you might harness ensurepip from standard library. It allows installing pip if it was not installed alongside with python, after doing

python -m ensurepip

you should be able to use pip as if it was installed together with python.

Daweo
  • 31,313
  • 3
  • 12
  • 25
  • Thanks - just got back to this - unfortunately, it doesn't work. A Ubuntu-20.04 server install gets Python 3.8.10, but `python3 -m ensurepip` just returns 'no module named ensurepip'. Removing `pip` after a manual install (`apt install python3-pip`) only cleans up a couple of MB and leaves gcc installed, unsurprisingly. – QF0 Jan 31 '22 at 19:56
1

XlsxWriter doesn't have any dependencies so it can be installed from source using python setup.py install. See the XlsxWriter installation docs.

However csvkit has a number of dependencies and may prove difficult to install from source. Which is why it is best to stick with pip since it handles that for you, which I'm sure you know.

Maybe try to install pip via get-pip which should give the smallest possible bootstrapping installation:

$ curl -sSL https://bootstrap.pypa.io/get-pip.py -o get-pip.py
$ python get-pip.py
jmcnamara
  • 38,196
  • 6
  • 90
  • 108