Parse output of pip list / pip freeze in python

Question

Hello I have a string like this:

AdvancedHTMLParser (8.0.1)\nappdirs (1.4.3)\nbeautifulsoup4 (4.6.0)\nchardet (3.0.4)\nchrome-gnome-shell (0.0.0)\ncupshelpers (1.0)\ncycler (0.10.0)\nCython (0.27.3)

I want to split this in a list of tuples. So that each list items has a tuple with two values, the name and the version (without the brackets).

I was only able to split the string by newline but I don't know how to properly grab the numbers in the brackets etc Can someone explain me how I can do this?

EDIT : I am trying to parse pip list local

 def get_installed_modules(self):
    data = subprocess.check_output(["pip", "list", "--local"])
    result = [tuple(line.replace('(', '').replace(')', '').split())
              for line in data.splitlines()]
    print(result)

I have the project that I cant just split the string but it requires a byte like object...

TypeError: a bytes-like object is required, not 'str'

If you are trying to parse the output of `pip freeze`, you can do that programmatically though python. — cs95, Dec 18 '17 at 13:56

score 5 · Answer 1 · answered Nov 02 '18 at 12:26

The accepted answer doesn't work anymore with the latest version of pip (> 10.0)

All those methods are now in private packages. For example the freeze module is in _internal/operations. You can still use it but personally I don't think it is a good idea to use internal packages. They might be moved or changed easily in a new version.

What you can do is to keep using pip cli version, use the --format json option to have formatted output and parse this in python.

 import subprocess
 import json
 data = subprocess.check_output(["pip", "list", "--format", "json"])
 parsed_results = json.loads(data)
 [(element["name"], element["version"]) for element in parsed_results]

cs95 · Accepted Answer · 2017-12-18T14:22:37.327

3

Option 1
If you're getting these outputs from pip, you can do it programmatically, using pip.operations.freeze -

from pip.operations import freeze  

modules = list(
    map(lambda x: x.split('=='), freeze.freeze(local_only=True))
)

print(modules)

[['aiodns', '1.1.1'],
 ['aiohttp', '1.2.0'],
 ['appdirs', '1.4.0'],
 ['appnope', '0.1.0'],
 ['argparse', '1.4.0'],
...

Option 2
You could also use get_installed_distributions, taken from here:

import pip

modules = []
for i in pip.utils.get_installed_distributions():
    modules.append((i.key, i.version))

print(modules)

[('pytreebank', '0.2.4'),
 ('cssselect', '1.0.1'),
 ('numba', '0.36.0.dev0+92.g2818dc9e2'),
 ('llvmlite', '0.0.0'),
 ('yarl', '0.8.1'),
 ('xlwt', '1.3.0'),
 ('xlrd', '1.1.0'),
 ...
]

Option 3
A third method is using pip.main -

import pip
pip.main(['list', 'local'])

However, this writes to stdout.

edited Dec 18 '17 at 14:22

answered Dec 18 '17 at 14:14

cs95

379,657
97
704
746

Perfect ! Thanks! – n00b.exe Dec 18 '17 at 14:29
@user4042470 You're welcome. To be sure, please try them all out and let me know which one _didn't_ work. Cheers. – cs95 Dec 18 '17 at 14:29
Is there also a way to check this for a certain virtualenv? Since freeze.freeze is for the local packages outside the virtualenv i believe? – n00b.exe Dec 18 '17 at 14:31
@user4042470 Maybe you need to pass `isolated=True` to `freeze`? – cs95 Dec 18 '17 at 14:33
Unfortunately, there are no docs on using pip programmatically, so I don't even know what half the arguments do. There must be a way, but I'm not really sure about it. – cs95 Dec 18 '17 at 14:34
All those methods have been moved starting from pip version > 10. I provided another answer below. – Fabian Nov 02 '18 at 12:28

score 2 · Answer 3 · answered Dec 18 '17 at 13:48

You can also use regular expressions:

>>> s = "AdvancedHTMLParser (8.0.1)\nappdirs (1.4.3)\nbeautifulsoup4 (4.6.0)\nchardet (3.0.4)\nchrome-gnome-shell (0.0.0)\ncupshelpers (1.0)\ncycler (0.10.0)\nCython (0.27.3)"
>>> re.findall(r"(.+) \((.+)\)", s)
[('AdvancedHTMLParser', '8.0.1'),
 ('appdirs', '1.4.3'),
 ('beautifulsoup4', '4.6.0'),
 ('chardet', '3.0.4'),
 ('chrome-gnome-shell', '0.0.0'),
 ('cupshelpers', '1.0'),
 ('cycler', '0.10.0'),
 ('Cython', '0.27.3')]

Found a mistake, let me fix it... Fixed; you can retract you downvote now (if that was the reason) — tobias_k, Dec 18 '17 at 13:50

score 1 · Answer 4 · answered Dec 18 '17 at 13:47

1

Straightforwardly:

data = 'AdvancedHTMLParser (8.0.1)\nappdirs (1.4.3)\nbeautifulsoup4 (4.6.0)\nchardet (3.0.4)\nchrome-gnome-shell (0.0.0)\ncupshelpers (1.0)\ncycler (0.10.0)\nCython (0.27.3)'
result = [tuple(line.replace('(', '').replace(')', '').split())
          for line in data.splitlines()]

print(result)

The output:

[('AdvancedHTMLParser', '8.0.1'), ('appdirs', '1.4.3'), ('beautifulsoup4', '4.6.0'), ('chardet', '3.0.4'), ('chrome-gnome-shell', '0.0.0'), ('cupshelpers', '1.0'), ('cycler', '0.10.0'), ('Cython', '0.27.3')]

answered Dec 18 '17 at 13:47

RomanPerekhrest

88,541
4
65
105

Thanks for the replies. – n00b.exe Dec 18 '17 at 13:56
When I try to directly query the data like this and split like so : `data = subprocess.check_output(["pip", "list", "--local"]) result = [tuple(line.replace('(', '').replace(')', '').split()) for line in data.splitlines()] print(result)` I get the exception `TypeError: a bytes-like object is required, not 'str'`. What can I do to fix this? I mean data should be string.. – n00b.exe Dec 18 '17 at 13:57
No I use python 3.6 – n00b.exe Dec 18 '17 at 14:06

score 1 · Answer 5 · answered Dec 18 '17 at 13:48

1

Split each line on the opening paren & remove the closing one:

self.__all_modules = [tuple(x[:-1].split(" (")) for x in data.splitlines()]

answered Dec 18 '17 at 13:48

Scott Hunter

48,888
12
60
101

Parse output of pip list / pip freeze in python

5 Answers5