0

I can do !pip list to see a list of all the packages.

I can do this to count all the sub folders in the python 3.7 folder:

import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'


f = []
for (dirpath, dirnames, filenames) in os.walk(containing_folder):
    f.extend(dirnames)
    break

print('there are', len(f), 'folders in the python 3.7 module')

but the number of folders does not equate to the number of modules as there appear to be more files than modules.

So how can i identify all the modules (and not folders) ? (ie. count all the pip installed folders).

D.L
  • 4,339
  • 5
  • 22
  • 45
  • 1
    Why can't you just use the results of running `pip list`? And why do you care how many modules are installed? – CryptoFool Oct 22 '22 at 21:31
  • hi @CryptoFool. because i would need to manually count each one from the list. that is essentially my question... does `!pip list` return a list or can i get the pip list as a **list** ? – D.L Oct 22 '22 at 21:36
  • `!pip list | nl` adds numbers. – tripleee Oct 22 '22 at 21:37
  • @tripleee, this works, so if one wanted to programmatically get the last value of that list, could that be done ? – D.L Oct 22 '22 at 21:46
  • @D.L - yes, should work fine if you use the `subprocess` module to run `pip`. What I was trying was the same thing...`pip list | wc -l`. If you do use `subprocess` to do this, make sure you add `shell=True` as a parameter. You need to have your command run via a shell so that the pipe will work`. – CryptoFool Oct 22 '22 at 22:14

2 Answers2

1

Python packages are denoted by the existence of a file named __init__.py. So if you want to count packages in terms of that formal definition, you can just count the number of files you find with this name. Here's code that will do that:

import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'

f = []
for (dirpath, dirnames, filenames) in os.walk(containing_folder):
    if '__init__.py' in filenames:
        f.append(os.path.basename(dirpath))

print(f)

print('there are', len(f), 'folders in the python 3.7 module')

If you just want to count the number of packages at the first level of a directory, which is probably what you want, here's code that does that:

import os
containing_folder = '/usr/local/lib/python3.7/dist-packages'

r = []
for entity in os.listdir(containing_folder):
    f = os.path.join(containing_folder, entity, '__init__.py')
    if os.path.isdir(os.path.join(containing_folder, entity)) and os.path.join(entity) and os.path.exists(f):
        r.append(entity)

print(len(r))

When I ran this code on one of my Python installs, and compared it against what I get when I do pip list | wc -l on that same version of Python, I got almost the same result...125 for the Python code, 129 for pip.

CryptoFool
  • 21,719
  • 5
  • 26
  • 44
  • i think the answer will be something like this, but there appears to be `5729` files when i run the above... which seems to be way to many. – D.L Oct 22 '22 at 21:42
  • I'm pretty sure that the code is doing what I say it's doing. I ran `find . -name "__init__.py" | wc -l` on the same target directory and got the exact same result as what Python gave me. I also got a number that seemed pretty high. But I can't think of a way that both the Python code and the `find` command could be getting it wrong in exactly the same way. Now, if you choose to define what a "package" is differently, then you need another mechanism per that defintion. That's why I said "count packages in terms of that formal definition". – CryptoFool Oct 22 '22 at 21:50
  • I hear what you are saying. My _"definition of a package"_ is the result of the `!pip list` assuming that it a valid definition... – D.L Oct 22 '22 at 21:55
  • I just wrote the list that is generated to a file and took a look at it. It seems right to me. Larger packages have a TON of subpackages. Pandas, for example, has 147 sub-packages. If you only want the primary modules, you could look for `__init__.py` files at just one level of directory depth, or something like that, or possibly stop when you first find a `__init__.py` file. – CryptoFool Oct 22 '22 at 21:55
0

You could use pip programmatically (though frankly, finding the documentation for this is a bit daunting). In theory, you could import pip and use its internal APIs to fetch the information you want, but the documentation discourages this, and instead suggests you use pip like any non-native external utility.

import subprocess

packages = subprocess.run(
    ['pip', 'list'].
    check=True, text=True, capture_output=True)
packagelist = packages.stdout.splitlines()

There are many details about subprocess which could benefit from a more detailed explanation, but it has been done many times before. Perhaps for this discussion, mainly note that passing the first argument as a list of tokens is required on Unix-like systems when you want avoid shell=True (which you want, when you can).

tripleee
  • 175,061
  • 34
  • 275
  • 318