I have yet to come across an answer that makes me WANT to start using virtual environments. I understand how they work, but what I don’t understand is how can someone (like me) have hundreds of Python projects on their drive, almost all of them use the same packages (like Pandas and Numpy), but if they were all in separate venv’s, you’d have to pip install those same packages over and over and over again, wasting so much space for no reason. Not to mention if any of those also require a package like tensorflow.
The only real benefit I can see to using venv’s in my case is to mitigate version issues, but for me, that’s really not as big of an issue as it’s portrayed. Any project of mine that becomes out of date, I update the packages for it.
Why install the same dependency for every project when you can just do it once for all of them on a global scale? I know you can also specify —-global-dependencies or whatever the tag is when creating a new venv, but since ALL of my python packages are installed globally (hundreds of dependencies are pip installed already), I don’t want the new venv to make use of ALL of them? So I can specify only specific global packages to use in a venv? That would make more sense.
What else am I missing?
UPDATE
I’m going to elaborate and clarify my question a bit as there seems to be some confusion.
I’m not so much interested in understanding HOW venv’s work, and I understand the benefits that can come with using them. What I’m asking is:
Why would someone with (for example) have 100 different projects that all require tensorflow to be installed into their own venv’s. That would mean you have to install tensorflow 100 separate times. That’s not just a “little” extra space being wasted, that’s a lot.
I understand they mitigate dependency versioning issues, you can “freeze” packages with their current working versions and can forget about them, great. And maybe I’m just unique in this respect, but the versioning issue (besides the obvious difference between python 2 and 3) really hasn’t been THAT big of an issue. Yes I’ve run into it, but isn’t it better practise to keep your projects up to date with the current working/stable versions than to freeze them with old, possibly no longer supported versions? Sure it works, but that doesn’t seem to be the “best” option to me either.
To reiterate on the second part of my question, what I would think is, if I have (for example) tensorflow installed globally, and I create a venv for each of my 100 tensorflow projects, is there not a way to make use of the already globally installed tensorflow inside of the venv, without having to install it again? I know in pycharm and possibly the command line, you can use a — system-site-packages argument (or whatever it is) to make that happen, but I don’t want to include ALL of the globally installed dependencies, cuz I have hundreds of those too. Is —-system-site-packages -tensorflow for example a thing?
Hope that helps to clarify what I’m looking for out of this discussion because so far, I have no use for venv’s, other than from everyone else claiming how great they are but I guess I see it a bit differently :P
(FINAL?) UPDATE
From the great discussions I've had with the contributors below, here is a summation of where I think venv's are of benefit and where they're not:
USE a venv:
- You're working on one BIG project with multiple people to mitigate versioning issues among the people
- You don't plan on updating your dependencies very often for all projects
- To have a clearer separation of your projects
- To containerize your project (again, for distribution)
- Your portfolio is fairly small (especially in the data science world where packages like Tensorflow are large and used quite frequently across all of them as you'd have to pip install the same package to each venv)
DO NOT use a venv:
- Your portfolio of projects is large AND requires a lot of heavy dependencies (like tensorflow) to mitigate installing the same package in every venv you create
- You're not distributing your projects across a team of people
- You're actively maintaining your projects and keeping global dependency versions up to date across all of them (maybe I'm the only one who's actually doing this, but whatever)
As was recently mentioned, I guess it depends on your use case. Working on a website that requires contribution from many people at once, it makes sense to all be working out of one environment, but for someone like me with a massive portfolio of Tensorflow projects, that do not have versioning issues or the need for other team members, it doesn't make sense. Maybe if you plan on containerizing or distributing the project it makes sense to do so on an individual basis, but to have (going back to this example) 100 Tensorflow projects in your portfolio, it makes no sense to have 100 different venv's for all of them as you'd have to install tensorflow 100 times into each of them, which is no different than having to pip install tensorflow==2.2.0
for specific old projects that you want to run, which in that case, just keep your projects up to date.
Maybe I'm missing something else major here, but that's the best I've come up with so far. Hope it helps someone else who's had a similar thought.