2

I have been learning data science using python for about a year now. I have become quite comfortable with the syntax and model creation. I have exclusively used Google Colab just due to how convenient it is and I love the notebook style. However, one thing I do not understand is the environment stuff. Although I use Colab, I do have python and anaconda on my machine and have installed various packages using the exact following format: pip install (package name). When I open my terminal, the first line is lead with (base) and when I check the Environments tab in anaconda navigator, it appears as though I installed all of these packages into a base environment named base (root)? Is that right? If so, what would my environment's name be then? What is a base environment compared to a venv?

The reason I am asking is because if I ever decide to use an IDE in the future, I would need to set my environment to be able to run packages, correct?

Just for fun I want to try using R and its reticulate package that allows python use in R. As stated in the answer to this question, I need to set my virtual environment before I can use python in R. Would my virtual environment be base (root)?

I'm a complete noob about all of this environment stuff. Again, I just opened my terminal and typed pip install (package name) for all packages I've installed. Thanks for any help in advance.

ipj
  • 3,488
  • 1
  • 14
  • 18
bismo
  • 1,257
  • 1
  • 16
  • 36

1 Answers1

2

So from your description, it sounds like your default Python installation on your computer is through Anaconda. If that's the case, base is actually going to be the name of the conda virtual environment that you're using.

Virtual environments can be tricky, so I'll walk you through what I usually do here.

First, you can always check which Python installation you're currently using by using the which command on Mac/Linux, or if you're using Windows the command will probably be where (if you're on Windows, this answer might be helpful: equivalent of 'which' in Windows.)

(base) ➜  ~ which python
/Users/steven/miniconda3/bin/python

From the above, you can see that my default Python is through Miniconda, which is just a small version of Anaconda.

This means that when you use pip to install packages, those are getting installed into this base conda environment. And, by the way, you can use the which command with pip as well, just to double-check that you're using the version of pip that's in your current environment:

(base) ➜  ~ which pip
/Users/steven/miniconda3/bin/pip

If you want to see the list of packages currently installed, you can do pip freeze, or conda env export. Both pip and conda are package managers, and if you're using an Anaconda Python installation then you can (generally) use either to install packages into your virtual environment.

(Quick side note: "virtual environments" are a general concept that can be implemented in different ways. Both conda and virtualenv are ways to use virtual environments in Python. I'm also a data scientist, and I use conda for all of my virtual environments.)

If you want to create a new virtual environment using conda, it's very straightforward. First, you can create the environment and install some packages right away, like pandas and matplotlib. Then you can activate that environment, check your version of python, and then deactivate it.

(base) ➜  ~ conda create -n my-new-environment pandas matplotlib
(base) ➜  ~ which python
/Users/steven/miniconda3/bin/python
(base) ➜  ~ conda activate my-new-environment
(my-new-environment) ➜  ~ which python
/Users/steven/miniconda3/envs/my-new-environment/bin/python
(my-new-environment) ➜  ~ conda deactivate
(base) ➜  ~ which python
/Users/steven/miniconda3/bin/python

And, if you want to see which conda virtual environments you currently have available, you can run conda env list.

Here's the documentation for conda environments, which I reference all the time: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

I hope this is helpful!

Steven Rouk
  • 893
  • 7
  • 9
  • Thank you so much for taking the time to respond. Yes, I am on Mac and installed python through Anaconda. My pip and python are located in the same place, as our yours. Why would one need to create a new virtual environment? Don't I already have all of the packages I need in the base environment? Sorry, it's difficult for me to graph the virtual environments concept. – bismo Aug 24 '20 at 20:47
  • 1
    @bismo -- If you already have everything that you need in your base environment, then you're good to go! I just wanted to show you how to create new environments and switch to them, in case you need that. It also (for me) helps me understand how different environments work. But yes, there's no problem working directly out of your `base` environment, unless you need to install conflicting package versions at any point. If you're ever working on a different project that needs conflicting versions with what you currently have, that's when you'd create a new virtual environment. – Steven Rouk Aug 24 '20 at 20:50
  • Ahh, I see. Thanks for being so helpful. One last thing. When I try to activate my environment using reticulate in R, I run the following: ```library(reticulate)``` ```use_virtualenv("base")```, but when I try to import pandas, I got the same error that the individual did in the question I linked above. What could be the issue here? – bismo Aug 24 '20 at 20:54
  • Don't use pip if you have Anaconda python installed. The conda command replaces pip. Use conda like you use pip. – Natsfan Aug 24 '20 at 23:16
  • @bismo -- Unfortunately I don't have much experience with R, so I'm not sure what's going on there or how to troubleshoot. You might want to look into how R loads packages and what paths it uses, etc. The main goal would be to figure out which version of Python it's using, and what folder it's loading the libraries from. – Steven Rouk Aug 25 '20 at 17:57