For a Sandbox environment like yours, a sandbox image is made on a Linux OS (CentOS). The Zeppelin Notebook points, in all probability, to the Python installation that comes along with every Linux OS.
If you wish to have your own installation of Python and your own set of libraries for Data Analysis like those in the SciPy stack. You need to install Anaconda on your Virtual machine. Your VM eed to be connected to the internet so that you can download and install the Anaconda package for testing.
You can then point Zeppelin to the anaconda's directory till the following path : /home/user/anaconda3/bin/python where user is your username
Zeppelin Configuration also confirms the fact that it uses the default python installation at /usr/bin/python
. You can go through its documentation for more Information
UPDATE
Hi Joseph, Spark Installations, by default, use the Python interpreter and the python libraries that have been installed on your OS. The folder structure that you have shown only tell you the location of the PySpark module. This module is a library like Pandas ior NumPy.
What you can do is install the SciPy Stack[NumPy, Pandas, MatplotLib etc..] via the command pip install package name
and import those libraries directly into your Zeppelin Notebook.
Use the command whereis python
in the terminal of your snadbox, the result would give you something as follows
/usr/bin/python /usr/bin/python2.7 ....
In your Zeppelin Configuration, for the property zeppelin.pyspark.python
you can set the first value from the out put of the previous command i.e /usr/bin/python
. So now all the libraries you installed via the pip install
command would be available for you in zeppelin.
This process would only work for your Sandbox environment. In a real production cluster, your administrator needs to install all these libraries on all the nodes of your Spark cluster.