Hello guys,
My question is about zeppelin notebook. I am new to zeppelin environment. I have a AWS account. I am working on EMR cluster. I want to use pandas and matplotlib in zeppelin environment. But, I got the error no module named pandas and matplotlib. I find this tutorial. I came to Step 8 but, i stil do not get the same problem. Zeppelin has interpreter. I try to change python path even if i am sure the path, i still got the same error. This link mentions If anyone experience about these issues, please help me.
%pyspark
import os
import numpy
import pandas
import matplotlib
print("Numpy "+numpy.__version__)
print("Pandas "+pandas.__version__)
print("Matplotlib "+matplotlib.__version__)
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-444747300595843376.py", line 367, in <module>
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-444747300595843376.py", line 355, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 3, in <module>
ImportError: No module named pandas