I am running a boo.py
script on AWS EMR using spark-submit
(Spark 2.0).
The file finished successfully when I use
python boo.py
However, it failed when I run
spark-submit --verbose --deploy-mode cluster --master yarn boo.py
The log on yarn logs -applicationId ID_number
shows:
Traceback (most recent call last):
File "boo.py", line 17, in <module>
import boto3
ImportError: No module named boto3
The python
and boto3
module I am using is
$ which python
/usr/bin/python
$ pip install boto3
Requirement already satisfied (use --upgrade to upgrade): boto3 in /usr/local/lib/python2.7/site-packages
How do I append this library path so that spark-submit
could read the boto3
module?