I'm trying to run a PySpark script that works fine when I run it on my local machine. The issue is that I want to fetch the input files from S3.
No matter what I try though I can't seem to be able find where I set the ID and secret. I found some answers regarding specific files ex: Locally reading S3 files through Spark (or better: pyspark) but I want to set the credentials for the whole SparkContext as I reuse the sql context all over my code.
so the question is: How do I set the AWS Access key and secret to spark?
P.S I tried the $SPARK_HOME/conf/hdfs-site.xml and Environment variable options. both didn't work...
Thank you