I want my Spark app (Scala) to be able to read S3 files
spark.read.parquet("s3://my-bucket-name/my-object-key")
On my dev machine, I could access S3 files using awscli a pre-configured profile in ~/.aws/config
or ~/.aws/credentials
, like:
aws --profile my-profile s3 ls s3://my-bucket-name/my-object-key
But when trying to read those files from Spark, with the aws_profile provided as an env variable (AWS_PROFILE), I got the following error:
doesBucketExist on my-bucket-name: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
Also tried to provide the profile as a JVM option (-Daws.profile=my-profile
), with no luck.
Thanks for reading.