I need to connect spark to my redshift instance to generate data . I am using spark 1.6 with scala 2.10 . Have used compatible jdbc connector and spark-redshift connector. But i am facing a weird problem that is : I am using pyspark
df=sqlContext.read\
.format("com.databricks.spark.redshift")\
.option("query","select top 10 * from fact_table")\
.option("url","jdbc:redshift://redshift_host:5439/events?user=usernmae&password=pass")\
.option("tempdir","s3a://redshift-archive/").load()
When i do df.show()
then it gives me error of permission denied on my bucket.
This is weird because i can see files being created in my bucket, but they can be read.
PS .I have set accesskey and secret access key also.
PS . I am also confused between s3a and s3n file system. Connector used : https://github.com/databricks/spark-redshift/tree/branch-1.x