I have a DynamoDB table that I need to connect to EMR Spark SQL to run queries on the table. I got the EMR Spark Cluster with release label emr-4.6.0 and Spark 1.6.1 on it.
I am referring to the document: Analyse DynamoDB Data with Spark
After connecting to the master node, I run the command:
spark-shell --jars /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar
It gives a warning:
Warning: Local jar /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar does not exist, skipping.
Later, when I import the DynamoDB Input Format using
import org.apache.hadoop.dynamodb.read.DynamoDBInputFormat
import org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat
It gives the error:
error: object dynamodb is not a member of package org.apache.hadoop
import org.apache.hadoop.dynamodb.read.DynamoDBInputFormat
error: object dynamodb is not a member of package org.apache.hadoop
import org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat
I think it is the jar that is causing this error. Where do I get this emr-ddb-hadoop.jar?