When using Spark MongoDB connector in Scala Application you can import the MongoSpark companion object via import com.mongodb.spark.config._ , then run val rdd = MongoSpark.load(spark) to load your collection. I want to do the same in a python application, but how should I make MongoSpark object available in my python application. There is no python package that to install and import. what is workaround
Asked
Active
Viewed 490 times
1 Answers
1
Please see the Spark Connector Python Guide for more information.
Below is a short example connecting to MongoDB from pySpark:
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName("myApp") \
.config("spark.mongodb.input.uri", "mongodb://127.0.0.1/test.coll") \
.config("spark.mongodb.output.uri", "mongodb://127.0.0.1/test.coll") \
.getOrCreate()
df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()
df.printSchema()

Ross
- 17,861
- 2
- 55
- 73
-
it gives the exception:Py4JJavaError: An error occurred while calling o71.load. : java.lang.ClassNotFoundException: Failed to find data source: com.mongodb.spark.sql.DefaultSource. Please find packages at http://spark.apache.org/third-party-projects.html. – yashar Apr 26 '17 at 15:26
-
How should I make the com.mongodb.spark.sql.DefaultSource available in python application, let say in the spyder IDE. – yashar Apr 26 '17 at 15:28
-
1You need to include the jar / package. When running pyspark you can add: `--packages org.mongodb.spark:mongo-spark-connector_2.11:2.0.0` – Ross Apr 26 '17 at 16:51
-
1I am using Spyder as the development IDE, is there any way to start Spyder with these .jar packages already available? – yashar Apr 26 '17 at 18:27
-
I found this link which is very relevant http://dataxwying.blogspot.nl/2016/02/setup-spyder-for-spark-step-by-step.html . Though I am still missing the MongoSpark object in python. – yashar Apr 28 '17 at 12:52
-
For spark-submit: `./bin/spark-submit --packages org.mongodb.spark:mongo-spark-connector_2.11:2.0.0 \--master spark://ip-or-domain-here:7077 \sparkapps/test.py` – Phyticist Jun 17 '17 at 11:09
-
worked for me, thanks! – Artem Aug 30 '18 at 15:26