Questions tagged [mongodb-hadoop]

29 questions
3
votes
1 answer

How to delete documents(records) with Mongo-Hadoop connector for Spark

I am using Mongo-Hadoop connector to work with Spark and MongoDB.I want to delete the documents in an RDD from the MongoDB,looks there is a MongoUpdateWritable to support document update. Is there way to do deletion with Mongo-Hadoop…
Tom
  • 5,848
  • 12
  • 44
  • 104
3
votes
1 answer

Save MongoDB data to parquet file format using Apache Spark

I am a newbie with Apache spark as well with Scala programming language. What I am trying to achieve is to extract the data from my local mongoDB database for then to save it in a parquet format using Apache Spark with the hadoop-connector This is…
2
votes
1 answer

Spark Task not Serializable Hadoop-MongoDB-Connector Enron

I am trying to run the EnronMail example of Hadoop-MongoDB Connector for Spark. Therefore I am using the java code example from…
Ulrich Zendler
  • 141
  • 1
  • 10
2
votes
1 answer

"ERROR 6000, Output location validation failed" using PIG MongoDB-Hadoop Connector on EMR

I get an "output location validation failed" exception in my pig script on EMR. It fails when saving data back S3. I use this simple script to narrow the problem: REGISTER /home/hadoop/lib/mongo-java-driver-2.13.0.jar REGISTER…
d0x
  • 11,040
  • 17
  • 69
  • 104
2
votes
1 answer

Apache Spark Mongo-Hadoop Connector class not found

So im trying to run this example https://github.com/plaa/mongo-spark/blob/master/src/main/scala/ScalaWordCount.scala But i keep getting this error Exception in thread "main" java.lang.NoClassDefFoundError: com/mongodb/hadoop/MongoInputFormat at…
2
votes
2 answers

Hadoop with MongoDB Concept

Hi I am new to Hadoop and NoSQL technologies. I started learning with world-count program by reading file stored in HDFS and and processing it. Now I want to use Hadoop with MongoDB. Started program from here . Now here is confusion with me that it…
Abhendra Singh
  • 1,959
  • 4
  • 26
  • 46
2
votes
4 answers

Update an existing collection in MongoDB using Java-Hadoop connector

Is it possible to update existing MongoDB collection with new data. I am using hadoop job to read write data to Mongo. Required scenario is :- Say first collection in Mongo is { "_id" : 1, "value" : "aaa" "value2" : null } after reading…
Abhishek bhutra
  • 1,400
  • 1
  • 11
  • 29
1
vote
0 answers

mongo-hadoop java connector Iterate through all collections

I am trying to use this hadoop mongo connector, https://github.com/mongodb/mongo-hadoop I have seen many examples of connecting to a particular mongo collection using something like this, mongodbConfig.set("mongo.input.uri",…
user3400864
  • 17
  • 1
  • 5
1
vote
1 answer

Spark Mongo Hadoop Connector not mapping data

I am attempting to map data from mongodb-hadoop connector inside a spark application. I have not other errors prior to this one so im assuming that the connection to mongodb was successful. im using the following code to map: JavaRDD logs =…
D.Asare
  • 103
  • 3
  • 14
1
vote
2 answers

Hadoop with mongoDB : NoClassDefFoundError MongoConfigUtil

I'm learning how to write a map / reduce job in hadoop with mongodb data as input. So I followed this example, but I got following error : Exception in thread "main" java.lang.NoClassDefFoundError: com/mongodb/hadoop/util/MongoConfigUtil at…
1
vote
1 answer

Hive Table Creation Using MongoDB Hadoop Driver

I am trying to connect from a Hive Database to a collection in MongoDB using a driver (jars) provided on the wiki site. Here are the steps I did: - I created a collection in MongoDB called "Diamond" under a database called "Moe" and it has got 20…
Mario
  • 35
  • 5
1
vote
1 answer

mongo-hadoop. not to handle mongodb document deletion

I want to synchronize mongodb and hadoop, but when I delete document from mongodb, this document must not be deleted in hadoop. I tried using mongo-hadoop and hive. this is hive query: CREATE EXTERNAL TABLE SubComponentSubmission ( id STRING, …
irakli2692
  • 127
  • 2
  • 9
1
vote
0 answers

Getting error " Hadoop Release '%s' is an invalid/unsupported release. Valid entries are in 2.6.0"

I am working on mongodb-hadoop connector. For this process, i am building mongodb adapter,after edited build.sbt file,i am trying to building adapter like ./sbt package then i am getting error Hadoop Release '%s' is an invalid/unsupported…
Prabjot Singh
  • 4,491
  • 8
  • 31
  • 51
1
vote
0 answers

MongoDB Hadoop error : no FileSystem for scheme:mongodb

I'm trying to get a basic Spark example running using mongoDB hadoop connector. I'm using Hadoop version 2.6.0. I'm using version 1.3.1 of mongo-hadoop. I'm not sure where exactly to place the jars for this Hadoop version. Here are the locations…
Navin Viswanath
  • 894
  • 2
  • 13
  • 22
1
vote
1 answer

MongoDB Hadoop connector streaming not running

I want to launch the MongoDB Hadoop Streaming connector, so I downloaded a compatible version of Hadoop (the 2.2.0) (see https://github.com/mongodb/mongo-hadoop/blob/master/README.md#apache-hadoop-22) I cloned the git repository mongohadoop, changed…
Julien Fouilhé
  • 2,583
  • 3
  • 30
  • 56
1
2