3

I am using Mongo-Hadoop connector to work with Spark and MongoDB.I want to delete the documents in an RDD from the MongoDB,looks there is a MongoUpdateWritable to support document update. Is there way to do deletion with Mongo-Hadoop connector?

Thanks

Tom
  • 5,848
  • 12
  • 44
  • 104

1 Answers1

2

If you want only delete records in an RDD use the functions of the Spark API, like map, reduce, filter...

If you want save later the results, use the MongoUpdateWriteble.

Look the basics: Mongo-Hadoop-Spark

  • Thanks @Cristu for the reply. What i want to do is to delete the documents contained in the RDD from the MongoDB(the BSON object in RDD is used as query to find the documents in MongoDB). Looks Mongo-Hadoop-Spark doesn't support this. Do you mean that I write the deletion logic on my own? – Tom Sep 14 '16 at 08:24
  • You're welcome! If you provide a code example, maybe i can help you coding when you need it. – Cristu Naranjo Sep 14 '16 at 08:27
  • Thanks @Cristu, I am able to use Mongo-Hadoop-Spark to do document insertion and update, but I didn't find how to delete with it.So i am asking whether I need to write my own logic to do the deletion. – Tom Sep 14 '16 at 08:30