0

I have some data I am trying to retrieve from mongodb into R using rmongodb package. At some point in time, the ordering of the fields in the stored documents changed.

I am trying to force my projection query to keep the order of the fields projected fixed by explicitly specifying approach attempted in: SO question as follows:

data <- mongo.find.all(mongo_conn, table,
                          fields = list('id1' = 1, 'id2' = 2,
                                        'time' = 3, 'latitude' = 4,
                                        'longitude' = 5, '_id' = 0))

I can't seem to find a good answer to this. It returns the fields in the order they are in the DB, which changed, as a list of course.

That means, it wreaks obvious havoc in what kind of loopy code I have to write to organize the returned results into a data frame like structure.

Any idea how I can get the fields in specified order and not what is in DB?

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Gopala
  • 10,363
  • 7
  • 45
  • 77
  • [$project](https://docs.mongodb.com/manual/reference/operator/aggregation/project/) will be your solution I think. – Shrabanee Jun 18 '16 at 03:15
  • Not entirely sure on what you're asking; can you provide example data and/or an expected result? Any reason why you can't do the sorting in R, and that you're not using the aggregation framework? – SymbolixAU Jun 19 '16 at 04:13
  • In the database, at some point, the order of storing the above fields in each document has changed. MongoDB is returning these fields in the order they appear in the database. I want them returned in the order I am specifying. Of course, I can re-order in R, which I am doing. But, that requires me to run through entire data (list returned) and sort each entry of list. – Gopala Jun 19 '16 at 12:57
  • Also, when specifying the fields you want returned, you can only use `0` (don't return) or `1` (return). – SymbolixAU Jun 21 '16 at 00:22

1 Answers1

1

In the answer you link to it says

Simple answer is that you can't do this.

Also see the related mongodb ticket

However, to get your results in a data.frame-like structure, use mongolite, it's much easier to work with

Consider this example using the mtcars data

data("mtcars")

library(mongolite)  

mongo <- mongo(db = "test",
               collection = "mtcars",
               url = "mongodb://localhost")

## insert into database
mongo$insert(mtcars)
# Complete! Processed total of 32 rows.
# [1] TRUE

mongolite::find will automatically simplify the query results into a data.frame structure

df_results <- mongo$find()
# Imported 32 records. Simplifying into dataframe...

head(df_results)
#                     mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Or, using the aggregation framework

mongo$aggregate(pipeline = '[{ "$project" : { "mpg" : 1, "wt" : 1, "_id" : 0}  },
                             { "$limit" : 5 }]')

# Imported 5 records. Simplifying into dataframe...
#     mpg    wt
# 1 21.0 2.620
# 2 21.0 2.875
# 3 22.8 2.320
# 4 21.4 3.215
# 5 18.7 3.440

And now for a bit of shameless self-promotion. I've been working on an extension to mongolite that returns a data.table object. The idea here is to increase the speed of returned objects, but only if the returned result set can be coerced using rbindlist.

The package is mongolitedt, and still in development.

# devtools::install_github("SymbolixAU/mongolitedt")
library(mongolitedt)

bind_mongolitedt(mongo)

mongo$aggregatedt(pipeline = '[{ "$project" : { "mpg" : 1, "wt" : 1, "_id" : 0}  },
                             { "$limit" : 5 }]')

## now have a data.table object returned
#  Imported 5 records.
#     mpg    wt
# 1: 21.0 2.620
# 2: 21.0 2.875
# 3: 22.8 2.320
# 4: 21.4 3.215
# 5: 18.7 3.440

## clean up
rm(mongo); gc()
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139