29

I am using mongoexport to export some data into .json formatted file, however the document has a large size overhead introduced by _id:IDVALUE tuples.

I found a similar post Is there a way to retrieve data from MongoDB without the _id field? on how to omit the _id field when retrieving data from mongo, but not exporting. It is suggested to use: .Exclude("_id"). I tried to reqrite the --query parameter of mongoexport to somehow include the .Exclude("_id") parameter, but all of the attempts failed so far.

Please suggest what is the proper way of doing this, or should I revert to using some post-export techniques?

Thanks

Community
  • 1
  • 1
Nik
  • 311
  • 1
  • 3
  • 5

13 Answers13

20

There appears to be no way to exclude a field (such as _id) using mongoexport.

Here's an alternative that has worked for me on moderate sized databases:

mongo myserver/mydb --quiet --eval "db.mycoll.find({}, {_id:0}).forEach(printjson);" > out.txt

On a large database (many millions of records) it can take a while and running this will affect other operations people try to do on the system:

quux00
  • 13,679
  • 10
  • 57
  • 69
13

This works:

mongoexport --db db_name --collection collection_name | sed '/"_id":/s/"_id":[^,]*,//' > file_name.json
Gianfranco P.
  • 10,049
  • 6
  • 51
  • 68
  • @Gianfranco P. , if mongo documents contain "_id" keyword on other locations, will that also replaces? – Joseph D Apr 08 '21 at 08:13
  • @JosephD good question. it doesn't as `sed` is matching the double-quotes. Here's the test I did https://gist.github.com/gianpaj/df6470cb24f82659580200828f6873bc – Gianfranco P. Apr 08 '21 at 11:02
  • Thanks @Gianfranco P. , and that sed is checking for matches in "_id" only . r8? – Joseph D Apr 08 '21 at 13:22
  • 1
    @JosephD yep. see https://linuxize.com/post/how-to-use-sed-to-find-and-replace-string-in-files/ – Gianfranco P. Apr 08 '21 at 18:32
10

Pipe the output of mongoexport into jq and remove the _id field there.

mongoexport --uri=mongodb://localhost/mydb --collection=my_collection \
  | jq 'del(._id)'

Update: adding link to jq.

sebastian
  • 1,438
  • 1
  • 15
  • 23
8

I know you specified you wanted to export in JSON but if you could substitute CSV data the native mongo export will work, and will be a lot faster than the above solutions

mongoexport --db <dbName> --collection <collectionName> --csv --fields "<fieldOne>,<fieldTwo>,<fieldThree>" > mongoex.csv
Peter Perháč
  • 20,434
  • 21
  • 120
  • 152
A_funs
  • 1,228
  • 2
  • 19
  • 31
5

mongoexport doesn't seem to have such option.

With ramda-cli stripping the _id would look like:

mongoexport --db mydb --collection mycoll -f name,age | ramda 'omit ["_id"]'
raine
  • 1,694
  • 17
  • 14
  • Trying this but keep getting the _id. ```mongoexport --db enso --collection places --out placesType.json -f "name,city,latitude,longitude,objectID,type" --jsonArray | R 'omit ["_id"]'``` Getting must specify --save, --no-save, or --vanilla. I tried each one and go same output with _id and this console error ```ARGUMENT 'omit' __ignored__ ARGUMENT '["_id"]' __ignored__```. Any ideas? – armand Dec 07 '16 at 16:50
  • I also get error for "specify --save, --no-save, or --vanilla" – KiwenLau Apr 13 '18 at 07:04
4

I applied quux00's solution but forEach(printjson) prints MongoDB Extended JSON notation in the output (for instance "last_update" : NumberLong("1384715001000").

It will be better to use the following line instead:

db.mycoll.find({}, {_id:0}).forEach(function (doc) {

    print( JSON.stringify(doc) );
});
noleto
  • 1,534
  • 16
  • 12
2
mongo <server>/<database> --quiet --eval "db.<collection>.find({}, {_id:0,<field>:1}).forEach(printjson);" > out.txt

If you have some query to execute change "" to '' and write your condition in find with "" like find("age":13).

xring
  • 727
  • 2
  • 8
  • 29
0

The simplest way to exclude the sub-document information such as the "_id" is to export it as a csv, then use a tool to convert the csv into json.

Matt
  • 925
  • 2
  • 9
  • 19
0

mongoexport can not omit "_id"

sed is a powerful command to do it:

mongoexport --db mydb --collection mycoll -f name,age | sed '/"_id":/s/"_id":[^,]*,//'

The original answer is from Exclude _id field using MongoExport command

KiwenLau
  • 2,502
  • 1
  • 19
  • 23
0

Just use --type=csv option in mongoexport command.

mongoexport --db=<db_name> --collection=<collection_name> --type=csv --field=<fields> --out=<Outfilename>.csv

For MongoDb version 3.4, you can use --noHeaderLine option in mongoexport command to exclude the field header in csv export too.

For Detail: https://docs.mongodb.com/manual/reference/program/mongoexport/

Saman
  • 333
  • 3
  • 9
0

export into a file and just use replace empty value using Regular expression, in my case

"_id": "f5dc48e1-ed04-4ef9-943b-b1194a088b95"

I used "_id": "(\w|-)*",

Afsanefda
  • 3,069
  • 6
  • 36
  • 76
0

With jq this can be achieved easily:

mongoexport -d database -c collection --jsonArray | jq 'del(.[]._id)'
Hüseyin
  • 17
  • 4
-7

Have you tried specifying your fields with the --fields flag? All fields that are not mentioned are excluded from the export.

For maintainability you can also write your fields into a seperate file and use --fieldFile.

zemirco
  • 16,171
  • 8
  • 62
  • 96
  • 1
    Yes, I did so from the very beginning, and was surprised to find the _id field in the exported file. I just double checked that, and the _id field is indeed exported. – Nik Oct 21 '12 at 18:22
  • I just iterated and removed items with "_id" key. – Nik Oct 21 '12 at 19:40
  • From the [documentation](https://docs.mongodb.com/manual/reference/program/mongoexport/#cmdoption--fields) > For JSON output formats, mongoexport includes only the specified field(s) and the _id field, and if the specified field(s) is a field within a sub-document, the mongoexport includes the sub-document with all its fields, not just the specified field within the document. – aandis Aug 25 '16 at 12:03