373

I'd like to get the names of all the keys in a MongoDB collection.

For example, from this:

db.things.insert( { type : ['dog', 'cat'] } );
db.things.insert( { egg : ['cat'] } );
db.things.insert( { type : [] } );
db.things.insert( { hello : []  } );

I'd like to get the unique keys:

type, egg, hello
Braiam
  • 1
  • 11
  • 47
  • 78
Steve
  • 4,859
  • 5
  • 21
  • 17

25 Answers25

385

You could do this with MapReduce:

mr = db.runCommand({
  "mapreduce" : "my_collection",
  "map" : function() {
    for (var key in this) { emit(key, null); }
  },
  "reduce" : function(key, stuff) { return null; }, 
  "out": "my_collection" + "_keys"
})

Then run distinct on the resulting collection so as to find all the keys:

db[mr.result].distinct("_id")
["foo", "bar", "baz", "_id", ...]
kris
  • 23,024
  • 10
  • 70
  • 79
  • 3
    Hi there! I've just posted a follow-up to this question asking how to make this snippet work even with keys located at deeper levels into the data structure (http://stackoverflow.com/questions/2997004/using-map-reduce-for-mapping-the-properties-in-a-collection). – Andrea Fiore Jun 08 '10 at 14:53
  • 1
    @kristina : How is it possible that I get entire *things* listed with the keys when using this on the *things* collection. It looks related to the history mechanism because I get *things* which I have modified in the past.. – Shawn Sep 26 '11 at 02:54
  • Why does the method above take longer than when I collate the keys externally using Python? – MFB Aug 23 '12 at 06:06
  • 3
    I know this is an old thread, but I seem to have a similar need. I'm using the nodejs mongodb native driver. The resulting temporary collection seems to empty always. I'm using the mapreduce function in the collection class for this. Is that not possible? – Deepak May 22 '14 at 14:15
  • 7
    This may be obvious, but if you want to get a list of all the unique keys in a subdocument, just modify this line: `for (var key in this.first_level.second_level.nth_level) { emit(key, null); }` – dtbarne Jan 07 '16 at 22:51
  • 1
    This function takes so long when using on a huge collection – Sercan Ozdemir Feb 16 '16 at 13:33
  • 4
    Instead of saving to a collection then running distinct on that, I use map(): `db.runCommand({..., out: { "inline" : 1 }}).results.map(function(i) { return i._id; });` – Ian Stanley Mar 03 '17 at 14:01
  • 1
    @dtbarne -- tried but not working, returning empty array. – Venkat Jun 27 '18 at 09:35
  • 1
    This doesn't work in environments when js engine is disabled as running javascript can bring security concerns, this is the case in many production environments – raspacorp Dec 19 '19 at 20:31
  • And to filter by type, just add e.g. `if (typeof(this[key]) == 'number')` before `emit(key, null)`. – Skippy le Grand Gourou Jan 12 '20 at 22:31
  • how can I achieve similar solution from scala ? – theAccidentalDeveloper Jun 29 '20 at 18:48
  • are you using pymongo? because `If you meant to call the 'runCommand' method on a 'Database' object it is failing because no such method exists.` – AhmadDeel Dec 06 '22 at 12:25
234

With Kristina's answer as inspiration, I created an open source tool called Variety which does exactly this: https://github.com/variety/variety

Community
  • 1
  • 1
James Cropcho
  • 3,302
  • 1
  • 20
  • 12
  • 17
    This is a fantastic tool, congratulations. It does exactly what the question asks, and can be configured with limits, depth etc. Recommended by any who follows. – Paul Biggar Jun 10 '12 at 20:35
149

You can use aggregation with the new $objectToArray aggregation operator in version 3.4.4 to convert all top key-value pairs into document arrays, followed by $unwind and $group with $addToSet to get distinct keys across the entire collection. (Use $$ROOT for referencing the top level document.)

db.things.aggregate([
  {"$project":{"arrayofkeyvalue":{"$objectToArray":"$$ROOT"}}},
  {"$unwind":"$arrayofkeyvalue"},
  {"$group":{"_id":null,"allkeys":{"$addToSet":"$arrayofkeyvalue.k"}}}
])

You can use the following query for getting keys in a single document.

db.things.aggregate([
  {"$match":{_id: "<<ID>>"}}, /* Replace with the document's ID */
  {"$project":{"arrayofkeyvalue":{"$objectToArray":"$$ROOT"}}},
  {"$project":{"keys":"$arrayofkeyvalue.k"}}
])
Victor Le Pochat
  • 241
  • 4
  • 11
s7vr
  • 73,656
  • 11
  • 106
  • 127
  • 30
    This is really the best answer. Solves the issue without involving some other programming language or package, and works with all drivers that support the aggregate framework (even Meteor!) – Micah Henning Nov 16 '17 at 21:48
  • 2
    If you want to return an array rather than a cursor containing a single map entry with an "allkeys" key, you can append `.next()["allkeys"]` to the command (assuming the collection has at least one element). – M. Justin Apr 09 '20 at 22:09
  • I would just note that aggregate from @kristina answer takes 11 sec on my set, and Map Recude 2 sec). I did not expect that. – seven Jul 13 '20 at 13:20
  • 1
    This worked for me on a collection with millions of documents where the map reduce timed out. – A. L. Strine Jul 19 '21 at 21:45
  • 1
    I vote for this too.. It's native afterall... – Romeo Sierra Sep 17 '21 at 17:29
  • This even works with pymongo by just changing `null` to `None`. Best answer. – anishtain4 Jan 27 '22 at 15:06
  • Thanks for this! Translated it into a rubyish syntax, used it with mongoid, worked like a charm – orthodoX Jan 17 '23 at 16:37
23

A cleaned up and reusable solution using pymongo:

from pymongo import MongoClient
from bson import Code

def get_keys(db, collection):
    client = MongoClient()
    db = client[db]
    map = Code("function() { for (var key in this) { emit(key, null); } }")
    reduce = Code("function(key, stuff) { return null; }")
    result = db[collection].map_reduce(map, reduce, "myresults")
    return result.distinct('_id')

Usage:

get_keys('dbname', 'collection')
>> ['key1', 'key2', ... ]
Wolkenarchitekt
  • 20,170
  • 29
  • 111
  • 174
  • 2
    Works great. Finally got my problem solved....this is the simplest solution i saw in stack overflow.. – Smack Alpha Jul 09 '19 at 13:10
  • 1
    And to filter by type, just add e.g. `if (typeof(this[key]) == 'number')` before `emit(key, null)`. – Skippy le Grand Gourou Jan 12 '20 at 22:31
  • 1
    Note: using MongoDB free tier, I get errror `pymongo.errors.OperationFailure: CMD_NOT_ALLOWED: mapReduce, full error: {'ok': 0, 'errmsg': 'CMD_NOT_ALLOWED: mapReduce', 'code': 8000, 'codeName': 'AtlasError'}` apparently because `mapReduce` is not supported in free tier [MongoDB unsupported-commands](https://docs.atlas.mongodb.com/reference/unsupported-commands/) – curtisp May 31 '21 at 17:49
20

If you are using mongodb 3.4.4 and above then you can use below aggregation using $objectToArray and $group aggregation

db.collection.aggregate([
  { "$project": {
    "data": { "$objectToArray": "$$ROOT" }
  }},
  { "$project": { "data": "$data.k" }},
  { "$unwind": "$data" },
  { "$group": {
    "_id": null,
    "keys": { "$addToSet": "$data" }
  }}
])

Here is the working example

Ashh
  • 44,693
  • 14
  • 105
  • 132
  • This is the best answer. You can also use `$match` at the beginning of the aggregation pipeline to only get the keys of documents that match a condition(s). – RonquilloAeon Aug 27 '19 at 18:21
19

If your target collection is not too large, you can try this under mongo shell client:

var allKeys = {};

db.YOURCOLLECTION.find().forEach(function(doc){Object.keys(doc).forEach(function(key){allKeys[key]=1})});

allKeys;
Li Chunlin
  • 517
  • 3
  • 14
12

Try this:

doc=db.thinks.findOne();
for (key in doc) print(key);
Jeff Loughlin
  • 4,134
  • 2
  • 30
  • 47
Carlos LM
  • 672
  • 5
  • 3
  • 58
    incorrect answer since this only outputs fields for a single document in a collection - the others may all have completely different keys. – Asya Kamsky Mar 31 '14 at 23:41
  • 17
    It is still the most useful answer to me, being a simple reasonable minimum. – Boris Burkov Jul 31 '14 at 16:13
  • 12
    It's not useful? How is it useful if it gives you the wrong answer? – Zlatko Jun 27 '15 at 07:48
  • 4
    The context show what is usefull: if data is normalized (ex. origen from CSV file), it is useful... For data imported from SQL is useful. – Peter Krauss Sep 22 '15 at 10:17
  • 5
    it is not a good answer it's an answer on how to get keys of **one** element in the collection not **all** keys in the collection! – yonatan Jan 07 '16 at 08:57
  • 2
    I think this answer doesn't work for the question but it did solve my problem: finding keys in a doc. +1 – limbo Aug 12 '16 at 18:51
  • 4
    This will resolve the issue like `db.thinks.find().forEach( function(doc) { for (key in doc) print(key); } );` – Kanagavelu Sugumar Jan 04 '17 at 17:17
11

Using python. Returns the set of all top-level keys in the collection:

#Using pymongo and connection named 'db'

reduce(
    lambda all_keys, rec_keys: all_keys | set(rec_keys), 
    map(lambda d: d.keys(), db.things.find()), 
    set()
)
Laizer
  • 5,932
  • 7
  • 46
  • 73
9

Here is the sample worked in Python: This sample returns the results inline.

from pymongo import MongoClient
from bson.code import Code

mapper = Code("""
    function() {
                  for (var key in this) { emit(key, null); }
               }
""")
reducer = Code("""
    function(key, stuff) { return null; }
""")

distinctThingFields = db.things.map_reduce(mapper, reducer
    , out = {'inline' : 1}
    , full_response = True)
## do something with distinctThingFields['results']
BobHy
  • 1,575
  • 10
  • 23
7

I am surprise, no one here has ans by using simple javascript and Set logic to automatically filter the duplicates values, simple example on mongo shellas below:

var allKeys = new Set()
db.collectionName.find().forEach( function (o) {for (key in o ) allKeys.add(key)})
for(let key of allKeys) print(key)

This will print all possible unique keys in the collection name: collectionName.

krishna Prasad
  • 3,541
  • 1
  • 34
  • 44
6

I think the best way do this as mentioned here is in mongod 3.4.4+ but without using the $unwind operator and using only two stages in the pipeline. Instead we can use the $mergeObjects and $objectToArray operators.

In the $group stage, we use the $mergeObjects operator to return a single document where key/value are from all documents in the collection.

Then comes the $project where we use $map and $objectToArray to return the keys.

let allTopLevelKeys =  [
    {
        "$group": {
            "_id": null,
            "array": {
                "$mergeObjects": "$$ROOT"
            }
        }
    },
    {
        "$project": {
            "keys": {
                "$map": {
                    "input": { "$objectToArray": "$array" },
                    "in": "$$this.k"
                }
            }
        }
    }
];

Now if we have a nested documents and want to get the keys as well, this is doable. For simplicity, let consider a document with simple embedded document that look like this:

{field1: {field2: "abc"}, field3: "def"}
{field1: {field3: "abc"}, field4: "def"}

The following pipeline yield all keys (field1, field2, field3, field4).

let allFistSecondLevelKeys = [
    {
        "$group": {
            "_id": null,
            "array": {
                "$mergeObjects": "$$ROOT"
            }
        }
    },
    {
        "$project": {
            "keys": {
                "$setUnion": [
                    {
                        "$map": {
                            "input": {
                                "$reduce": {
                                    "input": {
                                        "$map": {
                                            "input": {
                                                "$objectToArray": "$array"
                                            },
                                            "in": {
                                                "$cond": [
                                                    {
                                                        "$eq": [
                                                            {
                                                                "$type": "$$this.v"
                                                            },
                                                            "object"
                                                        ]
                                                    },
                                                    {
                                                        "$objectToArray": "$$this.v"
                                                    },
                                                    [
                                                        "$$this"
                                                    ]
                                                ]
                                            }
                                        }
                                    },
                                    "initialValue": [

                                    ],
                                    "in": {
                                        "$concatArrays": [
                                            "$$this",
                                            "$$value"
                                        ]
                                    }
                                }
                            },
                            "in": "$$this.k"
                        }
                    }
                ]
            }
        }
    }
]

With a little effort, we can get the key for all subdocument in an array field where the elements are object as well.

styvane
  • 59,869
  • 19
  • 150
  • 156
  • Yes `$unwind` will explode collection (no.of fields * no.of docs), we can avoid that by using `$mergeObjects` on all versions > `3.6`.. Did the same, Should've seen this answer before, my life would've been easier that way (-_-) – whoami - fakeFaceTrueSoul Feb 11 '20 at 00:42
4

To get a list of all the keys minus _id, consider running the following aggregate pipeline:

var keys = db.collection.aggregate([
    { "$project": {
       "hashmaps": { "$objectToArray": "$$ROOT" } 
    } }, 
    { "$group": {
        "_id": null,
        "fields": { "$addToSet": "$hashmaps.k" }
    } },
    { "$project": {
            "keys": {
                "$setDifference": [
                    {
                        "$reduce": {
                            "input": "$fields",
                            "initialValue": [],
                            "in": { "$setUnion" : ["$$value", "$$this"] }
                        }
                    },
                    ["_id"]
                ]
            }
        }
    }
]).toArray()[0]["keys"];
chridam
  • 100,957
  • 23
  • 236
  • 235
3

This works fine for me:

var arrayOfFieldNames = [];

var items = db.NAMECOLLECTION.find();

while(items.hasNext()) {
  var item = items.next();
  for(var index in item) {
    arrayOfFieldNames[index] = index;
   }
}

for (var index in arrayOfFieldNames) {
  print(index);
}
jimm101
  • 948
  • 1
  • 14
  • 36
ackuser
  • 5,681
  • 5
  • 40
  • 48
3

Maybe slightly off-topic, but you can recursively pretty-print all keys/fields of an object:

function _printFields(item, level) {
    if ((typeof item) != "object") {
        return
    }
    for (var index in item) {
        print(" ".repeat(level * 4) + index)
        if ((typeof item[index]) == "object") {
            _printFields(item[index], level + 1)
        }
    }
}

function printFields(item) {
    _printFields(item, 0)
}

Useful when all objects in a collection has the same structure.

qed
  • 22,298
  • 21
  • 125
  • 196
2

I know I am late to the party, but if you want a quick solution in python finding all keys (even the nested ones) you could do with a recursive function:

def get_keys(dl, keys=None):
    keys = keys or []
    if isinstance(dl, dict):
        keys += dl.keys()
        list(map(lambda x: get_keys(x, keys), dl.values()))
    elif isinstance(dl, list):
        list(map(lambda x: get_keys(x, keys), dl))
    return list(set(keys))

and use it like:

dl = db.things.find_one({})
get_keys(dl)

if your documents do not have identical keys you can do:

dl = db.things.find({})
list(set(list(map(get_keys, dl))[0]))

but this solution can for sure be optimized.

Generally this solution is basically solving finding keys in nested dicts, so this is not mongodb specific.

gustavz
  • 2,964
  • 3
  • 25
  • 47
1

Based on @Wolkenarchitekt answer: https://stackoverflow.com/a/48117846/8808983, I write a script that can find patterns in all keys in the db and I think it can help others reading this thread:

"""
Python 3
This script get list of patterns and print the collections that contains fields with this patterns.
"""

import argparse

import pymongo
from bson import Code


# initialize mongo connection:
def get_db():
    client = pymongo.MongoClient("172.17.0.2")
    db = client["Data"]
    return db


def get_commandline_options():
    description = "To run use: python db_fields_pattern_finder.py -p <list_of_patterns>"
    parser = argparse.ArgumentParser(description=description)
    parser.add_argument('-p', '--patterns', nargs="+", help='List of patterns to look for in the db.', required=True)
    return parser.parse_args()


def report_matching_fields(relevant_fields_by_collection):
    print("Matches:")

    for collection_name in relevant_fields_by_collection:
        if relevant_fields_by_collection[collection_name]:
            print(f"{collection_name}: {relevant_fields_by_collection[collection_name]}")

    # pprint(relevant_fields_by_collection)


def get_collections_names(db):
    """
    :param pymongo.database.Database db:
    :return list: collections names
    """
    return db.list_collection_names()


def get_keys(db, collection):
    """
    See: https://stackoverflow.com/a/48117846/8808983
    :param db:
    :param collection:
    :return:
    """
    map = Code("function() { for (var key in this) { emit(key, null); } }")
    reduce = Code("function(key, stuff) { return null; }")
    result = db[collection].map_reduce(map, reduce, "myresults")
    return result.distinct('_id')


def get_fields(db, collection_names):
    fields_by_collections = {}
    for collection_name in collection_names:
        fields_by_collections[collection_name] = get_keys(db, collection_name)
    return fields_by_collections


def get_matches_fields(fields_by_collections, patterns):
    relevant_fields_by_collection = {}
    for collection_name in fields_by_collections:
        relevant_fields = [field for field in fields_by_collections[collection_name] if
                           [pattern for pattern in patterns if
                            pattern in field]]
        relevant_fields_by_collection[collection_name] = relevant_fields

    return relevant_fields_by_collection


def main(patterns):
    """
    :param list patterns: List of strings to look for in the db.
    """
    db = get_db()

    collection_names = get_collections_names(db)
    fields_by_collections = get_fields(db, collection_names)
    relevant_fields_by_collection = get_matches_fields(fields_by_collections, patterns)

    report_matching_fields(relevant_fields_by_collection)


if __name__ == '__main__':
    args = get_commandline_options()
    main(args.patterns)
Rea Haas
  • 2,018
  • 1
  • 16
  • 18
0

As per the mongoldb documentation, a combination of distinct

Finds the distinct values for a specified field across a single collection or view and returns the results in an array.

and indexes collection operations are what would return all possible values for a given key, or index:

Returns an array that holds a list of documents that identify and describe the existing indexes on the collection

So in a given method one could do use a method like the following one, in order to query a collection for all it's registered indexes, and return, say an object with the indexes for keys (this example uses async/await for NodeJS, but obviously you could use any other asynchronous approach):

async function GetFor(collection, index) {

    let currentIndexes;
    let indexNames = [];
    let final = {};
    let vals = [];

    try {
        currentIndexes = await collection.indexes();
        await ParseIndexes();
        //Check if a specific index was queried, otherwise, iterate for all existing indexes
        if (index && typeof index === "string") return await ParseFor(index, indexNames);
        await ParseDoc(indexNames);
        await Promise.all(vals);
        return final;
    } catch (e) {
        throw e;
    }

    function ParseIndexes() {
        return new Promise(function (result) {
            let err;
            for (let ind in currentIndexes) {
                let index = currentIndexes[ind];
                if (!index) {
                    err = "No Key For Index "+index; break;
                }
                let Name = Object.keys(index.key);
                if (Name.length === 0) {
                    err = "No Name For Index"; break;
                }
                indexNames.push(Name[0]);
            }
            return result(err ? Promise.reject(err) : Promise.resolve());
        })
    }

    async function ParseFor(index, inDoc) {
        if (inDoc.indexOf(index) === -1) throw "No Such Index In Collection";
        try {
            await DistinctFor(index);
            return final;
        } catch (e) {
            throw e
        }
    }
    function ParseDoc(doc) {
        return new Promise(function (result) {
            let err;
            for (let index in doc) {
                let key = doc[index];
                if (!key) {
                    err = "No Key For Index "+index; break;
                }
                vals.push(new Promise(function (pushed) {
                    DistinctFor(key)
                        .then(pushed)
                        .catch(function (err) {
                            return pushed(Promise.resolve());
                        })
                }))
            }
            return result(err ? Promise.reject(err) : Promise.resolve());
        })
    }

    async function DistinctFor(key) {
        if (!key) throw "Key Is Undefined";
        try {
            final[key] = await collection.distinct(key);
        } catch (e) {
            final[key] = 'failed';
            throw e;
        }
    }
}

So querying a collection with the basic _id index, would return the following (test collection only has one document at the time of the test):

Mongo.MongoClient.connect(url, function (err, client) {
    assert.equal(null, err);

    let collection = client.db('my db').collection('the targeted collection');

    GetFor(collection, '_id')
        .then(function () {
            //returns
            // { _id: [ 5ae901e77e322342de1fb701 ] }
        })
        .catch(function (err) {
            //manage your error..
        })
});

Mind you, this uses methods native to the NodeJS Driver. As some other answers have suggested, there are other approaches, such as the aggregate framework. I personally find this approach more flexible, as you can easily create and fine-tune how to return the results. Obviously, this only addresses top-level attributes, not nested ones. Also, to guarantee that all documents are represented should there be secondary indexes (other than the main _id one), those indexes should be set as required.

jlmurph
  • 1,050
  • 8
  • 17
0

We can achieve this by Using mongo js file. Add below code in your getCollectionName.js file and run js file in the console of Linux as given below :

mongo --host 192.168.1.135 getCollectionName.js

db_set = connect("192.168.1.135:27017/database_set_name"); // for Local testing
// db_set.auth("username_of_db", "password_of_db"); // if required

db_set.getMongo().setSlaveOk();

var collectionArray = db_set.getCollectionNames();

collectionArray.forEach(function(collectionName){

    if ( collectionName == 'system.indexes' || collectionName == 'system.profile' || collectionName == 'system.users' ) {
        return;
    }

    print("\nCollection Name = "+collectionName);
    print("All Fields :\n");

    var arrayOfFieldNames = []; 
    var items = db_set[collectionName].find();
    // var items = db_set[collectionName].find().sort({'_id':-1}).limit(100); // if you want fast & scan only last 100 records of each collection
    while(items.hasNext()) {
        var item = items.next(); 
        for(var index in item) {
            arrayOfFieldNames[index] = index;
        }
    }
    for (var index in arrayOfFieldNames) {
        print(index);
    }

});

quit();

Thanks @ackuser

Irshad Khan
  • 5,670
  • 2
  • 44
  • 39
0

Following the thread from @James Cropcho's answer, I landed on the following which I found to be super easy to use. It is a binary tool, which is exactly what I was looking for: mongoeye.

Using this tool it took about 2 minutes to get my schema exported from command line.

paneer_tikka
  • 6,232
  • 1
  • 20
  • 17
0

I know this question is 10 years old but there is no C# solution and this took me hours to figure out. I'm using the .NET driver and System.Linq to return a list of the keys.

var map = new BsonJavaScript("function() { for (var key in this) { emit(key, null); } }");
var reduce = new BsonJavaScript("function(key, stuff) { return null; }");
var options = new MapReduceOptions<BsonDocument, BsonDocument>();
var result = await collection.MapReduceAsync(map, reduce, options);
var list = result.ToEnumerable().Select(item => item["_id"].ToString());
Joe Mayo
  • 7,501
  • 7
  • 41
  • 60
Andrew Samole
  • 695
  • 7
  • 7
0

This one lines extracts all keys from a collection into a comma separated sorted string:

db.<collection>.find().map((x) => Object.keys(x)).reduce((a, e) => {for (el of e) { if(!a.includes(el)) { a.push(el) }  }; return a}, []).sort((a, b) => a.toLowerCase() > b.toLowerCase()).join(", ")

The result of this query typically looks like this:

_class, _id, address, city, companyName, country, emailId, firstName, isAssigned, isLoggedIn, lastLoggedIn, lastName, location, mobile, printName, roleName, route, state, status, token
gil.fernandes
  • 12,978
  • 5
  • 63
  • 76
0

We could use the open source tool Variety for it but it lacks compatibility with Mongosh. So please follow below steps to get it working with mongosh

  1. First Install Mongosh CLI

  2. Then download variety.js from https://github.com/CodyDWJones/variety/blob/master/variety.js and move it into your Ubuntu $HOME

  3. Use below command to generate schema

    mongosh "mongodb://mongoURIexample" variety.js --eval "use dbexample" --eval "var collection = 'collectionexample'"

Make sure you replace the URI, db and collection name appropriately.

Dulshan
  • 31
  • 7
  • Variety is already introduced in another [highly upvoted, bounty-rewarded answer](https://stackoverflow.com/a/10366065) – ray Aug 21 '23 at 21:09
  • Yes true, but you will have trouble setting it up with mongosh cli if you dont follow above steps. mongosh is the latest mongo cli. There has to be certain tweaks applied in order to get it working with mongosh. The link I posted has fixed it and is still a open PR. – Dulshan Aug 22 '23 at 06:04
-1

I extended Carlos LM's solution a bit so it's more detailed.

Example of a schema:

var schema = {
    _id: 123,
    id: 12,
    t: 'title',
    p: 4.5,
    ls: [{
            l: 'lemma',
            p: {
                pp: 8.9
            }
        },
         {
            l: 'lemma2',
            p: {
               pp: 8.3
           }
        }
    ]
};

Type into the console:

var schemafy = function(schema, i, limit) {
    var i = (typeof i !== 'undefined') ? i : 1;
    var limit = (typeof limit !== 'undefined') ? limit : false;
    var type = '';
    var array = false;

    for (key in schema) {
        type = typeof schema[key];
        array = (schema[key] instanceof Array) ? true : false;

        if (type === 'object') {
            print(Array(i).join('    ') + key+' <'+((array) ? 'array' : type)+'>:');
            schemafy(schema[key], i+1, array);
        } else {
            print(Array(i).join('    ') + key+' <'+type+'>');
        }

        if (limit) {
            break;
        }
    }
}

Run:

schemafy(db.collection.findOne());

Output

_id <number>
id <number>
t <string>
p <number>
ls <object>:
    0 <object>:
    l <string>
    p <object>:
        pp <number> 
va5ja
  • 473
  • 1
  • 3
  • 8
  • 3
    his answer is wrong and you built on top of it. the whole point is to output *all* the fields of *all* the documents, not the first document which may have different fields than each next one. – Asya Kamsky Mar 31 '14 at 23:43
-1

I was trying to write in nodejs and finally came up with this:

db.collection('collectionName').mapReduce(
function() {
    for (var key in this) {
        emit(key, null);
    }
},
function(key, stuff) {
    return null;
}, {
    "out": "allFieldNames"
},
function(err, results) {
    var fields = db.collection('allFieldNames').distinct('_id');
    fields
        .then(function(data) {
            var finalData = {
                "status": "success",
                "fields": data
            };
            res.send(finalData);
            delteCollection(db, 'allFieldNames');
        })
        .catch(function(err) {
            res.send(err);
            delteCollection(db, 'allFieldNames');
        });
 });

After reading the newly created collection "allFieldNames", delete it.

db.collection("allFieldNames").remove({}, function (err,result) {
     db.close();
     return; 
});
Gautam
  • 743
  • 7
  • 9
-3

I have 1 simpler work around...

What you can do is while inserting data/document into your main collection "things" you must insert the attributes in 1 separate collection lets say "things_attributes".

so every time you insert in "things", you do get from "things_attributes" compare values of that document with your new document keys if any new key present append it in that document and again re-insert it.

So things_attributes will have only 1 document of unique keys which you can easily get when ever you require by using findOne()

Paresh Behede
  • 5,987
  • 1
  • 18
  • 13
  • For databases with many entries where queries for all keys are frequent and inserts are infrequent, caching the result of the "get all keys" query would make sense. This is one way to do that. – Him Apr 02 '19 at 15:51