1

Although this question has been asked many times, it seems that none of the answers provides the desired result. I wonder if there is a way in MongoDB to create all keys/fields of all documents (including nested documents) of a collection.To illustrate the issue, here a short example:

Given the following collection:

[
  {
    "A": "B",
    "C": {
      "D": "E",
      "F": "G"
    }
  },
  {
    "H": "I",
    "J": {
      "K": {
        "L": "M",
        "N": "O"
      },
      "P": "Q"
    }
  }
]

I want to get the following set of keys as output:

{
  "keys": ["A", "C.D", "C.F", "H", "J.K.L", "J.K.N", "J.P"]
}

All other solutions I found so far only return the top-level keys:

{
  "keys": ["A", "C", "H", "J"]
}

I've already experimented a lot with the aggregation pipeline, but for the life of me could not figure out how to convince it to give me the desired result. Maybe someone could help me out here.

D. SM
  • 13,584
  • 3
  • 12
  • 21
evermean
  • 1,255
  • 21
  • 49
  • Can you post an expected output? I've understand something like [this](https://mongoplayground.net/p/oHbfsSe87XD) – J.F. Nov 24 '20 at 15:07
  • Thanks for the answer. The proposed solution works if I have a clear understanding of my documents and know in advance which keys might exist. In my case I don't know which keys exist, so I have to "discover" them. I have updated my question accordingly – evermean Nov 24 '20 at 16:07

1 Answers1

2

Using Get names of all keys in the collection as a guide:

mr = db.runCommand({
  "mapreduce" : "t",
  "map" : function() {
    var r = function(prefix, value) {
      var emit_prefix = prefix;
      if (prefix) {
        emit_prefix += '.';
      }
      for (var key in value) {
        if (value.hasOwnProperty(key)) {
          emit(emit_prefix + key, null);
          var subv = value[key];
          if (typeof subv == 'object') {
            r(emit_prefix + key, subv);
          }
        }
      }
    }
    r('', this);
  },
  "reduce" : function(key, stuff) { return key; }, 
  "out": "t" + "_keys"
})

No distinct step is required since this is handled in the reducer.

Your question didn't include arrays of documents.

D. SM
  • 13,584
  • 3
  • 12
  • 21
  • Thank you very much for the proposed solution. Unfortunately I could not check if it works correctly as it took too long to finish. After about 30 min I stopped the running process. Is it possible that I did something wrong? I believe that this query should not take that long. The collection on which I run the query has about 1.1 million documents, so not too many. – evermean Nov 27 '20 at 10:32
  • 1
    Try on a smaller collection on a local server. – D. SM Nov 27 '20 at 15:56