0

I have records of this form:

{
    "_id" : ObjectId("57993e64498e9bebb535154f"),
    "fooKey" : "123|a|b|c|||d",
    "locationId" : 1,
    "type" : "FOO"
}

{
    "_id" : ObjectId("579e0a3d498e9bebb545ff96"),
    "fooKey" : "123|x|y|z|||v",
    "locationId" : 1,
    "type" : "FOO"
}

{
    "_id" : ObjectId("57a5443b498e381a40a26afb"),
    "fooKey" : "123|a|b|c|||d",
    "locationId" : 2,
    "type" : "FOO"
}

{
    "_id" : ObjectId("57a63fef498e381a40a60347"),
    "fooKey" : "123|x|y|z|||v",
    "locationId" : 2,
    "type" : "FOO"
}

{
    "_id" : ObjectId("579ab3ce498e9538125052ca"),
    "fooKey" : "456|h|j|j|||k",
    "locationId" : 2,
    "type" : "BAR"
}

I went through the documentation and this seems like it could be complex given that I need this today (and I am not an expert in Mongo). What I need an aggregation query (for only records with "type" : "FOO") to return groups grouped by:

  • The first field in the pipe-delimited string in the "fooKey" field (for example "123"
  • The locationId

and then where the resulting count of the type field (where it is specifically equal to "FOO" is greater than 1.

That is given the records above I need something along the lines of records 1 and 2 returned in a group along with records 3 and 4 aggregated in a group... along with a count of the group size.

Expected Output

Something like this:

{ 
    "foo": "123", 
    "locationId": 1, 
    "type": "FOO", 
    "total": 2 
}, 
{ 
    "foo": "123", 
    "locationId": 2,
    "type": "FOO", 
    "total": 2 
} 
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
TonyW
  • 73
  • 1
  • 9
  • Something like this:{ foo: “123", "locationId" : 1, type: "FOO", total: 2 }, { foo: “123", "locationId" : 2, type: "FOO", total: 2 } – TonyW Aug 15 '16 at 19:48
  • Also there could be many distinct types of foo's for fooKeys.... foo is like a person Id. – TonyW Aug 15 '16 at 19:50
  • 2
    [Most efficient way to change a string field value to its substring](https://stackoverflow.com/questions/38832525/most-efficient-way-to-change-a-string-field-value-to-its-substring) – styvane Aug 15 '16 at 20:01
  • Unless your first token in the pipe delimited fooKey will always be the same length, I don't think you can do this in the current version of MongoDB using aggregate. – Wake Aug 15 '16 at 20:19
  • Is there a regex way to do it? – TonyW Aug 15 '16 at 22:00
  • It looks like the split operation can split the tokens of the pipe-delimited string into an array to take the first element. – TonyW Aug 15 '16 at 22:06
  • This returns me back what I want -- IF I had the "foo" field in the original records:db.somecollection.aggregate([ { "$group": { "_id": { "foo": "$foo", "locationId": "$locationId" }, "count": { "$sum": 1 } }}, {$match: {count: {$gt: 1}}} ]) but I really don't want to be adding extra fields on a production document database. – TonyW Aug 15 '16 at 22:33
  • I am on Mongo 2.4. According to this link (where someone is asking an almost identical question), there is no way to accomplish the above (without using MR)?http://stackoverflow.com/questions/21118774/how-can-i-create-new-fields-from-an-existing-string-with-project – TonyW Aug 15 '16 at 23:08

0 Answers0