0

Doc 1:

{
  typeId: 'A1',
  name: 'EAGLE-25'
}

Doc 2:

{
  typeId: 'A1',
  name: 'EAGLE-32'
}

Doc 3:

{
  typeId: 'B1',
  name: 'FOX5'
}

Doc 4:

{
  typeId: 'B1',
  name: 'FOX15'
}

Wanted result after aggregated query:

[
  {
     typeId: 'A1',
     commonName: 'EAGLE',
     names: ['EAGLE-25', 'EAGLE-32']
  },
  {
     typeId: 'B1',
     commonName: 'FOX',
     names: ['FOX5', 'FOX15']
  }
]

Is this possible with the aggregation framework?

Joe
  • 4,274
  • 32
  • 95
  • 175
  • 1
    Possible duplicate of [Match with substring in mongodb aggregation](https://stackoverflow.com/questions/20066279/match-with-substring-in-mongodb-aggregation) and also https://stackoverflow.com/questions/28282334/mongodb-group-by-field-using-substr – dnickless Jul 25 '17 at 18:42

1 Answers1

1

Here you go:

db.getCollection('test').aggregate
(
  {
    $group:
    {
      _id:
      {
        "typeId": "$typeId",
        "commonName": { "$substrCP": [ "$name", 0, { "$indexOfCP": [ "$name", "-" ] } ] } // group by substring of "name" property 
      },
      "names": { $push: "$name" } // create the "names" array per group
    }
  },
  {
    $project:
    {
      "_id": 0, // get rid of _id field
      "typeId": "$_id.typeId", // flatten "_id.typeId" into "typeId"
      "commonName": "$_id.commonName", // flatten "_id.commonName" into "commonName"
      "names": "$names" // include "names" array the way it is
    }
  }
)

As always with MongoDB aggregations you can get an idea of what's going on by simply reducing the projection stages step by step starting from the end of the query.

EDIT:

My answer above doesn't make too much sense anymore after your latest change to your question. And I cannot think of a way to make your generic "least common denominator" query work.

But more importantly, I think your are missing something in your specification. Imagine you have the following elements in your database:

{
  typeId: 'A1',
  name: 'EAGLE-25'
}

{
  typeId: 'A1',
  name: 'EATS-26'
}

{
  typeId: 'A1',
  name: 'EVERYTHING-27'
}

What you would get with your "least common denominator" concept is this:

[
  {
     typeId: 'A1',
     commonName: 'E',
     names: ['EAGLE-25', 'EATS-26', 'EVERYTHING-27']
  }
]

It would appear that this result does not make a lot of sense anymore...?!

EDIT 2:

I had an idea which works on the assumption that you can define a maximum length of the "common prefix". We get pretty close to what you want, I think:

db.getCollection('eagle').aggregate
(
  {
    $project:
    {
      "range": {$range: [ 1, 10, 1 ]}, // adjust the '10' to match the maximum length of your "least common prefix"
      "typeId": "$typeId",
      "name": "$name"
    }
  },
  { $unwind: "$range" },
  {
    $project:
    {
      "typeId": "$typeId",
      "name": "$name",
      "commonName": { $substrCP: ["$name", 0, "$range"] } // extract the first couple of characters from the name
    }
  },
  {
    $group: { _id: {"typeId": "$typeId", "commonName": "$commonName"}, "names": { $addToSet: "$name" } }
  },
  {
    $project:
    {
      "_id": 0, // get rid of _id field
      "typeId": "$_id.typeId", // flatten "_id.typeId" into "typeId"
      "commonName": "$_id.commonName", // flatten "_id.commonName" into "commonName"
      "names": "$names" // include "names" array the way it is
    }
  }
)
dnickless
  • 10,733
  • 1
  • 19
  • 34
  • I have other strings than in the example. – Joe Jul 25 '17 at 18:48
  • Give me some more details and I shall happily refine the query. – dnickless Jul 25 '17 at 18:49
  • Nice, but the string don't always have a "-" as a "seperator". The string can be anything. – Joe Jul 26 '17 at 12:52
  • Anything as in what? What's the pattern that you would want to split by? – dnickless Jul 26 '17 at 12:58
  • I want "lowest common denominator" the string starts with. E.g 'ABCD' and 'ABCF' have 'ABC' as the commonName. – Joe Jul 26 '17 at 13:03
  • I can understand it dosen't make sense but it makes sense when you have the full picture. My data comes from a very "loose" source. I guess I have to take care of it manually. Thx for your effort. – Joe Jul 26 '17 at 13:46