15

Because mongodb will index sparse, compound indexes that contain 1 or more of the indexed fields, it is causing my unique, sparse index to fail because one of those fields is optional, and is being coerced to null by mongodb for the purpose of the index.

I need database-level ensurance of uniqueness for the combination of this field and a few others, and having to manage this at the application level via some concatenated string worries me.

As an alternative, I considered setting the default value of the possibly null indexed field to 'null ' + anObjectId, because it would allow me to keep the index without causing errors. Does this seem like a sensisble (although hacky) solution? Does anyone know of a better way I could enforce database-level uniqueness on a compound index?

Edit: I was asked to elaborate on the actual problem domain a bit more, so here it goes.

We get large data feeds from our customers that we need to integrate into our database. These feeds include various (3) unique identifiers supplied by the customer that we use for updating the versions we store in our database when the data feeds refresh. I need to tie uniqueness of these identifiers to the customer, because the same identifier could appear from multiple sources, and we want to allow that.

The document structure looks like this:

{
  "identifiers": {
      "identifierA": ...,
      "identifierB": ...,
      "identifierC": ...
  },
  "client": ...
}

Because the each individual identifier is optional (at least one of the three is required), I need to uniquely index the combination of the index with the client (e.g. one index is the combination of client plus identifierA). However, this index must only occur when the identifier exists, but this is not supported my mongodb (see the hyperlink above).

I was considering the above solution, but I would like to hear if anyone else has solved this or has suggestions.

jtmarmon
  • 5,727
  • 7
  • 28
  • 45
  • Too wordy an unclear. Try posting some examples of what you want to achieve. – Neil Lunn Jan 28 '15 at 01:34
  • 1
    Can you explain why it's unclear? I thought I was pretty explicit... 1) I want database-level insurance of uniqueness for the combination of multiple fields 2) mongodb doesn't support this 3) does my solution of appending a generated objectId to a string make sense? – jtmarmon Jan 28 '15 at 01:35
  • You kind of locked this in to "show me how to create and index to do that" and "I think this is the right way but it does not work". I'm trying to give you an opportunity to explain your actual problem you are trying to solve by creating an index. because people who know better may know another way. Better than hacky possibly? – Neil Lunn Jan 28 '15 at 01:38
  • @NeilLunn see edits :) – jtmarmon Jan 28 '15 at 01:49
  • May I suggest the problem here is that you have "keys" in a sub-document rather than say using an array with documents like `[{ "type": "A", "data": "Something },{ "type": "B", "data": "Something else" }]`. But this is what I largely mean by "explaining the actual problem". If you rather told us what type of data you are storing, what the constraints are and how you are using it, then it's clear to see other ways to solve the issue. – Neil Lunn Jan 28 '15 at 02:04
  • Ah that's an interesting solution. I might do that.. And was my elaboration not clear, or were you just elaborating on your request for elaboration? – jtmarmon Jan 28 '15 at 02:10

3 Answers3

15

https://docs.mongodb.org/manual/core/index-partial/

As of mongoDB 3.2 you can create partial index to support this as well.

db.users.createIndex(
   { name: 1, email: 1 },
   { unique: true, partialFilterExpression: { email: { $exists: true } } }
)
KCD
  • 9,873
  • 5
  • 66
  • 75
dCoder
  • 374
  • 3
  • 14
  • 2
    Actually since we would need something like: partialFilterExpression: { email: { $exists: true }, email : {$ne: null} } and that is not supported what you proposed does not work – ricardoespsanto Jun 20 '16 at 22:25
  • You can use $and operator at the top level to support this. Can you provide some error message as well as mongo version? – dCoder Jun 21 '16 at 03:37
  • mongo v 3.2.0 and error message was unsupported expression in partial index: $not wristband == "" with a given partialFilterExpression of: partialFilterExpression: {wristband: {$exists: true}, wristband: {$ne: ''}} – ricardoespsanto Jun 21 '16 at 08:42
  • 2
    Your partial expression is not valid. Use $and: [ {email: { $exists: true} }, email: {$ne: null}}] – dCoder Jun 21 '16 at 10:56
  • same error message: unsupported expression in partial index: $not wristband == "" I think that partial indexes only support equality not unequality operators at least that's what I read from: https://docs.mongodb.com/manual/core/index-partial/ – ricardoespsanto Jun 21 '16 at 13:05
  • 1
    http://stackoverflow.com/a/34292680/3343029. You are right. You may use $type then – dCoder Jun 21 '16 at 13:11
  • 2
    That fixed it! thanks. TL;DR: $type: "string" implies it is a string and exists so it takes care of the nulls, undefineds and non existing properties – ricardoespsanto Jun 21 '16 at 13:52
3

A sparse index avoids indexing a field that doesn't exist. A unique index avoid documents being inserted that have the same field values. Unfortunately as of MongoDB 2.6.7, the unique constraint is always enforced even when creating a compound index (indexing two or more fields) with the sparse and unique properties.

Example:

db = db.connect("test");
db.a.drop();
db.a.insert([
    {},
    {a : 1},
    {b : 1},
    {a : 1, b : 1}
]);
db.a.ensureIndex({a:1,b:1},  { sparse: true, unique: true } );
db.a.insert({a : 1}); // throws Error but wanted insert to be valid.

However, it works as expected for a single index field with sparse and unique properties. I feel like this is a bug that will get fixed in future releases.

Anyhow, here are two solutions to get around this problem.

1) Add a non-null hash field to each document that is only computed when all the required fields for checking the uniqueness are supplied. Then create a sparse unique index on the hash field.

function createHashForUniqueCheck(obj){
    if( obj.firstName && obj.id){
        return MD5( String( obj.firstName) + String(obj.id) );
    }
    return null;
}

2) On the application side, check for uniqueness before insertion into Mongodb. :-)

Larry Battle
  • 9,008
  • 4
  • 41
  • 55
0

A hash index ended up being sufficient for this

jtmarmon
  • 5,727
  • 7
  • 28
  • 45