0

I have some code that inserts a record, and I want to first delete any existing records with matching tuples. This code is called rapidly from a number of executables:

public void AddMemberEligibility(long memberId, string internalContractKey, int planSponsorId, int vendorId, string vendorContractKey) {
    using (IDocumentSession session = Global.DocumentStore.OpenSession()) {
        var existingMember = session.Query<MemberEligibility>().FirstOrDefault(x => x.VendorId == vendorId 
                               && x.MemberId == memberId && x.PlanSponsorId == planSponsorId);
        if (existingMember != null) {
            session.Delete<MemberEligibility>(existingMember);
            session.SaveChanges();
        }

        Eligibility elig = new Eligibility() {
            InternalContractKey = internalContractKey,
            MemberId = memberId,
            PlanSponsorId = planSponsorId,
            VendorId = vendorId
        };

        session.Store(elig);
        session.SaveChanges();
    }
}

This doesn't seem to be enough to protect against duplicates. Any suggestions?

Daniel
  • 10,864
  • 22
  • 84
  • 115
  • 1
    When I have similiar situations which use a database, I have an added column containing an MD5 hash of the designated properties, then I can just check the one value. One thing I do is keep the existing row instead of deleting the existing and replacing so that I don't fragment my indexes. – Mad Myche Apr 26 '17 at 15:18
  • @MadMyche "keep the existing row instead of deleting" -- this is good advice, thank you. – Daniel Apr 28 '17 at 12:58

2 Answers2

0

A Hash collection would fix this problem this nicely enough.

It calls hashCode() on input and contains functions to keep the collection somewhat organized then equals() to test the overlapping hash codes. This combination makes it put and contains functions typically 0(1); though if say all the hash codes are the same then it increases contains to 0(logn).

Most likely a concurrent hash collection would be preferable. If you are in java (which it looks like), you can use a CurrentHashSet

Code Eyez
  • 313
  • 3
  • 14
  • Could you point me in the right direction in the RavenDB documentation or provide me with a little example? Thanks! (Also, language is C#.) – Daniel Apr 26 '17 at 14:08
  • [Here's the link to C# version of HashSet, which should work in your use case](https://stackoverflow.com/questions/18922985/concurrent-hashsett-in-net-framework) I'm not 100% about the time complexities in C# but there is no reason they differ too much if at all. – Code Eyez Apr 26 '17 at 15:47
  • 1
    I think maybe you aren't fully understanding the question. I understand the concept of concurrent hash sets in respect to, say, a multithreaded app that needs to store things in memory. But how can I ensure uniqueness within my choice of persistence engine (RavenDB) across numerous atomic consumers? RavenDB is an "eventually consistent" object database. I think a good answer would be provided by someone who knows the nuances of the RavenDB engine. – Daniel Apr 26 '17 at 16:28
0

What I ended up doing, after taking Oren Eini's advice on the Raven Google group, was to use the Unique Constraints Bundle.

My DTO now looks something like this:

using Raven.Client.UniqueConstraints;

public class MemberEligibility {
    [UniqueConstraint]
    public string EligibilityKey { get { return $"{MemberId}_{VendorId}_{PlanSponsorId}_{VendorContractKey}"; } }
    public long MemberId { get; set; }
    public int VendorId { get; set; }
    public int PlanSponsorId { get; set; }
    public string VendorContractKey { get; set; }
    // other fields
}

and my add/update looks like this:

public void AddMemberEligibility(long memberId, int planSponsorId, int vendorId, string vendorContractKey, ...) {
    using (IDocumentSession session = Global.DocumentStore.OpenSession()) {
        MemberEligibility elig = new MemberEligibility() {
            MemberId = memberId,
            PlanSponsorId = planSponsorId,
            VendorId = vendorId,
            VendorContractKey = vendorContractKey,
            //other stuff
        };

        var existing = session.LoadByUniqueConstraint<MemberEligibility>(x => x.EligibilityKey, elig.EligibilityKey);
        if (existing != null) {
            // set some fields
        } else {
            session.Store(elig);
        }
        session.SaveChanges();
    }
}

At this point I'm not 100% certain this is the solution I'll push to production, but it works. Keep in mind session.SaveChanges() will throw an exception if there's already a document with the same [UniqueConstraint] property in the store. Also I started with this property typed as a Tuple<...>, but Raven's serializer couldn't figure out how to work with it, so I settled on a string for now.

Daniel
  • 10,864
  • 22
  • 84
  • 115