0

I have three database tables which represent comparison scores for common name strings. They're separated into individual tables to speed scanning time. All of them share the same basic structure of four columns: an ID, two strings and score represtning how different the strings are.

The DB is accessed via Entity Framework. In the implementation code, this leads me to having three near-identitical functions, one for each object type:

    private bool CheckSurnameJw(string string1, string string2, double threshold)
    {
        JwDistanceSurname jw = _rep.GetJwDistanceSurname(string1, string2);

        if (jw == null)
        {
            double dist = JaroWinklerProximity(string1, string2);
            JwDistanceSurname newJw = new JwDistanceSurname { Surname1 = string1, Surname2 = string2, JwScore = dist };
            _rep.Update(newJw);
            _rep.SaveChanges();

            return dist >= surnameProximityThreshold;
        }
        else
        {
            return jw.JwScore >= threshold;
        }
    }

Looking at this, while I could nip and tuck slightly, I can't see any clear areas where the function could be sensibly improved by farming code out to another function. But it annoys the hell out of me to have to re-implement the same logic block three times, to deal with three different identity types.

I wrapped all three classes in an interface specifiying the four columns to see if that would help me tidy things up. But it doesn't: I can't use a generic "get" function because each is querying a different table and, likewise, when I create a new instance of the class I need to give it the appropriate type.

Is there a way I can improve this using reflection/generics?

Bob Tway
  • 9,301
  • 17
  • 80
  • 162
  • 1
    Have you considered some sort of interitance? http://www.entityframeworktutorial.net/code-first/inheritance-strategy-in-code-first.aspx – Rand Random Nov 10 '17 at 14:51
  • Can you post your other near identical code? Kind of hard to refactor just seeing 1 function without seeing other to compare their common codes. – penleychan Nov 10 '17 at 14:54
  • 1
    Have you try to make a repo who target a subset of the dbcontext by calling _dbContext.Set() ? IEntity will be a generic type implementing the interface who specify the common fields of your 3 tables. https://msdn.microsoft.com/en-us/library/gg696521(v=vs.113).aspx – Jon Lopez Garcia Nov 10 '17 at 14:54
  • @12seconds they're literally identical except the type names change - so JwDistanceSurname becomes JwDistanceForename and JwDistanceTitle respectively. – Bob Tway Nov 10 '17 at 14:56
  • @RandRandom Thanks for the suggestion, but we're using DbFirst and I'm not entirely sure that approach will work? – Bob Tway Nov 10 '17 at 14:57
  • 1
    https://stackoverflow.com/questions/22263189/entity-framework-db-first-implement-inheritance – Rand Random Nov 10 '17 at 14:59
  • 1
    Would this work? `private bool CheckSurnameJw(string string1, string string2, double threshold) where T: new()`? Replace `JwDistanceSurname` with `T` and for `newJw`, `var newJw = new T { Surname1 = string1, Surname2 = string2, JwScore = dist };` Usage would be like `CheckSurnameJw("test", "test", 2);` – penleychan Nov 10 '17 at 14:59

1 Answers1

1

If your four columns all have the same columnNames, I'd definitely go for inheritance using Table Per Type or some kind of Composition

However, if the four columns do not have a common meaning, I'd use an interface.

Table per Type approach

Use this if the four columns represent typically the same thing and you would think that your tables are special kinds of this common thing:

abstract class Common
{
    public string String1 {get; set;}
    public string string2 {get; set;}
    public double Threshold {get; set;} 
}

class Table1 : Common
{
     public int Id {get; set;}
     ...
}

class Table2 : Common
{
     public int Id {get; set;}
     ...
}

Your DbContext will be like:

class MyDbContext : DbContext
{
    public DbSet<Table1> Table1s {get; set;}
    public DbSet<Table2> Table2s {get; set;}

    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Table1>().ToTable("Table1s");
        modelBuilder.Entity<Table2>().ToTable("Table2s");
    }
}

This is enough for entity framework to design two tables, where each table has the four common columns.

A DbSet represents the table in the database. The database doesn't know how CheckSurName. Hence I choose not to let class Common know how to CheckSurName. I create an extension function of Common instead.

See Extension Methods Demystified

static class CommonExtensions
{
    public static bool CheckSurName(this Common common)
    {
        JwDistanceSurname jw = _rep.GetJwDistanceSurname(common.String1, common.String2);
        ...
    }
}

Usage:

IEnumerable<Table1> myTableCollection = myDbContext.Table1s
    .Where(table => table...)
foreach (Table1 table1 in myTableCollection)
{
     bool surNameOk = table1.CheckSurName();
     Process(surNameOk);
}

Composition Approach

As most developers, I favour composition over inheritance. The approach will be similar, but instead of Inheritance composition is used

class Common ...

class Table1
{
    public int Id {get; set;}
    public Common Common {get; set;}
}

etc. This will also lead to one Table per type, every table containing all Common properties. The extension function will be similar. The only difference is that you don't perform the check on surnames on your retrieve tables, but on the Common of your retrieved tables:

IEnumerable<Table1> retrievedTables = ...
foreach (Table1 table in retrievedTables)
{
    bool SurnameOk = table1.Common.CheckSurName();
    ...
}

If Common represents something like a person, and your tables represents Items that have a Person, like a SchoolClass with a Teacher, and a School with a HeadMaster, I'd definitely go for this approach. After all a School is not a Person.

Interface approach

You described your columns as if only their types where common, not the names, nor their meaning. You just have two strings and one double, (and an Id, which is lost in your CheckSurName), and the only common thing is that they are two strings and a double. In that case I'd go for an Interface.

Objects that have properties that are needed to CheckSurName will implement ISurNameCheckable:

interface ISurnameCheckable
{
    public string String1 {get;}
    public string String2 {get;}
    public double Threshold {get;}
}

class Table1 : ISurnameCheckable
{
    public int Id {get; set;}
    public string Street {get; set;}
    public string City {get; set}

    // implementation of ISurnameCheckable
    public string String1 {get{return this.Street;}}
    public string String2 {get{return this.City;}}
    ...
}

The extension function is almost the same:

public static bool CheckSurName(this ISurnameCheckable surnameCheckable)
{
    JwDistanceSurname jw = _rep.GetJwDistanceSurname(
       surnameCheckable.String1, surnameCheckable.String2);
    ...
}
Harald Coppoolse
  • 28,834
  • 7
  • 67
  • 116