Currently I am working on a project in which we read, process and store products. We are using Entity Framework 6 to read and write to a MySql database.
After building the prototype and fetching some statistics, we found that storing new products in the database takes (relatively) much time. I have been asked to improve this, but I can't really figure out what the best option is.
Currently, every read and write happens in a using
block. Since this was my first time using Entity Framework 6, I did my research and the vast majority of StackOverflow said that you should always use a using
block. So I did.
Code snippet of how it looks now;
public int GetSomeId(string SomeStringToMatchWith)
{
using (var db = new MyDbContext())
{
return db.SomeTable.Where(t => t.SomeString == SomeStringToMatchWith).FirstOrDefault().id;
}
}
public void SaveSomeData(int SomeId)
{
using(var db = new MyDbContext())
{
db.SomeTable.Add(new SomeTable{ Id = SomeId });
db.SaveChanges();
}
}
I have been told that MySql would work faster if it is exposed to bulk data rather than to single data inserts. Also, after reading this question, I figured it might be better for the program to not immediately write the data to the database (and therefore, to not use using
), but to create a simple Repository
that saves data and write it to the database after a certain amount of time. Since I am willing to acccess the Repository
through multiple threads, I figured a Singleton design would satisfy.
There is one nasty requirement though; the products have to be matched by certain values, and it could happen that product #2 has a match with product #1. In other words, I always need to be able to access the most recent data.
Something like this came to my mind;
public class Repository
{
private static readonly object Lock = new object();
private MyDbContext context { get; set; }
private Repository()
{
context = new MyDbContext();
}
private static Repository _Instance;
public static Repository Instance
{
get
{
if (_Instance == null)
{
lock(Lock)
{
if(_Instance == null)
{
_Instance = new Repository();
}
}
}
return _Instance;
}
}
//This method is called once in a while
public void Commit()
{
context.SaveChanges();
context.Dispose(); //Get rid of entities
context = new MyDbContext(); //Create a fresh DbContext
}
//Other Read/Write methods
}
There are a few questions, actually;
- Is it a hard task to make this whole class thread safe? Do I need to add a lock to every table to do so, or is there a smarter way?
- Would it actually increase the performance? The linked question above does make me think it does.
- Since this is not the recommended way and most of you will (probably) disagree; would there be a better way to implement this?
Please note that the current version works as-is. The program needs to process about 2.500.000 products and the only bottleneck seems to be the writing to the database. As a final note, I have also read Implementing the Repository and Unit of Work Patterns... which clearly tells me how to do it, but does not give me information why I should or should not use it.