0

We are working on shopping cart website with more than 1 lakh products, made on the top of popular e-commerce application - NopCommerce version 2.3 (Just to introduce you to NopCommerce - It's a one of the best & popular open source e-commerce application built on the top of ASP.net version 4 and MVC3.). The site was published with two langauages and single currency.

With around 80 category and 30-40k products it works fairly well. I mean not very bad. But it wasn't good either. As soon as more products were added, performance issues were started with symptoms like long response time (more than 40-50 seconds to load) and high CPU usage (utilizing 90-100%) with just 10-20 users.

The server is Quad Core Xeon Processor with 16 GB of RAM - Windows Server 2008 R2, and is working fine with one more e-commerce website with 50k products on custom develop code - taking hardly 4-8% cpu.

We used cache to store home page featured products and category menu in memory to avoid db calls. It improved home page only.

Later on for fixing issue, we profiled and found that it was Catalog listing which was causing lot of delay to fetch data from the db, which is finely normalized. SQL server seems to take 80-90% CPU and w3wp were taking 30-40% cpu which is causing 100% cpu constantly all the time with just few visitors on website. We consulted few expert, they suggested us to store de-normalized data on disk in binary format to bypass expensive database connections. We did some research and used Protobuff to store de-normalized serialized objects data to the disk which is storing only those fields which are necessary for catalog - product listing page. But due to maintaing some specification functionality we used to create 3 binary file. One for product object, another for category specification object. These two files are per category. And one more file for product and specification mapping - taking almost 5 mb. When requests come, it reads from the serialized binary file and returns data to the object. It reads into mapping file only when someone is filtering the product based on specification.

So now whenever a request for catagory product listing page comes, it checks whether there is binary file created for that category, if it doesn't it generates using stored procedure, and save object to binary for later use. If file exists it directly reads it from binary file. With this thing, we avoided 90% of db calls while loading this page. With just few users (appx. 30-40), it works like a charm. And we are able to reduce the response time to 700-800 ms for each page load. This is a great improvement if we look at the loading time, but CPU is still on the higher side. Differece is: now w3wp using 60-70% cpu with 20-30 visitors and sql is hardly using 5-8%.

But with hitting more users appx to 100-120, servers starts to hang and w3wp is using more than 100% constantly. Requests are no longer served in seconds, instead it takes more than 20-25 seconds to load. And then most requests are never served. We noticed this when multiple request are coming to the site.

We're not expert at Serialization and Binary formater. But we think the high cpu usage is caused by file read operation or may be due to the de-serialization operation being performed on each catalog page load.

We're now looking at the probable solution to address High CPU usage. What could be the problem, and where should we look to fix it. What do you think, is it the file read operation or de-serialization causing this? Should we store de-normalized object in db? What are the alternative we have to address this issue?

Awaiting your expert opinion on the same.

Thanks in advance.

E.J. Brennan
  • 45,870
  • 7
  • 88
  • 116
Krunal
  • 2,967
  • 8
  • 45
  • 101
  • 3
    Profile your code. There's no need to 'think' what is causing high CPU usage. Find out. – AakashM Feb 10 '12 at 15:49
  • 2
    You should profile your code (http://stackoverflow.com/questions/3927/what-are-some-good-net-profilers) and figure out where is the problem. Also, if you look at NopCommerce forums, you will see that there are many complaining about performance issues...which are not resolved... – Aleksandar Vucetic Feb 10 '12 at 15:56
  • Its a late response, but we have managed to fix this issue. We have fixed this issue by optimizing nopCommerce with several optimization techniques, and integrating Apache Solr with it. We build the plugin for the same that can be used to fix the issue. More details can be found here at http://www.nopaccelerate.com/?utm_source=stackoverflow&utm_medium=forum&utm_campaign=reference – Krunal Mar 02 '13 at 13:36

3 Answers3

1

Since you are having CPU issues, I suspect deserialization is the main culprit. In that case, you can make serialization, deserialization nearly 100 times faster by implementing ISerializable interface yourself. I have used this technique before for large object graph and the improvement was phenomenal.

Say you have a class like this:

[Serializable]
public class TestObject : ISerializable {
  public long     id1;
  public long     id2;
  public long     id3;
  public string   s1;
  public string   s2;
  public string   s3;
  public string   s4;
  public DateTime dt1;
  public DateTime dt2;
  public bool     b1;
  public bool     b2;
  public bool     b3;
  public byte     e1;
  public IDictionary<string,object> d1;
}

Implement ISerializable so that you can do custom serialization and deserialization.

public void GetObjectData (SerializationInfo info, StreamingContext ctxt) {
  SerializationWriter sw = SerializationWriter.GetWriter ();
  sw.Write (id1);
  sw.Write (id2);
  sw.Write (id3);
  sw.Write (s1);
  sw.Write (s2);
  sw.Write (s3);
  sw.Write (s4);
  sw.Write (dt1);
  sw.Write (dt2);
  sw.Write (b1);
  sw.Write (b2);
  sw.Write (b3);
  sw.Write (e1);
  sw.Write<string,object> (d1);
  sw.AddToInfo (info);
}

public TestObject (SerializationInfo info, StreamingContext ctxt) {
  SerializationReader sr = SerializationReader.GetReader (info);
  id1 = sr.ReadInt64 ();
  id2 = sr.ReadInt64 ();
  id3 = sr.ReadInt64 ();
  s1  = sr.ReadString ();
  s2  = sr.ReadString ();
  s3  = sr.ReadString ();
  s4  = sr.ReadString ();
  dt1 = sr.ReadDateTime ();
  dt2 = sr.ReadDateTime ();
  b1  = sr.ReadBoolean ();
  b2  = sr.ReadBoolean ();
  b3  = sr.ReadBoolean ();
  e1  = sr.ReadByte ();
  d1  = sr.ReadDictionary<string,object> ();
}

This will not only make the payload 10-100 times smaller, but also improve the performance by 10x to sometimes 100x.

Another thing, see if you have any large loop that loops through thousands of objects. Maybe you have suboptimal linq queries. Those are sometimes CPU hog.

And finally I would recommend Top 10 caching mistakes that I have seen developers make, especially when using a distributed cache.

http://www.codeproject.com/Articles/115107/Ten-Caching-Mistakes-that-Break-your-App

oazabir
  • 1,599
  • 9
  • 15
  • If you implement this, you need to generate serialized binary for all the objects again. Otherwise existing files won't deserialize. – oazabir Feb 12 '12 at 12:32
  • We have stored de-serialized object in object cache, and retrieving it from db whenever it was opted out from memory. And we changed to db to store binary object instead of file system. Although we need to test it. – Krunal Feb 15 '12 at 10:08
  • Did you try out the ISerializable implementation? Seeing improvement? – oazabir Feb 16 '12 at 15:08
0

The answer to database "problems" is fix your poorly designed database. Database "problems" are not a fundamental problem with the ability of databases. It is a problem with your design.

Fixes come in many forms, but it's always the answer. Database "problems" are always the same of many different flavors of problems.

The moral to this story, never take database advice from a guy who knows nothing about fixing database problems and suggests you apply duct tape. The answer to all database problems is to move the data and calculations as close to the database as possible.

The further you move data from the database the more you exacerbate the problem and linearly lower the scalability of your solution. Don't listen to non-database developers trying to "fix" your database.

Jeff Fischer
  • 2,063
  • 1
  • 17
  • 12
0

Question 1: What all is running on this box? If I read correctly, you have one site with 50,000 products (no mention of users or hits) and another with lots more. As you stack sites, you will see some degredation, even if your code is very tight.

Question 2: Do you have all layers on a single box? You now have competing concerns and might block some CPU bound threads due to I/O operations.

Question 3: Have you code reviewed to ensure proper development concepts and methodologies (SOLID, etc)? If not, you could hold resources longer than needed and cause issues.

Question 4: Have you profiled? I mean both SQL Server and the web application. if not, you have no clue where the issue might be and I doubt anyone can help you in this forum.

Even with millions of "products", a properly designed database and site should be fairly fast. But, different factors come together to represent performance. All of the pieces on all layers can affect the application.

As an example, I consulted once on a company that had built a high performance eCommerce application that was dying. All of the parts seemed fine in code reviews. In tests, both the pages and the database worked fine. But they had never stressed the system. If they had they would have caught this little bit of insanity.

 //let's not focus on the magic string, okay? Think about static
 private static SqlConnection connection = new SqlConnection("{conn string here}");

The entire site was filtering through a single SQL Connection because one developer did not understand the concept of the underlying connection pool and thought object initialization would be more of a hit than filtering through a static "always on" connection.

Until you profile the application, you don't have a question here that can be answered. Once you find an issue and ask, someone can step up and say "here is how you solve that". You can add more information to this question, but until there is a problem identified, rather than a generic symptom, you are going nowhere.

Gregory A Beamer
  • 16,870
  • 3
  • 25
  • 32
  • Hi Gregory, thank you for your insights. Here are answers: 1. On the same server we have one more shopping cart site with 50k products with appx. 1000+ visitors a day. – Krunal Feb 11 '12 at 04:45
  • 2. Yes, they're all on one server. Including both sites admin panel. 3. Yes, we have reviewed it, and modify where there is a scope. 4. We have profiled SQL & application. Earlier SQL was causing issue, but now its an application. The controller where we read from binary file and return to model is where application suffer. My Question is: 1. Is it the right approach to use binary file to store de-normalized data and read from binary when requests come? What do you suggest and why? – Krunal Feb 11 '12 at 04:59
  • 1
    The answer to your questions really depend on a lot of factors. In general, where is the biggest pain point? For denomalized data, a database can work fine, as long as it has structure (this is a blurb, this is a long description, etc). With unstructured data, CMS is an option. For example, SharePoint with FAST can work. It really depends on the nature of the data. Now, what work is being done in the controller? And, can some of that work be done ahead of time and then stored somewhere? If you can find the long path and shorten it, you get great benefits. – Gregory A Beamer Feb 23 '12 at 15:21