1

I have this simple line:

var records = db.MyDbTable.Where(x => x.SupplierId.HasValue).ToList();

And it takes one hour to execute. I have 272 rows in the table. The table has one column that is NVARCHAR(MAX) and inside there are huge strings. How can I optimize the statement? Afterwards in code I have a foreach which goes through the records and processes them.

The select statement in db takes 5 minutes.

When I use SQL profiler to check, this is the statement that gets translated from C#:

SELECT 
    [Extent1].[id] AS [id], 
    [Extent1].[xml] AS [xml], 
    [Extent1].[name] AS [name], 
    [Extent1].[create] AS [create], 
    [Extent1].[sale] AS [sale], 
    [Extent1].[SupplierId] AS [SupplierId], 
    [Extent1].[Success] AS [Success], 
    [Extent1].[IId] AS [IId]
    FROM [dbo].[MyDbTable] AS [Extent1]
    WHERE [Extent1].[SupplierId] IS NOT NULL
Lance U. Matthews
  • 15,725
  • 6
  • 48
  • 68
petko_stankoski
  • 10,459
  • 41
  • 127
  • 231
  • 2
    I think we're going to need a little bit of info about your database structure. Also, it would help if you told us what results were returned after the hour-long execution. Usually these problems are the result of improper joins on tables/views, but your question seems to indicate this is a standard lookup in a single table. Can you clarify? – Greg T. Apr 26 '18 at 06:45
  • 1
    How should we know? From the little info you gave I can´t see much to optimize, in fact the query already is as simply as it can be - and at least from the client-side (your code) I doubt there´s much you can do about it. Are you sure that statement takes so much time? Did you use a profiler for it? – MakePeaceGreatAgain Apr 26 '18 at 06:47
  • if supplierId is foreign key and if reference field and foreign key field dont have same datatype then this issue may occur. – Code Name Jack Apr 26 '18 at 06:48
  • Also make sure that there are no un-commited transactions locking this particular table from reading. try the same same query with "With(nolock)" to make sure that this is not a transaction problem. – Binu Vijayan Apr 26 '18 at 06:49
  • What if you run the sql version of the query directly against database from management studio ? – Chetan Apr 26 '18 at 06:49
  • Are some direct querry faster? What are the result ? What is the SQL Translation of your querry? does it contains Something weird? – Drag and Drop Apr 26 '18 at 06:49
  • @GregT. MyDbTable has no foreign-keys with other tables. It has 6 columns, one is the primary key, one is the NVARCHAR(MAX) I mentioned and 4 more (name of varchar 255, on datetime, and two integers) – petko_stankoski Apr 26 '18 at 06:51
  • @ChetanRanpariya The select statement in Management Studio takes almost 5 minutes. – petko_stankoski Apr 26 '18 at 06:52
  • 1
    Did you try SQL server profiler to find out the actual use SQL statement? Are you sure you plan to use all properties of all objects in MyDbTable? – Harald Coppoolse Apr 26 '18 at 06:52
  • So, how huge are those strings? – Evk Apr 26 '18 at 06:54
  • `The table has one column that is NVARCHAR(MAX) and inside there are huge strings.` How big are we talking about here? Quantify it for us. 100 characters? 1000? 10000? – Flater Apr 26 '18 at 06:54
  • How much time does take folowing query: var records = db.MyDbTable.Where(x => x.SupplierId.HasValue).Select(x => x.SupplierId).ToList(); – Denis Fedak Apr 26 '18 at 06:55
  • @Evk They are xml's stored in db table so huge. I know it is a bad structure, that's what I'm trying to change. – petko_stankoski Apr 26 '18 at 07:11
  • 1
    And you actually use that xml in futher processing? Still interesting to know how huge exactly they are (100MB, 10MB)? – Evk Apr 26 '18 at 07:14
  • @Evk The largest one has 7 million characters. – petko_stankoski Apr 26 '18 at 07:18
  • If you don't need that xml - you can just not include it in select. But if you need that for futher processing - there is not much you can do. I don't see why it takes 1 hour still, except if your network connection to database is very slow. But on the other hand, if each row is about 10MB, and you select say 200 rows, that's 2GB of network traffic. Might take an hour in bad conditions. – Evk Apr 26 '18 at 07:22
  • No DDL? No execution plan? How is anyone supposed to help you if you give no useful information in your question? – MJH Apr 26 '18 at 08:17
  • If NVARCHAR(MAX) column has **huge values**, then you should put it in a separate table. – DxTx Apr 26 '18 at 08:24
  • Why don't you answer the key question asked a number of times: DO YOU NEED THE XML? Or is it enough to get a sub selection of fields from the table? – Gert Arnold Apr 26 '18 at 09:57
  • @GertArnold I need the xml. I process it later on. – petko_stankoski Apr 26 '18 at 21:43

2 Answers2

2

As OP stated in comment, xml field is required. Then as D T said, you should store it separate table. Further you can add this field after COMPRESSING it in C#. I am giving some stats that I tested using BLToolKit and also EntityFramework.

I am reading a text file with 70000 lines and each line contains 70 characters. That is 5,039,998 of length. Below is the stats that I want to share. I have just tested 35 records as without compression, it required around 9MB for each row of space in database.

enter image description here

As you can see, OutOfMemory Exception, I didn't dig into that. I just wanted to see timings. What I actually did is,

When you insert data in that table; before inserting COMPRESS the string. The datatype for the field is varbinary (MAX), I had taken. When you fetch data; only fetch it when you want to process it. And don't forget to DECOMPRESS it. In SQL Server, it takes 0 seconds to fetch compressed records.

Below is the code for compression and decompression. Reference For The Code

    public static void CopyTo(Stream src, Stream dest)
    {
        byte[] bytes = new byte[4096];

        int cnt;

        while ((cnt = src.Read(bytes, 0, bytes.Length)) != 0)
        {
            dest.Write(bytes, 0, cnt);
        }
    }

    public static byte[] Zip(string str)
    {
        var bytes = Encoding.UTF8.GetBytes(str);

        using (var msi = new MemoryStream(bytes))
        using (var mso = new MemoryStream())
        {
            using (var gs = new GZipStream(mso, CompressionMode.Compress))
            {
                //msi.CopyTo(gs);
                CopyTo(msi, gs);
            }

            return mso.ToArray();
        }
    }

    public static string Unzip(byte[] bytes)
    {
        using (var msi = new MemoryStream(bytes))
        using (var mso = new MemoryStream())
        {
            using (var gs = new GZipStream(msi, CompressionMode.Decompress))
            {
                //gs.CopyTo(mso);
                CopyTo(gs, mso);
            }

            return Encoding.UTF8.GetString(mso.ToArray());
        }
    }

Hope that helps you.

DhavalR
  • 1,409
  • 3
  • 29
  • 57
-3

Try to add index on your where field(SupplierId) into your database and see if it helps.

  • Index on a NVARCHAR(MAX) ? May I advice to build a [checksum](https://learn.microsoft.com/en-us/sql/t-sql/functions/checksum-transact-sql?view=sql-server-2017) and index that instead ? – Drag and Drop Apr 26 '18 at 06:52
  • 1
    While that is a common solution to problems like these, the column type (nvarchar(max)) makes indexes inefficient; and for a 1 hour runtime on a 272 row table, I'd expect there to be more issues than just a missing index. – Flater Apr 26 '18 at 06:53
  • 2
    The SupplierId field is int, another field is nvarchar(max) – petko_stankoski Apr 26 '18 at 07:03