-1

Some T-SQL developers use 0 by default to show that if (for example) user is active and when the user is passive it's value is 1. The followin code (in my example) shows currently active users:

SELECT * FROM USERS WHERE ISACTIVE = 0 

My question is that, is this query processing faster than the following? =>

SELECT * FROM USERS WHERE ISACTIVE = 1 
Eldar Zeynalov
  • 379
  • 5
  • 19
  • 7
    It will generate the same execution plan. – Felix Pamittan Apr 20 '15 at 07:47
  • 1
    Why don't you try once and see for yourself if it generates a different plan. If query plan is different then there will be difference in execution time else not. – Rahul Apr 20 '15 at 07:50
  • 1
    For future reference, here is how you can find the execution plan: [find execution plan](http://stackoverflow.com/questions/7359702/how-do-i-obtain-a-query-execution-plan) – danielhc Apr 20 '15 at 07:58
  • 5
    Don't try to learn a billion little rules for writing the fastest possible code. That set of rules doesn't exist (and even if it did, would have billions of caveats, conditions that must be considered and would be in constant flux). Write *simple, clear, understandable* code and set performance *goals*. Once the code is written, *measure* the performance, if it meets the goals, then don't do anything more. Only if it doesn't meet the goals should you start picking apart *where* the performance issue is. I pretty well guarantee it won't come down to your choice of meaning for `bit` values. – Damien_The_Unbeliever Apr 20 '15 at 08:07
  • 1 is associated with true. You can use where ISACTIVE = 'true' – paparazzo Apr 20 '15 at 13:28

3 Answers3

5

In the simplest scenario there will be no difference. As with any performance question, the key is to test, so I set up the following table with 1,000,000 randomly distributed rows (500,000 each for 1 and 0).

CREATE TABLE #T (ID INT IDENTITY PRIMARY KEY, Filler CHAR(1000), Active BIT NOT NULL);
INSERT #T (Active)
SELECT  Active
FROM    (   SELECT  TOP 500000 Active = 1
            FROM    sys.all_objects AS a
                    CROSS JOIN sys.all_objects AS b
            UNION ALL
            SELECT  TOP 500000 Active = 0
            FROM    sys.all_objects AS a
                    CROSS JOIN sys.all_objects AS b
        ) AS t
ORDER BY NEWID();

The next step is a simple test of how long a clustered index scan takes on each:

SET STATISTICS TIME ON;
SET STATISTICS IO ON;

SELECT COUNT(Filler) FROM #T WHERE Active = 1;
SELECT COUNT(Filler) FROM #T WHERE Active = 0;

The execution plan is exactly the same for both:

enter image description here

As is the IO:

Scan count 5, logical reads 143089, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Then looking at the elapsed time, over 10 runs (not really enough but the picture is fairly clear) the elapsed times were (in ms)

Active = 1  Active = 0
---------------------------
125         132
86          86
89          61
83          89
88          89
63          64
85          93
126         125
100         117
66          68
--------------------------
91.1        92.4    (Mean)

So a mean difference of approx 1ms, which is not significant enough to be considered material. So in your case no, there is no difference.

I then though perhaps it makes a difference with a sorted index on the column, so added one:

CREATE INDEX IX_T__Active ON #T (Active) INCLUDE (Filler);

And again the results showed no (relevant) difference:

Active = 1  Active = 0
--------------------------
57          55
42          48
56          57
58          55
44          42
46          41
41          42
42          52
43          43
52          59
--------------------------
48.1        49.4

In summary, it does not make a material difference, and I am pretty sure this is the exact kind of premature optimisation that Donald Knuth was referring too.

GarethD
  • 68,045
  • 10
  • 83
  • 123
1

You can quickly test this on tempdb.

    create table #tmp
    (
    id int, flag bit default(0)
    );

        DECLARE @max AS INT, @rc AS INT;
        SET @max = 200000;
        SET @rc = 1;


        INSERT INTO #tmp VALUES(1, 0);
        WHILE @rc * 2 <= @max
        BEGIN
          INSERT INTO #tmp SELECT id + @rc, 0 as flag FROM #tmp;
          SET @rc = @rc * 2;
        END

        INSERT INTO #tmp
          SELECT id + @rc, 1 as flag FROM #tmp WHERE id + @rc <= @max;
GO

set statistics time on
Go
select * from #tmp where flag = 1
set statistics time off
Go
set statistics time on
Go

select * from #tmp where flag = 0
set statistics time off
Go

Try creating indexes as well, you will see more differences i.e. how bit column work in case of indexes with different values.

  1. Result will be same when count of both flag is equal with/without index on this column.
  2. Result will be same when count of both flag is equal or close to equal when column has index
  3. Result will vary when count of both flag is unequal and index is present on flag column
Shantanu Gupta
  • 20,688
  • 54
  • 182
  • 286
0

I don't think this design is there because of performance as both of you queries will generate the same execution plan. More important aspect is if you have index on this column or not.

There are multiple other reasons why to mark inactive users with value 1.

Some reasons on why I would do so:

1) 0 is default value of int and bool

In some ORMs (for example, EF6) you don't have to specify any value to status column and it will be set to 0. So, user will be active by default.

In most systems most of the users will be active. If user is inactive, it is an special case which needs to be covered. Not all around.

2) Future value considerations

This column might contain different values in future to indicate that user has been suspended, deleted, etc.

It would not make much sense to have

0-inactive, 1-active, 2-suspended, etc.

instead of

0-active, 1-inactive, 2-suspended, etc.

That would allow querying problematic users by simple expression status > 0.

Kaspars Ozols
  • 6,967
  • 1
  • 20
  • 33
  • 0 is NOT the default value of an int, nor a bit. NULL is the default value of everything unless specified otherwise – t-clausen.dk Apr 20 '15 at 08:06
  • I was talking about ORMs like Entity Framework 6. If you generate EF6 model on SQL database where `status` is non-nullable `bit` or non-nullable `int` column, in the model you will get C# `int` or `bool` which is not nullable and has default value of 0 and false (converts to 0). So, when creating a record there will be no way to pass in `NULL`. – Kaspars Ozols Apr 20 '15 at 08:18