I have a database with over 3,000,000 rows, each has an id and xml field with varchar(6000).
If I do SELECT id FROM bigtable
it takes +- 2 minutes to complete. Is there any way to get this in 30 seconds?
I have a database with over 3,000,000 rows, each has an id and xml field with varchar(6000).
If I do SELECT id FROM bigtable
it takes +- 2 minutes to complete. Is there any way to get this in 30 seconds?
Build clustered index on id column
You could apply indexes to your tables. In your case a clustered index.
Clustered indexes:
http://msdn.microsoft.com/en-gb/library/aa933131(v=sql.80).aspx
I would also suggest filtering your query so it doesn't return all 3 million rows each time, this can be done by using TOP
or WHERE
.
TOP:
SELECT TOP 1000 ID
FROM bigtable
WHERE:
SELECT ID FROM
bigtable
WHERE id IN (1,2,3,4,5)
First of all, 3 milion records dont make a table 'Huge'.
To optimize your query, you should do the following.
- Filter your query, why do you need to get ALL your IDs?
- Create clustered index for the ID column to get a smaller lookup table to search first before pointing to the selected row.
Okay, why are you retuning all the Id
s to the client?
Even if your table has no clustered index (which I doubt), the vast majority of you processing time will be client-side, transferring the Id
values over the network and displaying them on the screen.
Querying for all values rather defeats the point of having a query engine.
The only reason I can think of (perhaps I lack imagination) for getting all the Id
s is some sort of misguided caching.
If you want to know many you have do
SELECT count(*) FROM [bigtable]
If you want to know if an Id
exists do
SELECT count([Id[) FROM [bigtable] WHERE [Id] = 1 /* or some other Id */
This will return 1 row with a 1 or 0 indicating existence of the specified Id
.
Both these queries will benefit massively from a clustered index on Id
and will return minimal data with maximal information.
Both of these queries will return in less than 30 seconds, and in less than 30 milliseconds if you have a clustered index on Id
Selecting all the Id
s will provide no more useful information than these queries and all it will achieve is a workout for you network and client.
You could index your table for better performance.
There are additional options as well which you could use to imrpove performance like partion feature.