8

I need to fetch and show data on a webpage whose number of records may vary based on filters from around 500 records to 1 million records.

Will caching be of any use here as I think million record in memory is not a good thought. SqldataReader?

Paging is a must to implement of course. Showing 1 million records is a worst case scenario(the stupid All filter in use-cases!).

Should I use connected architecture(SqlDataReader) or disconnected architecture(DataSets)?

Manish
  • 6,106
  • 19
  • 64
  • 90
  • 4
    i'm just trying to work out if there are enough hours in a day to page through a million rows of data. let's say 40 rows per page so 25,000 pages required, say 1 min of browsing per page would be 25,000 mins or about 17 days - eeeek – Jon Black Oct 16 '10 at 17:37
  • 3
    It makes absolutely no sense to display a million records if you ask me. What is the scenario here? – Brian Rasmussen Oct 16 '10 at 17:39
  • As an aside, assuming your records have an average of 150 bytes (that's like a name, a short description, a couple of ints and a couple bools). 1 million records would be less than 150MB. Not really too much to store in the cache. However, it is worth noting that your database server (probably SQL Server) is already doing caching. In fact, if queries like that are common and you have a lot of memory, I would think it is possible that the whole table is in memory. – tster Oct 16 '10 at 17:40
  • @f00: showing million record is a worst case scenario, edited the question accordingly. – Manish Oct 16 '10 at 17:42
  • Definitely use SqlDataReader if you have to deal with millions of records. DataSets will likely become very ineffective because of the architecture. See my answer below. – Mike Dinescu Oct 16 '10 at 17:47
  • 2
    after edit... get rid of the "all" filter. Enable more advanced filtering so that a user can drill down into exactly what they want. – Chase Florell Oct 16 '10 at 17:47
  • personally i'd be using disconnected datatables (not sets) or roll my own serialisable business objects populated by a datareader as they are far more scaleable under concurent load than datareaders - http://msdn.microsoft.com/en-us/library/ms978388.aspx – Jon Black Oct 16 '10 at 17:53

4 Answers4

13

First of all, think about it like this: displaying 1 million records makes absolutely no sense to any user. So, you have to think about what the user expects to see. Maybe a summary?! Maybe paginate the records in pages of say 25, or 50 or 100 records. Either of these approaches will not require you to hold 1 M records at a time in memory.

Also, when you run a query against a SQL database and use a SqlDataReader you will not be receiving all the records but instead the SQL driver will send the query to the SQL server, the server will execute the query, prepare a result set and create a forward-only cursor on the server. Then the driver will fetch a record at a time, every time you call Read() on your SqlDataReader. The behavior is very similar if you use LINQ to SQL which uses deferred execution. The result set is not transferred over in full until (or unless) you specifically request each and every row.

So, a simple pagination query will do the trick. Or in other cases some sort of summary report that aggregates the data from those 1 million records one or two pages of relevant data.

Of course if you do need to move back and forth through the pages, some sort of caching might make sense but again, think about it: how often will a user actually want to browse through 1 million records - probably never.

As a last note, if you do implement pagination - make sure that the method you use to implement the pagination relies on the SQL server sending data one page at a time and not reading all 1 million records into ASP.NET and then paginating the local copy of the data because that would be very inefficient and slow. Here is an example of a SQL Server query that performs pagination: SO Question #109232

Community
  • 1
  • 1
Mike Dinescu
  • 54,171
  • 16
  • 118
  • 151
  • Nice detailed explaination. Thanks! But still a question: Won't DataSets be of any help here? – Manish Oct 16 '10 at 17:47
  • 1
    A DataSet works differently. As opposed to the SqlDataReader which has a forward-only behavior, a DataSet has to bring all data from the server locally before operating on it. This makes it easier to use in some cases but if you're query could return 1 million records DatSets are not the solution. In the end it all depends on the query. – Mike Dinescu Oct 16 '10 at 17:50
3

I concur with the rest of the answerers. displaying 1M records is ludicrous. However, you can display the first X records, and page through.

The trick is in the Stored Procedure doing the fetching

ALTER PROCEDURE [dbo].[MyHugeTable_GetWithPaging] 
( 
        @StartRowIndex      int, 
        @MaximumRows        int 
) 

AS 
SET NOCOUNT ON 

Select 
    RowNum, 
    [UserName]
From 
    (Select 
        [ID], 
        [UserName]
        Row_Number() Over(Order By [ID] Desc) As RowNum 
        From dbo.[MyHugeTable] t) 
As DerivedTableName 
Where RowNum Between @StartRowIndex And (@StartRowIndex + @MaximumRows) 
Chase Florell
  • 46,378
  • 57
  • 186
  • 376
1

If your server can't cache 1 million records how do you think your user's web browser is going to handle 1 million records worth of HTML coming at it?

Consider paging (here is an example with 1 million records)

Also consider that the user never wants more than about 30 to 50 records. You are either showing them too low level of detail, or you need more filtering.

Chase Florell
  • 46,378
  • 57
  • 186
  • 376
tster
  • 17,883
  • 5
  • 53
  • 72
  • That link was cool. Sort of what I want. But what is it using in the background..DataSets? I see the retrieval is quite quick. – Manish Oct 16 '10 at 17:52
  • @Manish, the background is agnostic to that grid. Using any language you can connect to any database. Then the trick is to write SQL queries which utilize indexes correctly such that returning rows n-m is fast. – tster Oct 17 '10 at 14:43
0

I will suggest use dynamic query with paging . so, when you click on specific page fetch records only for those pages. To fetch record from database from specific range use following query.

like this.

Create proc Test

@take smallint,
@skip smallint,
@orderBy nvarchar(20),
@subscriptionid smallint,


as 

DECLARE @SQLQuery AS NVARCHAR(max)

SET @SQLQuery=' Select ROW_NUMBER() OVER (ORDER BY P.ProductId desc) as RowNum,* from product"

set @SQLQuery=@SQLQuery + ' and Subscriptionid='+CONVERT(nvarchar, @subscriptionid) 

set @SQLQuery= ';WITH Results_CTE AS ( '+@SQLQuery

    set @SQLQuery= @SQLQuery +' ) SELECT * FROM Results_CTE WHERE RowNum > '+CONVERT(nvarchar, @skip)+' AND RowNum <= '+CONVERT(nvarchar, @skip+@take)  --//paging';
END

EXECUTE sp_executesql @SQLQuery 
virender
  • 4,539
  • 5
  • 27
  • 31