2

I am trying to fetch chunk by chunk data from a MySQL table.

I have a table like below:

HistoryLogs = 10986119

Now I would like to fetch chunk by chunk data from MyQSL and pass it to sqlbulk copy for processing. I have decided batch size as 1000.

For example, if I have records of 10000 then my query will be like below:

SELECT * FROM tbl LIMIT 0,1000;
SELECT * FROM tbl LIMIT 1000,2000;
SELECT * FROM tbl LIMIT 2000,3000;
SELECT * FROM tbl LIMIT 9000,10000;

So first I will fetch total records from table and then trying like below:

 private int FetchCount(string table)
        {
            using (MySqlCommand cmd = new MySqlCommand("SELECT COUNT(*) FROM " + table, conn))
            {
                cmd.CommandTimeout = 0;
                return cmd.ExecuteNonQuery();
            }
        }

string query = string.Empty;
string table ="HistoryLogs";
int records = FetchCount(table);
for (int i = 0; i < records / 1000 ; i++) //
 {
     here I would like to create chunk by chunk query and pass it to Process method
 }


 private MySqlDataReader Process(MySqlConnection conn, string query)
        {
            using (MySqlCommand cmd = new MySqlCommand(query, conn))
            {
                cmd.CommandTimeout = 0;
                MySqlDataReader reader = cmd.ExecuteReader();
                return reader;
            }
        }

So I am not getting how to create pagination query, and not sure whether I am thinking in the right way.

halfer
  • 19,824
  • 17
  • 99
  • 186
I Love Stackoverflow
  • 6,738
  • 20
  • 97
  • 216

1 Answers1

4

It is a bit uncertain what technique you want to use to get the data from the database.

Because of the Linq keyword I assume you want a Linq statement that gives you a page of items with a given pageSize.

However it is a uncertain how you want to fetch your data for page X. Do you have an IQueryable of the records you want to divide into pages (as in Entity Framework - highly recommended), or do you want to change your SQL statement so that it will give you page X?

The IQueryable method

Suppose you want pages of records of type T and you have an IQueryable<T> to get all records of type T.

 IQueryable<T> allRecords = ...;

You want to divide this sequence into pages. Each page has a PageSize, a PageNr and a sequence of records:

class Page<T>
{
    public int PageSize {get; set;}
    public int PageNr {get; set;}
    public IEnumerable<T> Contents {get; set;}
}

Now to divide AllRecords into a sequence of pages, I use an extension method:

public static class PagingExtensions
{
    public static IQueryable<Page<T>> ToPages<T>(this IQueryable<T> allRecords, int pageSize)
    {
        return allRecords.Select( (record, i) => new
        {
            PageNr = i / pageSize,
            Record = record,
        })
        .GroupBy(item => item.PageNr)
        // intermediate result: sequence of IGrouping<int, T>
        // where key is pageNr
        // and each element in the group are the records for this page
        .Select(group => new Page<T>
        {
            PageNr = group.Key,
            PageSize = pageSize,
            Contents = (IEnumerable<T>) group
        });
    }
}

The code to divide the sequence of MyRecords into pages will be:

const int pageSize = 1000;
IQueryable<MyRecord> allMyRecords = ...
IQueryable<Page<MyRecord>> pages = allMyRecords.ToPages(1000);

// do what you want with the pages, for example:
foreach (Page<MyRecord> page in pages)
{
    Console.WriteLine($"Page {page.PageNr}");
    foreach (MyRecord record in Page.Contents)
    {
       Console.WriteLine(record.ToString());
    }
}

Be aware, that all used functions use deferred execution. The records are not fetched until you enumerate them.

If you want to be able to divide a collection in pages that is in local memory instead of in a database use IEnumerable<T> instead of IQueryable<T>.

Method without IQueryable

If you don't have an IQueryable to fetch all records, you either have to create a class that implements this yourself or adjust your SQL queries depending on the page you want to fetch. I wouldn't recommend the first method though.

class Page<T>
{
    public Page(SqlConnection conn, int pageNr, int pageSize)
    {
        this.PageNr = pageNr;
        this.PageSize = pageSize;
    }
    private readonly SqlConnection conn;
    public int PageSize {get; private set;}
    public int PageNr {get; private set;}

    public IEnumerable<T> ReadContents()
    {
        int offset = this.PageNr * this.PageSize;
        int fetch = this.PageSize;
        string cmdText = "SELECT col1, col2, ..."
          + " FROM ... "
          + " WHERE ... "
          + " ORDER BY -- " 
          // this is a MUST there must be ORDER BY statement
          //-- the paging comes here
          + $" OFFSET {offset} ROWS"
          + $" FETCH NEXT {fetch} ROWS ONLY;";

        using (SqlCommand cmd = new SqlCommand("cmdText, conn))
        {
            using (var sqlDataReader = cmd.ExecuteQuery())
            {
                List<T> readItems = sqlDataReader...;
                // you know better than I how to use the SqlDataReader                    
                return readItems
            }
        }
    }
}

The idea for Fetch / Offset instead of Enumerable Skip / Take came from Implement paging in SQL on stackoverflow.

Ed S.
  • 122,712
  • 22
  • 185
  • 265
Harald Coppoolse
  • 28,834
  • 7
  • 67
  • 116
  • 1
    This is really really awesome.Thank you so much for providing the answer and i really appreciate your efforts – I Love Stackoverflow Dec 21 '16 at 14:09
  • There are a couple of issues with the IQueryable version. The first is easy to fix; `Contents = (IEnumerable) group` should be `Contents = group.Select(g => g.Record)`. The second not so much if you're using EF; you cannot use the `(T, int)` overload of `Select` in a query. – Ed S. Nov 12 '19 at 18:45