1

I want to read data from XML file but i do not want to read all data and then sort and filter. basically i do not want to load all XML data into memory rather want to read small amount of data by pagination.

we often read data from database with pagination and same way i want to read data from xml file with pagination. so tell me what would be the best class which i can use to do it.

My xml data look like:

<?xml version="1.0" encoding="utf-8"?>
<Root>
  <Orders>
    <OrderID>10248</OrderID>
    <CustomerID>VINET</CustomerID>
    <EmployeeID>5</EmployeeID>
    <OrderDate>1996-07-04T00:00:00</OrderDate>
    <RequiredDate>1996-08-01T00:00:00</RequiredDate>
    <ShippedDate>1996-07-16T00:00:00</ShippedDate>
    <ShipVia>3</ShipVia>
    <Freight>32.3800</Freight>
    <ShipName>Vins et alcools Chevalier</ShipName>
    <ShipAddress>59 rue de l'Abbaye</ShipAddress>
    <ShipCity>Reims</ShipCity>
    <ShipPostalCode>51100</ShipPostalCode>
    <ShipCountry>France</ShipCountry>
  </Orders>
</Root>

now reading like this way with LINQ

XDocument document = XDocument.Load(@"c:\users\WindowsFormsApplication5\Orders.xml");
var query = from r in document.Descendants("Orders")
            select new
            {
                OrderID = r.Element("OrderID").Value,
                CustomerID = r.Element("CustomerID").Value,
                EmployeeID = r.Element("EmployeeID").Value,
            };

//setup query result ordering,
//assume we have variable to determine ordering mode : bool isDesc = true/false
if (isDesc) query = query.OrderByDescending(o => o.OrderID);
else query = query.OrderBy(o => o.OrderID);

//setup pagination, 
//f.e displaying result for page 2 where each page displays 100 data
var page = 2;
var pageSize = 100;
query = query.Skip(page - 1*pageSize).Take(pageSize);

//execute the query to get the actual result
var items = query.ToList();

If anyone see my above code then must notice the way I am reading data from XML file. It load all data into memory but I suppose I want to read 10 data at time then just will load 10 data in memory not all.

Above coding approach will not be good if there is 10,000,000 records in XML file. So, what would be best way to read large XML file partly with pagination like concept.

Peter B
  • 22,460
  • 5
  • 32
  • 69
Monojit Sarkar
  • 2,353
  • 8
  • 43
  • 94
  • Since you sort your data you need to have it all, so you will need to read it all. – Guru Stron Oct 06 '16 at 10:09
  • http://stackoverflow.com/questions/2441673/reading-xml-with-xmlreader-in-c-sharp -> Check the answer of Jon Skeet – mybirthname Oct 06 '16 at 10:10
  • You didn't explain the problem, but I guess it's a poor performance of above code. See [this](http://stackoverflow.com/q/14000846/1997232). For huge xml you have to traverse it manually (without `XDocument`). Another problem is sorting you want to apply, that won't work well without indexing or having it initially sorted. – Sinatr Oct 06 '16 at 10:10

1 Answers1

0

Use XmlReader

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication16
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XmlReader reader = XmlReader.Create(FILENAME);
            while(!reader.EOF)
            {
                if(reader.Name != "Orders")
                {
                    reader.ReadToFollowing("Orders");
                }
                if(!reader.EOF)
                {
                    XElement order = (XElement)XElement.ReadFrom(reader);
                    int OrderID = (int)order.Element("OrderID");
                    string CustomerID = (string)order.Element("CustomerID");
                    int EmployeeID = (int)order.Element("EmployeeID");
                }
            } 
        }
    }
}
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • what is ReadToFollowing() function does ? – Monojit Sarkar Oct 06 '16 at 11:32
  • why u check reader.EOF twice in code ? – Monojit Sarkar Oct 06 '16 at 11:32
  • Read Following will read one Orders Element in the code posted. Two EOF checks are needed since ReadToFollowing() may not find any more elements. So when you read the last Orders in the file it may not be the last element in the file so the While will not see EOF. – jdweng Oct 06 '16 at 15:17