1

At present i am fetching data from xml file using LINQ but the problem is i am using XDocument to load xml file but XDocument class load xml data into memory. so if there is 10,000 data in my xml file then XDocument class will load 10,000 data into memory. so some one tell me if use read xml data with XmlReader class then it will not dump full data into memory.

At present this way i am fetching data from xml file.

My xml data look like:

<?xml version="1.0" encoding="utf-8"?>
<Root>
  <Orders>
    <OrderID>10248</OrderID>
    <CustomerID>VINET</CustomerID>
    <EmployeeID>5</EmployeeID>
    <OrderDate>1996-07-04T00:00:00</OrderDate>
    <RequiredDate>1996-08-01T00:00:00</RequiredDate>
    <ShippedDate>1996-07-16T00:00:00</ShippedDate>
    <ShipVia>3</ShipVia>
    <Freight>32.3800</Freight>
    <ShipName>Vins et alcools Chevalier</ShipName>
    <ShipAddress>59 rue de l'Abbaye</ShipAddress>
    <ShipCity>Reims</ShipCity>
    <ShipPostalCode>51100</ShipPostalCode>
    <ShipCountry>France</ShipCountry>
  </Orders>
</Root>

here i am posting code wich fetch data from xml file with order by and paging.

XDocument document = XDocument.Load(@"c:\users\documents\visual studio 2010\Projects\WindowsFormsApplication5\WindowsFormsApplication5\Orders.xml");
            bool isDesc = true;
            //setup basic query
            var query = from r in document.Descendants("Orders")
            select new
            {
                OrderID = r.Element("OrderID").Value,
                CustomerID = r.Element("CustomerID").Value,
                EmployeeID = r.Element("EmployeeID").Value,
            };

            //setup query result ordering,
            //assume we have variable to determine ordering mode : bool isDesc = true/false
            if (isDesc) 
                query = query.OrderByDescending(o => o.OrderID);
            else 
                query = query.OrderBy(o => o.OrderID);

            //setup pagination, 
            //f.e displaying result for page 2 where each page displays 100 data
            var page = 1;
            var pageSize = 5;
            query = query.Skip(page - 1 * pageSize).Take(pageSize);

            //execute the query to get the actual result
            //var items = query.ToList();
            dataGridView1.DataSource = query.ToList();

So some one tell me how could i use xmlreader to read data from xml file with pagination and order by clause will be there.

I got one hit but do not understand how to use it for my purpose:

using( var reader = XmlReader.Create( . . . ) )
{
       reader.MoveToContent();
       reader.ReadToDescendant( "book" );
       // skip N <book> elements
       for( int i = 0; i < N; ++i )
       {
              reader.Skip();
              reader.ReadToNextSibling( "book" );
       }
       // read M <book> elements
       for( int i = 0; i < M; ++i )
       {
              var s = reader.ReadOuterXml();
              Console.WriteLine( s );
              reader.ReadToNextSibling( "book" );
       }
}

So please see the above code and help me to construct the code which would use xml reader to fetch paginated data.

shilovk
  • 11,718
  • 17
  • 75
  • 74
Mou
  • 15,673
  • 43
  • 156
  • 275
  • What about : XmlReader reader = XmlReader.Create(@"c:\users\documents\visual studio 2010\Projects\WindowsFormsApplication5\WindowsFormsApplication5\Orders.xml"); XDocument document = XDocument.Load(reader); – jdweng Jul 19 '15 at 13:26
  • You need all the data in memory since you are trying to order all elements by id. The approach above is what is referred to as the hybrid approach. See Jon's answer : http://stackoverflow.com/questions/8096564/xmltextreader-vs-xdocument – jdweng Jul 19 '15 at 13:34
  • 1
    The 'approach above' is pretty much the same as the current `XDocument.Load("...")` - see the [source](http://referencesource.microsoft.com/#System.Xml.Linq/System/Xml/Linq/XLinq.cs,5db314795a18e20b) for confirmation. It's not the 'hybrid approach' Jon refers to, it's still loads the entire document. – Charles Mager Jul 19 '15 at 16:56
  • There's no obvious advantage to be gained by using `XmlReader` for this. It's more complicated to use, and you still have to read and store every element to be able to sort and paginate. – Charles Mager Jul 20 '15 at 07:20
  • tell me best approach to read xml data which will not load all data into memory. i want to paginate data from xml file but do not like to load huge data into memory. so guide me what would be the best option. thanks – Mou Jul 20 '15 at 07:28

1 Answers1

0

Try this code. Your latest posting is locked so I had to answer here

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XmlReader reader = XmlReader.Create(FILENAME);
            Order order = new Order();
            while(reader.ReadToFollowing("Orders"))
            {
                string xml = reader.ReadOuterXml().Trim();
                XElement element = XElement.Parse(xml);
                order.Add(element);
            }
            order.Sort();

        }
    }
    public class Order
    {
        public void Add(XElement element)
        {
            if (orders == null)
            {
                orders = new List<Order>();
            }
            Order newOrder = new Order();
            newOrder.OrderId = int.Parse(element.Element("OrderID").Value);
            newOrder.CustomerId = element.Element("CustomerID").Value;
            newOrder.EmployeeId = int.Parse(element.Element("EmployeeID").Value);
            newOrder.OrderDate = DateTime.Parse(element.Element("OrderDate").Value);
            newOrder.RequiredDate = DateTime.Parse(element.Element("RequiredDate").Value);
            newOrder.ShippedDate = DateTime.Parse(element.Element("ShippedDate").Value);
            newOrder.ShipVia = int.Parse(element.Element("ShipVia").Value);
            newOrder.Freight = double.Parse(element.Element("Freight").Value);
            newOrder.ShipName = element.Element("ShipName").Value;
            newOrder.ShipAddress = element.Element("ShipAddress").Value;
            newOrder.ShipCity = element.Element("ShipCity").Value;
            newOrder.ShipPostalCode = int.Parse(element.Element("ShipPostalCode").Value);
            newOrder.ShipCountry = element.Element("ShipCountry").Value;
            orders.Add(newOrder);

        }
        public void Sort()
        {
            orders = orders.OrderBy(x => x.OrderId).ToList();
        }
        public static List<Order> orders { get; set; }
        public int OrderId { get; set; }
        public string CustomerId { get; set; }
        public int EmployeeId { get; set; }
        public DateTime OrderDate { get; set; }
        public DateTime RequiredDate { get; set; }
        public DateTime ShippedDate { get; set; }
        public int ShipVia { get; set; }
        public double Freight { get; set; }
        public string ShipName { get; set; }
        public string ShipAddress { get; set; }
        public string ShipCity { get; set; }
        public int ShipPostalCode { get; set; }
        public string ShipCountry { get; set; }
    }

}
​
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • thanks but i found skip and take is missing for pagination. – Mou Jul 20 '15 at 09:25
  • Why do you need skip and take if you are loading entire xml? My code reads on one Orders element at a time and parses into the class Order so entire XML isn't in memory at one time. – jdweng Jul 20 '15 at 09:45
  • I saw from you new posting you are using DataSource. Do you want to put the parsed data into a datatable or may use an IList<> instead of a List<>? – jdweng Jul 20 '15 at 12:49