6

I am very new to parsing XML, and I started learning about linq which I think might be the best solution here. I am mostly interested in performance as the application I am creating will read stock exchange prices, which sometimes can change very rapidly. I receive following message from the server:

<?xml version="1.0" encoding="utf-16"?>
    <events>
        <header>
            <seq>0</seq>
        </header>
        <body>
            <orderBookStatus>
                <id>100093</id>
                <status>Opened</status>
            </orderBookStatus>
            <orderBook>
                <instrumentId>100093</instrumentId>
                <bids>
                    <pricePoint>
                        <price>1357.1</price>
                        <quantity>20</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1357.0</price>
                        <quantity>20</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1356.9</price>
                        <quantity>71</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1356.8</price>
                        <quantity>20</quantity>
                    </pricePoint>
                </bids>
                <offers>
                    <pricePoint>
                        <price>1357.7</price>
                        <quantity>51</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1357.9</price>
                        <quantity>20</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1358.0</price>
                        <quantity>20</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1358.1</price>
                        <quantity>20</quantity>
                    </pricePoint>
                    <pricePoint>
                        <price>1358.2</price>
                        <quantity>20</quantity>
                    </pricePoint>
                </offers>
                <lastMarketClosePrice>
                    <price>1356.8</price>
                    <timestamp>2011-05-03T20:00:00</timestamp>
                </lastMarketClosePrice>
                <dailyHighestTradedPrice />
                <dailyLowestTradedPrice />
                <valuationBidPrice>1357.1</valuationBidPrice>
                <valuationAskPrice>1357.7</valuationAskPrice>
                <lastTradedPrice>1328.1</lastTradedPrice>
                <exchangeTimestamp>1304501070802</exchangeTimestamp>
            </orderBook>
        </body>
    </events>

My aim is to parse price point elements

<pricePoint>
      <price>1358.2</price>
      <quantity>20</quantity>
</pricePoint>

into dictionary of the following structure:

Dictionary<double, PriceLevel> 

where price should be a double and PriceLevel is a class

class PriceLevel
{
     int bid;
     int offer;

     public PriceLevel(int b, int o)
     {
          bid = b;
          offer = o;
     }


}

Depending on the element, in which each price point exists (either bids or offers) quantity should be assigned accordingly, i.e. if price point exists in bids, then quantity should be assigned to bid, and 0 to offer. On the opposite, if price point exists in offers, then quantity should be assigned to offer and 0 to bid.

I hope my explanation is clear, however if you have any problems understanding it, please do not hesitate to ask for clarification in comments. I would greatly appreciate help in solving this problem.

+++++++++++++++++++++++++++++++++++++++++ Update:

I have gone deeper into the stream I am trying to read, and it is not going to be as simple as I expected. I found out, that the stream will not always contain the whole document, therefore I will have to read it using XmlReader to process the stream on the ongoing basis. In this case, how can I read bids and offers? I have something like this:

StreamReader sr = new StreamReader("..\..\videos.xml");

        XmlReader xmlReader = XmlReader.Create(sr);
        while (xmlReader.Read())
        {
            if (xmlReader.HasValue)
            {
                OnXmlValue(this, new MessageEventArgs(true, xmlReader.Value));//saxContentHandler.Content(xmlReader.Value);
            }
            else
            {
                if (xmlReader.IsEmptyElement)
                {
                    OnStartElement(this, new MessageEventArgs(false, xmlReader.Name));
                    OnEndElement(this, new MessageEventArgs(false, xmlReader.Name));
                }
                else if (xmlReader.IsStartElement())
                {
                    OnStartElement(this, new MessageEventArgs(false, xmlReader.Name));
                }
                else
                {
                    OnEndElement(this, new MessageEventArgs(false, xmlReader.Name));
                }
            }
        }

but I am struggling to link element name to its value ... ie, how can I know which bid price point I am currently reading and if this exists in bids or offers? Thank you for help

Macin
  • 391
  • 2
  • 6
  • 20
  • what is the maximum accuracy of your price point? you have all of them at 1 decimal place, with 4 digits before the dp, in your example. Is that how it will be for all your price points? – Matt Ellen May 04 '11 at 10:09
  • How fast does fastest have to be? Seconds, milliseconds, microseconds? – Ishtar May 04 '11 at 10:28
  • I think, that maximum accuracy of 6 decimal points should be enough. as for the speed and performance, there will be lots of similar messages per second so me are talking miliseconds at least;) – Macin May 04 '11 at 11:24
  • I would definitely consider programmer's efficiency as well here. XmlReader is the lowest API in .NET which all other XML APIs in .NET uses under the scenes. It's a streaming API, so it is the fastest. For the programmer's productivity I prefer Linq to XML, i.e. XDocument. Anyway for some raw numbers: https://www.altamiracorp.com/blog/employee-posts/performance-linq-to-sql-vs, http://blogs.msdn.com/b/codejunkie/archive/2008/10/08/xmldocument-vs-xelement-performance.aspx – nawfal Aug 19 '15 at 08:07

4 Answers4

4

When are are using a event based interface, similar to the one presented in your update, you will need to remember the name of the previous start element event. Often it is worth while holding a stack to keep track of the events. I would probably do something similar to the following:

public class PriceLevel
{
    private decimal? bid = null;
    private decimal? offer = null;

    public decimal? Bid {
        get { return bid; }
        set { bid = value; }
    }

    public decimal? Offer {
        get { return offer; }
        set { offer = value; }
    }
}

public delegate void OnPriceChange(long instrumentId, Dictionary<decimal, PriceLevel> prices);

public class MainClass
{
    private Stack<String> xmlStack = new Stack<String>();
    private Dictionary<decimal, PriceLevel> prices = new Dictionary<decimal, PriceLevel>();
    private bool isBids = false;
    private decimal? currentPrice = null;
    private long instrumentId;
    private OnPriceChange _priceChangeCallback;

    public void MainClass(OnPriceChange priceChangeCallback) {
        this._priceChangeCallback = priceChangeCallback;
    }

    public void XmlStart(object source, MessageEventArgs args) {
        xmlStack.Push(args.Value);

        if (!isBids && "bids" == args.Value) {
            isBids = true;
        }
    }

    public void XmlEnd(object source, MessageEventArgs args) {
        xmlStack.Pop();

        if (isBids && "bids" == args.Value) {
            isBids = false;
        }

        // Finished parsing the orderBookEvent
        if ("orderBook" == args.Value) {
            _priceChangeCallback(instrumentId, prices);
        }
    }

    public void XmlContent(object source, MessageEventArgs args) {

        switch (xmlStack.Peek()) {
        case "instrumentId":
            instrumentId = long.Parse(args.Value);
            break;

        case "price":
            currentPrice = decimal.Parse(args.Value);
            break;

        case "quantity":

            if (currentPrice != null) {
                decimal quantity = decimal.Parse(args.Value);

                if (prices.ContainsKey(currentPrice)) {
                    prices[currentPrice] = new PriceLevel();
                }
                PriceLevel priceLevel = prices[currentPrice];

                if (isBids) {
                    priceLevel.Bid = quantity;
                } else {
                    priceLevel.Offer = quantity;
                }
            }
            break;
        }
    }
}
Michael Barker
  • 14,153
  • 4
  • 48
  • 55
2

first you need to get all offers and all bids

XDocument xmlDoc = XDocument.Load("TestFile.xml");


var bids = (from b in xmlDoc.Descendants("bids")
           select b).ToList();

var offers = (from o in xmlDoc.Descendants("offers")
           select o).ToList();

then you just iterate throgh bids and offers and add them to the dictionary... but as someone sait before... you will maybe have the problem that an pricelevel will have both bids and offers set if they have the same price

to iterate throgugh the list you just do this

foreach (XElement e in bids)
{
   price = e.Element("price").Value;
   quantity = e.Element("quantity").Value;
   dictionary.add(price, new PriceLevel(quantity,null);
}

the same you do for offer... but again.. .you probably have to check if this key already exists...

Ivan Crojach Karačić
  • 1,911
  • 2
  • 24
  • 44
0

1st of all, I believe your method of putting into dictionary would result in error. If not wrong, dictionary cannot have the same key, so since you are using price as the key, there will be very high chance u hit this issue.

I can't say for the speed, you have to test out. But so far XDocument runs fine for me.
Using XDocument, load the whole xml message into that variable, for instance

XDocument doc = XDocument.Load(message);

With doc, you can use Linq to group them into bid and ask.

Once you achieve this, there should be no problem in presenting your data as you already got the price and separated them into bid and ask

C_Rance
  • 661
  • 12
  • 25
0

I managed to get something like this:

public void messageParser()
    {
        int i = 0;
        bool readingBids = false;
        bool readingOffers = false;
        decimal price=0;
        int qty = 0;

        StreamReader sr = new StreamReader("..\\..\\sampleResponse.xml");

        XmlReader xmlReader = XmlReader.Create(sr);
        DateTime startTime = DateTime.Now;
        while (xmlReader.Read())
        {
            #region reading bids
            if (xmlReader.IsStartElement("bids"))
            {
                readingBids = true; 
                readingOffers = false; 
            }

            if (xmlReader.NodeType == XmlNodeType.EndElement && xmlReader.Name == "bids")
            {
                readingBids = false;
                readingOffers = false;
            }

            if (readingBids == true)
            {
                if (xmlReader.IsStartElement("price"))
                    price = xmlReader.ReadElementContentAsDecimal();

                if (xmlReader.IsStartElement("quantity"))
                {
                    qty = xmlReader.ReadElementContentAsInt();
                    OnPricePointReceived(this, new MessageEventArgs(price, qty, "bid"));
                }
            }
            #endregion

            #region reading offers
            if (xmlReader.IsStartElement("offers"))
            { 
                readingBids = false; 
                readingOffers = true; 
            }

            if (xmlReader.NodeType == XmlNodeType.EndElement && xmlReader.Name == "offers")
            {
                readingBids = false;
                readingOffers = false;
            }

            if (readingOffers == true)
            {
                if (xmlReader.IsStartElement("price"))
                    price = xmlReader.ReadElementContentAsDecimal();

                if (xmlReader.IsStartElement("quantity"))
                {
                    qty = xmlReader.ReadElementContentAsInt();
                    OnPricePointReceived(this, new MessageEventArgs(price, qty, "offer"));
                }
            }
            #endregion
        }
        DateTime stopTime = DateTime.Now;
        Console.WriteLine("time: {0}",stopTime - startTime);
        Console.ReadKey();
    }
}

Is this a proper solution for the problem? I have some doubts regarding this piece of code:

 if (readingBids == true)
        {
            if (xmlReader.IsStartElement("price"))
                price = xmlReader.ReadElementContentAsDecimal();

            if (xmlReader.IsStartElement("quantity"))
            {
                qty = xmlReader.ReadElementContentAsInt();
                OnPricePointReceived(this, new MessageEventArgs(price, qty, "bid"));
            }
        }

I only fire OnPricePointReceived event when I managed to read price and qty. However, there is possibility, that there will be no quantity for the the given price (or not). How to implement valiadation, to avoid errors based on incomplete messages?

Macin
  • 391
  • 2
  • 6
  • 20