3

A certain API call is returning two arrays. One array contains Property names, e.g

Properties[0] = "Heartbeat"
Properties[1] = "Heartbeat 2"
Properties[2] = "Some Other discovery method"

Another array contains values for the Properties array, e.g

Values[0] = "22/01/2007"
Values[1] = "23/02/2007"
Values[2] = "5/06/2008"

The values and properties array elements match up, e.g Values[0] is always the value of Properties[0], etc.

My aim is to get the most recent "Heartbeat*" value. Note that the Heartbeat properties and values are not always in elements 1 and 2 of the array, so I need to search through them.

Code looks like this:

static DateTime GetLatestHeartBeat(string[] Properties, string[] Values)
{
    DateTime latestDateTime = new DateTime();
    for(int i = 0; i < Properties.Length; i++)
    {
        if(Properties[i].Contains("Heart"))
        {
            DateTime currentDateTime;
            DateTime.TryParse(Values[i],out currentDateTime);
            if (currentDateTime > LatestDateTime)
                latestDateTime = currentDateTime;
        }
     }
     return latestDateTime
}

The above code gives me the desired result, only issue being the loop continues after there are no more Heartbeat values to find. Is there a more efficient way of performing the above?

svick
  • 236,525
  • 50
  • 385
  • 514
  • 3
    If performance is critical, storing these as a custom class instead of 2 strings will help a ton. Parsing DateTime in a loop is pretty slow... – Reed Copsey Jul 07 '14 at 23:32
  • @Reed Copsey: just a minor note: "Parsing DateTime in a loop is pretty slow" --- this phrase may give false sense to a newbie that it is a loop what it makes it slow (hence they must avoid loops). Instead I'd say that it's performing it multiple times is what makes it slow. Regardless if it's done in a loop or just copied thousand times. – zerkms Jul 07 '14 at 23:37
  • Appreciate the comments. Performance is critical as the code will be searching 100s of thousands of these property/value sets. Wouldn't storing them as a custom class add the overhead of having to convert the two arrays in to my custom class? – Mohamed El masri Jul 07 '14 at 23:42
  • In essence my code is creating a custom class with the `new` keyword. You could do the same with a new custom class type and set the properties just as I set `p` and `v`. The `new` constructor seems very clean in this case however. – crthompson Jul 07 '14 at 23:56
  • 1
    Nothing against Paqogomez' answer, but if performance is critical as you said in your comment, why did you flag his answer (which should execute slower than the code given in your question)? Note, that Paqogomez answer is correct in the context of your question as it is more "efficient" in the sense that it is shorter and thus better readable/maintainable code. Thus, it deserves the answer flag. However, i can't shake the feeling that you do not really understand the performance costs of the code in that answer (i.e., the costs of creating dynamic objects and doing an OrderByDescending) –  Jul 08 '14 at 00:23
  • I ran a couple of tests throwing a few thousand property/result sets and paqogomez' code ran faster, not significantly but still faster. I ran it again just then and now it is slower. I'm beginning to think that the bottle neck is not within my code but more within the API from which I'm grabbing the arrays from. For my own learning, I'll unmark paqogomez' answer and see if anyone is able to come up with a 'faster' and not neccesarily shorter/more maintainable solution. – Mohamed El masri Jul 08 '14 at 00:30
  • Don't do such quick tests, they produce meaningless results... i wrote once something about "you don't measure what you think you are measuring" ([see here](http://stackoverflow.com/questions/16471759/is-bitarray-faster-in-c-sharp-for-getting-a-bit-value-than-a-simple-conjuction-w/19027941#19027941)). Most likely your test results were skewed by .NET's lazy type loading/initialization ... –  Jul 08 '14 at 00:31
  • Hmm, if you call GetLatestHeartBeat only once for any given properties/values array, there is not much faster you can do (unless you use an API which provides the data in other/better data structures)... There might be a marginal performance gain if you replace `Properties[i].Contains("Heart")` with `Properties[i].StartsWith("Heart")` (although i don't know if this is feasible for you), but this is already entering the domain of evil premature micro-optimizations... –  Jul 08 '14 at 00:41
  • However, by any chance does the API perhaps guarantee that the property names in the array are in a certain (alphabetical) order? –  Jul 08 '14 at 00:44
  • elgonzo, the api returns an unsorted array of property names unforunately. – Mohamed El masri Jul 08 '14 at 00:55
  • Look at chad's answer. I was stupid not thinking about Parallel (still stuck in the old pre-.NET 4.5 world i guess :) ). Although you might perhaps logically partition your array in bigger chunks (a few dozen elements at least) instead of creating a task for any single element, i guess, but i might be wrong about this hunch of mine... –  Jul 08 '14 at 00:55

2 Answers2

3

While this doesnt address performance, I would optimize the query like this:

var latestDateTime = Properties.Select((p, index) => 
                                     new {p, v = DateTime.Parse(Values[index])})
                                 .Where(e => e.p.Contains("Heart"))
                                 .OrderByDescending(e => e.v).First();

Perhaps moving the parse after the where clause would limit the times that it casts.

var latestDateTime = Properties.Select((p, index) => 
                                 new {p, v = Values[index]})
                               .Where(e => e.p.Contains("Heart"))
                               .Select(e => DateTime.Parse(e.v))
                               .Max();

EDIT: Per @dbc's comments, changed .OrderByDescending(e => e).First(); to Max();

crthompson
  • 15,653
  • 6
  • 58
  • 80
  • 1
    Sorting the entire array via "OrderByDescending" just to compute the largest value could hurt performance. Suggest using [Max](http://msdn.microsoft.com/en-us/library/vstudio/system.linq.enumerable.max%28v=vs.100%29.aspx) instead. – dbc Jul 08 '14 at 02:52
1

I'd find the indices that contain "Heart" (or other key word) in a parallel for loop to speed it up. Then iterate over those indices to find the latest one.

    static DateTime GetLatestHeartBeat(string[] props, string[] vals)
    {
        ConcurrentBag<int> heartIndxs = new ConcurrentBag<int>();
        // find the indices of "Heart" in parallel
        Parallel.For(0, props.Length,
            index =>
            {
                if (props[index].Contains("Heart"))
                {
                    heartIndxs.Add(index);
                }
            });
        // loop over each heart index to find the latest one
        DateTime latestDateTime = new DateTime();
        foreach (int i in heartIndxs)
        {
            DateTime currentDateTime;
            if (DateTime.TryParse(vals[i], out currentDateTime) && (currentDateTime > latestDateTime))
                latestDateTime = currentDateTime;
        }

        return latestDateTime;
    }

If using DateTime.TryParse is really too slow you could use RegEx to parse the date string and do your own comparison. Honestly I'm not sure that is faster than just using the DateTime.TryParse. Here is a discussion of that topic: Which is Quicker: DateTime.TryParse or Regex

Community
  • 1
  • 1
Chad Dienhart
  • 5,024
  • 3
  • 23
  • 30
  • Don't you think changing his code with the following makes more sense? if (DateTime.TryParse(vals[i], out currentDateTime) && (currentDateTime > latestDateTime)) ... – Johnny Jul 08 '14 at 01:27
  • Yes, that would speed it up if there are cases in which the TryParse fails. – Chad Dienhart Jul 08 '14 at 01:37