9

Given a populated SortedList<DateTime, double>. I would like to get all the keys (or their index range, what should be a closed int interval (missed I something?)) for a given low and high DateTime interval.

Note: The low and high values are not necessary are actually in the SortedList.


If anyone has better idea how to do this without a SortedList, here is a bit wider scope what I would like to do, and I found SortedList may be suitable:

  • I would like to cache doubles with a DateTime key.
  • Access performance to the double having the given key is preferable over the performance of adding keys an removing keys
  • Here is the thing: I must "invalidate" the cache for a given key range (delete the keys) Again there is no guarantee that the range min and max is exactly found in the cache.
Christoph Fink
  • 22,727
  • 9
  • 68
  • 113
g.pickardou
  • 32,346
  • 36
  • 123
  • 268
  • So your question is basically get a range of keys? – Sriram Sakthivel May 22 '14 at 12:07
  • 3
    Are you wedded to using `SortedList`? Could you perhaps use a `SortedSet>` and a comparer that compares by `DateTime`? Then you could use `GetViewBetween`. – Jon Skeet May 22 '14 at 12:07
  • @Skeet SortedSet is not a problem, just I was hoped that simple task can be accomplished with a "Sorted" thing like SortedList. I am afraid I was missing something about SortedList (never used) – g.pickardou May 22 '14 at 12:13
  • You could also binary search the `Keys` such that the "lower" value is passed back to you on a search miss. `Keys` supports efficient indexed access and does not require copying to materialize, so everything should go quite nicely. – Jon May 22 '14 at 12:13
  • Related: [Binary Search on Keys of SortedList](https://stackoverflow.com/questions/6101820/binary-search-on-keys-of-sortedlistk-v) – Theodor Zoulias Apr 25 '22 at 16:57

3 Answers3

5

I wondered why SortedList<TKey, TValue> doesn't provide a BinarySearch when it's already sorted by the keys. It also uses the method itself(f.e. in IndexOf), but the array used is a private field. So i've tried to create an extension method for this. Have a look:

public static class SortedListExtensions
{
    public static int BinarySearch<TKey, TValue>(this SortedList<TKey, TValue> sortedList, TKey keyToFind, IComparer<TKey> comparer = null)
    {
        TKey[] keyArray = sortedList.GetKeyArray();
        if (comparer == null) comparer = Comparer<TKey>.Default;
        int index = Array.BinarySearch<TKey>(keyArray, keyToFind, comparer);
        return index;
    }

    public static IEnumerable<TKey> GetKeyRangeBetween<TKey, TValue>(this SortedList<TKey, TValue> sortedList, TKey low, TKey high, IComparer<TKey> comparer = null)
    {
        int lowIndex = sortedList.BinarySearch(low, comparer);
        if (lowIndex < 0)
        {
            // list doesn't contain the key, find nearest behind
            // If not found, BinarySearch returns the complement of the index
            lowIndex = ~lowIndex;
        }

        int highIndex = sortedList.BinarySearch(high, comparer);
        if (highIndex < 0)
        {
            // list doesn't contain the key, find nearest before
            // If not found, BinarySearch returns the complement of the index
            highIndex = ~highIndex - 1;
        }

        IList<TKey> keys = sortedList.Keys;
        for (int i = lowIndex; i < highIndex; i++)
        {
            yield return keys[i];
        }
    }
    
    private static TKey[] GetKeyArray<TKey, TValue>(this SortedList<TKey, TValue> sortedList)
    {
        // trying to resolve array with reflection because SortedList.keys is a private array
        Type type = typeof(SortedList<TKey, TValue>);
        FieldInfo keyField = type.GetField("keys", BindingFlags.NonPublic | BindingFlags.Instance);
        if(keyField != null && keyField.GetValue(sortedList) is TKey[] keyArrayFromReflection)
        {
            return keyArrayFromReflection;      
        }
        
        // fallback: fill a new array from the public Keys property, you might want to log this since you should change the reflection implementation
        IList<TKey> keyList = sortedList.Keys;
        TKey[] keyArray = new TKey[keyList.Count];
        for (int i = 0; i < keyArray.Length; i++)
            keyArray[i] = keyList[i];
        return keyArray;
    }
}

Create a sample SortedList:

DateTime start = DateTime.Today.AddDays(-50);
var sortedList = new SortedList<DateTime, string>();
for(int i = 0; i < 50; i+=2)
{
    var dt = start.AddDays(i);
    sortedList.Add(dt, string.Format("Date #{0}: {1}", i, dt.ToShortDateString()));
}

DateTime low = start.AddDays(1);   // is not in the SortedList which contains only every second day
DateTime high = start.AddDays(10);

Now you can use the extension method to get the range of keys between a low- and high-key:

IEnumerable<DateTime> dateRange = sortedList.GetKeyRangeBetween(low, high).ToList();

Result:

04/04/2014
04/06/2014
04/08/2014
04/10/2014

Note that this is built from scratch and not really tested.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • 1
    Hi @Tim, the moment you added the for loop to copy the contents to new array, it became O(n) instead of O(log n) for binary search. Is there any way to avoid this copy? – Kiran Nov 03 '18 at 14:42
  • @Kiran: better late answer than never. Changed the extension little bit to try to resolve the `TKey[]` with reflection and fill that new array only as fallback. – Tim Schmelter May 13 '21 at 09:21
3

As the list is sorted you can use binary search to locate the endpoints of your interval. Worst case performance will be O(log n).

Jonas Jämtberg
  • 573
  • 2
  • 6
  • 2
    The endpoints are not necessary in the list. It will not work. – g.pickardou May 22 '14 at 12:15
  • The worst case would be that all the keys in the list are equal, which would degenerate to O(n) unless the binary search is very carefully optimized to find the ends of the range in logarithmic time. – Jon May 22 '14 at 12:15
  • 1
    `SortedList.Keys` is an `IList` which is neither a `List` nor an `DateTime[]`, so how do you use `BinarySearch`? – Tim Schmelter May 22 '14 at 12:16
  • @g.pickardou: It can certainly be made to work -- when binary search misses, it has effectively found the position where the search target would be *if it existed*. Consider than in C++, [`equal_range`](http://www.cplusplus.com/reference/algorithm/equal_range/), which does exactly what you want, is logarithmic. – Jon May 22 '14 at 12:18
  • @TimSchmelter We can implement our own `BinarySearch` or a hack: use reflection to read the private array field. – Sriram Sakthivel May 22 '14 at 12:19
  • Yes, I understand now: _I_ have to implement the bin search. – g.pickardou May 22 '14 at 12:23
  • @TimSchmelter I dont see your issue. – Jonas Jämtberg May 22 '14 at 12:25
  • @JonasJämtberg: the issue is that you cannot use `BinarySearch` on a `SortedList` because it doesn't support it. It uses it itself(for example in `IndexOf`), but the array used is private as Siriam has mentioned. Reflection is not an option. – Tim Schmelter May 22 '14 at 12:26
  • @TimSchmelter I'm talking about implementing a binary search for the purpose. – Jonas Jämtberg May 22 '14 at 12:27
  • 1
    btw do not take this as a complaining, but I can not believe I have to implement a binary search in _2014_ over a sorted thing just to get an item what is <= or >= – g.pickardou May 22 '14 at 12:29
  • @g.pickardou I'm sure you can find a library with somewhere that will do it for you. However, implementing the algorithm is so simple that it is probably the quickest option. – Jonas Jämtberg May 22 '14 at 13:21
2

You can solve the problem by running an adapted binary search on the Keys twice to find the indexes that bound the range of interest in the Keys collection.

Since IList<T> does not offer binary search facilities you need to write your own. Fortunately, there is also the option of stealing the ready-made implementation from How to perform a binary search on IList.

Here's an adapted version to find the lower bound:

public static int LowerBound<T>(this IList<T> list, T value, IComparer<T> comparer = null)
{
    if (list == null)
        throw new ArgumentNullException("list");

    comparer = comparer ?? Comparer<T>.Default;

    int lower = 0, upper = list.Count - 1;

    while (lower <= upper)
    {
        int middle = lower + (upper - lower) / 2;
        int comparisonResult = comparer.Compare(value, list[middle]);

        // slightly adapted here
        if (comparisonResult <= 0)
            upper = middle - 1;
        else
            lower = middle + 1;
    }

    return lower;
}

To implement UpperBound, simply change

if (comparisonResult <= 0)

to

if (comparisonResult < 0)

It's now trivial to do this:

var low = set.Keys.LowerBound(value);
var high = set.Keys.UpperBound(value);

// These extra comparisons are required because the adapted binary search
// does not tell us if it actually found the needle. They could be rolled
// into the methods themselves, but this would require another out parameter.
if (set.Keys[low] != value) ++low;
if (set.Keys[high] != value) --high;

if (low <= high) /* remove keys in the range [low, high] */
Community
  • 1
  • 1
Jon
  • 428,835
  • 81
  • 738
  • 806