1

I wanted to have your opinion on what is the best way to manage time series in c# according to you. I need to have a 2 dimensions matrix-like with Datetime object as an index of rows (ordered and without duplicate) and each columns would represent the stock value for the relevant Datetime. I would like to know if any of those objects would be able to handle missing data for a date: adding a column or a time serie would add the missing date in the row index and would add "null" or "N/a" for missing values for existing dates.

A lot of stuff are already available in c# compared to c++ and I don't want to miss something obvious.

gdoron
  • 147,333
  • 58
  • 291
  • 367
pam
  • 676
  • 1
  • 7
  • 27
  • 2
    Are you concerned about memory usage? Random retrieval speed? Traversal speed? What is your major concern? – zmbq Jan 29 '12 at 11:26

5 Answers5

4

TeaFiles.Net is a library for time series storage in flat files. As I understand you only want to have the data in memory, in which case you would use a MemoryStream and pass it to the ctor.

// the time series item type
struct Tick
{
    public DateTime Time;
    public double Price;
    public int Volume;
}

// create file and write some values
var ms = new MemoryStream();
using (var tf = TeaFile<Tick>.Create(ms))
{
    tf.Write(new Tick { Price = 5, Time = DateTime.Now, Volume = 700 });
    tf.Write(new Tick { Price = 15, Time = DateTime.Now.AddHours(1), Volume = 1700 });
    // ...
}

ms.Position = 0; // reset the stream

// read typed
using (var tf = TeaFile<Tick>.OpenRead(ms))
{
    Tick value = tf.Read();
    Console.WriteLine(value);
}

https://github.com/discretelogics/TeaFiles.Net

You can install the library via NuGet packages Manager "TeaFiles.Net"
A vsix sample Project is also available in the VS Gallery.

citykid
  • 9,916
  • 10
  • 55
  • 91
4

You could use a mapping between the date and the stock value, such as Dictionary<DateTime, decimal>. This way the dates can be sparse.

If you need the prices of multiple stocks at each date, and not every stock appears for every date, then you could choose between Dictionary<DateTime, Dictionary<Stock, decimal>> and Dictionary<Stock, Dictionary<DateTime, decimal>>, depending on how you want to access the values afterwards (or even both if you don't mind storing the values twice).

Matthew Strawbridge
  • 19,940
  • 10
  • 72
  • 93
  • This might be a valid solution. – pam Jan 29 '12 at 17:58
  • 2
    Beware: Dictionary does not guarantee order preservation. If you only load the data once you should be fine (with the current implementation), but otherwise the order could change! Better use OrderedDictionary instead. – Erwin Mayer Dec 27 '13 at 19:46
1

The DateTime object in C# is a value Type which means it initializes with its default value and that is Day=1 Month=1 Year=1 Hour=1 Minute=1 Second=1. (or was it hour=12, i am not quite sure).

If I understood you right you need a datastructure that holds DateTime objects that are ordered in some way and when you insert a new object the adjacent dateTime objects will change to retain your order.

In this case I would focus mor on the datastructure than on the dateTime object.

Write a simple class that inherits from Lits<> for example and include the functionality you want on an insert oder delete operation.

Something like:

public class DateTimeList : List<DateTime> {

public void InsertDateTime (int position, DateTime dateTime) {

    // insert the new object
    this.InsertAt(position, dateTime)

    // then take the adjacent objects (take care of integrity checks i.e.
    // exists the index/object? in not null ? etc.        

    DateTime previous = this.ElementAt<DateTime>(position - 1);

    // modify the previous DateTime obejct according to your needs.

    DateTime next = this.ElementAt<DateTime>(position + 1);

    // modify the next DateTime obejct according to your needs.    

}
}
marc wellman
  • 5,808
  • 5
  • 32
  • 59
  • That is close to what I am looking for. Is there anything in the SortedList object we could use? Thanks – pam Jan 29 '12 at 12:28
  • A SortedList is managed using an array. Adding elements to the middle of an array takes time (O(N)). If you do a lot of element adding, this can be a problem. Is performance any kind of concern here? – zmbq Jan 29 '12 at 13:19
  • Yes I do. But what do you suggest? – pam Jan 29 '12 at 13:41
0

The is a time series library called TimeFlow, which allows smart creation and handling of time series.

The central TimeSeries class knows its timezone and is internally based on a sorted list of DatimeTimeOffset/Decimal pairs with specific frequency (Minute, Hour, Day, Month or even custom periods). The frequency can be changed during resample operations (e.g. hours -> days). It is also possible to combine time series unsing the standard operators (+,-,*,/) or advanced join operations using cusom methods.

Further more, the TimeFrame class combines multiple time series of same timezone and frequency (similar to pythons DataFrame but restricted to time series) for easier access.

Additional there is the great TimeFlow.Reporting library that provides advanced reporting / visualization (currently Excel and WPF) of time frames.

Disclaimer: I am the creator of these libraries.

JanDotNet
  • 3,746
  • 21
  • 30
0

As you mentioned in your comment to Marc's answer, I believe the SortedList is a more appropriate structure to hold your time series data.

UPDATE

As zmbq mentioned in his comment to Marc's question, the SortedList is implemented as an array, so if faster insertion/removal times are needed then the SortedDictionary would be a better choice.

See Jon Skeet's answer to this question for an overview of the performance differences.

Community
  • 1
  • 1
SimonC
  • 6,590
  • 1
  • 23
  • 40
  • But is it a good structure to handle time series with multiple columns. I don't know if you are familiar with R language but I am trying to recreate some kind of "zoo" object. – pam Jan 29 '12 at 13:19
  • How sparse is the data? If you expect values for each column for each time point then why not create a class to hold the data at each point. If not you could use one list per column. Sorry, I'm not familiar with R. What features does it have you need? – SimonC Jan 29 '12 at 14:19
  • Yes i might use one list per column. Zoo objects are made to handle timeseries. The index of the table will be unified thru all stocks and missing values will be handled. – pam Jan 29 '12 at 17:07
  • I might use a sorted dictionary using a class storing all the stock values for a certain date as Tvalue but it doesn't make a lot of sense no? – pam Jan 29 '12 at 17:14