1

I want to serialize the Date portion of a DateTime into 4 bytes (could be a Int32). What is the fastest way to do so?

Background: in order to serialize a full DateTime I have been using the ToBinary method so far. It returns a Int64 that I'm storing elsewhere. Now, I have the requirement to store only the Date part of the DateTime using only half the space. So, I was wondering how to achieve this in the fastest way as performance is crucial.

Options that come into my mind is:

  • Encode Year, Month, Day into and int as YYYYMMDD by using some multiplications and property accesses with the nice side-effect that this encoding is human-readable.
  • Keep using ToBinary and keep only "the upper or lower half" of the returned long. Don't know if that is possible.
  • Check how DateTime are stored internally. Maybe the date portion can be accessed in other ways.

How would you do it?

Dejan
  • 9,150
  • 8
  • 69
  • 117
  • 2
    Not sure if this is a trick question as a date could be easily represented in an Int32 as YYYYMMDD, e.g. 20150514 etc. – user469104 May 14 '15 at 17:28
  • @user469104: thx. Of course. Still, is this the fastest way? It would require accessing three props and doing some calculation... – Dejan May 14 '15 at 17:33
  • 2
    You'd probably need to benchmark whichever alternatives you are looking at. Do each a million times in a loop and see which one comes out the fastest. – user469104 May 14 '15 at 17:35
  • 2
    @KenWhite What `DateTime` are you talking about? The `System.DateTime` type doesn't work like that at all... – Jason Watkins May 14 '15 at 17:35
  • Use `DateTime.ToOADate`, which converts it to a COM-compatible datetime (a double), convert to int32 (truncate), and you have 4 bytes (Int32). Coming back, assign the int32 to a double, and use `DateTime.FromOADate` to convert back to `DateTime`. – Ken White May 14 '15 at 17:36
  • @Jason: Yep, was thinking of COM `DateTime`, which is why I deleted that comment. See my new comment. :-) – Ken White May 14 '15 at 17:37
  • @KenWhite - OADates have some weird logic for compatibility issues. I'd avoid them if possible. See [the reference source](http://referencesource.microsoft.com/#mscorlib/system/datetime.cs,1117). – Matt Johnson-Pint May 14 '15 at 18:02
  • @Dejan - You may also be interested in [my pending additions of `System.Date` and `System.TimeOfDay`](https://github.com/mj1856/corefx-dateandtime) to the .NET framework. For this task, I basically use the approach Jason Watkins suggested in his answer. – Matt Johnson-Pint May 14 '15 at 18:05
  • @MattJohnson: oh, I would love to see System.Date! – Dejan May 14 '15 at 18:07
  • Also, if you're looking for something usable today, consider the `LocalDate` type in [Noda Time](http://nodatime.org). – Matt Johnson-Pint May 14 '15 at 18:08

2 Answers2

8

DateTimes are stored as the number of ticks since 00:00:00, January 1, 0001. So if you take a DateTime's Ticks and divide it by TimeSpan.TicksPerDay, you end up with the number of days since January 1, 0001.

Reverse the operation (multiply by TimeSpan.TicksPerDay) When deserializing.

Since there are "only" 3,652,058 days in the range covered by DateTime, this will easily fit in an Int32

To Serialize:

System.DateTime toSerialize;
long longDays = toSerialize.Ticks / System.TimeSpan.TicksPerDay;

// Safe since (DateTime.MaxValue - DateTime.MinValue).Days << Int32.MaxValue
int days = (int)longDays; 
// Serializes `days` however you would serialize any other int

To Deserialize:

int days;
long ticks = days * System.TimeSpan.TicksPerDay;
System.DateTime deserialized = new DateTime(ticks);
Jason Watkins
  • 3,766
  • 1
  • 25
  • 39
2

Any Year which is supported by .net DateTime(0001-9999) would fit in short which is 2 bytes. Day and Month will take a byte each respectively.

So, you can simply create a custom struct with all three fields.

public struct Date
{
    public readonly short Year;//2 byte
    public readonly byte Month;//1 byte
    public readonly byte Day;  //1 byte
    ...
}

You can serialize this struct which would be 4 bytes(assuming no padding). You can just store the bytes representation of the struct since this struct is blittable.

If you can use unsafe code, then conversion can be done very fast. Otherwise use one of the .Net'ish way

private unsafe int DateToInt(Date date)
{
    int* d = (int*)&date;
    return *d;
}

private unsafe Date IntToDate(int date)
{
    Date* d = (Date*)(&date);
    return *d;
}

Or you can use BitVector32 which is designed specially to handle these kind of things.

private static BitVector32.Section yearSection = BitVector32.CreateSection(9999);
private static BitVector32.Section monthSection = BitVector32.CreateSection(12, yearSection);
private static BitVector32.Section daySection = BitVector32.CreateSection(31, monthSection);

private int DateToInt(DateTime date)
{
    BitVector32 bv = new BitVector32(0);
    bv[yearSection] = date.Year;
    bv[monthSection] = date.Month;
    bv[daySection] = date.Day;
    return bv.Data;
}

private DateTime IntToDate(int date)
{
    BitVector32 bv = new BitVector32(date);
    return new DateTime(bv[yearSection], bv[monthSection], bv[daySection]);
}

private void TestDateSerialization()
{
    DateTime date = new DateTime(2009, 6, 24);  
    int serialized = DateToInt(date);

    DateTime deserialized = IntToDate(serialized);
}
Community
  • 1
  • 1
Sriram Sakthivel
  • 72,067
  • 7
  • 111
  • 189
  • Have you actually tried out the unsafe version? Not sure if it's correct. – Dejan May 14 '15 at 19:11
  • You could get the Struct down to one byte. Make it an unsigned byte and have it be an offset from the earliest domain date. – t3dodson May 14 '15 at 19:15
  • @SriramSakthivel: sorry. My confusion was that I thought you are passing the pointer to a `DateTime` but you were referring to the `Date` struct you've proposed. – Dejan May 15 '15 at 07:26
  • @SriramSakthivel: still you're proposition is using much more lines of codes and proves to be factor 10 slower than dividing the Ticks property. That's unless I did a mistake when measuring the time. Here's the gist: https://gist.github.com/dradovic/b5578a4bcb1e9d182cc9 – Dejan May 15 '15 at 07:39
  • I'm proposing options. I didn't claim these will be the fastest ones. Measure it yourself and choose the right one for you. Maybe my answer could help someone in future of their use case. Also I don't have to compare other answers and time it to find which is better, because I'm not going to use it. – Sriram Sakthivel May 15 '15 at 08:03