1

Given a set of date/times in Excel (formatted to maintain and display milliseconds)

2020-01-01 12:00:00.500 AM
2020-01-01 12:00:01.500 AM
2020-01-01 12:00:02.500 AM
2020-01-01 12:00:03.500 AM
2020-01-01 12:00:04.500 AM
2020-01-01 12:00:05.500 AM
2020-01-01 12:00:06.500 AM
2020-01-01 12:00:07.500 AM
2020-01-01 12:00:08.500 AM
2020-01-01 12:00:09.500 AM
2020-01-01 12:00:10.500 AM

Excel rounds these when displayed (or collected by a Pivot Table) in this way

2020-01-01 12:00:00 AM    Down
2020-01-01 12:00:01 AM    Down
2020-01-01 12:00:03 AM    Up
2020-01-01 12:00:03 AM    Down
2020-01-01 12:00:04 AM    Down
2020-01-01 12:00:05 AM    Down
2020-01-01 12:00:06 AM    Down
2020-01-01 12:00:07 AM    Down
2020-01-01 12:00:08 AM    Down
2020-01-01 12:00:09 AM    Down
2020-01-01 12:00:11 AM    Up

How do I replicate this seemingly random midpoint rounding?

Note I know that Excel stores dates as doubles, and I actually have direct access to the values stored in Excel. I'm also aware of the floating point precision issues that floating point date-time values can have.

What I'm looking for is a concrete set of operations (preferably in C#) that provides parity with what Excel in obviously doing. I've tried a number of ways to calculate and round to the "nearest" second nothing I've tried can replicate the seemingly random half rounding.

I'm pretty sure there's something to do with precision at play as well, since the date value (whole number portion) can affect how the time (fractional) portion rounds to the nearest second.

Update Regular DateTime rounding isn't going to cut it. Converting the double date to a DateTime (which works in ticks) will always result in exactly 500ms and therefore a standard rounding will always round up. This is not what Excel is doing (see data above).

Neither is this an issue of simple accuracy or floating-point rounding... @Jeremy Lakeman provided some code to get a more accurate string representation of the values. So I tried it.

Take the following data. Values in brackets are the stringified double values that are stored in the XLSX file XML for the date-time values shown. Left are unrounded, right are rounded by Excel.

2020-01-01 12:00:16.500 AM [43831.00019097222]  --> 2020-01-01 12:00:16.000 AM [43831.000185185185]
2020-01-01 12:00:17.500 AM [43831.000202546296] --> 2020-01-01 12:00:17.000 AM [43831.000196759262]
2020-01-01 12:00:18.500 AM [43831.000214120373] --> 2020-01-01 12:00:19.000 AM [43831.000219907408]
2020-01-01 12:00:19.500 AM [43831.000225694443] --> 2020-01-01 12:00:19.000 AM [43831.000219907408]

The DoubleConverter gives the following outputs for the unrounded values multiplied by seconds: DoubleConverter.ToExactString(d * 86400)

2020-01-01 12:00:16.500 AM --> 3786998416.5
2020-01-01 12:00:17.500 AM --> 3786998417.5
2020-01-01 12:00:18.500 AM --> 3786998418.5
2020-01-01 12:00:19.500 AM --> 3786998419.5

Note that only the 3rd value rounds up, while the other 3 round down. How do I replicate that?

PleasantD
  • 722
  • 5
  • 8
  • https://stackoverflow.com/questions/4072490/is-there-a-way-to-programmatically-convert-vb6-formatting-strings-to-net-format – Hans Passant Feb 15 '23 at 23:59
  • @HansPassant Interesting thought. But when I ran a test `Vb6Format(unroundedTime.Value, "hh:mm:ss")` always seems to round up to the next second. So that's clearly not what Excel is doing. Unless Excel has a different version of `oleaut32.dll` that it's using. But this has two problems -- 1) its not going to be portable and 2) its not going to be efficient for rounding a large set of double-datetime values. – PleasantD Feb 16 '23 at 00:05
  • 1
    Excel dates are weird, https://stackoverflow.com/questions/981655/how-to-represent-a-datetime-in-excel https://superuser.com/q/1131371/1427832 – Jeremy Lakeman Feb 16 '23 at 00:05
  • @JeremyLakeman I'm aware of the weirdness. I'm even cognizant of the 1900-01-00 and false leap year in 1900 that Excel will let you display. But neither of those other links actually tell me how to replicate Excel's behaviour, which is what I'm trying to achieve. – PleasantD Feb 16 '23 at 00:08
  • I expect it's a rounding issue due to the underlying type being a floating point value and not an exact number of half seconds. – Jeremy Lakeman Feb 16 '23 at 00:11
  • 1
    I'd give you an answer, but the question is locked. Hint, use https://jonskeet.uk/csharp/DoubleConverter.cs to format each `(43834 + N / 172800d) * 86400` for odd values of `N` (as per your question). – Jeremy Lakeman Feb 16 '23 at 00:30
  • In other words `2020-01-01 12:00:02.500 == 3787257602.500000476837158203125 seconds past 1900`. – Jeremy Lakeman Feb 16 '23 at 00:34
  • So `( [DateTime value] - new DateTime(1900, 1, 1) ).TotalDays * 86400` should give you a double with the same precision / rounding error as excel. – Jeremy Lakeman Feb 16 '23 at 00:56
  • @D Stanley : Your link is on the topic, but it doesn't answer the question. – Gustav Feb 16 '23 at 14:06
  • As a note: https://stackoverflow.com/questions/1393696/rounding-datetime-objects?noredirect=1&lq=1 does not solve this problem since the conversion from double dates to DateTime (ticks) always gives exactly 500ms unrounded and therefore always rounds up, which is NOT the behaviour seen in Excel. – PleasantD Feb 16 '23 at 18:18
  • @JeremyLakeman This was interesting, but still not what Excel is doing. Details above – PleasantD Feb 16 '23 at 18:34
  • Yeah my example was off by three due to a typo, `2020-01-01 == 43831`. – Jeremy Lakeman Feb 17 '23 at 00:37
  • Interesting question. Here's a little experiment I tried: If you use .NET 7, which [no longer rounds doubles to a full millisecond when converting them with AddDays](https://stackoverflow.com/q/75402404/87698), you get results with look somewhat similar to what Excel does, see [this fiddle](https://dotnetfiddle.net/EiBGjO). Apparently, this is *closer* to what Excel does (but not exactly the same, because we get 17.500... for the second number). – Heinzi Feb 17 '23 at 21:52

1 Answers1

-1

I know I'm on the right track here, because the following (using DoubleConverter);

var epoch = new DateTime(1900, 1, 1);
var start = new DateTime(2020, 1, 1);
var offset = new TimeSpan(0, 0, 0, 0, 500);

var values = Enumerable.Range(0, 100)
    .Select(i =>
    {
        var v = start + offset * (i * 2 + 1);
        var seconds1 = (v - epoch).TotalSeconds;
        var seconds2 = (v - epoch).TotalDays * 86400;
        return $"{v:yyyy-MM-dd HH:mm:ss.f} => {DoubleConverter.ToExactString(seconds2)}, { (seconds2>seconds1 ? "Up" : "Down")}";
    })
    .ToList();

Produces;

2020-01-01 00:00:00.5 => 3786825600.5, Down
2020-01-01 00:00:01.5 => 3786825601.5, Down
2020-01-01 00:00:02.5 => 3786825602.500000476837158203125, Up
2020-01-01 00:00:03.5 => 3786825603.5, Down
2020-01-01 00:00:04.5 => 3786825604.5, Down
2020-01-01 00:00:05.5 => 3786825605.499999523162841796875, Down
2020-01-01 00:00:06.5 => 3786825606.5, Down
2020-01-01 00:00:07.5 => 3786825607.5, Down
2020-01-01 00:00:08.5 => 3786825608.5, Down
2020-01-01 00:00:09.5 => 3786825609.5, Down
2020-01-01 00:00:10.5 => 3786825610.500000476837158203125, Up
...

Each of the values that are slightly greater than 0.5 are the values that excel is rounding up, the others round down. Note that TimeSpan.TotalSeconds gives a double representing the exact number of milliseconds, while .TotalDays * 86400 appears to have the same accuracy as your example.

So the trick is to work out which results are slightly above .5 and round them up.

You can perform this same test in excel with =MOD(A2*86400,1). To see the same floating point accuracy problem and determine which values excel will round up.

private static readonly DateTime EPOCH = new DateTime(1900, 1, 1);
public DateTime ExcelRound(DateTime value)
{
    var m = value.Ticks % TimeSpan.TicksPerSecond;
    if (value.Millisecond > 500)
        m -= TimeSpan.TicksPerSecond;
    else if (value.Millisecond == 500)
    {
        var seconds1 = (value - EPOCH).TotalSeconds;
        var seconds2 = (value - EPOCH).TotalDays * 86400;
        if (seconds2 > seconds1)
            m -= TimeSpan.TicksPerSecond;
    }
    return value.Add(new TimeSpan(-m));
}

However, now that you know what excel is doing. I would personally take that evidence and complain about your requirement. Just because excel has a bug, doesn't mean you should be forced to replicated it.

Jeremy Lakeman
  • 9,515
  • 25
  • 29
  • Excel also rounds dates to the nearest millisecond, so you might need to explicitly deal with the case where `(value.Ticks % TimeSpan.TicksPerMillisecond) != 0`. – Jeremy Lakeman Feb 17 '23 at 02:44
  • See my extended example in the question. If you extrapolate the example data just a little bit further you easily find multiple examples of _exactly_ .5 second values that round up and down without sense. – PleasantD Feb 17 '23 at 21:17
  • **RE: the requirement to replicate Excel.** Under most circumstances I would agree with you. _However_, I am working on an application that is attempting to replicate some of the behaviours of Excel, so their bugs do become my problems. Especially since the primary reason I need date-time rounding is due to how Pivot Table cache's only ever store the rounded second values for date-time fields. This has edge-case implications when trying to replicate filtering behaviour on data that has (500) millisecond values. However, this is an edge case -- which is why I presented it to the hive-mind. – PleasantD Feb 17 '23 at 21:23
  • To get exact results, you would need to know the exact sequence of floating point computations used by excel so that you can replicate the exact rounding errors.... – Jeremy Lakeman Feb 18 '23 at 02:49