0

I have a value in a raw binary file (part of a database) and I want to convert to a Python format which can be interpreted by a human being. This is part of a forensic carving procedure. I can convert 8 byte values using this SQL sentence (you will see a date in GMT+2 and in GMT)

SELECT CAST(0x0000ae9401039c4a AS datetime), CAST(0x0000ae9400e2a6ca AS datetime)

which returns

2022-05-13 15:45:12.780 2022-05-13 13:45:12.780

I have tried to convert the binary value with DCODE v5.5 (https://www.digital-detective.net/dcode/) but can't find any format meeting the output of the previous SQL sentence (I have checked that it is right in the database I'm trying to carve).

Do anyone know how to perform the conversion in Python?

I imagine I just need the origin of this time representation and how much time is every bit. Comparing two timestamps separated exactly 2 hours you can see a "300" that I don't know how to interpret. Is a bit 1/300 seconds?

>>> t1=0xae9401039c4a
>>> t2=0xae9400e2a6ca
>>> t1-t2
2160000
>>> (t1-t2)/(2*3600)
300.0

Those ere the properties of the database I need to carve: datetime properties

  • 2
    Why are you not converting the value in SQL Server? SQL Server has 4 different date and time data types, and the value `2022-05-13 15:45:12.780` would be returned very differently for all 4 in a `varbinary` ([db<>fiddle](https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=416e90d57c89562e4e65a208cdbea6ee)). Python could easily use different binary values again. Converting the value to the appropriate date and time data type in your SQL is the best place. – Thom A Jun 14 '22 at 13:28
  • The problem is that I'm developing a carving procedure in Python to extract fragments of a database that would not be treated by SQLServer (there are chunks of a previous database in not asigned disk space). Thanks for your response anyway. – Chirrin Dul Jun 14 '22 at 14:11
  • This sounds like an [XY Problem](//xyproblem.info) at this stage. – Thom A Jun 14 '22 at 14:13
  • Yes, yes, of course. – Chirrin Dul Jun 15 '22 at 11:10
  • Why is the date stored as binary instead of a proper date type? That's the bug that needs to be urgently fixed. SQL understands dates. SQL Server understands dates. Python can read those dates and understand them – Panagiotis Kanavos Jun 17 '22 at 08:50
  • `I'm developing a carving procedure in Python to extract fragments of a database` there are tools for this. What you try to do would only work with a legacy type replaced by others since 2005 – Panagiotis Kanavos Jun 17 '22 at 08:56

2 Answers2

2

Short Version

The legacy datetime type stores dates as a 64-bit floating point offset from 1900-01-01

floatValue=struct.unpack('<d',bytes)[0]
OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO + datetime.timedelta(days=floatValue)

Newer types don't use that format though.

Excel handling libraries like openpyxl offer functions that convert OA/Serial dates like openpyxl.utils.datetime.from_excel

Long Explanation

The legacy datetime type in SQL Server uses the OLE Automation Date storage format that was also used in Excel, Visual Basic and all desktop applications that supported OLE/COM Automation in the late 1990s and early 2000s, before macro viruses. This is a 64-bit floating point number (called a Serial date in Excel) whose integral part is an offset since 1899-12-30 and fractional part is the time of day. Except when it's 1899-12-31 or 1900-01-01.

Back when Excel was released, Lotus 1-2-3 was the most popular spreadsheet and a defacto standard, and incorrectly considered 1900 a leap year. To ensure compatibility, Excel adopted the same bug. VBA adopted tried to both fix the bug and ensure formulas produced the same results as Excel and Lotus, so use 1899-12-30 as a base.

The SQL Server team didn't care about the bug and used the logical 1900-01-01 instead.

Essentially, this value is a timedelta. In Python, you can convert this float to a timedelta by passing it as the days parameter to timedelta, and add it to the base 1900-01-01:

OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO + datetime.timedelta(days=floatValue)

To get a 64-bit float from an array of bytes you can use struct.unpack with the appropriate format string. A 64-bit float is actually a double:

floatValue=struct.unpack('<d',bytes)[0]

WARNING

datetime is a legacy type. The types introduced in 2005, date, time, datetime2 and datetimeoffset have a different storage format. datetime2 and datetimeoffset have variable precision and variable size.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
-1

For future reference, I was finally able to find the data I needed in this post: What is the internal representation of datetime in sql server?

This function will do the conversion:

def extr_datetime (bytes):
    days_off = int.from_bytes(bytes[4:8],byteorder='little', signed=True)
    ticks_off = int.from_bytes(bytes[0:4],byteorder='little', signed=True) / 300.0
    epoch = '1900/01/01 00:00:00'
    epoch_obj = datetime.strptime(epoch, '%Y/%m/%d %H:%M:%S')
    d = epoch_obj + timedelta(days=days_off) + timedelta(seconds=ticks_off)
    return d
General Grievance
  • 4,555
  • 31
  • 31
  • 45
  • That's not a solution at all. Why is the wrong type used in the database in the first place? That's the bug. Besides, are you sure it will even work? (it doesn't the base is wrong, so are the format assumptions). `datetime` is an ancient type that stores dates as a float (OA Dates). Since 2005 it was replaced by better types like `datetime2` (which doesn't use the OA Date format), `date`, `time` and `datetimeoffset`. – Panagiotis Kanavos Jun 17 '22 at 08:50
  • Yes, yes, of course. – Chirrin Dul Jun 17 '22 at 08:56
  • That's not an answer or an explanation why you do this. Or why you don't use one of the already available tools that *do* understand SQL Server types. The new types aren't even fixed-length – Panagiotis Kanavos Jun 17 '22 at 08:56
  • You have not understood the question I asked. I needed a way IN PYTHON to convert 8 bytes to a date. The 8 bytes were the binary representation of a SQL Server's "datetime" type. The 8 bytes are not in a file, but in not assigned space in a hard disk. They are part of deleted .mdf files. The solution I put is perfectly right, at least it makes the work I needed as expected. – Chirrin Dul Jun 23 '22 at 05:36
  • The reference you posted is simply wrong. I explained what the actual storage format is - an OADate: [OLE Automation date](https://docs.microsoft.com/en-us/dotnet/api/system.datetime.tooadate?view=net-6.0#remarks), also used to store dates in Excel. It's a float, whose integer part is the offset since 1900-01-01 and fractional part is the time of day. Python libraries that handle Excel have functions that convert floats to dates. There are several similar questions, like [this one](https://stackoverflow.com/questions/33849149/python-equivalent-of-matlab-function-to-convert-oadate-to-datetime) – Panagiotis Kanavos Jun 23 '22 at 06:55
  • In the possibly duplicate questions you'll see that all you really need is `OLE_TIME_ZERO + datetime.timedelta(days=float(ole))` where `OLE_TIME_ZERO` is the epoch and `ole` the value to convert. For some weird reason, Excel used 1899-12-30 as the base, and so did every application in the 1990s *except* SQL Server, which used 1900-01-01 – Panagiotis Kanavos Jun 23 '22 at 06:57
  • To convert those bytes to a float (actually a double) [you can use struct.unpack](https://stackoverflow.com/questions/5415/convert-bytes-to-floating-point-numbers): `oa=struct.unpack('d',bytes)` – Panagiotis Kanavos Jun 23 '22 at 07:04