1

I have some relatively large F# structures made from primitive types / arrays / lists / records / DUs, but no classes. The structures are fluid and so creating tables to match them will bear a very high maintenance cost. Serializing to / deserializing from JSON works like magic.

Unfortunately, the resulting JSON strings might be as large as 25-250 MB, so I want to compress them before storing in the SQL Server database. This is effectively an archive system, so performance is not an issue, look up is not an issue, and using file system as a storage will bring its own mess, which I want to avoid.

I am fine with the SQL Server database growing as large as it will. On the other side SQL Server based compression does not do any good in this case due to some other issues.

What I need is as follows: take an in-memory string (about 25-250MB JSON) -> compress it into a binary format (not GZipped base 64 format as discussed in various answers on the web!) -> store in SQL Server varbinary(max) column and then be able to do it all the way back. I was expecting something like JsonConvert.SerializeObject / JsonConvert.DeserializeObject. Unfortunately, searching on the web did not produce any one-liner results. I would appreciate any advice.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    See msdn : https://learn.microsoft.com/en-us/dotnet/api/system.io.compression?view=netframework-4.7.2. You can use a MemoryStream to do the compression. – jdweng Jan 06 '19 at 23:22
  • 2
    If “creating tables to match them will bear a very high maintenance cost” why are you storing them in MSSQL? Would not a document/JSON database be a better choice? – Dour High Arch Jan 06 '19 at 23:23
  • @jdweng Great! I am well aware of `System.IO.Compression` and can write anything that I want around it. I am just sure that someone has already done that! – Konstantin Konstantinov Jan 06 '19 at 23:24
  • @DourHighArch See above, please. Going with file system storage brings a lot of mess. I started from that and it did not really work out well. The question is about string -> binary compressed representation of that string -> back, not about SQL :) – Konstantin Konstantinov Jan 06 '19 at 23:27
  • 1
    @jdweng Thanks for the comment, by the way. I ended up porting an old C# answer: https://stackoverflow.com/questions/7343465/compression-decompression-string-with-c-sharp to F#. This answer, along with `JsonConvert`, work like a charm: just pipe: `x |> JsonConvert.SerializeObject |> zip` on the way to the database and `r.x |> unZip |> JsonConvert.DeserializeObject` on the way back, where `x` is some F# object, `r` is the database row, `r.x` is the column, `TypeOfX` does not have any nasty surprises, and then the magic happens. I had a DB row shrink from about 250 MB to about 25 MB. – Konstantin Konstantinov Jan 07 '19 at 23:01
  • You may try to include JSON -> BSon (and the reverse) in your pipeline. It could save you some entropy. – OzrenTkalcecKrznaric Jan 07 '19 at 23:05

0 Answers0