1

i wrote an application which is a custom console that allows execution of various commands. One of the commands allows serialization of data. The input data is a string, which is a list of comma separated values.

My question is - how to make the serialized data compact as much as possible? The serialization format is not important for me.

Here is the command's code:

using CustomConsole.Common;
using System.IO;
using System.Xml.Serialization;
using System;

namespace Shell_Commander.Commands
{
    public class SerializeCommand : ICommand
    {
        private string _serializeCommandName = "serialize";
        public string Name { get { return this._serializeCommandName; } set { _serializeCommandName = value; } }

        public string Execute(string parameters)
        {
            try
            {
                var splittedParameters = parameters.Split(" ");
                var dataToSerialize = splittedParameters[0].Split(",");
                var pathTofile = splittedParameters[1].Replace(@"\", @"\\");

                XmlSerializer serializer = new XmlSerializer(dataToSerialize.GetType());
                using (StreamWriter writer = new StreamWriter(pathTofile))
                {
                    serializer.Serialize(writer, dataToSerialize);
                    var length = new FileInfo(pathTofile).Length;

                    Console.WriteLine($"Wrote file to: {pathTofile}");
                    return length.ToString();
                }
            }
            catch (Exception e)
            {
                Console.WriteLine(e);
                return "0";
            }
        }
    }
}

The command accepts 2 parameters:

  1. Data to serialize
  2. File path (in order to save the serialized data).

Example - for the "1,2,4" input, the following file will be saved:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfString xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <string>1</string>
  <string>2</string>
  <string>4</string>
</ArrayOfString> 

EDIT:

  1. I want my command to be able to serialize also complex objects in the future, so writing the string as is to the file is not a solution.

  2. I want to use only standard serialization methods and formats.

Simcha Rif
  • 55
  • 4
  • 2
    When I had a similar requirement I used [Protobuf.Net](https://www.nuget.org/packages/protobuf-net) - not all that easy to set up, but amazing compression. – stuartd May 03 '19 at 10:08
  • 1
    If the original input is 1 string, why not just save that string? Serialization is the act of taking objects and creating a text or byte representation of it for persisting it or transporting it. If you started with a string, why not just save that string? Basically, it seems like you have already serialized something to get that string. – Lasse V. Karlsen May 03 '19 at 10:09
  • Only you know the nature of the data. If you really, really want it to be as compact as possible, you'd have to invent your own serialization. If, for example, the number and range of the parameters is known in advance to be numeric and limited, you could pack multiple parameters into a single byte. The current XML roundtrip you have of course ridiculously inflates your data size. You have 238 bytes to store three numbers. You could as well just save the "1,2,4" string to a file and be done with it ass @Lasse indicates. Do you really need to shave off each byte possible? – CodeCaster May 03 '19 at 10:09
  • Here's my version of this method: `File.WriteAllText(pathToFile, parameters, Encoding.UTF8);`. That's all, I still don't understand why you need anything in this method at all since your input is 1 string. – Lasse V. Karlsen May 03 '19 at 10:20
  • @stuartd if I was pedantic (and I am), I'd observe that this is simply *being efficient*, not "compression" , but... yes, that's an option. I'm kinda with Lasse here, though - if you have a string input, *just write it* - perhaps if it is long enough, it might benefit from some gzip love, but... – Marc Gravell May 03 '19 at 10:44
  • @LasseVågsætherKarlsen, CodeCaster thanks, but i want to able to serialize also complex objects also. I don't want to invent my own serialization. I'm looking for an efficient solution that will user standard serialization methods and formats. – Simcha Rif May 03 '19 at 10:47
  • @stuatd, see my comment above. Can you give a code example of Protobuff? – Simcha Rif May 03 '19 at 10:49
  • I can't, sorry. This was for a project owned by a third party. – stuartd May 03 '19 at 10:54
  • So you're *not* going to persist a string? Meaning, the code in the question is nowhere near what you want? Wouldn't it be better to post a realistic question then? – Lasse V. Karlsen May 03 '19 at 11:05
  • @LasseVågsætherKarlsen This was only a simple example, i want the command to be able to serialize also objects in the future. I'll edit the question. – Simcha Rif May 03 '19 at 11:24
  • Then as has been mentioned, protobuf is probably the most compact, json is slightly more compact than XML, and you could always compress the final result if you require space-savings. Question is if you still need to be able to edit the serialized data in a text editor or similar without proper deserialization first. – Lasse V. Karlsen May 03 '19 at 11:25
  • @LasseVågsætherKarlsen Can you write an example of json or protobuff? both sound interesting... – Simcha Rif May 03 '19 at 11:37
  • 3
    To serialize with json, use Json.Net nuget package, obtain your object and just use `JsonConvert.SerializeObject(obj)`, this will give you json string. If you had done that on your `splittedParameters` array the result would've been `[ "1", "2", "4" ]`. As for protobuf, that's way more convoluted and unfortunately that's a bit more than I can do right now. – Lasse V. Karlsen May 03 '19 at 11:45
  • If you're worried about disk space, why not compress the serialized output? For JSON see e.g. [Can I decompress and deserialize a file using streams?](https://stackoverflow.com/q/32943899/3744182). – dbc May 03 '19 at 21:50

0 Answers0