2

The requirement is to migrate a WCF service using Datasets to a gRPC service keeping the metadata when sending the data from the client to the service to identify Row modifications so we can update the database accordingly.

I'm working on a PoC serializing the DataSet as shown next:

using (MemoryStream ms = new MemoryStream()) {
      dataSet.RemotingFormat = SerializationFormat.Binary;
      BinaryFormatter fmt = new BinaryFormatter();
      fmt.Binder = DataSetSerializationBinder.Default;
      fmt.Serialize(ms, dataSet);
      ms.Flush();
      return ms.ToArray();
}

Sending the byte array from the client to the service and then in the service deserializing using the following:

using (MemoryStream ms = new MemoryStream(datasetBytes)) {
      ms.Position = 0;
      BinaryFormatter fmt = new BinaryFormatter();
      fmt.Binder = DataSetSerializationBinder.Default;                    
      var des = fmt.Deserialize(ms);
      DataSet ds = (DataSet)des;
      ms.Close();
}

Overriding the SerializationBinder like this:

public override Type BindToType(string assemblyName, string typeName){
    if (assemblyName.Equals("DSSerializer"))
        return typeof(System.Data.DataSet);
    else
        return defaultBinder.BindToType(assemblyName, typeName);
}
public override void BindToName(Type serializedType, out string assemblyName, out string typeName)
{
    assemblyName = "DSSerializer";
    typeName = serializedType.FullName;            
}

The Serialize and Deserialize methods are in the same library referenced from the client and the server, but when trying to Deserialize it throws the exception: Member 'XmlSchema' was not found. And before adding the SerializationBinder, the exception was: BinaryFormatter.Deserialize: specified cast is not valid.

Note that I need to transfer both schema information and DataRow.RowState information. When I tried to transfer my DataSet using XML as shown by this answer to Loading a datatable into a xml and xml back into a datatable:

dataSet.WriteXml(ms, XmlWriteMode.WriteSchema);

var dataSet = new DataSet();
dataSet.ReadXml(ms, XmlReadMode.ReadSchema);

I found that the latter seems not to be transferred. This is confirmed by Preserving DataRowState when serializing DataSet using DataContractSerializer.

I understand BinaryFormatter is not the best option, what would be a better way to do it, or is there a way to make it work?

dbc
  • 104,963
  • 20
  • 228
  • 340
RMiZe
  • 35
  • 5
  • Honestly, sending DataSet (or even working with DataSet) sounds like the real problem here. There are very few problems where DataSet is a good solution. Is it even remotely possible to change them to use POCOs? – Marc Gravell Jan 16 '21 at 13:25
  • You can use XML instead of binary. Just write the `DataSet` to XML using `XmlWriteMode.WriteSchema` and read it using `XmlReadMode.ReadSchema` and your schema will be preserved, as shown in [this answer](https://stackoverflow.com/a/12461911/3744182) to [Loading a datatable into a xml and xml back into a datatable](https://stackoverflow.com/q/12455539/3744182) by [SpaceApple](https://stackoverflow.com/users/1140144/spaceapple). – dbc Jan 16 '21 at 15:24
  • In fact I think this may be a duplicate, agree? – dbc Jan 16 '21 at 15:27
  • @MarcGravell thank you for your answer. This is a legacy service and it is being used by a windows forms application which is tightly coupled to datasets, unfortunately. So it would take a lot of work to change it to POCOs – RMiZe Jan 18 '21 at 15:46
  • Thank you @dbc, i tested it but, although it keeps the schema, what i need to send is the DataRow.RowState information to the service in order to identify any modification. – RMiZe Jan 18 '21 at 15:51

1 Answers1

1

I was able to reproduce your problem here: fiddle #1. You need to serialize the RowState information of a DataSet which seems only to be serialized by BinaryFormatter, but not by XML even when writing with XmlWriteMode.WriteSchema and reading with XmlReadMode.ReadSchema.

Now, according to Cutting Edge: Binary Serialization of DataSets written by Dino Esposito in October 2004, what BinaryFormatter actually serializes is the schema of the data set plus the diffgram of the data set. Since DataSet has methods to read and write both its schema and its diffgram, it should be possible to convert a data set to and from a DTO that contains these two properties without loss of RowState information.

The following DTO does the job:

public class DataSetDTO
{
    public string XmlSchema { get; set; }
    public string XmlDiffGram { get; set; }
    
    public static DataSetDTO FromDataSet(DataSet dataSet, bool indent = false)
    {
        var diffGram = new StringBuilder();
        using (var xmlWriter = XmlWriter.Create(new StringWriter(diffGram), new XmlWriterSettings { Indent = indent }))
            dataSet.WriteXml(xmlWriter, XmlWriteMode.DiffGram);
        
        var schema = new StringBuilder();
        using (var xmlWriter = XmlWriter.Create(new StringWriter(schema), new XmlWriterSettings { Indent = indent }))
            dataSet.WriteXmlSchema(xmlWriter);
        
        return new DataSetDTO
        {
            XmlSchema = schema.ToString(),
            XmlDiffGram = diffGram.ToString(),
        };
    }
    
    public DataSet ToDataSet()
    {
        var dataSet = new DataSet();
        if (!string.IsNullOrEmpty(XmlSchema))
        {
            // TODO: determine whether and how to deny resolving external references, as is done in the reference source
            // https://referencesource.microsoft.com/#system.data/system/data/DataSet.cs,388
            dataSet.ReadXmlSchema(new StringReader(XmlSchema));
        }
        if (!string.IsNullOrEmpty(XmlDiffGram))
        {
            using var reader = XmlReader.Create(new StringReader(XmlDiffGram));
            dataSet.ReadXml(reader, XmlReadMode.DiffGram);
        }
        return dataSet;
    }
}

If I round-trip a DataSet using the following method:

static DataSet RoundTripViaDTO(DataSet set) => DataSetDTO.FromDataSet(set).ToDataSet();

The schema, row data and RowState information are all deserialized successfully. Demo here: fiddle #2.

Notes:

  • The DTO can be communicated between your client and server using any serializer you prefer. However, don't use BinaryFormatter, for the reasons explained in What are the deficiencies of the built-in BinaryFormatter based .Net serialization? as well as the documentation remarks which state

    BinaryFormatter is insecure and can't be made secure. For more information, see the BinaryFormatter security guide.

    You are currently using .NET Core 3.1, but, looking forward, BinaryFormatter is marked as obsolete in .NET 5 and BinaryFormatter serialization is prohibited by default for ASP.NET apps. See here for details.

  • Since the diffgram data can be verbose I disabled formatting when writing it to a string. If the resulting strings are very large, you might consider replacing the DTO with some sort of streaming solution.

  • In the reference source for DataSet, when the schema is loaded resolving of external references is disabled.

    this.ReadXmlSchema(new XmlTextReader(new StringReader(strSchema)), true);
    
    internal void ReadXmlSchema(XmlReader reader, bool denyResolving) { //...
    

    However, the API to load a schema from an XmlReader and deny resolving is internal. For security reasons you may want to investigate how to deny resolving using publicly available APIs. See XmlSchemaSet Class: Security Considerations for issues that may arise from reading schemas from untrusted sources.

  • This solution assumes the incoming data set is exactly of type DataSet and not some typed DataSet subclass. If you are working with typed data sets you will need to enhance the DTO to include the type information.

dbc
  • 104,963
  • 20
  • 228
  • 340