11

We are currently consuming a web service (IBM Message Broker). As the service is still under development, in many cases it returns invalid XML (yes, this will be fixed, I am promised).

The problem comes in when calling this service from .NET, using a client generated by svcutil using ClientBase<T>. It seems the XmlSerializer used is not faulting on invalid XML elements.

Here is an example of what fails to report a fault, and just return a partially initialized element:

using System;
using System.Diagnostics;
using System.IO;
using System.Xml;
using System.Xml.Serialization;

[Serializable]
public class Program
{
  [XmlElement(Order = 0)]
  public string One { get;set; }

  [XmlElement(Order = 1)]
  public string Two { get;set; }

  static void Main(string[] args)
  {
    var ser = new XmlSerializer(typeof(Program));
    ser.UnknownElement += (o, e) => { 
      Console.WriteLine("Unknown element: {0}", e.Element.Name); 
    };

    using (var input = new StringReader(
@"<?xml version=""1.0"" encoding=""utf-8"" ?>
<Program>
  <Two>Two</Two>
  <One>One</One>
</Program>"))
    {
      var p = (Program)ser.Deserialize(input);
      Debug.Assert(p.One != null);
    }
  }
}

When attaching to the UnknownElement event, it correctly reports the invalid XML (element order does not match), but when using ClientBase<T>, these (and some other cases) are simply ignored (as if not using the fault events of XmlSerializer).

My question is how can I make ClientBase<T> detect invalid XML? Is there a way to hook into the fault events of the XmlSerializer used by ClientBase<T>?

Currently we have to manually check responses using SoapUI if something does not make sense.

Thanks

leppie
  • 115,091
  • 17
  • 196
  • 297
  • 1
    You could try a [Message Inspector](http://msdn.microsoft.com/en-us/library/aa717047(v=vs.110).aspx) to validate it yourself? – CodeCaster Jul 22 '14 at 08:41
  • 1
    @CodeCaster: While it could work (need to see if I have everything I need), it seems it is a lot of work for something that should work out of the box... – leppie Jul 22 '14 at 08:48
  • @CodeCaster: I must be doing something stupid, but the interface methods on `IEndpointBehavior` are never being called... From the docs it would seem the following should work: `svc.Endpoint.Behaviors.Add(new ValidatingEndpointBehavior());` – leppie Jul 22 '14 at 09:21
  • 1
    @CodeCaster: Fixed: DO NOT refer to `InnerChannel` before applying behavior! – leppie Jul 22 '14 at 09:27
  • After a few hours of experimenting, I still have gotten no further... Y U NO WORK OUT OF BOX??? – leppie Jul 22 '14 at 11:11
  • Put together a simple test project which demonstrates the failing of code. – Erti-Chris Eelmaa Jul 24 '14 at 09:22
  • @Erti-ChrisEelmaa: The sample does... Remove the fault event handler. This is exactly how `ClientBase` uses it. I cant post server and client code here, too long (also .NET WCF service wont return bad XML). – leppie Jul 24 '14 at 09:42
  • http://msdn.microsoft.com/en-us/library/aa347733(v=vs.110).aspx Use the DataContractSerializer - [it hates things in the wrong order](http://stackoverflow.com/questions/1513525/ignore-field-order-in-datacontractserializer) `/serializer:DataContractSerializer` – ta.speot.is Jul 24 '14 at 11:19

4 Answers4

5

So, out-of-the-box, WCF doesn't believe in XML validation. It treats the XML as a message format, reading the information out which appears correct and ignoring the rest. This has the advantage of being very liberal in what the service will accept.

The trouble comes when things like the ordering of elements start to matter. It could be argued that ordering of the structures shouldn't be important, that you can indicate ordering with information in the data itself (dates, times or index properties, for example). In your trivial case, the ordering doesn't actually matter, since you can read and comprehend the information regardless of the order it's presented in. I am sure your actual case is much more valid, so I won't labour this point further.

In order to validate the XML structure, you need access to the message in the WCF pipeline. The easiest way in is to use an IClientMessageInspector impementation which validates the message and attach it to your client using a behaviour.

Assuming you want to do this with XML schema validation against an XSD, you would create an inspector like this:

class XsdValidationInspector : IClientMessageInspector
{
    private readonly XmlSchemaSet _schemas;

    public XsdValidationInspector(XmlSchemaSet schemas)
    {
        this._schemas = schemas;
    }

    public void AfterReceiveReply(ref Message reply, object correlationState)
    {
        // Buffer the message so we can read multiple times.
        var buffer = reply.CreateBufferedCopy();

        // Validate the message content.
        var message = buffer.CreateMessage();

        using (var bodyReader
            = message.GetReaderAtBodyContents().ReadSubTree())
        {
            var settings = new XmlReaderSettings
            {
                Schemas = this._schemas,
                ValidationType = ValidationType.Schema,
            };

            var events = new List<ValidationEventArgs>();
            settings.ValidationEventHandler += (sender, e) => events.Add(e);

            using (var validatingReader
                = XmlReader.Create(bodyReader, settings))
            {
                // Read to the end of the body.
                while(validatingReader.Read()) {  }
            }

            if (events.Any())
            {
                // TODO: Examine events and decide whether to throw exception.
            }
        }

        // Assign a copy to be passed to the next component.
        reply = buffer.CreateMessage();
    }

    public object BeforeSendRequest(
        ref Message request,
        IClientChannel channel) {}
}

The accompanying validation behaviour isn't especially complicated:

class XsdValiationBehavior : IEndpointBehavior
{
    private readonly XmlSchemaSet _schemas;

    public XsdValidationBehavior(XmlSchemaSet schemas)
    {
        this._schemas = schemas;
    }

    public void AddBindingParameters(
        ServiceEndpoint endpoint,
        BindingParameterCollection bindingParameters) {}

    public void ApplyClientBehavior(
        ServiceEndpoint endpoint,
        ClientRuntime clientRuntime)
    {
        clientRuntime.MessageInspectors.Add(
            new XsdValidationInspector(this._schemas));
    }

    public void ApplyDispatchBehavior(
        ServiceEndpoint endpoint,
        EndpointDispatcher endpointDispatcher) {}

    public void Validate(ServiceEndpoint endpoint){}
}

You can either create some configuration elements and apply the behaviour via config, or you can do so programatically by modifying the client's channel factory before you open the client connection. Here's the programmatic approach:

var schemaMarkup =  @"<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
       <xsd:element name='Program'>
        <xsd:complexType>
         <xsd:sequence>
          <xsd:element name='One' minOccurs='1' maxOccurs='1'/>
          <xsd:element name='Two' minOccurs='1' maxOccurs='1'/>
         </xsd:sequence>
        </xsd:complexType>
       </xsd:element>
      </xsd:schema>";

var schema = new XmlSchema();
using (var stringReader = new StringReader(schemaMarkup));
{
    var events = new List<ValidationEventArgs>();
    schema.Read(stringReader, (sender, e) => events.Add(e));

    // TODO: Check events for any errors.
}

var validation = new XsdValidationBehavior(new XmlSchemaSet { schema });

client.ChannelFactory.Behaviours.Add(validation);
Paul Turner
  • 38,949
  • 15
  • 102
  • 166
  • I have tried this approach, but referencing outside schema's are not ideal as the service contract might be changing (more work for me, trying to do less ;p). If there is a way to get `XmlSchemaSet` from the generated code structure and getting it working (would not for me) then this would be the ideal solution. – leppie Jul 24 '14 at 14:16
  • If you're consuming a WSDL, it will contain a reference to the XML schema of the messages the service exposes. The issue you may encounter is that the service may not have the strong constraints on element ordering that you're trying to test for. – Paul Turner Jul 25 '14 at 08:03
  • Seems I cannot extend/split the bounty :( I appreciate your efforts, but based purely on rep points, I will go with the one with lower rep. Sorry. Will try upvote a few of your answers to make up for it :) – leppie Jul 31 '14 at 09:36
  • Let me know if 'vote spam' was detected, and I will redo if needed. – leppie Jul 31 '14 at 09:38
  • 2
    You needn't worry, leppie; I'm not here for the internet points, just glad you got something that worked. – Paul Turner Jul 31 '14 at 13:47
4

I would suggest the same implementation as Tragedian eg. create a a client message inspector that is added to the service endpoint which preforms the schema validation of all messages coming in.

Dynamic validation with Local Service Schema

Below is an example for dynamically loading the schema fetched originally from the service that was used to generate the service reference. This way you can always update the service and never have to change this code to validate the xml with the schema.

This uses the Service reference you have to load the existing schema on your solution (you can see that schema information in the ServiceReference Folder inside your project using a file explorer.

using System.ServiceModel.Channels;
using System.ServiceModel.Description;
using System.ServiceModel.Dispatcher;
using System.Xml.Schema;

namespace ConsoleApplication1
{
    class Program
    {
        class XsdValidationInspector : IClientMessageInspector ... //omitted for clarity
        class XsdValiationBehavior : IEndpointBehavior ... //omitted for clarity

        static void Main(string[] args)
        {
            ContractDescription cd = ContractDescription.GetContract(typeof(ServiceReference1.IService1));

            WsdlExporter exporter = new WsdlExporter();

            exporter.ExportContract(cd);

            XmlSchemaSet set = exporter.GeneratedXmlSchemas;

            // Client implementation omitted for clarity sake.
            var client = <some client here>; //omitted for clarity

            var validation = new XsdValidationBehavior(new XmlSchemaSet { xmlSchema });

            client.ChannelFactory.Behaviours.Add(validation);
        }
    }
}

Dynamic Check of Service End Point Schema

But in line with your comment about not having to change the hard-coded schema and or objects I have added below a way for you to automatically get the schema dynamically from the service end point. I would suggest that you cache this.

You can even use this to identify if the service end point has changed eg. when you first get the reference for the service you save it to disk and generate your messages then the service can get the schema dynamically from the service end point every day and check for any modifications or differences and notify you or log any errors.

See below an example of how to do this.

using System;
using System.IO;
using System.Net;
using System.Web.Services.Description;
using System.Text;
using System.Xml.Schema;

namespace ConsoleApplication1
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            //Build the URL request string
            UriBuilder uriBuilder = new UriBuilder(@"http://myservice.local/xmlbooking.asmx");
            uriBuilder.Query = "WSDL";

            HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(uriBuilder.Uri);
            webRequest.ContentType = "text/xml;charset=\"utf-8\"";
            webRequest.Method = "GET";
            webRequest.Accept = "text/xml";

            //Submit a web request to get the web service's WSDL
            ServiceDescription serviceDescription;
            using (WebResponse response = webRequest.GetResponse())
            {
                using (Stream stream = response.GetResponseStream())
                {
                    serviceDescription = ServiceDescription.Read(stream);
                }
            }

            Types types = serviceDescription.Types;
            XmlSchema xmlSchema = types.Schemas[0];

            // Client implementation omitted for clarity sake.
            var client = some client here;

            var validation = new XsdValidationBehavior(new XmlSchemaSet { xmlSchema });

            client.ChannelFactory.Behaviours.Add(validation);

        }
    }
}

This way you don't need to regenerate the schema every time as it will always pick up the latest schema.

Community
  • 1
  • 1
dmportella
  • 4,614
  • 1
  • 27
  • 44
  • It's probably incorrect to dynamically reference the schema, unless the client is going to be dynamically updated as well. This leads to the situation where you can have multiple versions of the schemas involved, and it's better to consistently wrong than occasionally correct. – Paul Turner Jul 29 '14 at 09:25
  • Just noted as he said he didnt want to make the changes manually. This way it is dynamic ... – dmportella Jul 29 '14 at 10:58
  • I like both this approach and @Tragedian's one, but is there not a way to get the schemas from the generated code's structure like `xsd.exe` does? (well, sure there is, but I am lacking on time to dig it out) – leppie Jul 30 '14 at 04:56
  • I have an example so as soon as i get into work i will post it. – dmportella Jul 30 '14 at 06:25
  • 1
    @leppie Hi I have updated my answer this should be exactly what you want now. – dmportella Jul 30 '14 at 07:53
  • Cool, will test it soon :) – leppie Jul 30 '14 at 09:45
  • Will only be able to test tomorrow, but will re-open bounty if it expires. – leppie Jul 30 '14 at 13:46
  • So close, yet so far! `exporter.GeneratedXmlSchemas` always contains 0 elements... :( Looking if I need to call something else perhaps. – leppie Jul 31 '14 at 05:45
  • Did you call the exporter.ExportContract(cd) – dmportella Jul 31 '14 at 05:54
  • @dmportella: Yes I did, it populates `GeneratedWsdlDocuments` though. – leppie Jul 31 '14 at 07:49
  • not sure what is different for you I have a working version here that shows the XmlSchemaSet being generated. there must be something strange about the service reference. what is the wsdl url I can try locally. – dmportella Jul 31 '14 at 08:24
  • @dmportella: service is internal... cant share :( I will re-open the bounty in 20 min when it expires, I feel we are very close! – leppie Jul 31 '14 at 08:48
  • Seems I could not renew bounty, so you get it. Thanks :) – leppie Jul 31 '14 at 09:37
3

You can configure svcutil to perform serialization with the DataContractSerializer:

/serializer:DataContractSerializer

Generates data types that use the Data Contract Serializer for serialization and deserialization.

Short Form: /ser:DataContractSerializer

DataContractSerializer will throw an exception if it encounters a mis-ordered element (it is sometimes painfully strict about element ordering) or other problems.

Community
  • 1
  • 1
ta.speot.is
  • 26,914
  • 8
  • 68
  • 96
  • 2
    This approach is great, but only if your contracts are compatible with the `DataContractSerializer`. They can't contain XML attributes for document data, only elements. – Paul Turner Jul 24 '14 at 11:38
  • 1
    Thanks! I will try that and report back :) – leppie Jul 24 '14 at 11:38
  • Unfortunately this is a no go... It spits out warnings about header, body and faults, and nothing gets generated :( Example: `Warning: The optional WSDL extension element 'fault' from namespace 'http://schemas.xmlsoap.org/wsdl/soap/' was not handled.` Am I using it wrong? – leppie Jul 24 '14 at 11:43
  • 1
    [Looks like SvcUtil + DataContractSerializer is quite demanding about what it will accept](http://stackoverflow.com/a/928915/242520) - see the link to the schema reference. I don't suppose the IBM Message Broken can generate different WSDLs. E.g. SOAP 1.1 or SOAP 1.2 and you could try both of them? – ta.speot.is Jul 24 '14 at 12:11
  • I can try ask, but I doubt anyone actually knows how to use it properly ;p – leppie Jul 24 '14 at 14:17
2

Now, I am not 100% sure of this, but I do believe that party is going on at the file XmlSerializerOperationFormatter.cs(System.ServiceModel),

namely in DeserializeBody:

private object DeserializeBody(XmlDictionaryReader reader, MessageVersion version, XmlSerializer serializer, MessagePartDescription returnPart, MessagePartDescriptionCollection bodyParts, object[] parameters, bool isRequest)
{
  try
  {
    if (reader == null)
      throw DiagnosticUtility.ExceptionUtility.ThrowHelperError((Exception) new ArgumentNullException("reader"));
    if (parameters == null)
      throw DiagnosticUtility.ExceptionUtility.ThrowHelperError((Exception) new ArgumentNullException("parameters"));
    object obj = (object) null;
    if (serializer == null || reader.NodeType == XmlNodeType.EndElement)
      return (object) null;
    object[] objArray = (object[]) serializer.Deserialize((XmlReader) reader, this.isEncoded ? XmlSerializerOperationFormatter.GetEncoding(version.Envelope) : (string) null);
    int num = 0;
    if (OperationFormatter.IsValidReturnValue(returnPart))
      obj = objArray[num++];
    for (int index = 0; index < bodyParts.Count; ++index)
      parameters[((Collection<MessagePartDescription>) bodyParts)[index].Index] = objArray[num++];
    return obj;
  }
  catch (InvalidOperationException ex)
  {
    throw DiagnosticUtility.ExceptionUtility.ThrowHelperError((Exception) new CommunicationException(System.ServiceModel.SR.GetString(isRequest ? "SFxErrorDeserializingRequestBody" : "SFxErrorDeserializingReplyBody", new object[1]
    {
      (object) this.OperationName
    }), (Exception) ex));
  }

as you can see, nobody is hooking themselves into the XmlSerializer.UnknownElement. Though, then again, we can't really say that, because XmlSerializer is passed through parameter. Long story short; it comes from either replyMessageInfo.BodySerializer or requestMessageInfo.BodySerializer property that is part of XmlSerializerOperationFormatter.cs, these come from the XmlSerializerOperationFormatter constructor.

Few steps further, and.. well 20983832972389 steps further, as the source code is madness. Basically, it leads to the fact that I do not see anything applied to XmlSerializer, which would kind of indicate what you've just said.

Possible solutions:

1) Use XmlSerializerOperationBehavior as a base and write your own "custom serializer". This is perfectly nice example how to write custom serializer: http://code.google.com/p/protobuf-net/source/browse/trunk/protobuf-net/ServiceModel/

You might be able to reuse some of the parts in XmlSerializerOperationBehavior. Maybe add some kind of error reporting.

2) I have never been fan of the Xml validation through XmlSerializer.

XmlSerializer is meant to serialize/deserialize objects, that's it. Partially constructed object is a nightmare. What I strongly suggest(and what I have been following myself with XmlSerializer usage), is to actually validate XML against schema and THEN deserialize.

All things aside, @CodeCaster suggestion is nice.

Erti-Chris Eelmaa
  • 25,338
  • 6
  • 61
  • 78
  • I might be able to reflect into `ServiceEndpoint` and modify the `XmlSerialzer` of `MessageInfo.BodySerializer`. A bit deep, but doable :D – leppie Jul 24 '14 at 12:03