19

I know it is possible to generate skeleton XSD from XML. For example this post has good answers.

The question is how to generate XSD based on several XMLs. The idea is that each XML might have several different occurences of optional, arrays, choice and the like. From all those examples, I would like to compose the most accurate XSD.

I know there might be collisions and the like but assuming all the XML came from an uknown XSD, it should be theoreticaly possible. But is there such tool?

Thanks

Community
  • 1
  • 1
Ragoler
  • 1,261
  • 4
  • 15
  • 20
  • Possible duplicate of [Any tools to generate an XSD schema from an XML instance document?](https://stackoverflow.com/questions/74879/any-tools-to-generate-an-xsd-schema-from-an-xml-instance-document) – simon04 May 16 '18 at 14:04

5 Answers5

18

Trang is just such a tool written in by the notable James Clark. It can translate between different forms of xml definitions such as Relax NG normal and compact syntax, old school DTD and XML schema. It can also infer schema from one or more xml files.

NOTE: The project has moved to Github. http://github.com/relaxng/jing-trang is the new location of the Trang repo

If you run ubuntu trang is packaged in the universe repository but that version seems a bit broken and a clean download from the link above is probably your best option. Assuming trang.jar is in the current directory:

java -jar trang.jar -I xml -O xsd file1.xml file2.xml definition.xsd

should do what you want.

Mauricio Gracia Gutierrez
  • 10,288
  • 6
  • 68
  • 99
Knut Haugen
  • 1,962
  • 13
  • 16
  • 4
    The Trang homepage still links to Google Code, but the project has moved to Github. For anyone else who finds it in the future, https://github.com/relaxng/jing-trang is the new location of the Trang repo. – rmunn Aug 04 '16 at 03:40
  • @ryanStull : can you give an example? i've just made an xsd using Trang from a carefully constructed set of four xml files, and another one using freeformatter.com from a single file. they are very different, and it'll be a while before i have time to work out which one to start adding my own validations to. – andrew lorien Sep 16 '16 at 08:00
8

.Net 4.5 has schema inferencing...

https://msdn.microsoft.com/en-us/library/xz2797k1(v=vs.110).aspx

this can accept multiple sources!

I needed this so I wrote the code, might as well share, pass in multiple file paths, first filepath is the xsd file to which you will write and the subsequent files are the input Xml files. This is a console application.

using System;
using System.IO;
using System.Xml;
using System.Xml.Schema;

namespace SchemaInferrer
{
    class Program
    {
        static void Main(string[] args)
        {
            string xsdFile="";
            string[] xmlFiles=null;
            DivideArguments(args, ref xsdFile, ref xmlFiles);

            if (FilesExist(xmlFiles))
            {
                Console.WriteLine("All files exist, good to infer...");
                XmlSchemaSet schemaSet = new XmlSchemaSet();
                XmlSchemaInference inference = new XmlSchemaInference();


                bool bFirstTime = true;
                foreach (string sFile in xmlFiles)
                {
                    XmlReader reader = XmlReader.Create(sFile);
                    if (bFirstTime)
                    {
                        schemaSet = inference.InferSchema(reader);
                    } else
                    {
                        schemaSet = inference.InferSchema(reader, schemaSet );
                    }
                    bFirstTime = false;
                }


                XmlWriterSettings xmlWriterSettings = new XmlWriterSettings()
                {
                    Indent = true,
                    IndentChars = "\t"
                };

                XmlWriter writer = XmlWriter.Create(xsdFile, xmlWriterSettings);

                foreach (XmlSchema schema in schemaSet.Schemas())
                {

                    //schema.Write(Console.Out);
                    schema.Write(writer);
                }
                Console.WriteLine("Finished, wrote file to {0}...",xsdFile);
                //Console.ReadLine();   
            }

        }

        static void DivideArguments(string [] args, ref string xsdFile, ref string[] xmlFiles)
        {
            xsdFile = args[0];
            xmlFiles=new string[args.Length-1];

            for (int i = 0; i < args.Length-1; i++)
            {
                xmlFiles[i] = args[i + 1];
            }
        }

        static bool FilesExist(string[] args)
        {
            bool bFilesExist=true; //* until proven otherwise

            if (args.Length>0)
            {
                foreach (string sFile in args )
                {
                if (!File.Exists(sFile) )
                    bFilesExist=false; 
                }
            }
            return bFilesExist;
        }
    }
}
S Meaden
  • 8,050
  • 3
  • 34
  • 65
3

This was the link i was looking for. Just thought I would share in case it helps someone else: http://blog.altova.com/generating-a-schema-from-multiple-xml-instances/

Mike Murphy
  • 1,287
  • 1
  • 12
  • 23
0

This is the code to create schema from one XML: Sample of code demonstrating how to use this class (it assumes that there is the “XmlSchemaSet set” class member accumulating the results and refining them from call to call):

        var si = new XmlSchemaInference();
        var reader = XmlReader.Create(new StringReader(textBox1.Text));
        var en = si.InferSchema(reader, set).Schemas().GetEnumerator();
        en.MoveNext();
        var schema = en.Current as XmlSchema;
        var stream = new MemoryStream();
        if (schema != null)
        {
            schema.Write(stream);
            set.Add(schema);
        }
        stream.Flush();
        stream.Position = 0;
        var streamReader = new StreamReader(stream);
        var str = streamReader.ReadToEnd();
        grid1.Model.LoadSchema(str);
        reader.Close();
        stream.Close();
        streamReader.Close();

If you run it again and give the XMLSchemaInference the generated schema and another XML, it will enhance the schema

Ragoler
  • 1,261
  • 4
  • 15
  • 20
0

I use this : https://xmlbeans.apache.org/docs/2.0.0/guide/tools.html#inst2xsd

It takes several xml instances and create xsd for you. There are 3 "schema design types" you can choose from. The default one works well for me.

It's a great tool and I have been using it for years. Not sure if the project is active though. Give it a try.

J. Lu
  • 1