3

This is a follow up from the question here Dynamic classes/objects ML.net's PredictionMoadel<TInput, TOutput> Train()

My system cannot use a predefined class at compile time, therefore I tried to feed a dynamic class into ML.NET like below

    // field data type
    public class Field
    {
        public string FieldName { get; set; }
        public Type FieldType { get; set; }
    }

    // dynamic class helper
    public class DynamicClass : DynamicObject
    {
        private readonly Dictionary<string, KeyValuePair<Type, object>> _fields;

        public DynamicClass(List<Field> fields)
        {
            _fields = new Dictionary<string, KeyValuePair<Type, object>>();
            fields.ForEach(x => _fields.Add(x.FieldName,
                new KeyValuePair<Type, object>(x.FieldType, null)));
        }

        public override bool TrySetMember(SetMemberBinder binder, object value)
        {
            if (_fields.ContainsKey(binder.Name))
            {
                var type = _fields[binder.Name].Key;
                if (value.GetType() == type)
                {
                    _fields[binder.Name] = new KeyValuePair<Type, object>(type, value);
                    return true;
                }
                else throw new Exception("Value " + value + " is not of type " + type.Name);
            }
            return false;
        }

        public override bool TryGetMember(GetMemberBinder binder, out object result)
        {
            result = _fields[binder.Name].Value;
            return true;
        }
    }

    private static void Main(string[] args)
    {
        var fields = new List<Field>
        {
            new Field {FieldName = "Name", FieldType = typeof(string)},
            new Field {FieldName = "Income", FieldType = typeof(float)}
        };

        dynamic obj1 = new DynamicClass(fields);
        obj1.Name = "John";
        obj1.Income = 100f;

        dynamic obj2 = new DynamicClass(fields);
        obj2.Name = "Alice";
        obj2.Income = 200f;

        var trainingData = new List<dynamic> {obj1, obj2};

        var env = new LocalEnvironment();
        var schemaDef = SchemaDefinition.Create(typeof(DynamicClass));
        schemaDef.Add(new SchemaDefinition.Column(null, "Name", TextType.Instance));
        schemaDef.Add(new SchemaDefinition.Column(null, "Income", NumberType.R4));
        var trainDataView = env.CreateStreamingDataView(trainingData, schemaDef);

        var pipeline = new CategoricalEstimator(env, "Name")
            .Append(new ConcatEstimator(env, "Features", "Name"))
            .Append(new FastTreeRegressionTrainer(env, "Income", "Features"));

        var model = pipeline.Fit(trainDataView);
    }

and got the error: "'No field or property with name 'Name' found in type 'System.Object'". I tried generating the class using Reflection only to run into the same problem.

Is there a workaround? Thanks

HuyNA
  • 598
  • 3
  • 19
  • Have you tried using ExpandoObject? https://stackoverflow.com/questions/20530134/is-it-possible-to-add-attributes-to-a-property-of-dynamic-object-runtime – CEO May 16 '20 at 06:05
  • I was not aware of ExpandoObject, good to know, thanks – HuyNA May 18 '20 at 18:58
  • 1
    I have a complete example of how to do this using a runtime generated class: https://stackoverflow.com/a/66913705/125406 – Michael Silver Apr 04 '21 at 01:07

3 Answers3

7

For those attempting to do this, I have a working solution that creates the schema and can be used to train data dynamically.

First, grab the code for DynamicTypeProperty and DynamicType from my other answer here.

The following code will create a schema dynamically:

var properties = new List<DynamicTypeProperty>()
{
    new DynamicTypeProperty("SepalLength", typeof(float)),
    new DynamicTypeProperty("SepalWidth", typeof(float)),
    new DynamicTypeProperty("PetalLength", typeof(float)),
    new DynamicTypeProperty("PetalWidth", typeof(float)),
};

// create the new type
var dynamicType = DynamicType.CreateDynamicType(properties);
var schema = SchemaDefinition.Create(dynamicType);

You'll then need to create list with the required data. This is done as follows:

var dynamicList = DynamicType.CreateDynamicList(dynamicType);

// get an action that will add to the list
var addAction = DynamicType.GetAddAction(dynamicList);

// call the action, with an object[] containing parameters in exact order added
addAction.Invoke(new object[] {1.1, 2.2, 3.3, 4.4});
// call add action again for each row.

Then you'll need to create an IDataView with the data, this requires using reflection, or the trainers won't infer the correct type.

            var mlContext = new MLContext();
            var dataType = mlContext.Data.GetType();
            var loadMethodGeneric = dataType.GetMethods().First(method => method.Name =="LoadFromEnumerable" && method.IsGenericMethod);
            var loadMethod = loadMethodGeneric.MakeGenericMethod(dynamicType);
            var trainData = (IDataView) loadMethod.Invoke(mlContext.Data, new[] {dynamicList, schema});

You then, should be able to run the trainData through your pipeline.

Good luck.

Gary Holland
  • 2,565
  • 1
  • 16
  • 17
  • Thanks Gary! I was able to create a dynamic IDataView from your example. However, I have a follow up question. Once you train a model, you have to create a Prediction Engine which asks for the source class type to be declared. However, I'm unsure how to declare this class since the I'm using a static instance of a dynamic type. Am I misunderstanding how to use these classes? Or can you provide a brief example of how you might use the IDataView created in this example to instantiate a Prediction Engine? – andyopayne Mar 31 '21 at 15:10
  • @andyopayne, take a look at my answer to another similar question. I posted some sample code to do exactly this. https://stackoverflow.com/questions/66893993/ml-net-create-prediction-engine-using-dynamic-class/66913705#66913705 – Michael Silver Apr 02 '21 at 02:25
2

Dynamic class doesn't actually create a class definition but it rather provides you with dynamic object.

I looked at the code for SchemaDefinition.Create() it needs an actual class definition to build the schema. So your options are to create and load a class definition dynamically.

You can create your class as string with all dynamic properties and compile it using Microsoft compiler services aka Roslyn. See here. This will generate an assembly (in memory as memory stream or on file system) with your dynamic type.

Now you are only half way there. To get your dynamic type from dynamic assembly you need to load it in the App Domain. See this post. Once the assembly is loaded you can use 'Activator.CreateInstance()' if it's same domain or if it's your custom domain then you would need yourDomain.CreateInstanceAndUnwrap() to create the object out of dynamically generated Class and to get the type use Assembly.GetType().

Few sample here, A little out of date but will get you on your feet if you are up for this. See CompilerEngine and CompilerService to compile and load the assembly.

Other options: Refelection.Emit() but it requires a great deal of IL level coding. See this post.

vendettamit
  • 14,315
  • 2
  • 32
  • 54
0

Right now I'm using a dummy place holder like this as a workaround

    public class TrainingSample
    {
        public string TextField1;
        public string TextField2;
        public string TextField3;
        public string TextField4;
        public string TextField5;

        public float FloatField1;
        public float FloatField2;
        public float FloatField3;
        public float FloatField4;
        public float FloatField5;
        public float FloatField6;
        public float FloatField7;
        public float FloatField8;
        public float FloatField9;
        public float FloatField10;
        public float FloatField11;
        public float FloatField12;
        public float FloatField13;
        public float FloatField14;
        public float FloatField15;
    }
HuyNA
  • 598
  • 3
  • 19
  • 1
    You could instead have a vector of FloatFeatures and Vector of TextFeatures and then just use encoding estimator steps that accept vector arguments. Most of the numeric ones accept vectors for the input column and for text features there are a few (e.g. ApplyWordEmbedding) ref: https://learn.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.text?view=ml-dotnet – cheft Jul 17 '19 at 22:23