3

I actually did all the Microsoft Tutorials for ML.NET and want to build my own models now. I want to convert string[][] data to an IDataView-Object, as I want to use it in an ML.NET-Model for binary classification.

So far I have always used data from external text or CSV files for the training. Now I want to use data stored in string[][] data. In data[0][] are the text values and in data[1][] are the boolean values.

I fail to convert the existing nested array into an IDataView object. I have already tried to use the following code:

 public class BinaryData
        {

            public string Text { get; set; }


            public bool Label { get; set; }
        }

// The data is collected from an Excel-Table with some functions and saved in this nested array: 

string[][] data = form.GetDataSelection().GetDataContainer().textCols;



BinaryData[] inMemoryCollection = new BinaryData[data[0].Length];
            for (int i = 0; i < data[0].Length-1; i++)
            {

                inMemoryCollection[i] = new BinaryData
                {
                    Text = data[0][i],
                    Label = Convert.ToBoolean(Convert.ToInt64(data[1][i]))
                };                             
            } 


IDataView dataView = mlContext.Data.LoadFromEnumerable<BinaryData>(inMemoryCollection);

My implementation is based on the tutorial from Microsoft.

It works until I want to use the Fit()-Method. I get the following error-message:

System.InvalidOperationException: 'Splitter/consolidator worker encountered exception while consuming source data'

I hope somebody can help me out here. Many thanks in advance!

Kate Orlova
  • 3,225
  • 5
  • 11
  • 35
Nick.exe
  • 31
  • 3
  • You're using a jagged array [][]. Perhaps try a two-dimensional array [,]. If that doesn't work and you don't get a good answer, reply to this comment and I'll see if I can dig in this evening. – Eric J. Sep 05 '19 at 17:54
  • 1
    Hey Eric, thank you very much for your help! It works fine. As I'm editing an Excel-AddIn for an university project I convert the given jagged array first with a method to an two-dimensional array and use than the for-loop to convert it to my inMemoryCollection – Nick.exe Sep 05 '19 at 21:10
  • Why don't you post that as an answer to your own question, showing how you do the conversion? Future visitors to the question are more likely to see an answer than a comment. – Eric J. Sep 05 '19 at 21:13

1 Answers1

0

it works with an two-dimensional array [,]. I used the Method from this post to convert the jagged array [][] into a two-dimensional array and changed my code a little bit:

string[][] data_jagged = form.GetDataSelection().GetDataContainer().textCols;
string[,] data = To2D(data_jagged);

BinaryData[] inMemoryCollection = new BinaryData[data_jagged[0].Length];
            for (int i = 0; i < data_jagged[0].Length; i++)
            {

                inMemoryCollection[i] = new BinaryData
                {
                    Text = data[0,i],
                    Label = Convert.ToBoolean(Convert.ToInt64(data[1,i]))
                };                             
            } 


IDataView dataView = mlContext.Data.LoadFromEnumerable<BinaryData>(inMemoryCollection);

Thanks to Eric for his help.

Nick.exe
  • 31
  • 3