0

I want to use mulan to classify some data. But I get a exception:

mulan.data.DataLoadException: Error creating Instances data from supplied Reader data source
at mulan.data.MultiLabelInstances.loadInstances(MultiLabelInstances.java:469)
at mulan.data.MultiLabelInstances.loadInstances(MultiLabelInstances.java:458)
at mulan.data.MultiLabelInstances.<init>(MultiLabelInstances.java:168)

The main function is from mulan.examples.TrainTestExperiment

public class TrainTestExperiment {

    public static void main(String[] args) {
        try {
            String path = Utils.getOption("path", args); // e.g. -path dataset/
            String filestem = Utils.getOption("filestem", args); // e.g. -filestem emotions
            String percentage = Utils.getOption("percentage", args); // e.g. -percentage 50 (for 50%)

            System.out.println("Loading the dataset");
            MultiLabelInstances mlDataSet = new MultiLabelInstances(path + filestem + ".arff", path + filestem + ".xml");

            // split the data set into train and test
            Instances dataSet = mlDataSet.getDataSet();
            RemovePercentage rmvp = new RemovePercentage();
            rmvp.setInvertSelection(true);
            rmvp.setPercentage(Double.parseDouble(percentage));
            rmvp.setInputFormat(dataSet);
            Instances trainDataSet = Filter.useFilter(dataSet, rmvp);

            rmvp = new RemovePercentage();
            rmvp.setPercentage(Double.parseDouble(percentage));
            rmvp.setInputFormat(dataSet);
            Instances testDataSet = Filter.useFilter(dataSet, rmvp);

            MultiLabelInstances train = new MultiLabelInstances(trainDataSet, path + filestem + ".xml");
            MultiLabelInstances test = new MultiLabelInstances(testDataSet, path + filestem + ".xml");

            Evaluator eval = new Evaluator();
            Evaluation results;

            Classifier brClassifier = new NaiveBayes();
            BinaryRelevance br = new BinaryRelevance(brClassifier);
            br.setDebug(true);
            br.build(train);
            results = eval.evaluate(br, test);
            System.out.println(results);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

As for the data format, I have one dimension called title and has 160 catagories.

the data file is formatted according to the arff format.

the some text is in chinese.

Any help is appreciate.

best regards

demongolem
  • 9,474
  • 36
  • 90
  • 105
zwang
  • 685
  • 1
  • 8
  • 13

1 Answers1

0

This looks like a bug in the Mulan.

Check out here for more details of the bug.

Jayamohan
  • 12,734
  • 2
  • 27
  • 41