0

I'm using mulan library for doing multi labels classification. The learner I'm using is RAkEL learner. I followed the mulan's instruction: http://mulan.sourceforge.net/starting.html

My label xml file:

<labels xmlns="http://mulan.sourceforge.net/labels"> 
  <label name="1"/> 
  <label name="2"/> 
  <label name="3"/> 
  <label name="4"/> 
  <label name="5"/> 
</labels>

My training data file:

@relation predict_label
@attribute 12345 numeric
@attribute A numeric
@attribute B numeric
@attribute C numeric
@attribute D numeric
@attribute E numeric

@attribute 1 {0, 1}
@attribute 2 {0, 1}
@attribute 3 {0, 1}
@attribute 4 {0, 1}
@attribute 5 {0, 1}

@data
2,3,2,2,2,2,1,0,0,0,0

2,2,3,2,2,2,0,1,0,0,0

2,2,2,3,2,2,0,0,1,0,0

2,2,2,2,3,2,0,0,0,1,0

2,2,2,2,2,3,0,0,0,0,1

My testing data file:

@relation catalog_ml
@attribute 12345 numeric
@attribute A numeric
@attribute B numeric
@attribute C numeric
@attribute D numeric
@attribute E numeric

@attribute 1 {0, 1}
@attribute 2 {0, 1}
@attribute 3 {0, 1}
@attribute 4 {0, 1}
@attribute 5 {0, 1}

@data
2,2,2,2,2,3,0,0,0,0,0

The result I had after performing predicting:

Bipartion: [false, false, false, false, false] Confidences: [0.0, 0.0, 0.0, 0.0, 0.0] Ranking: [5, 4, 3, 2, 1]Predicted values: null

My questions are:
1. Can somebody help me to verify what I did wrong ?
2. As I understand, the ranking [5, 4, 3, 2, 1] is the positions of labels in the xml label file. Is my understanding correct ? Why the ranking order is not from 1 to 5 ... ?
3. Is predicted values null because this is a multi labels classification test ?Otherwise which learner won't return Predicted value to null?

Thank you very much. Any suggestion or comments are more than welcomed.

Xitrum
  • 7,765
  • 26
  • 90
  • 126

1 Answers1

0

I'm also pretty new to Mulan, but the following i can say.

  1. Can somebody help me to verify what I did wrong ?

You dindn't do something wrong in particular. you just did not give the classifier enough information to classify your test sample. I added some random lines to your training set

@relation predict_label
@attribute 12345 numeric
@attribute A numeric
@attribute B numeric
@attribute C numeric
@attribute D numeric
@attribute E numeric

@attribute 1 {0, 1}
@attribute 2 {0, 1}
@attribute 3 {0, 1}
@attribute 4 {0, 1}
@attribute 5 {0, 1}

@data
2,3,2,2,2,2,1,0,0,0,0
2,2,3,2,2,2,0,1,0,0,0
2,2,2,3,2,2,0,0,1,0,0
2,2,2,2,3,2,0,0,0,1,0
2,2,2,2,2,3,0,0,0,0,1
2,2,2,2,2,2,1,0,1,1,0
1,2,3,4,6,7,0,0,0,1,1
5,4,3,2,1,0,1,1,1,1,1
9,8,7,5,4,3,0,1,1,0,0
1,2,3,2,1,0,0,1,1,1,1
1,5,6,8,9,0,1,1,0,0,1

and got the following result:

Bipartion: [false, false, false, false, false] Confidences: [0.16666666666666666, 0.0, 0.0, 0.16666666666666666, 0.3333333333333333] Ranking: [3, 5, 4, 2, 1]Predicted values: null

Bipartition is here the predicted value and Confidence is a value about how confident the classifier is about what he cassified here. Indeed not very confident. But thats because of the "bad" training dataset.

  1. As I understand, the ranking [5, 4, 3, 2, 1] is the positions of labels in the xml label file. Is my understanding correct ? Why the ranking order is not from 1 to 5 ... ?

The ranking shows simply at which of your labels the classifier was most confident. Because they are all "0" they are listed somehow "random" or in the way a sort function put it without information. As you can see in my example it's ordered by confidence.

Is predicted values null because this is a multi labels classification test ?Otherwise which learner won't return Predicted value to null?

I actually don't know what they are for. If someone has an answer for that question i would be happy too.

Edit

If you copy one of the trainingset lines into the test testdataset you get different Bipartition values than only false.

Florian H
  • 3,052
  • 2
  • 14
  • 25