Calculating Confusion Matrix for Two Text Files

Question

I would like to calculate a confusion matrix for two text files. Does anyone know of a library or tool either in python or shell script which can do this?

for example I have two files

FILE A:

FILE B:

Where I would get a confusion matrix:

   1   2
--------
1| 0   2
2| 0   2

Update: I would like to point out that the original post includes row and column labels

I would appreciate if you could have a look at this dear: https://stackoverflow.com/questions/44215561/python-creating-confusion-matrix-from-multiple-csv-files — Mahsolid, May 27 '17 at 14:42

kgully · Accepted Answer · 2016-10-27T16:35:09.563

4

This is probably overkill, but scikit-learn will do that pretty easily:

from sklearn.metrics import confusion_matrix

# Read the data
with open('file1', 'r') as infile:
    true_values = [int(i) for i in infile]
with open('file2', 'r') as infile:
    predictions = [int(i) for i in infile]

# Make confusion matrix
confusion = confusion_matrix(true_values, predictions)

print(confusion)

With output

[[0 2]
 [0 2]]

http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html

Update: To print with labels, you could either convert to a dataframe with pandas or something like this:

def print_confusion(confusion):
    print('   ' + '  '.join([str(n) for n in range(confusion.shape[1])]))
    for rownum in range(confusion.shape[0]):
        print(str(rownum) + '  ' + '  '.join([str(n) for n in confusion[rownum]]))

which prints

   0  1
0  0  2
1  0  2

edited Oct 27 '16 at 16:35

answered Oct 25 '16 at 19:17

kgully

650
7
16

yeah that works well. I would still like to avoid having to install sklearn all the time because it comes with a load of dependencies. – badner Oct 25 '16 at 22:02
Do you know how to get the column and row labels to print as well? I used numpy.savetxt(outfile, confusion, delimiter=",", fmt='%s') – badner Oct 26 '16 at 21:57
I would appreciate if you could have a look at this dear: https://stackoverflow.com/questions/44215561/python-creating-confusion-matrix-from-multiple-csv-files – Mahsolid May 27 '17 at 14:43

Calculating Confusion Matrix for Two Text Files

1 Answers1