7

Does anyone know is there anyway to output the classification report as the text file or CSV file?

This line print(metrics.classification_report(y_test, y_pred)) in python gives me the classification report. I want to have this report in csv format.

I tried to copy and paste but the the columns would be lumped together! Any help appreciated!

Ahmad
  • 8,811
  • 11
  • 76
  • 141
user8034918
  • 441
  • 1
  • 9
  • 20
  • https://stackoverflow.com/questions/28200786/how-to-plot-scikit-learn-classification-report – perigon Jul 10 '17 at 04:02
  • I just posted a full example where I save the metrics.classification_report(y_test, y_pred) to a csv file. – seralouk Jul 10 '17 at 07:15
  • Just use the function [precision_recall_fscore_support](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html). – TomDLT Jul 10 '17 at 10:54
  • (at)user8034918 Did my answer work? – seralouk Jul 10 '17 at 15:24

5 Answers5

12

The function has a parameter which solves this exact problem.

import pandas as pd
from sklearn.metrics import classification_report

report_dict = classification_report(y_true, y_pred, output_dict=True)
pd.DataFrame(report_dict)

After converting the dictionary into a dataframe, you can write it to a csv, easily plot it, do operations on it or whatever.

Rabeez Riaz
  • 442
  • 5
  • 15
2

It is possible but you need to create a function.

Let's say that I want to write the report to my report.csv file (this need to be created before running the code)

Full Example:

from sklearn.metrics import classification_report
import csv
import pandas as pd

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']

def classifaction_report_csv(report):
    report_data = []
    lines = report.split('\n')
    for line in lines[2:-3]:
        row = {}
        row_data = line.split('      ')
        row['class'] = row_data[0]
        row['precision'] = float(row_data[1])
        row['recall'] = float(row_data[2])
        row['f1_score'] = float(row_data[3])
        row['support'] = float(row_data[4])
        report_data.append(row)
    dataframe = pd.DataFrame.from_dict(report_data)
    dataframe.to_csv('report.csv', index = False)

#call the classification_report first and then our new function

report = classification_report(y_true, y_pred, target_names=target_names)
classifaction_report_csv(report)

Hope this helps. Open the csv file and see:

Screenshot:

enter image description here

seralouk
  • 30,938
  • 9
  • 118
  • 133
2

I found Rabeez Riaz solution much easier. I would like to add that you can transpose to the dataframe with report_dict as argument.

df = pandas.DataFrame(report_dict).transpose()

From here on, you are free to use the standard pandas methods to generate your desired output formats (CSV, HTML, LaTeX, ...). your desired output formats (CSV, HTML, LaTeX, ...).

Source link:https://intellipaat.com/community/15701/scikit-learn-output-metrics-classificationreport-into-csv-tab-delimited-format

Rabeez Riaz
  • 442
  • 5
  • 15
Jcbyte
  • 207
  • 1
  • 4
  • 11
1

Additionally to sera's answer, I find the following way helpful - without having to parse the string of classification report using precision_recall_fscore_support:

from sklearn.metrics import precision_recall_fscore_support
from sklearn.utils.multiclass import unique_labels


def classification_report_to_csv_pandas_way(ground_truth,
                                            predictions,
                                            full_path="test_pandas.csv"):
    """
    Saves the classification report to csv using the pandas module.
    :param ground_truth: list: the true labels
    :param predictions: list: the predicted labels
    :param full_path: string: the path to the file.csv where results will be saved
    :return: None
    """
    import pandas as pd

    # get unique labels / classes
    # - assuming all labels are in the sample at least once
    labels = unique_labels(ground_truth, predictions)

    # get results
    precision, recall, f_score, support = precision_recall_fscore_support(ground_truth,
                                                                          predictions,
                                                                          labels=labels,
                                                                          average=None)
    # a pandas way:
    results_pd = pd.DataFrame({"class": labels,
                               "precision": precision,
                               "recall": recall,
                               "f_score": f_score,
                               "support": support
                               })

    results_pd.to_csv(full_path, index=False)


def classification_report_to_csv(ground_truth,
                                 predictions,
                                 full_path="test_simple.csv"):
    """
    Saves the classification report to csv.
    :param ground_truth: list: the true labels
    :param predictions: list: the predicted labels
    :param full_path: string: the path to the file.csv where results will be saved
    :return: None
    """
    # get unique labels / classes
    # - assuming all labels are in the sample at least once
    labels = unique_labels(ground_truth, predictions)

    # get results
    precision, recall, f_score, support = precision_recall_fscore_support(ground_truth,
                                                                          predictions,
                                                                          labels=labels,
                                                                          average=None)

    # or a non-pandas way:
    with open(full_path) as fp:
        for line in zip(labels, precision, recall, f_score, support):
            fp.write(",".join(line))

if __name__ == '__main__':
    # dummy data
    ground_truth = [1, 1, 4, 1, 3, 1, 4]
    prediction = [1, 1, 3, 4, 3, 1, 1]

    # test
    classification_report_to_csv(ground_truth, prediction)
    classification_report_to_csv_pandas_way(ground_truth, prediction)

outputs in either case:

class,f_score,precision,recall,support
1,0.75,0.75,0.75,4
3,0.666666666667,0.5,1.0,1
4,0.0,0.0,0.0,2
mkaran
  • 2,528
  • 20
  • 23
0

To have a csv similar to the output of classification report, you can use this:

    labels = list(set(targcol))
    report_dict = classification_report(targcol, predcol, output_dict=True)
    repdf = pd.DataFrame(report_dict).round(2).transpose()
    repdf.insert(loc=0, column='class', value=labels + ["accuracy", "macro avg", "weighted avg"])
    repdf.to_csv("results.csv", index=False)
Ahmad
  • 8,811
  • 11
  • 76
  • 141