0

I have 5 classes in validation set and i want to draw a graph based on top1 results per class in validation loop using wandb . I have tried a single accuracy graph based on the average of 5 classes and it works fine but i want to do a separate way like top1 accuracy for each class. I am unable to achieve, are there any way to achieve it?

Validation Loader

 val_loaders = []
    for nuisance in val_nuisances:
        val_loaders.append((nuisance, torch.utils.data.DataLoader(
            datasets.ImageFolder(os.path.join(valdir, nuisance), transforms.Compose([
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
                normalize,
            ])),
            batch_size=args.batch_size, shuffle=False,
            num_workers=args.workers, pin_memory=True,
        )))


val_nuisances = ['shape', 'pose', 'texture', 'context', 'weather']

Validation Loop

def validate(val_loaders, model, criterion, args):
    overall_top1 = 0
    for nuisance, val_loader in val_loaders:
        batch_time = AverageMeter('Time', ':6.3f', Summary.NONE)
        losses = AverageMeter('Loss', ':.4e', Summary.NONE)
        top1 = AverageMeter('Acc@1', ':6.2f', Summary.AVERAGE)
        top5 = AverageMeter('Acc@5', ':6.2f', Summary.AVERAGE)
        progress = ProgressMeter(
            len(val_loader),
            [batch_time, losses, top1, top5],
            prefix=f'Test {nuisance}: ')

        # switch to evaluate mode
        model.eval()

        with torch.no_grad():
            end = time.time()
            for i, (images, target) in enumerate(val_loader):
                if args.gpu is not None:
                    images = images.cuda(args.gpu, non_blocking=True)
                if torch.cuda.is_available():
                    target = target.cuda(args.gpu, non_blocking=True)

                # compute output
                output = model(images)
                loss = criterion(output, target)

                # measure accuracy and record loss
                acc1, acc5 = accuracy(output, target, topk=(1, 5))
                losses.update(loss.item(), images.size(0))
                top1.update(acc1[0], images.size(0))
                top5.update(acc5[0], images.size(0))

                # measure elapsed time
                batch_time.update(time.time() - end)
                end = time.time()

                if i % args.print_freq == 0:
                    progress.display(i)

            progress.display_summary()
        overall_top1 += top1.avg
    overall_top1 /= len(val_loaders)
    return top1.avg
Khawar Islam
  • 2,556
  • 2
  • 34
  • 56

1 Answers1

0

I don't see any log to W&B in your code, but logging the top1 accuracy per class would just be

class_names = ['shape', 'pose', 'texture', 'context', 'weather']
top1_accuracies = [0.9, 0.8, 0.9, 0.9, 0.8]
wandb.log({class_names[0]: top1_accuracies[0], class_names[1]: top1_accuracies[1], ...}

In the above example, it looks like you're not actually creating a variable for the top1 accuracy of each class. You'll want to do that first. Taken from https://stackoverflow.com/a/50977153/3959708

You can use sklearn's confusion matrix to get the accuracy

from sklearn.metrics import confusion_matrix
import numpy as np

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']

#Get the confusion matrix
cm = confusion_matrix(y_true, y_pred)
#array([[1, 0, 0],
#   [1, 0, 0],
#   [0, 1, 2]])

#Now the normalize the diagonal entries
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
#array([[1.        , 0.        , 0.        ],
#      [1.        , 0.        , 0.        ],
#      [0.        , 0.33333333, 0.66666667]])

#The diagonal entries are the accuracies of each class
cm.diagonal()
#array([1.        , 0.        , 0.66666667])
Scott Condron
  • 1,902
  • 16
  • 20