In training mode, targets should be passed

Question

I am new to deep learning and have the project on detecting traffic lights in university where we can use open-source code.

So, I tried to run the code on kaggle https://www.kaggle.com/endoruk1234/trafficlightdetection-fasterrcnn-pytorch/log

However on the stage of testing the saved model on a video, I got this mistake 'In training mode, targets should be passed'.

I am not sure why do I need to pass targets on the testing stage. I don't understand is it a problem with the initial model, or video capture part is written with mistakes.

The model


    from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
    def _get_instance_segmentation_model(num_classes):
        model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
        in_features = model.roi_heads.box_predictor.cls_score.in_features
        model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
        return model
    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
    N_CLASS = 4 
    INP_FEATURES = model.roi_heads.box_predictor.cls_score.in_features
    model.roi_heads.box_predictor = FastRCNNPredictor(INP_FEATURES, N_CLASS)
    model.to(device)
    params = [p for p in model.parameters() if p.requires_grad]
    optimizer = torch.optim.Adam(params)
    lr_scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer)

Training


    lossHist = LossAverager()
    valLossHist = LossAverager()
    
    for epoch in range(EPOCHS):
        
        start_time = time()
        model.train()
        lossHist.reset()
        
        for images, targets, image_ids in tqdm(trainDataLoader):
            #bbox = check_bbox(bbox)
            images = torch.stack(images).to(device)
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
            
            bs = images.shape[0]
            
            loss_dict = model(images, targets)
            
            totalLoss = sum(loss for loss in loss_dict.values())
            lossValue = totalLoss.item()
            
            lossHist.update(lossValue,bs)
    
            optimizer.zero_grad()
            totalLoss.backward()
            optimizer.step()
        
        if lr_scheduler is not None:
            lr_scheduler.step(totalLoss)
    
        print(f"[{str(datetime.timedelta(seconds = time() - start_time))[2:7]}]")
        print(f"Epoch {epoch}/{EPOCHS}")
        print(f"Train loss: {lossHist.avg}")
        
        if(epoch == 10):
            torch.save(model.state_dict(), 'fasterrcnn_resnet{}_fpn.pth'.format(epoch)) 
    
    
    torch.save(model.state_dict(), 'fasterrcnn_resnet{}_fpn.pth'.format(epoch))

Testing on a video with a mistake


    while(True):
        ret, input = cap.read()
        image = input.copy()
        input = preprocess(input).float()
        input = input.unsqueeze_(0)
        input = input.type(torch.cuda.FloatTensor)
    
        print(input)
    
        result = model(input)
    
        boxes = result[0]['boxes'].type(torch.cuda.FloatTensor)
        scores = result[0]['scores'].type(torch.cuda.FloatTensor)
        labels = result[0]['labels'].type(torch.cuda.FloatTensor)
    
        mask = nms(boxes,scores,0.3)
        boxes = boxes[mask]
        scores = scores[mask]
        labels = labels[mask]
    
        boxes = boxes.data.cpu().numpy().astype(np.int32)
        scores = scores.data.cpu().numpy()
        labels = labels.data.cpu().numpy()
    
        mask = scores >= 0.5
        boxes = boxes[mask]
        scores = scores[mask]
        labels = labels[mask]
    
        colors = {1:(0,255,0), 2:(255,255,0), 3:(255,0,0)}
    
        for box,label in zip(boxes,labels):
            image = cv2.rectangle(image,
                              (box[0], box[1]),
                              (box[2], box[3]),
                              (0,0,255), 1)
    
        cv2.imshow("image", image)
        
        if cv2.waitKey(0):
            break


    ValueError                                Traceback (most recent call last)
    <ipython-input-84-e32f9d25d942> in <module>()
          8     print(input)
          9 
    ---> 10     result = model(input)
         11 
         12     boxes = result[0]['boxes'].type(torch.cuda.FloatTensor)
    
    /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
        725             result = self._slow_forward(*input, **kwargs)
        726         else:
    --> 727             result = self.forward(*input, **kwargs)
        728         for hook in itertools.chain(
        729                 _global_forward_hooks.values(),
    
    /usr/local/lib/python3.7/dist-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
         58         """
         59         if self.training and targets is None:
    ---> 60             raise ValueError("In training mode, targets should be passed")
         61         if self.training:
         62             assert targets is not None
    
    ValueError: In training mode, targets should be passed

Thank you in advance! If you can tell me how to correct the model or videocapture code I will be very grateful.

score 2 · Answer 1 · answered Aug 02 '21 at 12:17

In your example you don't explain how you load your model, but I think you have forgotten model.eval(). This function is a kind of switch for some specific layers/parts of the model that behave differently during training and inference (evaluating) time.

To make inferences, you can load your model like this :

model.load_state_dict(torch.load("/content/gdrive/MyDrive/Models/model_Resnet.pth"))
model.eval()

In training mode, targets should be passed

1 Answers1