1

I'm trying to follow the solution from the top answer here to load an object detection model from the .pth file.

os.environ['TORCH_HOME'] = '../input/torchvision-fasterrcnn-resnet-50/' #setting the environment variable
model = detection.fasterrcnn_resnet50_fpn(pretrained=False).to(DEVICE)

I get the following error

NotADirectoryError: [Errno 20] Not a directory: '../input/torchvision-fasterrcnn-resnet-50/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth/hub'

google did not reveal an answer to the error and I don't exactly know what it means except for the obvious (that folder 'hub' is missing).

Do I have to unpack or create a folder? I have tried loading the weights but I get the same error message.

this is how I load the model

model = detection.fasterrcnn_resnet50_fpn(pretrained=True)
checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
model.load_state_dict(checkpoint['state_dict'])

thank you for your help!

Full Error Trace:

gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

URLError                                  Traceback (most recent call last)
/tmp/ipykernel_42/1218627017.py in <module>
      1 # to load
----> 2 model = detection.fasterrcnn_resnet50_fpn(pretrained=True)
      3 checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
      4 model.load_state_dict(checkpoint['state_dict'])

/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/faster_rcnn.py in fasterrcnn_resnet50_fpn(pretrained, progress, num_classes, pretrained_backbone, trainable_backbone_layers, **kwargs)
    360     if pretrained:
    361         state_dict = load_state_dict_from_url(model_urls['fasterrcnn_resnet50_fpn_coco'],
--> 362                                               progress=progress)
    363         model.load_state_dict(state_dict)
    364     return model

/opt/conda/lib/python3.7/site-packages/torch/hub.py in load_state_dict_from_url(url, model_dir, map_location, progress, check_hash, file_name)
    553             r = HASH_REGEX.search(filename)  # r is Optional[Match[str]]
    554             hash_prefix = r.group(1) if r else None
--> 555         download_url_to_file(url, cached_file, hash_prefix, progress=progress)
    556 
    557     if _is_legacy_zip_format(cached_file):

/opt/conda/lib/python3.7/site-packages/torch/hub.py in download_url_to_file(url, dst, hash_prefix, progress)
    423     # certificates in older Python
    424     req = Request(url, headers={"User-Agent": "torch.hub"})
--> 425     u = urlopen(req)
    426     meta = u.info()
    427     if hasattr(meta, 'getheaders'):

/opt/conda/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    220     else:
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 
    224 def install_opener(opener):

/opt/conda/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
    523             req = meth(req)
    524 
--> 525         response = self._open(req, data)
    526 
    527         # post-process response

/opt/conda/lib/python3.7/urllib/request.py in _open(self, req, data)
    541         protocol = req.type
    542         result = self._call_chain(self.handle_open, protocol, protocol +
--> 543                                   '_open', req)
    544         if result:
    545             return result

/opt/conda/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    501         for handler in handlers:
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:
    505                 return result

/opt/conda/lib/python3.7/urllib/request.py in https_open(self, req)
   1391         def https_open(self, req):
   1392             return self.do_open(http.client.HTTPSConnection, req,
-> 1393                 context=self._context, check_hostname=self._check_hostname)
   1394 
   1395         https_request = AbstractHTTPHandler.do_request_

/opt/conda/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1350                           encode_chunked=req.has_header('Transfer-encoding'))
   1351             except OSError as err: # timeout error
-> 1352                 raise URLError(err)
   1353             r = h.getresponse()
   1354         except:

URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
Olli
  • 906
  • 10
  • 25
  • Why are you chaging `TORCH_HOME` if you are not loading any pretrained model? – Ivan Oct 11 '21 at 09:54
  • @Ivan I'm loading a pretrained model, I saved the .path file to the directory defined in torch_home – Olli Oct 11 '21 at 10:10
  • Can you show the line where you are loading the model? – Ivan Oct 11 '21 at 10:17
  • @Ivan Hi, I added a line, according to the comment I thuoght that was only needed but I tried the above (now edited) as well after saving with torch.save according to the tutorial of loading and saving. But I already get the same error in the first line – Olli Oct 11 '21 at 10:19
  • Ok, can you also provide the full error backtrace? *"But I already get the same error in the first line"*That's because you've set `pretrained=True` and it can't find a `/hub` sub-directory under `TORCH_HOME`. – Ivan Oct 11 '21 at 10:21
  • @Ivan, sure thank you for your help. I've tried several things, I wasn't clear enough. I also get the same error when I reset the kernel, without setting TORCH_HOME, just trying to load the model I previously saved. I will restart again and post full trace – Olli Oct 11 '21 at 10:25
  • @Ivan I have now updated the full error trace, btw I get the same error when pretrained = False – Olli Oct 11 '21 at 10:27

2 Answers2

0

If you are loading a pretrained network, you don't need to load the model from torchvision pretrained (as in pretrained by torchvision on ImageNet using pretrained=True). You have two options:

  1. Either set pretrained=False and load you weights using:

    checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
    model.load_state_dict(checkpoint['state_dict'])
    
  2. Or if you decide to change TORCH_HOME (which is not ideal) you need to keep the same directory structure Torchvision has which would be:

    inputs/hub/checkpoints/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth 
    

    In practice, you wouldn't change TORCH_HOME just to load one model.

Ivan
  • 34,531
  • 8
  • 55
  • 100
  • when I due pretrained = False I still get the same error: which I posted in the trace in my update. THank you for your patience – Olli Oct 11 '21 at 10:28
  • Have you tried your solution without the internet on? – Olli Oct 11 '21 at 10:29
  • If you have your weights already saved locally, you won't need Internet access. – Ivan Oct 11 '21 at 10:31
  • Thanks ivan, but maybe I'm slow, I've seen other issues like this. You need to define the model class and structure no? Or how would you instantiate the model without internet? – Olli Oct 11 '21 at 10:44
  • I can't run this without internet: `model = detection.fasterrcnn_resnet50_fpn(pretrained=False) checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar') model.load_state_dict(checkpoint['state_dict'])` – Olli Oct 11 '21 at 10:57
0

I found the solution digging deep into github, to the problem, which is a little hidden.

detection.() has a default argument besides pretrained, it's called pretrained_backbone which by default is set to true, which if True sets the models to download from a dictionary path of urls.

this will work:

detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone = False, num_classes = 91).

then load the model as usual. num_classes is expected, in the docs it's a default = 91 but in github i saw it as None, which is why I added it here for saftey.

Olli
  • 906
  • 10
  • 25