7

Is there a simple way (e.g. without modifying code) to load wights from multiple pretrained networks into one network? The network contains some layers with same dimensions and names as both pretrained networks.

I am trying to achieve this using NVidia DIGITS and Caffe.

EDIT: I thought it wouldn't be possible to do it directly from DIGITS, as confirmed by answers. Can anyone suggest a simple way to modify the DIGITS code to be able to select multiple pretrained networks? I checked the code a bit, and thought the training script would be a good place to start, but I don't have in-depth knowledge of Caffe, so I'm not sure what the best/quickest way to achieve this would be.

Igor Ševo
  • 5,459
  • 3
  • 35
  • 80

2 Answers2

6

As Shai suggested, there was no way of doing this, so I decided to clone the official repository and make the appropriate changes. I changed the code so that multiple pretrained networks can be loaded by using a colon as separator.

I created a pull request on the official repository and my changes were then merged with the main branch of DIGITS, meaning it is now possible to use this functionality in DIGITS.

Igor Ševo
  • 5,459
  • 3
  • 35
  • 80
2

AFAIK there is no straight forward way of doing so.
However, you can use net surgery to load the pretrained models and manually assign their weights to the target net. Once you have a single net with all the weights initialized according to the various pretrained models, you can save it and use it as a single pretrained model for the rest of your work.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • 1
    this is how to do it...a NN model is just a bunch of weights so all you need to do is copy them over and save it. good luck though...if you copy layers from different nets they may not play well together. And digits is pretty high level...you'll need to drop into python or c++ to get it done. – user1269942 Dec 27 '15 at 06:32
  • Is there an easy way to incorporate this into the DIGITS code (I edited the question)? – Igor Ševo Dec 27 '15 at 11:24
  • @IgorŠevo if you are going to try this trick many times, you may consider altering the training script of DIGITS. However, I think if you are going to do this only once or twice you'd better do it manually using "net surgery". – Shai Dec 27 '15 at 11:30
  • I'm probably going to do this multiple times with different architectures, so I'm going to need something a bit more high-level. – Igor Ševo Dec 27 '15 at 12:17
  • @IgorŠevo well, I think the first step to modify the training script is to programmatically do this modification once in python. Can you make it work once? – Shai Dec 27 '15 at 12:27
  • 1
    I could probably use the "net surgery" you suggested and add modified code to the DIGITS training script. I was hopping Caffe would expose some API for loading weights from a different network. – Igor Ševo Dec 27 '15 at 12:54