2
  • I wanted to use the multilingual-codesearch model but first the code doesn't work and outputs the following error which suggest that it cannot load with only weights:
    from transformers import AutoTokenizer, AutoModel
      
    tokenizer = AutoTokenizer.from_pretrained("ncoop57/multilingual-codesearch")
    
    model = AutoModel.from_pretrained("ncoop57/multilingual-codesearch")
ValueError: Unrecognized model in ncoop57/multilingual-codesearch. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: gpt_neo, big_bird, speech_to_text, vit, wav2vec2, m2m_100, convbert, led, blenderbot-small, retribert, ibert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta-v2, deberta, flaubert, fsmt, squeezebert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag, tapas
  • Then I downloaded the pytorch bin file but it only contains the weight dictionnary (state dictionnary as mentioned here), which means that if I want to use the model I have to initialize the good architecture and then load the weights.

But how am I supposed to find the architecture fitting the weight of a model that complex ? I saw that some method could find back the model based on the weight dictionnary but I didn't manage to make them work (I think about enter link description here).

How can one find back the architecture of a weight dictionnary in order to make the model work ? Is it even possible ?

David Thery
  • 669
  • 1
  • 6
  • 21
  • 2
    Hi @AntonKamanda, I'm the author of that model. The reason it did not work was because I use a custom model architecture that's not in the HuggingFace's transformers library. Therefore when it checked for the model type it failed. Here is the notebook I developed it in which contains the model architecture I defined and also how to train and load a pretrained model from HuggingFace's model hub: https://colab.research.google.com/drive/1kQjMtu3HqaDS3NrwNc1eUrYL1sEArWe8?usp=sharing. For your question, I don't think it is possible to reverse engineer. I think you need the original implementation. – NCoop May 01 '21 at 11:40

0 Answers0