Questions tagged [resuming-training]

36 questions
146
votes
8 answers

Loading a trained Keras model and continue training

I was wondering if it was possible to save a partly trained Keras model and continue the training after loading the model again. The reason for this is that I will have more training data in the future and I do not want to retrain the whole model…
5
votes
3 answers

Keras - manage history

I am training Keras models, saving them with model.save() and than later loading them and resuming training. I would like to plot after each training the whole training history, but model.fit_generator() only returns the history of the last session…
Jsevillamol
  • 2,425
  • 2
  • 23
  • 46
4
votes
2 answers

Tensorflow model restoration (resume training seems starting from scratch)

I've a problem for resuming training after saving my model. The problem is that my loss decrease form 6 to 3 for example. At this time I save the model. When I restore it and continue training, the loss restart from 6. It seems that the restoration…
JimZer
  • 918
  • 2
  • 9
  • 19
3
votes
1 answer

How to resume a pytorch training of a deep learning model while training stopped due to power issues or some other interrpts

Actually i am training a deep learning model and want to save checkpoint of the model but its stopped when power is off then i have to start from that point from which its interrupted like 10 epoches completed and want to resume/start again from…
3
votes
2 answers

TF2 object detection API issue with resuming training from saved checkpoint

I'm facing an issue with TF2 object detection API that seems to have occurred overnight. I'm trying to resume training from a saved checkpoint and as usual I change the path in the config file to where the checkpoints are before resuming the…
3
votes
1 answer

How to increase training steps in Tensorflow?

I followed the following Tensorflow tutorial to retrain the Inception V3 on my own classes. https://www.tensorflow.org/hub/tutorials/image_retraining Everything worked well so far and I got an acceptable final test accuracy. However, I want to…
Sara0010
  • 31
  • 3
2
votes
1 answer

gensim doc2vec train more documents from pre-trained model

I am trying to train with new labelled document(TaggedDocument) with the pre-trained model. Pretrained model is the trained model with documents which the unique id with label1_index, for instance, Good_0, Good_1 to Good_999 And the total size of…
Isaac Sim
  • 539
  • 1
  • 7
  • 23
2
votes
0 answers

How to modify a pretrained graph? Tensorflow

i want to modify a pretrained model and then finetune it. I am able to load the graph in tensorflow. But what happens is that when i write new layers then my graph`s shape is changed unexpectedly. Code is long but here it is with tf.Session() as…
Rafay Zia Mir
  • 2,116
  • 6
  • 23
  • 49
1
vote
1 answer

AttributeError: 'DataParallel' object has no attribute 'copy'

I am trying to resume training monkAI pytorch retinanet. I have loaded with .pt file instead of actual model. The changes are made in Monk_Object_Detection/5_pytorch_retinanet/lib/train_detector.py, check for '# change' in the places where its…
van
  • 15
  • 9
1
vote
1 answer

How to Resume Yolov3 training?

I am new to deep learning, I have a yolov3 model that I have been training on my custom data. Every time I train, the training seems to start from scratch. How do I make the model continue its training from where it stopped last time? The setup I…
elbashmubarmeg
  • 330
  • 1
  • 9
1
vote
2 answers

Loaded keras model fails to continue training, dimensions mismatch

I'm using tensorflow with keras to train to a char-RNN using google colabs. I train my model for 10 epochs and save it, using 'model.save()' as shown in the documentation for saving models. Immediately after, I load it again just to check, I try to…
1
vote
1 answer

Resuming pytorch model training raises error “CUDA out of memory”

My goal is to save the model at every epoch as I have to stop the training during the night and I don't want to lose progress. After I trained my model for 1 epoch I interrupted the process via terminal with CTRL+Z. When I tried to resume the…
1
vote
0 answers

Fasttext model loaded with gensim won't continue training with new sentences

I am trying to load a fasttext .bin model in spanish, donwloaded from https://fasttext.cc/docs/en/crawl-vectors.html and continue training it with new sentences from the specific domain I am interested in. System: Anaconda, Jupyter Notebook,…
1
vote
0 answers

Retrain a SavedModel in Tensorflow

Is there any example of retraining a SavedModel? In many places they claim it is possible, instead of using checkpoints, but not examples provided. When I have tried to carried out, the variables of the model remain fixed: ... model_save_path =…
MickeyMouse
  • 103
  • 1
  • 7
1
vote
1 answer

Unable to resume training from checkpoint model in keras

I am saving the models in each epoch when it surpasses the previous epoch in terms of accuracy. But when i load model it does not resume from the saved model point. The code is as below : filepath =…
user41986
  • 81
  • 1
  • 6
1
2 3