I have two laptops and want to use both for the DL model training. I don't have any experience in distributed systems and want to know is it possible to use the processing power of two laptops together to train a single model. What about tf.distribute.experimental.ParameterServerStrategy
? Will it be of any use?
Asked
Active
Viewed 1,123 times
2

superduper
- 401
- 1
- 5
- 16
-
Check the docs here https://www.tensorflow.org/guide/distributed_training – ZWang Jul 23 '20 at 02:44
-
It doesn't say how to combine two machines to use the processing power of both. – superduper Jul 23 '20 at 18:47
-
1I think Model Parallelism is what you are looking for. Please refer this talk on **Mesh Tensorflow**, https://www.youtube.com/watch?v=HgGyWS40g-g&list=PL6LsUGheZdT8te2nsOnFpzDQ2Am19VCYd&index=263. Thanks! – Sep 06 '20 at 13:29
1 Answers
1
Yes, you can use multiple devices for training your model and you need to have cluster and worker configuration to be done on both the devices like below.
tf_config = {
'cluster': {
'worker': ['localhost:12345', 'localhost:23456']
},
'task': {'type': 'worker', 'index': 0}
}
This Tutorial from Tensorflow on Multi-worker training with Keras will show you all the details about the configuration and training your model.
Hope this answers your question.