0

I am using Caffe which is a framework for convolutional neural networks with GPUs(or CPUs). It uses mainly CUDA 6.0 and I'm training a CNN with a large dataset of images(ImageNet dataset=1.2million of images) and requires a great amount of memory. However I'm running small experiments over subsets of the original(which also require significant amounts of memory). I am also working on a gpu cluster. This is the output of the command $ nvidia-smi

+------------------------------------------------------+                       
| NVIDIA-SMI 331.62     Driver Version: 331.62         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M2050         Off  | 0000:08:00.0     Off |                    0 |
| N/A   N/A    P0    N/A /  N/A |   1585MiB /  2687MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla M2050         Off  | 0000:09:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla M2050         Off  | 0000:0A:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla M2050         Off  | 0000:15:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  Tesla M2050         Off  | 0000:16:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  Tesla M2050         Off  | 0000:19:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  Tesla M2050         Off  | 0000:1A:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   7  Tesla M2050         Off  | 0000:1B:00.0     Off |                    0 |
| N/A   N/A    P1    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0     10242  ../../../build/tools/train_net.bin                  1577MiB |
+-----------------------------------------------------------------------------+

But when I try to run these multiple processes(for example the same train_net.bin over a different dataset), they fail because they are running on the same GPU and I want to know how to force to use another GPU. I would appreciate any help.

ssierral
  • 8,537
  • 6
  • 26
  • 44
  • NVIDIA documentation is [here](http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#cuda-visible-devices). Additional write-up [here](http://www.resultsovercoffee.com/2011/02/cudavisibledevices.html). – Robert Crovella Jul 01 '14 at 23:05
  • 1
    For caffe, put in the solver.prototxt this line: device_id: 1 then GPU with device id 1 will be selected. – Min Lin Jul 03 '14 at 05:07
  • Hey Thank you very much, just saved my day... It's like a pain in the ass to make hyperparameter optimization with these GPUs... Unfortunately these don't have the computer capabilites of the Tesla K20 that was used originally... so I have to use minibatches of 64 images instead of the 256 images and the gradient descent stays at the same value... anyway thanks – ssierral Jul 03 '14 at 05:37

0 Answers0