1

I am very new to server side operations and I want to write an API that starts the training models on GPU server and allows the train outputs to be displayed to client. Since more than one person is connected to the server, we work in docker containers so that the operations we do, do not cause any problems. I want to use the code below in my future API. The code simply connect to main server with ssh. I tried to use Paramiko for that purpose. Then I want to get into my container, activate my anaconda environment in that container and run a python script to start training.

import paramiko


host = "xxx.xx.x.xxx"       # actual host IP
username = "support"
password = "password"

client = paramiko.client.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(host, username=username, password=password)
_stdin, _stdout, _stderr = client.exec_command("pwd")
_stdin.close()
print(_stdout.read().decode())              # It prints the directory in the main server --> /home/support 
print(_stderr.read().decode())
_stdin, _stdout, _stderr = client.exec_command("docker exec -it <container id> /bin/bash; pwd")
_stdin.close()
print(_stdout.read().decode())              # It still prints the directory in the main server --> /home/support. I guess it does not get into container.
print(_stderr.read().decode())              
_stdin, _stdout, _stderr = client.exec_command("docker exec -it <container id> /bin/bash; conda activate <my conda env>; python train.py")       # The real command series that I want to do.
_stdin.close()
print(_stdout.read().decode())
print(_stderr.read().decode())              # It prints the  
                                            # input device is not a TTY
                                            # bash: conda: command not found

client.close()

But it does not work like how I imagine. I could not figure out how to get into my container. Since this approach does not work, I also tried to connect the container with ssh, but it did not work either.

My question is how can i perform such the operation I described.

emretasar
  • 13
  • 5
  • Use (single) quotation marks around the command to run inside the container. – Klaus D. Jun 09 '22 at 07:16
  • Basically just a variant of [Execute (sub)commands in secondary shell/command on SSH server in Python Paramiko](https://stackoverflow.com/q/58452390/850848) and [Execute multiple commands in Paramiko so that commands are affected by their predecessors](https://stackoverflow.com/q/49492621/850848). – Martin Prikryl Jun 09 '22 at 07:28
  • I'd highly recommend taking `docker exec` out of this. Design a container that starts up with the virtual environment already loaded, run the training task, then exits. Then when you connect to the remote system you only need to run one `docker run` command and you don't need to deal with its stdin/stdout. You may even be able to use the Docker SDK with an ssh transport in that case. – David Maze Jun 09 '22 at 10:32

2 Answers2

0

I do not know paramiko very well. Could be that it has trouble to switch to a terminal inside a terminal and handle that correctly.

I'd focus on log into the containers directly with ssh. That is of course only if the containers are reachable on your network. (But I assume so, since you mentioned multiple people working on the server.

You'd have to make sure, that the docker containers have ssh port exposed to your network (possible security risk!) and you know the credentials (or more secure even: switch to certificate based login).

Svenito
  • 188
  • 10
0

When you use ; as the separator between commands, they become separate commands on the host. So in your first command, the pwd is run on the host.

Also, the -it option shouldn't be used here, since you're not interacting with the container from your tty.

Try these two commands

_stdin, _stdout, _stderr = client.exec_command("docker exec <container id> /bin/bash -c pwd")

and

_stdin, _stdout, _stderr = client.exec_command("docker exec <container id> /bin/bash -c 'conda activate <my conda env> && python train.py'")       # The real command series that I want to do.

I've changed <image id> to <container id> in the commands since what you're interacting with is a container and not an image. It's probably what you meant, so I don't think that's where your issue comes from.

Hans Kilian
  • 18,948
  • 1
  • 26
  • 35
  • Thank you very much for your interest. I did what you said and it gets into the container now. It prints the directory inside the docker container. But when I do the second part it gives an error message `/bin/bash: conda: command not found`. But I can activate the same environment inside the terminal with the same command. Why might this be happening? Also you were rigth I meant the container name and editted it in the code. – emretasar Jun 09 '22 at 07:37
  • That might be because `conda` isn't in the path. Try specifying the full path for it in your command. There are some programs that only modify the path when you're in an interactive shell, which might be why it's working interactively and not from a program. – Hans Kilian Jun 09 '22 at 07:38
  • Thank you very much. When I give the full path while activating, it works very well. You helped me a lot. – emretasar Jun 09 '22 at 07:57