Wrapper for Remote Executor
Currently BAMPI uses remote executor to do various types of tasks. Apart from Ironic’s agent pattern, remote executor is based on SSH. For high availability design, we need to consider a lot of situations, one is “while during task execution, BAMPI-1 fails”. In this situation, remote executor is dead, but the task being run still remains on the target bare-metal. So how does BAMPI-2 take over this kind of situation?
By doing some experiments, we found that although the SSH connection is broken,
the processes started by that SSH session still goes on. The real question is,
how to prevent BAMPI-2 running new tasks on the same bare-metal which already
has a running task? To differentiate whether a bare-metal has a previous
executed task or not, try to use ps
. For example, when Bob logins to a server
and executes sleep 100 &
, then he logouts, that “sleep” process is still
running in the background. However, its parent process is no longer being Bob’s
shell. These kind of processes are called “orphan process”: an orphan process is
a computer process whose parent process has finished or terminated, though it
remains running itself. All the orphan processes’ parent process is init
which
has PID 1.
So the answer is clear: find the orphan processes and wipe them out! But there
might be some other processes’ parent process ID is 1, e.g. the processes
spawned by init
or other orphan processes not started by BAMPI at the
beginning. And there are various types of tasks that BAMPI can run. So one
suggestion is that, in addition to BAMPI’s remote executor, write another
“wrapper” that runs on bare-metal. Therefore we can combine these two condition:
parent process ID is 1 and process name equals to the name of wrapper. Ta-da!