[Devel] Re: pid namespace bug ?

Sukadev Bhattiprolu sukadev at linux.vnet.ibm.com
Fri May 7 10:46:46 PDT 2010


Ferenc Wagner [wferi at niif.hu] wrote:
| Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com> writes:
| 
| > Daniel Lezcano [daniel.lezcano at free.fr] wrote:
| >
| >> Ferenc Wagner wrote:
| >>
| >>> That is, the jailed sleep process could be killed by SIGKILL only, even
| >>> though (according to strace) SIGTERM was delivered and it isn't handled
| >>> specially.  Why does this happen?
| >
| > Yes, SIGKILL is the only reliable way to terminate a container-init.
| > container-init needs to be immune to signals from within the container
| > but be open to receiving signals from parent container.  These requirements
| > complicate the implementation of allowing SIGINT/SIGTERM etc to
| > container-init from parent container.
| >
| > Besides a realistic container-init would block such signals, in which case
| > the complexity in the kernel could be viewed as unnecessary.
| 
| For full-system containers this is acceptable, but for running batch
| jobs this may prove problematic.  Is this behaviour documented somewhere?
| Is this specific to SIGINT/SIGTERM or are other signals affected as well?

Let me clarify - if the container-init has a handler for the signal, the
signal will be delivered. _Unhandled_ signals whose default is to terminate/
stop the process will be ignored by cinit unless the signal is SIGKILL/SIGSTOP
and sender is from parent container.

So to terminate a cinit from parent namespace you need SIGKILL. But other
signals will be delivered to cinit only if it has a handler.

| They are used for communication (job control) with the container running
| the job.  Such batch jobs are typically run under the supervision of
| some kind of "shepherd" process, which acts as "init" for the job
| environment; in my case it's the container-init.  It's the reaper or
| possible orphaned processes and the same time it communicates with the
| job scheduler (outside of the container) via signals.

So can this job scheduler install handlers for SIGINIT/SIGTERM/SIGQUIT ?

| So I'd consider
| at least some kernel complexity necessary for Linux containers becoming
| a viable tool for batch job segregation.

Yes, it is annoying that we can't CTRL-C a cinit running /bin/sleep, but
this behavior should not be too limiting to a more functional cinit.

I had submitted a verbose man page patch for kill(2) to describe these
semantics. but following para in the notes section of kill(2) does
allude to this behavior:

       The only signals that can be sent to process ID 1, the init
       process, are those for which init has explicitly installed signal
       handlers.  This is done to assure the system is not brought down
       accidentally.

See: 
	http://www.kernel.org/doc/man-pages/online/pages/man2/kill.2.html


Thanks,

Sukadev
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list