[Devel] Re: pid namespace bug ?

Ferenc Wagner wferi at niif.hu
Fri May 7 13:55:17 PDT 2010


Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com> writes:

> Ferenc Wagner [wferi at niif.hu] wrote:
>
>| Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com> writes:
>| 
>|> Daniel Lezcano [daniel.lezcano at free.fr] wrote:
>|>
>|>> Ferenc Wagner wrote:
>|>>
>|>>> That is, the jailed sleep process could be killed by SIGKILL only, even
>|>>> though (according to strace) SIGTERM was delivered and it isn't handled
>|>>> specially.  Why does this happen?
>|>
>|> Yes, SIGKILL is the only reliable way to terminate a container-init.
>|> container-init needs to be immune to signals from within the container
>|> but be open to receiving signals from parent container.  These requirements
>|> complicate the implementation of allowing SIGINT/SIGTERM etc to
>|> container-init from parent container.
>|>
>|> Besides a realistic container-init would block such signals, in which case
>|> the complexity in the kernel could be viewed as unnecessary.
>| 
>| For full-system containers this is acceptable, but for running batch
>| jobs this may prove problematic.  Is this behaviour documented somewhere?
>| Is this specific to SIGINT/SIGTERM or are other signals affected as well?
>
> Let me clarify - if the container-init has a handler for the signal, the
> signal will be delivered. _Unhandled_ signals whose default is to terminate/
> stop the process will be ignored by cinit unless the signal is SIGKILL/SIGSTOP
> and sender is from parent container.
>
> So to terminate a cinit from parent namespace you need SIGKILL. But other
> signals will be delivered to cinit only if it has a handler.

Thanks for clarifying.  How does the above apply to signalfds?  Will
those deliver the signals which would otherwise been ignored by cinit,
having no handler installed?

>| They are used for communication (job control) with the container running
>| the job.  Such batch jobs are typically run under the supervision of
>| some kind of "shepherd" process, which acts as "init" for the job
>| environment; in my case it's the container-init.  It's the reaper or
>| possible orphaned processes and the same time it communicates with the
>| job scheduler (outside of the container) via signals.
>
> So can this job scheduler install handlers for SIGINT/SIGTERM/SIGQUIT ?

The scheduler is outside of the container, so I suppose you mean the
shepherd process, which is the container init.  Yes, it already has
handlers for each signal it's interested in, so according to the above,
everything should work as expected (once we get the signals forwarded to
it).

>| So I'd consider at least some kernel complexity necessary for Linux
>| containers becoming a viable tool for batch job segregation.
>
> Yes, it is annoying that we can't CTRL-C a cinit running /bin/sleep, but
> this behavior should not be too limiting to a more functional cinit.

Indeed.  I misunderstood you on first read.

> I had submitted a verbose man page patch for kill(2) to describe these
> semantics. but following para in the notes section of kill(2) does
> allude to this behavior:
>
>        The only signals that can be sent to process ID 1, the init
>        process, are those for which init has explicitly installed signal
>        handlers.  This is done to assure the system is not brought down
>        accidentally.

I even read that paragraph recently.  I didn't think it would apply,
though, as I was trying to kill cinit in the outer namespace, where it
had a generic PID, not 1.  Your effort to expand the man page of kill(2)
is most appreciated, I hope it will land soon!
-- 
Thanks,
Feri.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list