[CRIU] [PATCH 1/3] seccomp: add initial support for SECCOMP_MODE_STRICT
Pavel Emelyanov
xemul at parallels.com
Tue Jun 2 06:46:32 PDT 2015
On 06/02/2015 04:38 PM, Tycho Andersen wrote:
> On Tue, Jun 02, 2015 at 03:40:24PM +0300, Pavel Emelyanov wrote:
>> Cool :) And can you elaborate on why exactly the filtered mode cannot be
>> supported?
>
> Yes, simply because we can't read the filters out of the kernel
> (they're not exposed to userland at all). I've mailed the seccomp
> maintainer about how to expose these, but he just had a baby and so is
> not very responsive right now. Hopefully I'll hear back about how to
> do this soon :)
Ah, I see.
>>> @@ -756,6 +758,36 @@ static int collect_helper_pids()
>>> return 0;
>>> }
>>>
>>> +/*
>>> + * Prepare the seccomp state for the process. Note that seccomp is restored in
>>> + * suspended mode, and then unsuspended after the restorer blob completes in
>>> + * finalize_restore().
>>> + */
>>
>> I'm not sure I get this. The prctl below will just turn the unsuspended strict mode
>> on. The ptrace-s in fork_with_pid() will try to suspend one, but this looks racy.
>> What prevents the prctl from being called _before_ the mentioned ptrace-s?
>
> The ptraces in fork_with_pid happen during CR_STATE_FORKING (this
> state is finished in restore_task_with_children, but after
> create_children_and_session is called, which itself does more forking
> and ptracing in fork_with_pid), but the prctl here happen in
> CR_STATE_RESTORE. I think it should be ok.
Ah, so the ptrace suspend of seccomp works even if seccomp itself is not
turned on. OK :)
>>> @@ -1135,13 +1170,57 @@ static inline int fork_with_pid(struct pstree_item *item)
>>> goto err_unlock;
>>> }
>>>
>>> -
>>> if (item == root_item) {
>>> item->pid.real = ret;
>>> pr_debug("PID: real %d virt %d\n",
>>> item->pid.real, item->pid.virt);
>>> }
>>>
>>> + if (ca.core->tc->seccomp_mode != SECCOMP_MODE_DISABLED) {
>>> +#ifdef CONFIG_HAS_SUSPEND_SECCOMP
>>
>> Can we avoid $ifdef-s in the middle of the code, please?
>
> Sure, what's the usual pattern for doing this without them? I could
> look for EIO from PTRACE_SUSPEND_SECCOMP (indicating a bad ptrace
> command), unless you have some better way in mind?
Typically this is done like this
#ifdef FOO
static void foo(void)
{
/* useful code here */
}
#else
static void foo(void)
{
/* relevant stub */
}
#endif
...
static void bar(void)
{
...
foo();
...
}
This way is also nice because foo() can be put into separate .c
file with some descriptive name.
>>> + int status;
>>> +
>>> + /*
>>> + * Since the process is going to restore seccomp state, we
>>> + * need to temporarily suspend seccomp before we restore this
>>> + * state, otherwise subsequent restore syscalls might be
>>> + * blocked. seccomp is resumed at the end in finalize_restore()
>>> + */
>>> + if (ptrace(PTRACE_ATTACH, ret, NULL, NULL) < 0) {
>>> + pr_perror("ptrace attach failed");
>>> + kill(ret, SIGKILL);
>>> + goto err_unlock;
>>> + }
>>> +
>>> + if (waitpid(ret, &status, 0) < 0) {
>>> + pr_perror("waidpid failed");
>>> + kill(ret, SIGKILL);
>>> + goto err_unlock;
>>> + }
>>> +
>>> + if (!WIFSTOPPED(status)) {
>>> + pr_err("not stopped after ptrace attach?\n");
>>> + kill(ret, SIGKILL);
>>> + goto err_unlock;
>>> + }
>>> +
>>> + if (ptrace(PTRACE_SUSPEND_SECCOMP, ret, NULL, 1) < 0) {
>>> + pr_perror("seccomp suspend failed %d", ret);
>>> + kill(ret, SIGKILL);
>>> + goto err_unlock;
>>> + }
>>> +
>>> + if (ptrace(PTRACE_DETACH, ret, NULL, NULL) < 0) {
>>> + pr_perror("ptrace detach failed");
>>> + kill(ret, SIGKILL);
>>> + goto err_unlock;
>>> + }
Oh, btw, once you implement suspend-off on ptrace detach this place will
stop working.
-- Pavel
More information about the CRIU
mailing list