[Devel] questions on capabilities (PSBM-40837)

Konstantin Khorenko khorenko at virtuozzo.com
Wed Nov 11 04:30:44 PST 2015


https://jira.sw.ru/browse/PSBM-40837

Evgenii Shatokhin added a comment - 09/Nov/15 5:13 PM
 > Here are the results for the capabilities in the containers.
 > ...
 > The following ones do not work (or work only partially) for the users in the container, including root:
 > * sys_module: loading/unloading of a kernel module fails.
We had usecases for that (loading iptables modules), but currently i don't know if anybody use this functionality.
(Because we've implemented autoload feature for that particular usecase)

======================
 > * setfcap: setting file capabilities fails.
Let's don't allow this in order to make harder cracking system,
otherwise if a process can get out of ns, it will execute CT-owned precreated file with max security.capabilities set
and get the control other the system.

======================
 > * fsetid: root can set/remove setuid flag for any executables but changing a setuid binary results in the dropped setuid flag.
It makes sense to change it into ve_capable() in cap_bprm_set_creds(), but only after we enable setfcap which we are not going to now.
=> leave as is.

======================
 > * linux_immutable: "chattr +i {file}" does not work in the container but works on the host.
Well, it seems it's safe to enable this in a Container running on a ploop,
but it's inconvenient in case of a shared fs between Containers, which is going to be simfs in vz7
(because if someone set immutable attr inside a CT, prctl won't be able to destroy the CT when needed,
need to remove the immutable attr first).

=> i'll add a dev task for the future: if someone needs support for file attrs inside a CT,
need to enable it for ploop case only, and prohibit in case simfs (any shared fs) is used (some flag on superblock?).

======================
 > * sys_nice (decreasing niceness does not work).
Let's don't allow it.

======================
 > * sys_resource: ulimit allows to increase some of the limits (core file size, stack size, ...) but not all.
 > For example, it cannot increase "max user processes" limit.
do_prlimit():
...
                 /* Keep the capable check against init_user_ns until
                    cgroups can contain all limits */
                 if (new_rlim->rlim_max > rlim->rlim_max &&
                                 !capable(CAP_SYS_RESOURCE))
                         retval = -EPERM;

Let it be as is.

======================
 > * sys_pacct - I sent a simple patch to make it work (https://jira.sw.ru/browse/PSBM-40587).
Good, will apply.

======================
 > * audit_write, but it makes little sense without CAP_AUDIT_CONTROL (turn audit on/off and adjust its rules).
It's ok, will enable in case there is a demand.


Evgenii, thank you.

--
Konstantin

On 11/09/2015 05:37 PM, Evgenii Shatokhin wrote:
> Ещё раз добрый день, Константин!
>
> По PSBM-40837. Я пересмотрел capabilities, написал там, в баге, какие
> работают, а какие нет.
>
> Есть сомнения по 4 из capabilities.
>
> Стоит ли давать процессам в контейнере возможность:
>
> 1) уменьшать их 'nice number' (CAP_SYS_NICE capability)?
>
> 2) увеличивать лимит "max user processes" для процесса (CAP_SYS_RESOURCE
> capability)?
>
> 3) запрещать изменения в каких-то файлах с помощью "chattr +i {file}"
> или чего-то аналогичного (CAP_LINUX_IMMUTABLE capability)?
>
> 4) задавать и убирать capabilities для исполняемых файлов (CAP_SETFCAP
> capability)?
>
> Эти вещи сейчас не работают в контейнерах.
>
> (3), (4), на мой взгляд, контейнерам не нужны, но мало ли.
> Насчёт (1) и (2) - безопаснее бы не давать, но возможно, есть какие-то
> ситуации, когда это нужно?
>
> Остальные capabilities, на мой взгляд, можно оставить как есть и в
> контейнеры не "пробрасывать".
>
> Евгений


More information about the Devel mailing list