[Devel] [PATCH rh7] fs: make overlayfs disabled in CT by default

Wed Jul 6 10:33:07 PDT 2016

On 07/06/2016 02:26 AM, Vladimir Davydov wrote:

> On Tue, Jul 05, 2016 at 04:45:10PM -0700, Maxim Patlasov wrote:
>> Vova,
>>
>>
>> On 07/04/2016 11:03 AM, Maxim Patlasov wrote:
>>> On 07/04/2016 08:53 AM, Vladimir Davydov wrote:
>>>
>>>> On Tue, Jun 28, 2016 at 03:48:54PM -0700, Maxim Patlasov wrote:
>>>> ...
>>>>> @@ -643,6 +643,7 @@ static struct cgroup_subsys_state
>>>>> *ve_create(struct cgroup *cg)
>>>>>         ve->odirect_enable = 2;
>>>>>       ve->fsync_enable = 2;
>>>>> +    ve->experimental_fs_enable = 2;
>>>> For odirect_enable and fsync_enable, 2 means follow the host's config, 1
>>>> means enable unconditionally, and 0 means disable unconditionally. But
>>>> we don't want to allow a user inside a CT to enable this feature, right?
>>> I thought it's OK to allow user inside CT to enable it if host sysadmin is
>>> OK about it. The same logic as for odirect: by default
>>> ve0->experimental_fs_enable = 0, so whatever user inside CT writes to this
>>> knob, the feature is disabled. If sysadmin writes '1' to ve0->..., the
>>> feature becomes enabled. If an user wants to voluntarily disable it inside
>>> CT, that's OK too.
>>>
>>>> This is confusing. May be, we'd better add a new VE_FEATURE for the
>>>> purpose?
>>> Not sure right now. I'll look at it and let you know later.
>> Technically, it is very easy to implement new VE_FEATURE for overlayfs. But
>> this approach is less flexible because we return EPERM from ve_write_64 if
>> CT is running, and we'll need to involve userspace team to make the feature
>> configurable and (possibly) persistent. Do you think it's worthy for
>> something we'll get rid of soon anyway (I mean as soon as PSBM-47981
>> resolved)?
> Fair enough, not much point in introducing yet another feature for the
> purpose, at least right now, sysctl should do for the beginning.
>
> Come to think of it, do we really need this sysctl inside containers? I
> mean, by enabling this sysctl on the host we open a possible system-wide
> security hole, which a CT admin won't be able to mitigate by disabling
> overlayfs inside her CT. So why would she need it for? To prevent
> non-privileged CT users from mounting overlayfs inside a user ns? But
> overlayfs is not permitted to be mounted by a userns root anyway AFAICS.
> May be, just drop in-CT sysctl then?

Currently, anyone who can login into CT as root may mount overlayfs, 
then try to exploit its weak sides. This is a problem.

Until we ensure that overlayfs is production-ready (at least does not 
have obvious breaches), let's disable it by default (of course, if ve != 
ve0). Those who want to play with overlayfs at their own risk will 
enable it by turning on some knob on host system (ve == ve0).

I don't think that mixing trusted (overlayfs-enabled) CTs and not 
trusted (overlayfs-disabled) CTs on the same physical node is important 
use-case for now. So, any simple system-wide knob must work. 
Essentially, the same scheme with odirect: by default it is '0' in ve0 
and the root inside CT cannot turn it on; and if it is manually set to 
'1' in ve0, the behavior will depend on per-CT root willing.

Thanks,
Maxim