[Devel] [PATCH VZ10 2/2] ve: Add bpf_prog_max_nr/bpf_prog_avail_nr cgroup files

Fri May 29 17:43:06 MSK 2026

On 5/29/26 16:32, Pavel Tikhomirov wrote:
> 
> 
> On 5/29/26 15:14, Vladimir Riabchun wrote:
>>
>> On 5/29/26 14:20, Pavel Tikhomirov wrote:
>>> Expose the per-VE BPF program load limit via two ve cgroup files:
>>>
>>>     bpf_prog_max_nr   - rw, writable only from ve0, restricts loads
>>>     bpf_prog_avail_nr - ro, remaining quota
>>>
>>> Writes adjust the avail counter by the delta so that already-loaded
>>> programs are not retroactively rejected when the cap is lowered.
>>>
>>> https://virtuozzo.atlassian.net/browse/VSTOR-131947
>>> Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
>>> Feature: ve: allow BPF in Containers
>>> ---
>>>    kernel/ve/ve.c | 39 +++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 39 insertions(+)
>>>
>>> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
>>> index 48da546117bb7..9c3be61a4366a 100644
>>> --- a/kernel/ve/ve.c
>>> +++ b/kernel/ve/ve.c
>>> @@ -1315,6 +1315,35 @@ static s64 ve_netif_avail_nr_read(struct cgroup_subsys_state *css, struct cftype
>>>        return atomic_read(&css_to_ve(css)->netif_avail_nr);
>>>    }
>>>    +static u64 ve_bpf_prog_max_nr_read(struct cgroup_subsys_state *css, struct cftype *cft)
>>> +{
>>> +    return css_to_ve(css)->bpf_prog_max_nr;
>>
>> Read is not protected by ve->op_sem, possible race.
> 
> We only protect write against write to preserve full consistency
> between bpf_prog_max_nr and bpf_prog_avail_nr pair. We are fine with eventual
> consistency on read with benefit of avoiding excess locking.

It's fine unless someone tries to use this value in tests, etc.

>>
>>> +}
>>> +
>>> +static int ve_bpf_prog_max_nr_write(struct cgroup_subsys_state *css, struct cftype *cft, u64 val)
>>> +{
>>> +    struct ve_struct *ve = css_to_ve(css);
>>> +    int delta;
>>> +
>>> +    if (!ve_is_super(get_exec_env()))
>>> +        return -EPERM;
>>> +
>>> +    if (val > INT_MAX)
>>> +        return -EOVERFLOW;
>>> +
>>> +    down_write(&ve->op_sem);
>>> +    delta = val - ve->bpf_prog_max_nr;
>>> +    ve->bpf_prog_max_nr = val;
>>> +    atomic_add(delta, &ve->bpf_prog_avail_nr);
>>
>> We should check ve->bpf_prog_avail_nr + delta >= 0, otherwise we can have more
>> programs than allowed.
> 
> That is ok, we allow existing programs to take more than allowed, since they
> already have it, we only prevent new programs unless avail_nr becomes positive.

Fair point.

>>
>>> +    up_write(&ve->op_sem);
>>> +    return 0;
>>> +}
>>> +
>>> +static s64 ve_bpf_prog_avail_nr_read(struct cgroup_subsys_state *css, struct cftype *cft)
>>> +{
>>> +    return atomic_read(&css_to_ve(css)->bpf_prog_avail_nr);
>>> +}
>>> +
>>>    static int ve_os_release_read(struct seq_file *sf, void *v)
>>>    {
>>>        struct cgroup_subsys_state *css = seq_css(sf);
>>> @@ -1786,6 +1815,16 @@ static struct cftype ve_cftypes[] = {
>>>            .name            = "netif_avail_nr",
>>>            .read_s64        = ve_netif_avail_nr_read,
>>>        },
>>> +    {
>>> +        .name            = "bpf_prog_max_nr",
>>> +        .flags            = CFTYPE_NOT_ON_ROOT,
>>> +        .read_u64        = ve_bpf_prog_max_nr_read,
>>> +        .write_u64        = ve_bpf_prog_max_nr_write,
>>> +    },
>>> +    {
>>> +        .name            = "bpf_prog_avail_nr",
>>> +        .read_s64        = ve_bpf_prog_avail_nr_read,
>>
>> Why signed value?
> 
> It can be negative if limit is set to less than used.

Agree.

>>
>>> +    },
>>>        {
>>>            .name            = "os_release",
>>>            .max_write_len        = __NEW_UTS_LEN + 1,
>>
>> -- 
>> Best regards, Riabchun Vladimir
>> Linux Kernel Developer, Virtuozzo
>>
> 

-- 
Best regards, Riabchun Vladimir
Linux Kernel Developer, Virtuozzo