[Devel] [PATCH VZ10 2/2] ve: Add bpf_prog_max_nr/bpf_prog_avail_nr cgroup files
Vladimir Riabchun
vladimir.riabchun at virtuozzo.com
Fri May 29 17:43:06 MSK 2026
On 5/29/26 16:32, Pavel Tikhomirov wrote:
>
>
> On 5/29/26 15:14, Vladimir Riabchun wrote:
>>
>> On 5/29/26 14:20, Pavel Tikhomirov wrote:
>>> Expose the per-VE BPF program load limit via two ve cgroup files:
>>>
>>> bpf_prog_max_nr - rw, writable only from ve0, restricts loads
>>> bpf_prog_avail_nr - ro, remaining quota
>>>
>>> Writes adjust the avail counter by the delta so that already-loaded
>>> programs are not retroactively rejected when the cap is lowered.
>>>
>>> https://virtuozzo.atlassian.net/browse/VSTOR-131947
>>> Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
>>> Feature: ve: allow BPF in Containers
>>> ---
>>> kernel/ve/ve.c | 39 +++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 39 insertions(+)
>>>
>>> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
>>> index 48da546117bb7..9c3be61a4366a 100644
>>> --- a/kernel/ve/ve.c
>>> +++ b/kernel/ve/ve.c
>>> @@ -1315,6 +1315,35 @@ static s64 ve_netif_avail_nr_read(struct cgroup_subsys_state *css, struct cftype
>>> return atomic_read(&css_to_ve(css)->netif_avail_nr);
>>> }
>>> +static u64 ve_bpf_prog_max_nr_read(struct cgroup_subsys_state *css, struct cftype *cft)
>>> +{
>>> + return css_to_ve(css)->bpf_prog_max_nr;
>>
>> Read is not protected by ve->op_sem, possible race.
>
> We only protect write against write to preserve full consistency
> between bpf_prog_max_nr and bpf_prog_avail_nr pair. We are fine with eventual
> consistency on read with benefit of avoiding excess locking.
It's fine unless someone tries to use this value in tests, etc.
>>
>>> +}
>>> +
>>> +static int ve_bpf_prog_max_nr_write(struct cgroup_subsys_state *css, struct cftype *cft, u64 val)
>>> +{
>>> + struct ve_struct *ve = css_to_ve(css);
>>> + int delta;
>>> +
>>> + if (!ve_is_super(get_exec_env()))
>>> + return -EPERM;
>>> +
>>> + if (val > INT_MAX)
>>> + return -EOVERFLOW;
>>> +
>>> + down_write(&ve->op_sem);
>>> + delta = val - ve->bpf_prog_max_nr;
>>> + ve->bpf_prog_max_nr = val;
>>> + atomic_add(delta, &ve->bpf_prog_avail_nr);
>>
>> We should check ve->bpf_prog_avail_nr + delta >= 0, otherwise we can have more
>> programs than allowed.
>
> That is ok, we allow existing programs to take more than allowed, since they
> already have it, we only prevent new programs unless avail_nr becomes positive.
Fair point.
>>
>>> + up_write(&ve->op_sem);
>>> + return 0;
>>> +}
>>> +
>>> +static s64 ve_bpf_prog_avail_nr_read(struct cgroup_subsys_state *css, struct cftype *cft)
>>> +{
>>> + return atomic_read(&css_to_ve(css)->bpf_prog_avail_nr);
>>> +}
>>> +
>>> static int ve_os_release_read(struct seq_file *sf, void *v)
>>> {
>>> struct cgroup_subsys_state *css = seq_css(sf);
>>> @@ -1786,6 +1815,16 @@ static struct cftype ve_cftypes[] = {
>>> .name = "netif_avail_nr",
>>> .read_s64 = ve_netif_avail_nr_read,
>>> },
>>> + {
>>> + .name = "bpf_prog_max_nr",
>>> + .flags = CFTYPE_NOT_ON_ROOT,
>>> + .read_u64 = ve_bpf_prog_max_nr_read,
>>> + .write_u64 = ve_bpf_prog_max_nr_write,
>>> + },
>>> + {
>>> + .name = "bpf_prog_avail_nr",
>>> + .read_s64 = ve_bpf_prog_avail_nr_read,
>>
>> Why signed value?
>
> It can be negative if limit is set to less than used.
Agree.
>>
>>> + },
>>> {
>>> .name = "os_release",
>>> .max_write_len = __NEW_UTS_LEN + 1,
>>
>> --
>> Best regards, Riabchun Vladimir
>> Linux Kernel Developer, Virtuozzo
>>
>
--
Best regards, Riabchun Vladimir
Linux Kernel Developer, Virtuozzo
More information about the Devel
mailing list