[CRIU] [PATCH 3/7] seccomp: add support for SECCOMP_MODE_FILTER

Wed Nov 11 23:58:22 PST 2015

>>> +			continue;
>>> +		}
>>> +
>>> +		info = xmalloc(sizeof(*info));
>>> +		if (!info)
>>> +			goto out;
>>> +		seccomp_filter__init(&info->filter);
>>> +
>>> +		info->filter.filter.len = len * sizeof(struct sock_filter);
>>> +		info->filter.filter.data = xmalloc(info->filter.filter.len);
>>> +		if (!info->filter.filter.data)
>>> +			goto out;
>>> +
>>> +		memcpy(info->filter.filter.data, buf, info->filter.filter.len);
>>> +
>>> +		info->prev = infos;
>>> +		infos = info;
>>> +	}
>>> +
>>> +save_infos:
>>> +	info_count = i;
>>> +
>>> +	m = xrealloc(filters, sizeof(*filters) * (next_filter_id + info_count));
>>
>> Why do we allocate that many memory? Part of the filters where "inherited".
>> Can you describe what kind of data structure is built behind the filters
>> pointer? It's some tree, but I don't get how it exactly looks and how pstree
>> items point inside it.
> 
> Yeah, this is a little strange. There are two sets of pointers to the
> filters. The first is the "tree" style from pstree, used for resolving
> inheritance relationships. The second set is this one, which is just a
> big array with everything. This is used for dumping (and freeing) the
> actual filters.
> 
> We could ditch the tree style ones from pstree and just use the big
> array (and keep array ids in pstree), but it seemed a little nicer to
> me this way. But I could be convinced otherwise :)

No no, keeping the tree relations inside the filters image and data structures
is nice. But as far as the "big array" thing is concerned --

> --- /dev/null
> +++ b/protobuf/seccomp.proto
> @@ -0,0 +1,8 @@
> +message seccomp_filter {
> +	required bytes		filter			= 1;
> +	optional uint32		prev			= 2;
> +}
> +
> +message seccomp_entry {
> +	repeated seccomp_filter		seccomp_filters	= 1;
> +}

why do we need the seccomp_entry? Why not just put seccomp_filter entries
into the image file one by one like it's done with pstree_entry-s?

>>> +int seccomp_filters_get_rst_pos(CoreEntry *core, int *count, unsigned long *pos)
>>> +{
>>> +	SeccompFilter *sf = NULL;
>>> +	struct sock_fprog *arr = NULL;
>>> +	int ret = -1, i;
>>> +
>>> +	if (!core->tc->has_seccomp_filter) {
>>> +		*count = 0;
>>> +		return 0;
>>> +	}
>>> +
>>> +	*count = 0;
>>> +	*pos = rst_mem_cpos(RM_PRIVATE);
>>> +
>>> +	BUG_ON(core->tc->seccomp_filter > se->n_seccomp_filters);
>>> +	sf = se->seccomp_filters[core->tc->seccomp_filter];
>>> +
>>> +	while (1) {
>>> +		(*count)++;
>>> +
>>> +		if (!sf->has_prev)
>>> +			break;
>>> +
>>> +		sf = se->seccomp_filters[sf->prev];
>>> +	}
>>> +
>>> +	arr = rst_mem_alloc(sizeof(*arr) * (*count), RM_PRIVATE);
>>> +	if (!arr)
>>> +		goto out;
>>> +
>>> +	sf = se->seccomp_filters[core->tc->seccomp_filter];
>>> +	for (i = 0; i < *count; i++) {
>>> +		struct sock_fprog *fprog = &arr[i];
>>> +		// XXX: this is a bit fugly; is there a better way?
>>
>> I didn't get a better way for what you were looking for :)
> 
> The problem here is that we want to send a struct sock_fprog to the
> restorer blob, which itself has a pointer to the actual instruction
> array. So we want to pass a double pointer to the restorer blob. I
> could do something like pack the second array onto the end of the
> struct sock_fprog,

But you do it already, don't you? After you've allocated the arr thing
in which sock_fprog-s are stored, you allocate more memory to the
fprog->filter and put the fprog there.

> but it is nice to be able to refer to the first
> array without having to do math by hand.
> 
> The solution here is to just save the rst_mem_alloc value into the
> struct where the second level pointer goes, and then remap the result
> after we remap the struct. That feels sort of broken, though :). But
> maybe not.

Ah I see :) I can only one thing that comes to my mind, as you told above,
in the seccomp_filters_get_rst_pos in fprog->filter you can save offset
to fprog, omit the loop with remap_ptr-s in sigreturn_restore() and do
the "math by hand" in pie's restore_seccomp. This would save us one for()
loop and would keep all the logic in less code.

-- Pavel