[CRIU] [PATCH 3/7] seccomp: add support for SECCOMP_MODE_FILTER

Fri Nov 13 13:58:18 PST 2015

On Thu, Nov 12, 2015 at 10:58:22AM +0300, Pavel Emelyanov wrote:
> 
> >>> +			continue;
> >>> +		}
> >>> +
> >>> +		info = xmalloc(sizeof(*info));
> >>> +		if (!info)
> >>> +			goto out;
> >>> +		seccomp_filter__init(&info->filter);
> >>> +
> >>> +		info->filter.filter.len = len * sizeof(struct sock_filter);
> >>> +		info->filter.filter.data = xmalloc(info->filter.filter.len);
> >>> +		if (!info->filter.filter.data)
> >>> +			goto out;
> >>> +
> >>> +		memcpy(info->filter.filter.data, buf, info->filter.filter.len);
> >>> +
> >>> +		info->prev = infos;
> >>> +		infos = info;
> >>> +	}
> >>> +
> >>> +save_infos:
> >>> +	info_count = i;
> >>> +
> >>> +	m = xrealloc(filters, sizeof(*filters) * (next_filter_id + info_count));
> >>
> >> Why do we allocate that many memory? Part of the filters where "inherited".
> >> Can you describe what kind of data structure is built behind the filters
> >> pointer? It's some tree, but I don't get how it exactly looks and how pstree
> >> items point inside it.
> > 
> > Yeah, this is a little strange. There are two sets of pointers to the
> > filters. The first is the "tree" style from pstree, used for resolving
> > inheritance relationships. The second set is this one, which is just a
> > big array with everything. This is used for dumping (and freeing) the
> > actual filters.
> > 
> > We could ditch the tree style ones from pstree and just use the big
> > array (and keep array ids in pstree), but it seemed a little nicer to
> > me this way. But I could be convinced otherwise :)
> 
> No no, keeping the tree relations inside the filters image and data structures
> is nice. But as far as the "big array" thing is concerned --
> 
> > --- /dev/null
> > +++ b/protobuf/seccomp.proto
> > @@ -0,0 +1,8 @@
> > +message seccomp_filter {
> > +	required bytes		filter			= 1;
> > +	optional uint32		prev			= 2;
> > +}
> > +
> > +message seccomp_entry {
> > +	repeated seccomp_filter		seccomp_filters	= 1;
> > +}
> 
> why do we need the seccomp_entry? Why not just put seccomp_filter entries
> into the image file one by one like it's done with pstree_entry-s?

You're right that we don't, I can rework it to get rid of that. We
still need the "big array" just to be able to free things (since
multiple pstrees can point to the same seccomp_filter and there is no
way of marking them "freed").

> >>> +int seccomp_filters_get_rst_pos(CoreEntry *core, int *count, unsigned long *pos)
> >>> +{
> >>> +	SeccompFilter *sf = NULL;
> >>> +	struct sock_fprog *arr = NULL;
> >>> +	int ret = -1, i;
> >>> +
> >>> +	if (!core->tc->has_seccomp_filter) {
> >>> +		*count = 0;
> >>> +		return 0;
> >>> +	}
> >>> +
> >>> +	*count = 0;
> >>> +	*pos = rst_mem_cpos(RM_PRIVATE);
> >>> +
> >>> +	BUG_ON(core->tc->seccomp_filter > se->n_seccomp_filters);
> >>> +	sf = se->seccomp_filters[core->tc->seccomp_filter];
> >>> +
> >>> +	while (1) {
> >>> +		(*count)++;
> >>> +
> >>> +		if (!sf->has_prev)
> >>> +			break;
> >>> +
> >>> +		sf = se->seccomp_filters[sf->prev];
> >>> +	}
> >>> +
> >>> +	arr = rst_mem_alloc(sizeof(*arr) * (*count), RM_PRIVATE);
> >>> +	if (!arr)
> >>> +		goto out;
> >>> +
> >>> +	sf = se->seccomp_filters[core->tc->seccomp_filter];
> >>> +	for (i = 0; i < *count; i++) {
> >>> +		struct sock_fprog *fprog = &arr[i];
> >>> +		// XXX: this is a bit fugly; is there a better way?
> >>
> >> I didn't get a better way for what you were looking for :)
> > 
> > The problem here is that we want to send a struct sock_fprog to the
> > restorer blob, which itself has a pointer to the actual instruction
> > array. So we want to pass a double pointer to the restorer blob. I
> > could do something like pack the second array onto the end of the
> > struct sock_fprog,
> 
> But you do it already, don't you? After you've allocated the arr thing
> in which sock_fprog-s are stored, you allocate more memory to the
> fprog->filter and put the fprog there.
> 
> > but it is nice to be able to refer to the first
> > array without having to do math by hand.
> > 
> > The solution here is to just save the rst_mem_alloc value into the
> > struct where the second level pointer goes, and then remap the result
> > after we remap the struct. That feels sort of broken, though :). But
> > maybe not.
> 
> Ah I see :) I can only one thing that comes to my mind, as you told above,
> in the seccomp_filters_get_rst_pos in fprog->filter you can save offset
> to fprog, omit the loop with remap_ptr-s in sigreturn_restore() and do
> the "math by hand" in pie's restore_seccomp. This would save us one for()
> loop and would keep all the logic in less code.

Less code here, but more code in the restorer blob where we have to do
a bunch of pointer math to figure out where things live. I could be
convinced that the extra pointer math was worth it, but I like this a
little better. Anyway, up to you :)

Tycho