[CRIU] [PATCH] cr-super: Initial commit

Wed Sep 16 07:22:07 PDT 2015

On Wed, Sep 16, 2015 at 04:43:21PM +0300, Cyrill Gorcunov wrote:
> On Wed, Sep 16, 2015 at 02:55:09PM +0300, Andrew Vagin wrote:
> > 
> > In addition, we need to check that we are able to attache to a process
> > by ptrace.
> > 
> > I think for that we need to drop CAP_SYS_PTRACE from the effective set,
> > try to call PTRACE_SEIZE and if this operation was success, we can read
> > map_files.
> 
> What for? It's already checked by criu itself. Seizing process is not
> that cheap operation, which actually changes internal state of a task
> we're poking while this wrapper is supposed to be very fast and
> "client" filtering should be based on groups (say group criu)
> or any other auth mechanism. Moreover, in the kernel there
> is a check for PTRACE_MODE_READ not for PTRACE_MODE_ATTACH
> (as happens for ptrace-seizing) so it's not 1:1 map of
> what map-files reading does. Am I missing something obvious?

Someout else (except criu) can use this tool. Pavel suggested a good
solution for this case.

> 
> > > +++ b/super/cr-super.h
> > > @@ -0,0 +1,59 @@
> > > +#ifndef __CR_SUPER_H__
> > > +#define __CR_SUPER_H__
> > > +
> > > +#include <limits.h>
> > > +
> > > +enum {
> > > +	SUPER_RSP_OK				= 1,
> > > +	SUPER_RSP_ERR				= 2,
> > > +
> > > +	SUPER_REQ_EXIT				= 3,
> > > +
> > > +	SUPER_REQ_MFD_OPEN_DIR			= 4,
> > 
> > Why do we need to have a separate command to open a directory?
> > 
> > We can cache a directory desriptor and open a directory only
> > if prev_pid isn't equal to req->pid.
> > 
> > > +	SUPER_REQ_MFD_FILL_STAT			= 5,
> > > +	SUPER_REQ_MFD_FILL_LINK			= 6,
> > > +	SUPER_REQ_MFD_FILL_MNT_ID		= 7,
> > 
> > Why we can't have one only one command?
> 
> Look, there is one problem -- aufs, it might ask for
> reading link, so we have to pass the complete PATH_MAX
> size via net. But I wanted to minimize net usage impact
> so the packets are small in size (ie there usually gonna
> be the following sequence
> 
> 	-> open-dir
> 	<- ok
> 	-> get-stat
> 	<- ok
> 
> and that's all. Probably merging open-dir with get-stat
> is a good idea. Thanks for the note.
> 
> Same time for auif it might endup in
> 
> 	-> opendir
> 	<- ok
> 	-> get-stat
> 	<- ok
> 	-> read-link (huge size)
> 	<- ok
> 	-> parse mnt-id
> 	<- ok

It may be very slow to request informatiosn about each mappings
separatly. Maybe we can open /proc/PID/map_files/XXXX and send the file
descriptor back?

> 
> > > +
> > > +#define __check_opened(__r)					\
> > > +	do {							\
> > > +		if (mfd_dir_fd < 0) {				\
> > > +			pr_err("dir is not yet opened\n");	\
> > > +			__RSP_ERR(__r, -EINVAL);		\
> > > +			return -EINVAL;				\
> > > +		}						\
> > > +	} while (0)
> > 
> > I don't like macroses which calls "return"
> 
> ok, i'll make it inline
> 
> > > +
> > > +	pr_debug("Listening\n");
> > > +	for (;;) {
> > > +		recv_size = recv(socket_fd, NULL, 0, MSG_TRUNC | MSG_PEEK);
> > > +		if (recv_size < sizeof(super_req_hdr_t)) {
> > > +			pr_perror("Can't read request");
> > > +			goto out;
> > > +		}
> > > +
> > > +		reqcv_buf = xmalloc(recv_size);
> > > +		if (!reqcv_buf)
> > > +			goto out;
> > > +
> > > +		recv_size = recv(socket_fd, reqcv_buf, recv_size, MSG_TRUNC);
> > > +		if (recv_size < 0) {
> > > +			pr_perror("Can't read request");
> > > +			goto out;
> > > +		} else if (recv_size == 0)
> > > +			goto out;
> > > +
> > > +		__TRACE_RCV_PKT(reqcv_buf);
> > > +
> > > +		ret = handle_request(socket_fd, &ids, reqcv_buf, recv_size);
> > > +		if (ret < 0)
> > > +			goto out;
> > > +
> > > +		__TRACE_SND_PKT(reqcv_buf);
> > > +
> > > +		send_size = send(socket_fd, reqcv_buf, recv_size, 0);
> > > +		if (send_size < recv_size) {
> > > +			pr_perror("Can't send request");
> > 
> > Don't mix ret and the function exit code.
> > Here ret is zero, so you return zero code for the error case
> 
> yes, thanks, need to setup @ret here, missed.

Could you use additional variables to avoid such bugs? We do this in
other places.

> 
> > > +
> > > +usage:
> > > +	printf("%s: --fd <num>\n", argv[0][1]);
> > 
> > This should be printed on stderr
> 
> ok