[CRIU] [PATCH] Make restored processes inherit file descriptors from criu
Andrew Vagin
avagin at parallels.com
Tue Nov 25 23:39:56 PST 2014
On Tue, Nov 25, 2014 at 02:49:13PM -0800, Saied Kazemi wrote:
> On Mon, Nov 24, 2014 at 11:12 PM, Andrew Vagin <avagin at parallels.com> wrote:
> >
> > On Mon, Nov 24, 2014 at 02:18:23PM -0800, Saied Kazemi wrote:
> > > There are cases where a process's file descriptor cannot be restored
> > > from the checkpointed image. For example, a pipe file descriptor with
> > > one end in the checkpointed process and the other end in a separate
> > > process (that was not part of the checkpointed process tree) cannot be
> > > restored because after checkpoint the pipe would be broken and removed.
> > > In these cases, criu's caller should set up a new file descriptor to be
> > > inherited by the restored process.
> > >
> > > When checkpointing, the argument of --inherit-fd is a string that
> > > identifies the file. It can be obtained from the target of the symbolic
> > > link in /proc/<pid>/fd (for example, /path/to/log/file or pipe:[234956]).
> > >
> > > When restoring, the argument of --inherit-fd has the form fd[%d]:%s,
> > > where %d tells criu which file descriptor to use for restoring file
> > > identified by %s.
> >
> > How do you handle conflicts between inherit-fd sets and restored file
> > descriptors.
> >
> > For example I have a process with the following set of file descriptors:
> > 0 -> tty
> > 1 -> tty
> > 2 -> tty
> > 3 -> /test
> > 4 -> pipe:[234956]
> >
> > On restore I create a pipe and use --inherit-fd fd3:pipe:[234956]. In
> > this case the 3 fd will be restored before the 4 fd. Why does this
> > operation not overwrite the inherit-fd descriptor?
> >
> > Why does criu not return an error?
> > fd 3 already in use (called at files.c:867
>
> Thanks for your quick review and feedback.
You are welcome:)
>
> I have a patch that resolves clashing fds. In an earlier
> implementation, I moved inherit fds to service fd area but, in order
> to avoid increasing the number of service fds, the patch that I am
> going to send you moves the clashing inherit fd to a different
> descriptor.
>
> Below is an actual test program that I wrote for the case you've
> mentioned above. There is a pipe (pipe:[823441]) between parent
> (29823) and child (29824). Child keeps writing hello %d to the pipe
> which parent reads and prints. After checkpointing the child (criu
> dump -D /tmp/criu_img -o dump.log -v4 -j -t 29824), the parent creates
> a new pipe (pipe:[823469]) and calls criu to restore the child (criu
> restore -D /tmp/criu_img -o restore.log v4 -j --inherit-fd
> fd[3]:pipe:[823441]). Child resumes execution as if nothing happened.
>
> ...
> [29824] total 0
> lrwx------ 1 root root 64 Nov 25 13:25 0 -> /dev/pts/3
> lrwx------ 1 root root 64 Nov 25 13:25 1 -> /dev/pts/3
> lrwx------ 1 root root 64 Nov 25 13:25 2 -> /dev/pts/3
> lr-x------ 1 root root 64 Nov 25 13:25 3 -> /tmp/test
> l-wx------ 1 root root 64 Nov 25 13:25 4 -> pipe:[823441]
> [29824] writing hello 3
>
> [29823] lr-x------ 1 root root 64 Nov 25 13:25 /proc/29823/fd/3 -> pipe:[823441]
> [29823] read hello 3
>
> [29823] child 29824 exited with status 9
> [29823] creating a new pipe
> [29869] criu restore -D /tmp/criu_img -o restore.log v4 -j
> --inherit-fd fd[3]:pipe:[823441]
>
> [29824] total 0
> lrwx------ 1 root root 64 Nov 25 13:25 0 -> /dev/pts/3
> lrwx------ 1 root root 64 Nov 25 13:25 1 -> /dev/pts/3
> lrwx------ 1 root root 64 Nov 25 13:25 2 -> /dev/pts/3
> lr-x------ 1 root root 64 Nov 25 13:25 3 -> /tmp/test
> l-wx------ 1 root root 64 Nov 25 13:25 4 -> pipe:[823469]
> [29824] writing hello 4
>
> [29823] lr-x------ 1 root root 64 Nov 25 13:25 /proc/29823/fd/3 -> pipe:[823469]
> [29823] read hello 4
> ...
>
>
> > >
> > > Note that while it's OK to omit --inherit-fd for checkpoint and specify
> > > it only for restore, it's better to always specify it because checkpoint
> > > code will not attempt to dump an inherit fd file.
> > >
> > > Signed-off-by: Saied Kazemi <saied at google.com>
> > > ---
...
> > > diff --git a/util.c b/util.c
> > > index dd76863..3b468f2 100644
> > > --- a/util.c
> > > +++ b/util.c
> > > @@ -41,6 +41,7 @@
> > > #include "cr_options.h"
> > > #include "servicefd.h"
> > > #include "cr-service.h"
> > > +#include "files.h"
> > >
> > > #define VMA_OPT_LEN 128
> > >
> > > @@ -93,6 +94,14 @@ void pr_vma(unsigned int loglevel, const struct vma_area *vma_area)
> > > int close_safe(int *fd)
> > > {
> > > int ret = 0;
> > > +
> > > + /*
> > > + * Don't close files descriptors that criu's caller
> > > + * set up to be inherited by the restored process.
> > > + */
> > > + if (inherit_fd_lookup_fd(*fd))
> > > + return 0;
> >
> > I think it's a bad place for this check. close_safe() is a generic
> > function for closing file descriptors.
> >
> > For example one of patterns how it's used
> >
> > int fd = -1;
> > if (smth)
> > goto err;
> >
> > fd = open();
> > if (fd < 0)
> > goto err;
> >
> > ...
> >
> > err:
> > close_safe(&fd)
> > return -1;
>
> If open succeeds, fd will be a valid descriptor different from any
> inherit fds. Therefore, close_safe() would close it as expected. Did
> I miss something here?
I want to say that this function is so generic, that we don't expect
extra logic from it.
I think the right place for this logic in close_old_fds().
I can't find where you close inherit_fd-s after restoring files. Did I
skip smth?
>
> --Saied
More information about the CRIU
mailing list