[CRIU] [PATCH] uffd.c: create lazy-pages socket in image directory

Mike Rapoport rppt at linux.vnet.ibm.com
Mon Jan 16 08:24:08 PST 2017


On Mon, Jan 16, 2017 at 01:32:40PM +0100, Adrian Reber wrote:
> On Mon, Jan 16, 2017 at 01:42:14PM +0200, Mike Rapoport wrote:
> > On Mon, Jan 16, 2017 at 09:45:22AM +0100, Adrian Reber wrote:
> > > On Sun, Jan 15, 2017 at 09:22:54AM +0200, Mike Rapoport wrote:
> > > > On Fri, Jan 13, 2017 at 06:31:15PM +0300, Pavel Emelyanov wrote:
> > > > > On 01/13/2017 01:47 PM, Adrian Reber wrote:
> > > > > > From: Adrian Reber <areber at redhat.com>
> > > > > > 
> > > > > > runc uses the work-dir and image-dir option to control where
> > > > > > criu is running. This means that criu chdir()-s to work-dir
> > > > > > and the lazy-pages socket cannot be found as it is in the image
> > > > > > directory.
> > > > > 
> > > > > Bu the socket __should__ be in work directory, is it really there? If no,
> > > > > then that's the bug, we should bind() socket in proper place.
> > > > 
> > > > The socket __is__ created in the working directory. The socket may not be
> > > > found if 'criu restore' and 'criu lazy-pages' use different --work-dir...
> > > 
> > > Trying to run the following I get:
> > > 
> > > $ criu lazy-pages  --page-server --address 192.168.122.3  --port 27 -vvvv -D /run/runc/container/criu.work/
> > > (00.000003) Version: 2.9 (gitid v2.9-606-g57d3d73)
> > > (00.000029) No inventory.img image
> > > (00.000041) Error (criu/util.c:552): Can't read link of fd -404: No such file or directory
> > > (00.000044) Error (criu/protobuf.c:75): Unexpected EOF on (null)
> > > 
> > > This is in runc's working directory. There is no inventory.img as it is
> > > the working directory and not the image directory. So the current code
> > > expects the working and image directory to be the same.
> > 
> > The default for all CRIU tools is to presume the same working and image
> > directory. If you'd like to use different directories for lazy-pages,
> > you'll need to pass -W and -D options to the lazy-pages daemon, just as for
> > the other CRIU commands, e.g.
> > 
> > $ criu lazy-pages --page-server --address 192.168.122.3 --port 27 -vvvv \
> >        -D /path/to/runc/criu.img -W /run/runc/container/criu.work/
> 
> Ah, thanks. That makes sense. Unfortunately that breaks lazy migration
> with runc. runc restore fails if it believes the container is still
> running. Without looking at the code it seems as soon as the directory
> /run/runc/container/ exists I cannot restore my container any more as
> 'runc restore' exists with "container with id exists". And I have to
> create the directory before doing the restore to start the lazy-pages
> daemon.
> 
> This could probably be solved if the lazy-pages daemon would be
> automatically started by 'criu restore' or with an option to manually
> specify the location of the socket.

I'm not familiar with runc, but I had run into problems with lazy migration
and zdtm on the same host. Even if the lazy-pages daemon would be forked
from restore, there could be a conflict between the existing processes held
by dump and the restored processes.
 
> 		Adrian
> 

--
Sincerely yours,
Mike.



More information about the CRIU mailing list