[CRIU] [PATCH v3] Add docker phaul driver

Pavel Emelyanov xemul at parallels.com
Thu Oct 22 02:31:08 PDT 2015


On 10/21/2015 07:59 PM, Hui Kang wrote:
> On Wed, Oct 21, 2015 at 11:32 AM, Pavel Emelyanov <xemul at parallels.com> wrote:
>> On 10/21/2015 05:14 PM, Hui Kang wrote:
>>> On Wed, Oct 21, 2015 at 8:23 AM, Pavel Emelyanov <xemul at parallels.com> wrote:
>>>>> diff --git a/phaul/p_haul_iters.py b/phaul/p_haul_iters.py
>>>>> index b2c76e3..f8b97db 100644
>>>>> --- a/phaul/p_haul_iters.py
>>>>> +++ b/phaul/p_haul_iters.py
>>>>> @@ -62,6 +62,7 @@ class phaul_iter_worker:
>>>>>               self.fs.set_options(opts)
>>>>>               self.__force = opts["force"]
>>>>>               self.pre_dump = opts["pre_dump"]
>>>>> +             self.target_host_ip = opts["to"]
>>>>>
>>>>>       def validate_cpu(self):
>>>>>               logging.info("Checking CPU compatibility")
>>>>> @@ -110,6 +111,12 @@ class phaul_iter_worker:
>>>>>               self.fs.set_work_dir(self.img.work_dir())
>>>>>               self.fs.start_migration()
>>>>>
>>>>> +             # TODO: Do not do predump for docker right now. Add page-server
>>>>> +             #       to docker C/R API, then we can enable the pre-dump
>>>>> +             if self.htype.get_driver_name() == "docker" :
>>>>> +                     logging.info("Disable pre-dump for docker")
>>>>> +                     self.pre_dump = False
>>>>> +
>>>>
>>>> No if self.htype.get_driver_name() == "docker" checks should remain in
>>>> generic code, please.
>>>
>>> What does the "No" point to in your reply? Do you think the above code
>>> is correct or not? Thanks.
>>
>> There should have been an "," after "no" :) IOW I think the code is correct,
>> but explicit check for driver being docker should not be here.
> 
> If we remove the "if self.htype.get_driver_name() == "docker" here,
> which place do you think is better to set pre-dump = faulse?

I would add can_pre_dump() method for the htype and make report False
for Docker and True for all the others. Then call this method in the
p_haul_iters.start_migration() (but take care of not killing the auto
detection path).

> I am proposing if we could temporarily depend on the p.haul command
> line before pre-dump being merged to the docker-py and docker
> upstream.
> 
> That means we should enforce --no-pre-dump in the p.haul command line.
> 
> ./p.haul-wrap 10.0.0.2 --no-pre-dump docker 123
> 
> The change will be made to p.haul: if type == docker, --no-pre-dump
> must be set to no. Otherwise, return the help message in the command
> line.
> 
> By doing so, we can remove explicit check in p_haul_iter.py. What do you think?

Yes, this is also an option.

>>
>>>>
>>>>>               logging.info("Checking for Dirty Tracking")
>>>>>               if self.pre_dump == PRE_DUMP_AUTO_DETECT:
>>>>>                       # pre-dump auto-detection
>>>>> @@ -194,36 +201,41 @@ class phaul_iter_worker:
>>>>>
>>>>>               logging.info("Final dump and restore")
>>>>>
>>>>> -             self.target_host.start_iter()
>>>>> -             self.img.new_image_dir()
>>>>> +             if self.htype.get_driver_name() == "docker" :
>>>>> +                     # call docker dump API
>>>>> +                     self.htype.dump()
>>>>> +                     logging.info("Dump complete")
>>>>
>>>> Why do you need such rough fix? Why just setting pre_dump to false doesn't help?
>>>
>>> Two reasons. First, this part is about the final dump. So setting
>>> predump=False does not have effect here. Second, docker driver calls
>>> the dump() which will call docker C/R API directly. This is different
>>> than pid or openvz.
>>
>> OK. Then we should make the dump part be similar to how the restore one
>> looks -- we should use the caller's API for everyone, not direct CRIU
>> connection.
> 
> To make this happen, shall we also change the driver to type pid and
> openvz. Then that seems beyhond my control : )

It is :) We haven't made any releases of P.Haul yet, so we're still free
to make any changes we need. But OK, I will try to prepare such a patch 
shortly.

>>
>>>>
>>>>> +             else:
>>>>> +                     self.target_host.start_iter()
>>>>> +                     self.img.new_image_dir()
>>>>>
>>>>> -             logging.info("\tIssuing dump command to service")
>>>>> +                     logging.info("\tIssuing dump command to service")
>>>>>
>>>>> -             req = criu_req.make_dump_req(
>>>>> -                     self.pid, self.htype, self.img, self.criu_connection, self.fs)
>>>>> -             resp = self.criu_connection.send_req(req)
>>>>> -             while True:
>>>>> -                     if resp.type != pycriu.rpc.NOTIFY:
>>>>> -                             raise Exception("Dump failed")
>>>>> -
>>>>> -                     if resp.notify.script == "post-dump":
>>>>> -                             #
>>>>> -                             # Dump is effectively over. Now CRIU
>>>>> -                             # waits for us to do whatever we want
>>>>> -                             # and keeps the tasks frozen.
>>>>> -                             #
>>>>> -                             break
>>>>> +                     req = criu_req.make_dump_req(
>>>>> +                             self.pid, self.htype, self.img, self.criu_connection, self.fs)
>>>>> +                     resp = self.criu_connection.send_req(req)
>>>>> +                     while True:
>>>>> +                             if resp.type != pycriu.rpc.NOTIFY:
>>>>> +                                     raise Exception("Dump failed")
>>>>> +
>>>>> +                             if resp.notify.script == "post-dump":
>>>>> +                                     #
>>>>> +                                     # Dump is effectively over. Now CRIU
>>>>> +                                     # waits for us to do whatever we want
>>>>> +                                     # and keeps the tasks frozen.
>>>>> +                                     #
>>>>> +                                     break
>>>>>
>>>>> -                     elif resp.notify.script == "network-lock":
>>>>> -                             self.htype.net_lock()
>>>>> -                     elif resp.notify.script == "network-unlock":
>>>>> -                             self.htype.net_unlock()
>>>>> +                             elif resp.notify.script == "network-lock":
>>>>> +                                     self.htype.net_lock()
>>>>> +                             elif resp.notify.script == "network-unlock":
>>>>> +                                     self.htype.net_unlock()
>>>>>
>>>>> -                     logging.info("\t\tNotify (%s)", resp.notify.script)
>>>>> -                     resp = self.criu_connection.ack_notify()
>>>>> +                             logging.info("\t\tNotify (%s)", resp.notify.script)
>>>>> +                             resp = self.criu_connection.ack_notify()
>>>>>
>>>>> -             logging.info("Dump complete")
>>>>> -             self.target_host.end_iter()
>>>>> +                     logging.info("Dump complete")
>>>>> +                     self.target_host.end_iter()
>>>>>
>>>>>               #
>>>>>               # Dump is complete -- go to target node,
>>>>> @@ -233,8 +245,12 @@ class phaul_iter_worker:
>>>>>
>>>>>               logging.info("Final FS and images sync")
>>>>>               self.fs.stop_migration()
>>>>> -             self.img.sync_imgs_to_target(self.target_host, self.htype,
>>>>> -                     self.connection.mem_sk)
>>>>> +
>>>>> +             if self.htype.get_driver_name() == "docker" :
>>>>> +                     self.htype.send_criu_images(self.target_host_ip)
>>>>
>>>> The fs_haul_docker.send_criu_images() does almost the same as phaul_images.send_criu_images
>>>> do, please, reuse the phaul_images class functionality for this.
>>>
>>> Do you mean "sync_imgs_to_target()" in imgaes.py? If so, yes I will
>>> change the code to re-use sync_imgs_to_target.
>>
>> Yes, please.
> 
> Ok, I will make the change

Thanks!

-- Pavel



More information about the CRIU mailing list