[CRIU] [PATCH v2] Add docker phaul driver
Hui Kang
hkang.sunysb at gmail.com
Wed Oct 21 07:58:56 PDT 2015
On Wed, Oct 21, 2015 at 10:42 AM, Hui Kang <hkang.sunysb at gmail.com> wrote:
> On Wed, Oct 21, 2015 at 8:07 AM, Pavel Emelyanov <xemul at parallels.com> wrote:
>> On 10/20/2015 07:53 PM, Hui Kang wrote:
>>> Hi, Pavel,
>>> I am able to remove more if htype.name() == "docker in the patch set 3
>>>
>>> https://lists.openvz.org/pipermail/criu/2015-October/022921.html
>>>
>>> Please see my inline reply. Another problem I met is that: after I
>>> enable criu_conn to remove some "if htype.name() == "docker ", docker
>>> migration and CPU validation suceeeds. However on the client side, I
>>> got such error at end
>>>
>>> 16:17:39.250: Asking target host to restore
>>> Error (cr-service.c:103): RPC error: Invalid req: Success
>>> 16:17:41.555: CRIU RPC error (0/7)
>>> Traceback (most recent call last):
>>> File "./p.haul", line 79, in <module>
>>> worker.start_migration()
>>> File "/root/development/p.haul/phaul/p_haul_iters.py", line 264, in
>>> start_migration
>>> resp = self.criu_connection.ack_notify()
>>> File "/root/development/p.haul/phaul/criu_api.py", line 74, in ack_notify
>>> return self._recv_resp()
>>> File "/root/development/p.haul/phaul/criu_api.py", line 53, in _recv_resp
>>> raise Exception("CRIU RPC error (%d/%d)" % (resp.type, self._last_req))
>>> Exception: CRIU RPC error (0/7)
>>>
>>> Could you help on this? Thanks.
>>
>> What's the gitid of the p.haul sources you use? I see no line 264
>> in mine :) And -- what's in CRIU logs, it looks like some errors
>> in notification handling.
>
> This is becuase the p.haul I am uisng is patched with changes for
> docker; so it has line 264.
>
> I checked the log of criu daemon on source and destination. They are the same:
>
> # cat /log
> (00.000170) The service socket is bound to /var/run/criu_service.socket
> (00.000740) Waiting for connection...
>
> At first, I thought the log does not show any connection because the
> docker driver will not make criu_req to the criu daemon. This may also
> explain that why the notification handling fails.
>
> However, later I realized that the validate_cpu makes request to the
> criu daemon. So I do not understand why the criulog does not show any
> connection. Can you explain this? Thanks.
I observed self._swrk = subprocess.Popen([criu_binary, "swrk", "%d" %
css[0].fileno()]) in criu_api.py. The p.haul will launch criu daemon
itself.
But then the question is where is the log file. Thanks.
- Hui
>
> - Hui
>
>>
>> -- Pavel
>>
>>> - Hui
>>>
>>>
>>> On Tue, Oct 20, 2015 at 5:29 AM, Pavel Emelyanov <xemul at parallels.com> wrote:
>>>> Hi, Hui, find my comments inline.
>>>>
>>>> Overall this is much better, but still I'd like to have even less
>>>> "if htype.name() == "docker"" stubs over the code. Let's discuss (inline)
>>>> what issues you're trying to solve and rework the generic p.haul code
>>>> respectively.
>>>>
>>>>> index 11b3dbb..fd2baa3 100644
>>>>> --- a/phaul/images.py
>>>>> +++ b/phaul/images.py
>>>>> @@ -140,6 +140,11 @@ class phaul_images:
>>>>> logging.info("Sending images to target")
>>>>>
>>>>> start = time.time()
>>>>> +
>>>>> + if htype.get_driver_name() == "docker" :
>>>>> + htype.send_criu_images()
>>>>> + return
>>>>
>>>> In docker driver the send_criu_images() anyway starts the fs_haul_subtree
>>>> and rsync-s them. Can you teach the docker driver to always return the
>>>> subtree fs hauler to avoid this "if"?
>>>
>>> In v3, I use send_criu_images for docker and sync_imgs_to_target() for
>>> other types. I agreed that it would be better to merge them together.
>>> So I list it as a todo task.
>>>
>>>>
>>>>> +
>>>>> cdir = self.image_dir()
>>>>>
>>>>> target_host.start_accept_images(phaul_images.IMGDIR)
>>>>
>>>>> diff --git a/phaul/p_haul_iters.py b/phaul/p_haul_iters.py
>>>>> index b2c76e3..8f8fb73 100644
>>>>> --- a/phaul/p_haul_iters.py
>>>>> +++ b/phaul/p_haul_iters.py
>>>>> @@ -36,6 +36,13 @@ class phaul_iter_worker:
>>>>> self.img = images.phaul_images("dmp")
>>>>>
>>>>> self.htype = p_haul_type.get_src(p_type)
>>>>> +
>>>>> + if self.htype.get_driver_name() != "docker" :
>>>>> + # docker will talk to swrk in runc
>>>>> + self.criu_connection = criu_api.criu_conn(self.data_socket)
>>>>> + else:
>>>>> + self.criu_connection = ""
>>>>> +
>>>>
>>>> You want to make the "dump" and "pre_dump" stages via Docker API too, am I right?
>>>
>>> Yes. I will do this seperately for runC and docker.
>>>
>>>>
>>>>> if not self.htype:
>>>>> raise Exception("No htype driver found")
>>>>>
>>>>> @@ -55,13 +62,16 @@ class phaul_iter_worker:
>>>>>
>>>>> def set_options(self, opts):
>>>>> self.target_host.set_options(opts)
>>>>> - self.criu_connection.verbose(opts["verbose"])
>>>>> - self.criu_connection.shell_job(opts["shell_job"])
>>>>> + if self.htype.get_driver_name() != "docker" :
>>>>> + self.criu_connection.verbose(opts["verbose"])
>>>>> + self.criu_connection.shell_job(opts["shell_job"])
>>>>> +
>>>>> self.img.set_options(opts)
>>>>> self.htype.set_options(opts)
>>>>> self.fs.set_options(opts)
>>>>> self.__force = opts["force"]
>>>>> self.pre_dump = opts["pre_dump"]
>>>>> + self.target_host_ip = opts["to"]
>>>>>
>>>>> def validate_cpu(self):
>>>>> logging.info("Checking CPU compatibility")
>>>>> @@ -103,13 +113,39 @@ class phaul_iter_worker:
>>>>>
>>>>> migration_stats.start()
>>>>>
>>>>> - if not self.__force:
>>>>> - self.validate_cpu()
>>>>> + # TODO fix it
>>>>> + if self.htype.get_driver_name() != "docker" :
>>>>> + if not self.__force:
>>>>> + self.validate_cpu()
>>>>
>>>> P.haul does CPU validation itself.
>>>
>>> Applied in v3.
>>>
>>>>
>>>>>
>>>>> logging.info("Preliminary FS migration")
>>>>> self.fs.set_work_dir(self.img.work_dir())
>>>>> self.fs.start_migration()
>>>>>
>>>>> + logging.info("Starting iterations")
>>>>> +
>>>>> + # For Docker, we take a different path
>>>>> + if self.htype.get_driver_name() == "docker" :
>>>>> + logging.info("Take a special path for Docker")
>>>>> +
>>>>> + self.htype.dump()
>>>>> + logging.info("\tDocker dump succeeded")
>>>>> + logging.info("FS and images sync")
>>>>> + # sync the aufs filesystem again
>>>>> + self.fs.stop_migration()
>>>>> +
>>>>> + # send the docker criu image to host
>>>>> + self.htype.send_criu_images(self.target_host_ip)
>>>>> +
>>>>> + logging.info("Asking target host to restore")
>>>>> + self.target_host.restore_from_images()
>>>>> +
>>>>> + return
>>>>
>>>> Would setting the self.pre_dump to "NO" help to avoid the hand-made
>>>> code above?
>>>
>>> Applied in v3.
>>>
>>>>
>>>>> +
>>>>> + # TODO: Do not do predump for docker right now. Add page-server
>>>>> + # to docker C/R API, then we can enable
>>>>> + # the pre-dump
>>>>> +
>>>>> logging.info("Checking for Dirty Tracking")
>>>>> if self.pre_dump == PRE_DUMP_AUTO_DETECT:
>>>>> # pre-dump auto-detection
>>>>
>>>>> diff --git a/phaul/p_haul_service.py b/phaul/p_haul_service.py
>>>>> index 11883a6..f0667fc 100644
>>>>> --- a/phaul/p_haul_service.py
>>>>> +++ b/phaul/p_haul_service.py
>>>>> @@ -45,17 +45,19 @@ class phaul_service:
>>>>> logging.info("Setting up service side %s", htype_id)
>>>>> self.img = images.phaul_images("rst")
>>>>>
>>>>> - self.criu_connection = criu_api.criu_conn(self._mem_sk)
>>>>> self.htype = p_haul_type.get_dst(htype_id)
>>>>>
>>>>> - # Create and start fs receiver if current p.haul module provide it
>>>>> - self.__fs_receiver = self.htype.get_fs_receiver(self._fs_sk)
>>>>> - if self.__fs_receiver:
>>>>> - self.__fs_receiver.start()
>>>>> + if self.htype.get_driver_name() != "docker" :
>>>>> + self.criu_connection = criu_api.criu_conn(self._mem_sk)
>>>>> + # Create and start fs receiver if current p.haul module provide it
>>>>> + self.__fs_receiver = self.htype.get_fs_receiver(self._fs_sk)
>>>>> + if self.__fs_receiver:
>>>>> + self.__fs_receiver.start()
>>>>
>>>> Make docker driver return empty receiver to avoid this "if".
>>>
>>> Applied in v3.
>>>
>>>>
>>>>>
>>>>> def rpc_set_options(self, opts):
>>>>> - self.criu_connection.verbose(opts["verbose"])
>>>>> - self.criu_connection.shell_job(opts["shell_job"])
>>>>> + if self.htype.get_driver_name() != "docker" :
>>>>> + self.criu_connection.verbose(opts["verbose"])
>>>>> + self.criu_connection.shell_job(opts["shell_job"])
>>>>> self.img.set_options(opts)
>>>>> self.htype.set_options(opts)
>>>>>
>>>>
>>>> -- Pavel
>>>>
>>> .
>>>
>>
More information about the CRIU
mailing list