[CRIU] Patch for the migration feature to change ip

孙亚 sunya888888 at 163.com
Mon Jul 14 06:19:02 PDT 2014










At 2014-07-14 05:18:58, "Pavel Emelyanov" <xemul at parallels.com> wrote:
>On 07/13/2014 12:41 PM, 孙亚 wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2014-07-11 07:09:52, "Pavel Emelyanov" <xemul at parallels.com> wrote:
>>>On 07/11/2014 10:00 AM, 孙亚 wrote:
>>>> Hi there:
>>>>  In order to implement the migration with changing the ip to the target machine where the program migrates
>>>>  to ,  I add a arg option '-m' in main function in  crtools.c and add some code to limit the -m option only
>>>>  valid in restore operation. And I add a data member in opts struct defined in cr_option.h.
>>>> 
>>>> When the user use the command like this:
>>>> 
>>>> '''criu restore -D targetFiles -m 192.168.0.1 --tcp-established '''
>>>> 
>>>> the ip will be changed into 192.168.0.1 in function restore_sockaddr in sk-inet.c.
>>>
>>>But there can be many sockets, which of them will have the ip changed into 192.168.0.1?
>> It depends on whether the all sockets to be restored are handled by function resotre_sockaddr.If it is , then
>> all the sockets will be changed into 192.168.0.1.And according to my test , it is.
>

>Well, this is not good as there can be more than one socket in the image.
Sure.
>
>>>
>>>> Of course , the program will be restored , but the tcp connection will be disconnected because of the changing
>>>> of the ip. And for the program , there should be error handling code for this scenario.
>>>
>>>Maybe it's just better to close the connection while restoring instead of fixing the ip address?
>> 
>> Yep, it is what Berkeley C/R system does, which doesn't support the restoring the socket aspect. I guess if we
>> close the connection during dumping , there should be a complex strategy for restoring the connections ,including:
>> 1)allowing the user to indicate the IP they want to migrate to 2)allowing the user to decide which IP should be
>> assigned to which socket. 3) the packages in the queue of current connection will be dropped ,if there is no way 
>> to redirect them into new connection and make them consistent with the other end of the connection.
>> I guess , it's difficult to know how to reconnect to the server program with new IPs in CRIU. And in fact , 
>> the complex step for user is the step 2) , which also could not be avoided if we want to acheive dumping and
>> restoring tcp connection in CRIU.
>> At the same time , We could not avoid step 3) either. So what we do now is that we decide to let the developer
>> of the client program to handle this situation.But before they can use the error handling code to deal with the
>> situation above , firstly we need to have the program restored.So we changed the IP and restore the program ,which 
>> will detect the broken connection and do their error handling code,such as reconnecting or exitting, after retoration.
>> 
>> In one word , what we do is just give the program a trigger to handle this situation , but what need to do is 
>> decided by the client program.
>> By the way , after the discussion by us , we don't think the TCP connection restoration is rational in all
>> situations , especially migration .Even thoughyou could restore the one end of connection , but if you could 
>> not restore the other end of the connection , an inconsitent situation between the two ends appears , which will
>> go against the TCP/IP principle.
>
>That's interesting. CRIU indeed doesn't know much details about processes and their connections,
>but on the other hand CRIU can call custom hooks so that external code could help. What if we
>put more callbacks into CRIU's TCP/IP sockets dumping and restoring code, so that you could
>write a plugin with any logic you need to handle that case? The plugins API is in the include/plugin.h
>and plugin.c files. Currently, there's no hooks for TCP/IP sockets, but if you could propose
>where to put those (with an example of a plugin) I would gladly merge these changes.

That's greate. I will do it as soon as possible.
And in fact I have another question to you , that is , I find if the pid of the process to be restored  is already ocuppied by other process (which could happen during migration or restoring locally) , the process to be restored will be assigned a new pid.But because the new pid assigned to the process is different from the original one , the restoration operation will fail, just like this:
'''
Error (cr-restore.c:1227): Pid 17047 do not match expected 17046
Error (cr-restore.c:1036): 17047 exited, status=255
Error (cr-restore.c:1590): Restoring FAILED.

'''
So Whether is it better to provide a choice to user for restoring the process with new pid?


>>>> By the way , if there is not -m option in command, the program will be restored in the original way.
>>>> Finally, I patched the three files. But what should be noticed , the command to patch the files:
>>>> 
>>>> '''diff -uN fromFile toFile > file.patch'''
>>>
>>>Git makes this much simpler :) Below is the link on a page describing how to do it.
>>>
>>>> the fromFile is from the criu-1.3-rc1 , and the toFile is from the criu-1.3-rc2.
>>>> The whole story will be seen: https://bugzilla.openvz.org/show_bug.cgi?id=2988
>>>> If there is any problem , please let me know.
>>>
>>>Can you send the patch in the format described at http://criu.org/How_to_submit_patches
>>>Briefly -- the patch should be inline (viewable in the mailer) and with signed-off-by line.
>> Sure, I will have done it as soon as possible.
>
>Cool!
>
>Thanks,
>Pavel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140714/6170370d/attachment.html>


More information about the CRIU mailing list