[CRIU] [RFC] net: netlink -- Add waiting for RX fillup

Andrew Vagin avagin at parallels.com
Wed Aug 6 13:59:10 PDT 2014


On Wed, Aug 06, 2014 at 11:57:45PM +0400, Cyrill Gorcunov wrote:
> In case if RX buffer of netlink socket is receiving data
> when we're trying to dump it, the CRIU will refuse to
> proceed simply because we can't fetch the data which
> is placed that deep inside kernel code without modifying
> kernel itself. Lets try to wait until receiving is complete,
> if it takes too long -- exit out with error.

I think this patch doesn't work. All processes are frozen in this
moment, so we will wait forever.
> 
> CC: Andrew Vagin <avagin at parallels.com>
> CC: Igor <igor at parallels.com>
> CC: Pavel Emelyanov <xemul at parallels.com>
> Signed-off-by: Cyrill Gorcunov <gorcunov at openvz.org>
> ---
>  sk-netlink.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/sk-netlink.c b/sk-netlink.c
> index 83a20905400c..b0e1465d6146 100644
> --- a/sk-netlink.c
> +++ b/sk-netlink.c
> @@ -2,6 +2,7 @@
>  #include <linux/netlink.h>
>  #include <linux/rtnetlink.h>
>  #include <poll.h>
> +#include <time.h>
>  
>  #include "fdset.h"
>  #include "files.h"
> @@ -68,12 +69,36 @@ int netlink_receive_one(struct nlmsghdr *hdr, void *arg)
>  static bool can_dump_netlink_sk(int lfd)
>  {
>  	struct pollfd pfd = {lfd, POLLIN, 0};
> -	int ret;
> +	int ret = 1, num;
> +
> +	for (num = 1; ret && num < 11; num++) {
> +		ret = poll(&pfd, 1, 0);
> +		if (ret < 0) {
> +			pr_perror("poll() failed");
> +			break;
> +		} else if (ret == 1) {
> +			struct timespec ts = {
> +				.tv_nsec = 50000 * num,
> +			};
> +
> +			pr_debug("nelink RX buffer is busy, "
> +				 "waiting for %lu nanoseconds\n",
> +				 (unsigned long)ts.tv_nsec);
> +			/*
> +			 * The netlink socket has some data in
> +			 * RX ring which are not yet complete.
> +			 * Lets wait some time until sending
> +			 * side finish transmission but don't
> +			 * wait here forewer.
> +			 */
> +			if (nanosleep(&ts, NULL)) {
> +				pr_perror("nelink nanosleep failed");
> +				break;
> +			}
> +		}
> +	}
>  
> -	ret = poll(&pfd, 1, 0);
> -	if (ret < 0) {
> -		pr_perror("poll() failed");
> -	} else if (ret == 1)
> +	if (ret)
>  		pr_err("The socket has data to read\n");
>  
>  	return ret == 0;
> -- 
> 1.9.3
> 


More information about the CRIU mailing list