[CRIU] problem restoring unix queues?
Tycho Andersen
tycho.andersen at canonical.com
Wed Jul 22 15:44:37 PDT 2015
On Wed, Jul 22, 2015 at 09:41:29AM -0600, Tycho Andersen wrote:
> On Mon, Jul 20, 2015 at 11:45:19AM -0600, Tycho Andersen wrote:
> > > But sockets we're having here are SOCK_DGRAM, aren't they?
> > >
> > > > and that's what is
> > > > causing the problem? The backlog is the only thing that I can see that
> > > > can cause EAGAIN from unix_dgram_sendmsg().
> > > >
> > > > For the example above, it looks like the socket that did a listen().
> > > > The peer that is failing to send is:
> > > >
> > > > # 0x1005327 == 16798503, the peer above
> > > > # 0x1005ffe == 16801790, the peer below
> > > >
> > > > (00.095929) 77: Connect 0x1005ffe to 0x1005327
> > > > (00.095937) 77: Trying to restore recv queue for 14
> > > > (00.095949) 77: Restoring 357-bytes skb for 14
> > > > (00.095976) 77: Error (sk-queue.c:238): Failed to send packet: Resource temporarily unavailable
> > >
> > > Ah it looks like I finally got what you mean :)
> > >
> > > There are two places in unix_dgram_sendmsg that result in EAGAIN.
> > > First is hitting send socket sndbuf, we (seem to) address that
> > > by raising it with sockopt. But there's another one -- check for
> > > unix_recvq_full(other). It checks for the number of packets in
> > > queue doesn't exceed the max_ack_backlog :(
> > >
> > > Are you talking about it?
> >
> > Yes :)
> >
> > > If yes, how has it happened that datagram socket has more data
> > > packets (as we see on dump) than this limit?
> >
> > I'm not sure. I wonder if something is getting screwed up because both
> > ends of the socket are closed (it looks like parse_rtattr zeros the
> > buffer it writes to, so maybe that's where the zero is coming from?).
> > It should be easy enough to test, I'll try to play with it this
> > afternoon.
>
> So I played around with this yesterday and I found some interesting
> things, although I have no idea what the cause of this is still:
Ok, a bit more fiddling and I think I've got a test case for it, so
hopefully a fix will come shortly :)
Tycho
More information about the CRIU
mailing list