[CRIU] CRIU and Weston

Mon Feb 9 23:59:13 PST 2015

On Mon, 09 Feb 2015 12:16:52 +0200
Ruslan Kuprieiev <kupruser at gmail.com> wrote:

> Hi Pekka,
> 
> Thank you for your response.
> 
> On 02/06/2015 03:05 PM, Pekka Paalanen wrote:
> > On Fri, 30 Jan 2015 21:31:24 +0200
> > Ruslan Kuprieiev <kupruser at gmail.com> wrote:
> >
> >> Hi!
> >>
> >> I would like to add checkpoint/restore support of a graphical
> >> app(no gl) that uses wayland.
> >> I'll appreciate any thoughts and ideas on how to accomplish that as
> >> well as what simplest
> >> app should I start with.
> >>
> >> After discussing it on #wayland, it looks like I should start with
> >> creating patches for weston
> >> to implement needed getters and setters to obtain client's state on
> >> dump and set it on restore.
> >>
> >> I was wondering if you could help me by providing some info about
> >> amount of data, that
> >> describes the client's state.
> >>
> >> I'll also appreciate any hints on where to start digging into
> >> weston.
> > Structs weston_surface, weston_view, weston_subsurface,
> > shell_surface... everything contained by those is a candidate, plus
> > everything that ever references those or those reference. Shared
> > mmapped files, file descriptors...
> >
> > I have very hard time seeing what could work for you while not
> > rewriting the world.
> >
> > Maybe you could start by looking at software pixel buffers. A client
> > creates a file on tmpfs, mmaps and writes pixels into it, sends the
> > file descriptor to the compositor, and the compositor mmaps that
> > file. So for save/restore, you need to somehow reinstate this
> > mapping in both the client and the compositor. This is the
> > fundamental, basic mechanism how clients send content to the
> > compositor, and the compositor decides how long it needs it. Can
> > you do that?
> 
> Yes, CRIU is able to c/r mmap-ed regions of a client.
> Could we just pass fd to the compositor when restoring a client?
> I mean, like tell weston that that buffer is what we're starting from?

That obviously requires changes in Weston to be able to receive it. If
a client is supposed to send it, then you need to extend the Wayland
protocol. Then there is the fundamental problem of re-creating the
Wayland protocol state and being able associate this recreated fd with
the right object... which means you have to patch libwayland in
suspicious ways.

Except that originally in Wayland the fd is sent when creating a
wl_shm_pool object, but the pool object is often destroyed immediately
after creating a wl_buffer from it. So you need to preserve not only
the memory area, but all the metadata of a wl_buffer (offset, length).

Btw. if you want to keep Weston running, then you cannot destroy the
shared memory (file). If Weston holds a reference to it in Wayland
protocol sense (not sending wl_buffer.release), it will need it for
repainting.

> > Clients send the file descriptor only once per wl_shm_pool. After
> > that they assume they can re-use the wl_buffers indefinitely. We
> > have no mechanism to revoke any.
> 
> Do you mean, that this is initial procedure?

What does "initial procedure" mean here? Creating a wl_shm_pool can
happen any time, and the wl_buffers created from a wl_shm_pool can be
re-used indefinitely, even after the wl_shm_pool protocol object is
destroyed.

> > Can you save and restore unix socket connections? Note, that
> > (re)connecting means starting from a clean slate in Wayland protocol
> > state wise.
> 
> In general - we can't, unless client and server are both in the
> process tree that we're dumping.
> But, maybe we could implement proper reconnecting to server socket, 
> doing all needed
> negotiations with server, and then passing saved fd of that pixel
> buffer you mentioned
> before.

I very VERY much doubt that is feasible. I think you'd end up
essentially writing a new middle-man Wayland server-client. Also
libwayland does not support dumping or restoring protocol state so you
cannot do hand-off, and while object ids are allocated deterministically
given a certain protocol message order, the ids are allocated
dynamically and must be without gaps. Even if you managed to get the
client-sent replayed requests in perfect order to recreate the exact
same protocol state, you need to get the compositor to do the same,
because also certain events create new protocol objects, and those
events may not be triggered by your client but something else and async.

Then you would also need to make massive invasive changes to Weston to
make it deal with a client connection disappearing for a while, if you
really wanted to save server-side state. I do not see upstream being too
receptive for such changes, it goes way out of scope of Weston IMO.

This is all just a huge hand-wave... I can't even imagine all the
things you'd need to take into account.

A remotely possible approach might be to run every app through a
middle-man daemon, that acts as a Wayland compositor to the app, and a
Wayland client to the real compositor. Then you'd make the middle-man
disconnect from the compositor when you suspend, and suspend it with
the app. So... essentially something related to how RDP and friends
work, I think. On reconnect the middle-man has a chance to negotiate
everything again instead of trying to replicate old state.

> Btw, what simplest program you would suggest to start with? Maybe 
> something that
> simply draws some static picture?

simple-shm in Weston's demo clients is the canonical "simplest example"
client.

Thanks,
pq