[CRIU] [PATCH 07/11] lazy-pages: make handle_requests also poll for listening socket
Pavel Emelyanov
xemul at virtuozzo.com
Tue Apr 12 06:12:25 PDT 2016
On 04/12/2016 04:07 PM, Mike Rapoport wrote:
> On Tue, Apr 12, 2016 at 03:48:25PM +0300, Pavel Emelyanov wrote:
>> On 04/12/2016 03:21 PM, Mike Rapoport wrote:
>>> On Tue, Apr 12, 2016 at 03:08:14PM +0300, Pavel Emelyanov wrote:
>>>> On 04/12/2016 02:27 PM, Mike Rapoport wrote:
>>>>> On Tue, Apr 12, 2016 at 02:06:22PM +0300, Pavel Emelyanov wrote:
>>>>>> On 04/11/2016 09:19 AM, Mike Rapoport wrote:
>>>>>>> Signed-off-by: Mike Rapoport <rppt at linux.vnet.ibm.com>
>>>>>>> ---
>>>>>>> criu/uffd.c | 119 +++++++++++++++++++++++++++++++++++++++---------------------
>>>>>>> 1 file changed, 77 insertions(+), 42 deletions(-)
>>>>>>>
>>>>>>> diff --git a/criu/uffd.c b/criu/uffd.c
>>>>>>> index 87c2602..fece56e 100644
>>>>>>> --- a/criu/uffd.c
>>>>>>> +++ b/criu/uffd.c
>>>>>>> @@ -545,32 +545,56 @@ static int handle_user_fault(int fd, struct list_head *uffd_list, void *dest)
>>>>>>> return 0;
>>>>>>> }
>>>>>>>
>>>>>>> +static int handle_socket_conn(int listen, struct sockaddr_un *saddr,
>>>>>>> + struct list_head *uffd_list)
>>>>>>> +{
>>>>>>> + int uffd;
>>>>>>> + int uffd_flags;
>>>>>>> +
>>>>>>> + uffd = ud_open(listen, saddr);
>>>>>>> + if (uffd < 0) {
>>>>>>> + pr_perror("uffd open error");
>>>>>>> + return -1;
>>>>>>> + }
>>>>>>> +
>>>>>>> + pr_debug("uffd is 0x%d\n", uffd);
>>>>>>> + uffd_flags = fcntl(uffd, F_GETFD, NULL);
>>>>>>> + pr_debug("uffd_flags are 0x%x\n", uffd_flags);
>>>>>>> + if (fcntl(uffd, F_SETFD, uffd_flags | O_NONBLOCK))
>>>>>>> + return -1;
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Find the memory pages belonging to the restored process so
>>>>>>> + * that it is trackable when all pages have been transferred.
>>>>>>> + */
>>>>>>> + if ((total_pages = find_vmas(uffd_list)) == -1)
>>>>>>> + return -1;
>>>>>>> +
>>>>>>> + return uffd;
>>>>>>> +}
>>>>>>> +
>>>>>>> #define POLL_TIMEOUT 5000
>>>>>>>
>>>>>>> -static int handle_requests(int fd)
>>>>>>> +static int handle_requests(int listen, struct sockaddr_un *saddr)
>>>>>>> {
>>>>>>> struct pollfd *pollfd;
>>>>>>> - int nr_pollfd;
>>>>>>> + int nr_pollfd = 1;
>>>>>>> + int max_pollfd;
>>>>>>> + int timeout = -1;
>>>>>>> int ret = -1;
>>>>>>> unsigned long ps;
>>>>>>> void *dest;
>>>>>>> + int i;
>>>>>>>
>>>>>>> LIST_HEAD(uffd_list);
>>>>>>>
>>>>>>> - /*
>>>>>>> - * Find the memory pages belonging to the restored process
>>>>>>> - * so that it is trackable when all pages have been transferred.
>>>>>>> - */
>>>>>>> - if ((total_pages = find_vmas(&uffd_list)) == -1)
>>>>>>> - return -1;
>>>>>>> -
>>>>>>> /* All operations will be done on page size */
>>>>>>> ps = page_size();
>>>>>>> dest = malloc(ps);
>>>>>>>
>>>>>>> /* allocate pollfd per task + pollfd for listen */
>>>>>>> - nr_pollfd = task_entries->nr_tasks + 1;
>>>>>>> - pollfd = malloc(sizeof(*pollfd) * nr_pollfd);
>>>>>>> + max_pollfd = task_entries->nr_tasks + 1;
>>>>>>> + pollfd = malloc(sizeof(*pollfd) * max_pollfd);
>>>>>>> if (!pollfd || !dest) {
>>>>>>> pr_err("memory allocation failed\n");
>>>>>>> ret = -1;
>>>>>>> @@ -578,12 +602,10 @@ static int handle_requests(int fd)
>>>>>>> }
>>>>>>>
>>>>>>> /*
>>>>>>> - * use nr_pollfd to set number of pollfd's being polled
>>>>>>> - * FIXME: eventually nr_pollfd should be nr_tasks + 1, and the
>>>>>>> - * first fd to poll on should that of listen
>>>>>>> + * use nr_pollfd to set number of pollfd's being polled, the
>>>>>>> + * first pollfd is always for the listening socket
>>>>>>> */
>>>>>>> - nr_pollfd = 1;
>>>>>>> - pollfd[0].fd = fd;
>>>>>>> + pollfd[0].fd = listen;
>>>>>>> pollfd[0].events = POLLIN;
>>>>>>>
>>>>>>> while (1) {
>>>>>>> @@ -592,7 +614,7 @@ static int handle_requests(int fd)
>>>>>>> * no uffd pages are requested the code switches to
>>>>>>> * copying the remaining pages.
>>>>>>> */
>>>>>>> - ret = poll(pollfd, nr_pollfd, POLL_TIMEOUT);
>>>>>>> + ret = poll(pollfd, nr_pollfd, timeout);
>>>>>>
>>>>>> But there should be strict split between the time we expect for new connections
>>>>>> on this socket versus when pagefaults start arriving. Because first we restore
>>>>>> the tree and send uffd sockets to this daemon and _then_ we resume the tasks and
>>>>>> from now on no new connections may appear, but PF-s start popping up from uffds.
>>>>>>
>>>>>> Since we know this, why do we need to complicate this and poll for all the
>>>>>> sockets together?
>>>>>
>>>>> I had an impression that the uffd is sent to the daemon at the stage when
>>>>> tasks are already running concurrently. If I was wrong here and the tasks
>>>>> are resumed after all the uffd's are sent to the daemon, then we indeed
>>>>> don't need to poll for uffd's and sockets together.
>>>>
>>>> Yes, we first fork() all the stuff and send the uffds out, then tasks are resumed
>>>> and start generating PF-s. The problem here is that this "resume" point is not
>>>> reported to uffd :) so he can only guess this moment.
>>>
>>> And if it will guess wrong? ;-)
>>
>> Then it should know it for sure...
>>
>>> Let's keep (e)polling for everything? :)
>>
>> I don't have string arguments for not polling the listening socket together
>> with uffds, but -- you don't have to poll listening socket :) Just call accept
>> is a loop and pick up newcomers. The only problem is how to know that we have
>> to stop doing this.
>
> what about
>
> int accepted = 0;
> do {
> client = accept();
> ...
> accepted++;
> } while (accepted < nr_tasks);
I think this would work, yes.
-- Pavel
More information about the CRIU
mailing list