[CRIU] [PATCH 1/1] Use choose_service_fd_base in the root process.

Radoslaw Burny rburny at google.com
Fri Aug 3 16:48:15 MSK 2018


Replying to Kirill's comment here, since I didn't get it in my inbox.

> This chooses service fd base for main criu process, usernsd process
(possibly there are other helpers).
> I'm not sure, these guys never create many temporary fds before they
install a service fd:
>




> criu main task:
>    cached_fd1 = open()
>    cached_fd2 = open()
>    ...
>    cached_fdn = open()
>
>    install_service_fd() -> rewrites cached_fdn.
>
> It's need to review criu main task & friends code and check there are no
such the situations.
> Also, if we are going to do this way, we will have to implement some
things to detect such
> the problems in the future code.
>
> Really restored tasks (init and its children) do not create many
temporary fd by design,
> and we check their fd leak by tests (in zdtm.py), so there is no such the
problem.

Thanks, I wasn't aware of that. So the main CRIU process is effectively
relying on a high
service_fd_base to avoid collisions with temporary fds, right?

In that case, we could still try to select a base by using a large enough
pad.
In install_service_fd, we could then do:
  BUG_ON(close(new_fd) == 0);
(such a safety check may be generally useful, regardless of my changes).

What do you think? Even with RLIMIT_NOFILE of few billions, the fork delay
is just few hundred ms
- so if it turns out that my change gets very complex, it might not be
worth it.

On Fri, Aug 3, 2018 at 3:12 AM Andrei Vagin <avagin at virtuozzo.com> wrote:

> Kirill, could you review this patch?
>
> On Thu, Aug 02, 2018 at 02:53:10PM +0200, Radoslaw Burny wrote:
> > choose_service_fd_base selects lowest possible numbers for service fds
> > to keep fd table of the criu process small, making fork() faster.
> > However, this method was only used in child processes, not the root CRIU
> > process - thus forks from root were still slow. This fixes the issue.
> >
> > Sorry for slightly unreadable patch - the code move was necessary due to
> > changes in the inter-function dependencies. The key part of the diff is
> > this line:
> >   service_fd_base = choose_service_fd_base(NULL);
> >
> > Signed-off-by: Radoslaw Burny <rburny at google.com>
> > ---
> >  criu/util.c | 66 ++++++++++++++++++++++++++---------------------------
> >  1 file changed, 33 insertions(+), 33 deletions(-)
> >
> > diff --git a/criu/util.c b/criu/util.c
> > index ab4d89aa..114756a6 100644
> > --- a/criu/util.c
> > +++ b/criu/util.c
> > @@ -441,27 +441,6 @@ static int service_fd_base;
> >  /* Id of current process in shared fdt */
> >  static int service_fd_id = 0;
> >
> > -int init_service_fd(void)
> > -{
> > -     struct rlimit64 rlimit;
> > -
> > -     /*
> > -      * Service FDs are those that most likely won't
> > -      * conflict with any 'real-life' ones
> > -      */
> > -
> > -     if (syscall(__NR_prlimit64, getpid(), RLIMIT_NOFILE, NULL,
> &rlimit)) {
> > -             pr_perror("Can't get rlimit");
> > -             return -1;
> > -     }
> > -
> > -     service_fd_rlim_cur = (int)rlimit.rlim_cur;
> > -     service_fd_base = service_fd_rlim_cur;
> > -     BUG_ON(service_fd_base < SERVICE_FD_MAX);
> > -
> > -     return 0;
> > -}
> > -
> >  static int __get_service_fd(enum sfd_type type, int service_fd_id)
> >  {
> >       return service_fd_base - type - SERVICE_FD_MAX * service_fd_id;
> > @@ -560,20 +539,20 @@ static void move_service_fd(struct pstree_item
> *me, int type, int new_id, int ne
> >
> >  static int choose_service_fd_base(struct pstree_item *me)
> >  {
> > -     int nr, real_nr, fdt_nr = 1, id = rsti(me)->service_fd_id;
> > +     int nr = -1, real_nr, fdt_nr = 1, id = rsti(me)->service_fd_id;
> >
> > -     if (rsti(me)->fdt) {
> > -             /* The base is set by owner of fdt (id 0) */
> > -             if (id != 0)
> > -                     return service_fd_base;
> > -             fdt_nr = rsti(me)->fdt->nr;
> > +     if (me != NULL) {
> > +             if (rsti(me)->fdt) {
> > +                     /* The base is set by owner of fdt (id 0) */
> > +                     if (id != 0)
> > +                             return service_fd_base;
> > +                     fdt_nr = rsti(me)->fdt->nr;
> > +             }
> > +             /* Now find process's max used fd number */
> > +             if (!list_empty(&rsti(me)->fds))
> > +                     nr = list_entry(rsti(me)->fds.prev,
> > +                                     struct fdinfo_list_entry,
> ps_list)->fe->fd;
> >       }
> > -     /* Now find process's max used fd number */
> > -     if (!list_empty(&rsti(me)->fds))
> > -             nr = list_entry(rsti(me)->fds.prev,
> > -                             struct fdinfo_list_entry, ps_list)->fe->fd;
> > -     else
> > -             nr = -1;
> >
> >       nr = max(nr, inh_fd_max);
> >       /*
> > @@ -607,6 +586,27 @@ static int choose_service_fd_base(struct
> pstree_item *me)
> >       return nr;
> >  }
> >
> > +int init_service_fd(void)
> > +{
> > +     struct rlimit64 rlimit;
> > +
> > +     /*
> > +      * Service FDs are those that most likely won't
> > +      * conflict with any 'real-life' ones
> > +      */
> > +
> > +     if (syscall(__NR_prlimit64, getpid(), RLIMIT_NOFILE, NULL,
> &rlimit)) {
> > +             pr_perror("Can't get rlimit");
> > +             return -1;
> > +     }
> > +
> > +     service_fd_rlim_cur = (int)rlimit.rlim_cur;
> > +     service_fd_base = choose_service_fd_base(NULL);
> > +     BUG_ON(service_fd_base < SERVICE_FD_MAX);
> > +
> > +     return 0;
> > +}
> > +
> >  int clone_service_fd(struct pstree_item *me)
> >  {
> >       int id, new_base, i, ret = -1;
> > --
> > 2.18.0.597.ga71716f1ad-goog
> >
> > _______________________________________________
> > CRIU mailing list
> > CRIU at openvz.org
> > https://lists.openvz.org/mailman/listinfo/criu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20180803/4673cff8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4843 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.openvz.org/pipermail/criu/attachments/20180803/4673cff8/attachment-0001.p7s>


More information about the CRIU mailing list