[CRIU] The progress of Time namespace

Andrei Vagin avagin at virtuozzo.com
Sat Jun 2 00:34:30 MSK 2018


On Fri, Jun 01, 2018 at 01:20:33PM -0500, Eric W. Biederman wrote:
> Adrian Reber <adrian at lisas.de> writes:
> 
> > On Fri, Jun 01, 2018 at 11:04:26AM +0800, yukon wrote:
> >> I found that the criu community intent to resolve the timer issue[1], I
> >> wonder if there is an issue to
> >> track the progress?
> >
> > I have heard of other people experimenting with it and I also had a few
> > patches to try it out. The point where I stopped is when I found out
> > that most time calls are actually coming from the VDSO and not from the
> > kernel and it is still unclear to me how to handle namespaces and VDSO
> > correctly.
> >
> > I have also talked with Christian (on CC) about it and I also contacted
> > Eric at some point (also on CC). Maybe they have more information about
> > the current status.
> 
> Andriae.  My apologies for not getting back to you earlier (I was
> swamped) but that is not a good excuse.  I was very impressed by what
> you did.
> 
> For me personally I have been looking for a real world case where the
> timers matter.  Having that would increase the priority of this work
> from where I stand.
> 
> To date all I have done is recognize that a time namespace is almost
> certainly something that we need, and read the code enough to have a
> general sense of how the time infrastructure in the kernel works.
> 
> I think the VDSO has per cpu if not per process constants so we should
> be able to affect this in a namespace.  If the VDSO does not we
> certainly can make that happen.
> 
> I would be very happy to merge a time namespace.   I would probably even
> start looking at implementation details if I had a compelling test case
> in my hand.
> 
> Yukon.  I don't have the beginning of this thread.  So if you know of a
> practical case that does not work because of timers I would love to hear
> about it.

Hi Eric,

We have a practial case. A few CRIU users reported us situations, when
applications stop working after migrating them to another host.

Usually this means that they use clock_gettime or timer_settime. The
problem here is that we can't adjust clocks on a destination host to
their values on a source host. For example, the application uses
CLOCK_MONOTONIC to measure time slices, but after migrating to another
host, clock_gettime(CLOCK_MONOTONIC) may retun a value which is smaller
than what was gotten on the source host. The application doesn't expect
such behaviour for CLOCK_MONOTONIC, and it probably will work
incorrectly (stuck, crash, etc).

Here is one quote from the CRIU mailing list:

  Is there a timeline on when the time namespace might be implemented? Or
  else is there anyone, even outside CRIU, working on it that you guys
  know of? It seems like this might be one of the last major obstacles
  keeping migration from being used in production systems, given that not
  all containers and connections can be migrated as long as a time
  dependency is capable of messing it up.
  https://github.com/checkpoint-restore/criu/issues/451#issuecomment-386073812

Thanks,
Andrei
> 
> Eric
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list