[Users] New Kernel Patch

Michael H. Warfield mhw at WittsEnd.com
Sat Jan 16 11:24:09 EST 2010


On Sat, 2010-01-16 at 09:19 -0500, Scott Dowdle wrote: 
> Suno,
> 
> ----- "Suno Ano" <suno.ano at sunoano.org> wrote:
> > Currently (January 2010) mainline is in development for the .33
> > release, .32 is stable and used by most Linux Distributions like for example
> > Debian, Ubuntu, Suse, etc.
> > 
> > From what it looks now Debian and Ubuntu are going into freeze for
> > their next stable release in March 2010. Will there be an up-to-date OpenVZ
> > kernel patch available by then? Debian is targeting to ship .32 with
> > their next stable release called squeeze.
> > 
> > In case OpenVZ will not be available on at least one of the major
> > Linux distributions and its offsprings, no need to mention how horrid that
> > would be ...
> > 
> > I would love to see OpenVZ be available in Debian's next stable
> > release since I am with no doubt an OpenVZ fanboy
> > http://sunoano.name/ws/public_xhtml/openvz.html
> 
: - Snip

> Of course I could be wrong with my assessment but if I'm not and you
> are stuck what what you see as an unfortunate situation... I wouldn't
> fault you for switching away from OpenVZ. The only thing that even
> comes close to a suitable candidate that I'm aware of is
> Linux-VServer. I haven't really been keeping up with Linux-VServer
> development but I believe that while they aren't interested in working
> with the mainline (they always want to be an out of tree patch), they
> do seem to be adapting Linux-VServer to each mainline kernel that
> comes out. The only problem is a new mainline kernel comes out every 3
> months and I'm not sure exactly how they balance that nor how
> successful they are at that. If anyone would like to provide a summary
> that explains exactly where Linux-VServer developement is (or a link
> to such), I don't think a few emails about Linux-VServer on the OpenVZ
> Users mailing list is going to hurt anything... but of course if
> anyone complains, feel free to email me!
> the info directly (dowdle at montanalinux.org).

I use to use Linux-vserver years and years ago but when they broke IPv6
support moving from 1.x to 2.x I was forced to abandon Linux-vserver and
switch a number of VM's over to OpenVZ.  To this day IPv6 remains an
"experimental patch" for Linux-vserver and I see that question come up
on their list periodically, so I couldn't migrate back there, even if I
wanted to.  That being said, IPv6 support in the OpenVZ vnet device is
nothing to brag about either and I have had to strictly use the veth
devices.

However...  There is a new kid on the block, depending on your
requirements.  Linux Containers or LXC.  It still has a few rough edges
and some differences with OpenVZ but has the big advantage that it's all
in the mainline kernel (2.6.29 and above), so no more patches (yeah!),
it is supported under libvirt, and the utilities are in the major
cutting edge distros like Fedora and Ubuntu.  I found that with a couple
of scripts, I could directly convert OpenVZ config files to LXC config
files and start my old OpenVZ containers as a container under LXC with
no further modification inside the container.  Other than a couple of
initial test containers I was experimenting with, once I got my scripts
settled down and tested, I migrated over 3 dozen VM's from OpenVZ to LXC
in a single day with none of the containers experiencing more that a
minute or so of down time (transfer time between hosts).  Because there
were no changes in the containers themselves, I could migrate them back,
if I needed to, just as fast.

Because LXC requires 2.6.29 and OpenVZ is only available on 2.6.27 or
earlier, obviously you can't run them on the same machine and kernel.

A lot of the OpenVZ developers and Linux-vserver developers have been
contributing to the containers effort in the kernel.


Some of the rough edges:

1) /proc/mounts shows mounts outside of the container (ugly but not
fatal).  Fixed in git.

2) Possible to break out of a container file system (related to #1
above).  It's possible to break out of chrooted jails.  Fixed in git by
using pivot root.  This is serious and if you have potential hostiles in
a container, I wouldn't use LXC yet or use the utilities from git.

3) Halt and Reboot of a container not working.  You have to manually
shut down and restart the container from the host.  Being worked on
right now.  I use a script that detects when there's only one process
running (init) in the container and the container runlevel is 0 or 6 to
decide to shut it down or restart it.  Ugly but works.

4) There still seems to be a lot of development work going on in the
kernel wrt checkpoint and restore.  Since I don't use that much, I
haven't paid that much attention but LXC does support freezing and
unfreezing containers.


Differences:

* LXC supports virtual consoles you can connect to and log into
(lxc-console).

* LXC does not (yet) support the equivalent of "vzctl enter" (under
discussion - some possible patches).

* Does not have the same level of fine grained resource control
available with OpenVZ (something that is not a requirement for me) and
the user_beancounters (some controls in the cgroups resources but not as
many).

* Handles the bridge management for the eth interfaces automatically, so
no need for extra config files in the host.

* You can not (yet) run a command via lxc-execute in a running container
where "vzctl exec" requires a running container but lxc-execute will
start a container and run a single command in it (reference back to the
vzctl enter remarks).

* It looks like, if you wanted to really experiment, you could combine
LXC with unionfs / funionfs to do something similar to the Linux-vserver
"unify" to combine common binaries between containers into a common RO
layer.  I haven't tried this just yet, but soon now.


Primary disadvantage to LXC is that the utlities are at 0.6.4 from
source forge and 0.6.3 from Fedora and really really under active
development and change.  Features and facilities are still subject to
discussion and change and it's not fully mature on that level.  I don't
know how long that will take but I wouldn't use anything less that
what's in their git repo right now or 0.6.5 or higher, when it comes
out.

Sooo...  If you WANT to run a newer leading edge distro like Ubuntu or
Fedora for your host system and you can deal with those differences, LXC
MIGHT be an option.  If you want to still with an LTS distro like RHEL
or CentOS on the host or you need something that works just like OpenVZ,
then probably not.  There might be some complex configurations of
options and devices that will not migrate properly.  All you can do is
test and report and problems.  I am running some CentOS guests in
containers on Fedora 11 and Fedora 12 hosts.

> Now having said all of that I have to clarify with a few more points:
> 
: - Snip

> 3) Outside (of Parallels) developers could probably help the situation
> but such is the life of a large, complex out-of-tree kernel patch. It
> is hard for outsiders to get past the steep learning curve... and it
> is questionable how co-operative Parallels would be.

Exactly.  Which is why the burning need to get to a mainstream kernel.
Containers, namespaces, and cgroups in the kernel have matured to the
point where they are eminently usable.  OpenVZ should be able to start
taking advantage of them directly and begin to eliminate the kernel
patch, if not eliminate it entirely.  Linux-vservers seems to already be
doing some of this and taking advantage of native namespaces and
cgroups.

I can't speak for the developers here but, I would not be surprised if
this were a real reason for some of the lack of recent progress on newer
kernels.  Why invest the effort at all if you are going to be able to
take advantage of mainline features in the near future?  Skip the
transition period and get ready for the big jump.  Better to organize
and prepare for when it reaches that level of maturity.  I would like to
see OpenVZ running on a recent linux kernel just using the whole cgroup
and namespaces facility, even if not all of the granular
user_beancounters are fully supported (and may never be fully supported
to that degree of granularity).

If I had the maturity and stability of the OpenVZ utilities running on
the mainline kernel using namespaces and cgroups and no custom patch,
that would be my ideal combination right now.

> Suno, I do appreciate you raising your concern and asking the
> question. It certainly is a valid one. I wonder if Kir will have a
> good answer for you or not. I hope he does rather than avoiding the
> question which is what I'd be tempted to do if I was him. :) I
> recognise your contributions to our (the OpenVZ) community and would
> prefer not to lose you.
> 
> TYL,

Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
Url : http://openvz.org/pipermail/users/attachments/20100116/47071d39/attachment.bin


More information about the Users mailing list