[Users] Re: New Kernel Patch

Michael H. Warfield mhw at WittsEnd.com
Mon Jan 18 10:26:59 EST 2010


Hey Suno et al...

On Mon, 2010-01-18 at 11:00 +0100, Suno Ano wrote:

:

> Michael> I found that with a couple of scripts, I could directly convert
> Michael> OpenVZ config files to LXC config files and start my old OpenVZ
> Michael> containers as a container under LXC with no further
> Michael> modification inside the container.

> Please provide your scripts to the public. I would love to see them,
> help improve things and maybe others will join in so nobody needs to be
> alone by switching to LXC.

I'm already working on this.  I don't want to wear out my welcome on
this mailing list by going too far off topic or extending this out too
far.  As soon as I clean some things up so they're not too embarrassing
to me, I'll put them up somewhere and make them available.  I think I'll
also suggest to the lxc maintainers that the time may have come for an
lxc-users mailing list.  :-)

:

> Here is what I found so far http://sysadmin-cookbook.rot13.org/#lxc , go
> down to ve2lxc. I have already started a very rough/ugly collection of
> bits and pieces of information for my personal matters which can be
> found at http://sunoano.name/ws/public_xhtml/linux_containers.html

I saw that site as well.  I'm a Fedora user so some of my stuff is
Fedora centric.  I have some Ubuntu installations and probably need to
do more work with that.

> Michael> Other than a couple of initial test containers I was
> Michael> experimenting with, once I got my scripts settled down and
> Michael> tested, I migrated over 3 dozen VM's from OpenVZ to LXC in a
> Michael> single day with none of the containers experiencing more that a
> Michael> minute or so of down time (transfer time between hosts).
> Michael> Because there were no changes in the containers themselves, I
> Michael> could migrate them back, if I needed to, just as fast.

> I want this! Tell us more please. Details sir ;-)

Not much to tell.  Copied the configuration files over to the target
host and then converted them into lxc configurations using the script I
mentioned.  Then rsync each machine to copy the bulk of the data and
files that won't change, shut him down, final rsync to polish any final
changes and get rid of the run files and locks, then start the VM on the
new host under LXC.  Rinse.  Repeat.

> Michael> 1) /proc/mounts shows mounts outside of the container (ugly but
> Michael>    not fatal). Fixed in git.

> Is this true for kernels >= .32 ?

Yes.  It's fixed in lxc-start in git.  It's not dependent on kernel.
Now, once they clone to the new name space, they umount everything that
is outside of the new container chroot.

> Michael> 2) Possible to break out of a container file system (related to #1
> Michael> above). It's possible to break out of chrooted jails. Fixed in
> Michael> git by using pivot root. This is serious and if you have
> Michael> potential hostiles in a container, I wouldn't use LXC yet or
> Michael> use the utilities from git.

> Also, is this true for kernels >= .32 ?

Again, yes.  They fixed it by adding a pivot root after the chroot into
lxc-start to avoid the public chroot breakout exploits.

> Michael> 3) Halt and Reboot of a container not working. You have to
> Michael>    manually shut down and restart the container from the host.
> Michael>    Being worked on right now. I use a script that detects when
> Michael>    there's only one process running (init) in the container and
> Michael>    the container runlevel is 0 or 6 to decide to shut it down
> Michael>    or restart it. Ugly but works.

> Can you please provide the scipt/resolution you are using. This is still
> true for >= .32 yes? Hm, my containers started automatically when
> rebooting the host. I am on .32, Debian standard kernel in unstable:

I posted part of that over on lxc-devel.  I'll make it available as
well.  Starting containers when the host reboots is not the problem.
The problem is if someone does a reboot, halt, init 6, or init 0 in the
container.  The init process doesn't exit so the container ends up
sitting there running with a single process running in it.  Lot of
discussion on how to properly fix that.

> ,----[ uname -a ]
> | Linux wks 2.6.32-trunk-amd64 #1 SMP Sun Jan 10 22:40:40 UTC 2010 x86_64 GNU/Linux
> `----

:

> Michael> * Handles the bridge management for the eth interfaces
> Michael>   automatically, so no need for extra config files in the host.

> I hate bridges therefore I use lxc.network.type = macvlan which is the
> equivalent to OpenVZ's venet device ... basically a pipe-like connection
> between container and host. No bridge involved. Imho a bride just
> complicates a setup and introduces an additional layer of indirection.

I tried the macvlan route.  It almost worked.  I found I could ping and
connect to the VM from another machine but not from the host machine
itself.  Weird.  May have been something peculiar in my particular setup
or configuration but bridges worked fine and I was already using bridges
and veth with OpenVZ and that made my conversion process easier as well.

> Michael> Primary disadvantage to LXC is that the utlities are at 0.6.4
> Michael> from source forge and 0.6.3 from Fedora and really really under
> Michael> active development and change. Features and facilities are
> Michael> still subject to discussion and change and it's not fully
> Michael> mature on that level. I don't know how long that will take but
> Michael> I wouldn't use anything less that what's in their git repo
> Michael> right now or 0.6.5 or higher, when it comes out.

> I had no problems on Debian testing/sid; stuff is quite recent here

> ,----[ dpkg -l lxc* | grep ii ]
> | ii  lxc   0.6.4-2       Linux containers userspace tools
> `----

See what you get if you cat /proc/mounts inside of one of those
containers.  My guess is that you'll see a lot of stuff you don't want
to see.  You'll also have the problem that you can break out of one of
those containers into the host file system using some of the public
exploits.

:

> Michael> I can't speak for the developers here but, I would not be
> Michael> surprised if this were a real reason for some of the lack of
> Michael> recent progress on newer kernels. Why invest the effort at all
> Michael> if you are going to be able to take advantage of mainline
> Michael> features in the near future? Skip the transition period and get
> Michael> ready for the big jump. Better to organize and prepare for when
> Michael> it reaches that level of maturity. I would like to see OpenVZ
> Michael> running on a recent linux kernel just using the whole cgroup
> Michael> and namespaces facility, even if not all of the granular
> Michael> user_beancounters are fully supported (and may never be fully
> Michael> supported to that degree of granularity).

> Yes, it is paradox. Kir and the rest of the core developers around
> OpenVZ have contributed a ton of good stuff to cgroup, namespaces and
> the networking part, all of which in mainline now, but then OpenVZ
> itself is horribly outdated (.27 where .32 is available).

Kir and Pavel et al have done great work over the years and I've been an
admirer and supporter for a long time of OpenVZ and appreciate
everything they accomplished.  I'm still using it where I can (like on
CentOS hosts) and probably will for a long time to come (RHEL 6 is a
long ways off as yet).

> On Debian I am on .26 on most server around here and yesterday updating
> the host failed because the current udev version does not work with .26
> but only newer kernels. It now really is the time to act, even if it is
> a bit melancholic because OpenVZ is great, great but to old :-/

Concur

> Michael> If I had the maturity and stability of the OpenVZ utilities
> Michael> running on the mainline kernel using namespaces and cgroups and
> Michael> no custom patch, that would be my ideal combination right now.

> I totally agree even though my experience was that the utilities work
> fine. Mainline is mainline is mainline ... :-)

They work.  They just still need work.  They still have several rough
edges to them.

Currently, the lxc-fedora and lxc-debian examples are not working for me
on Fedora.  The lxc-debian script fails to build a container at all
(still looking at that) and the lxc-fedora one doesn't build a
functional container because it still has udev enabled in it.  I've been
building containers using the OpenVZ precreated templates with a few
modifications for udev and adding back the tty devices.  Still need to
play more with the Debian / Ubuntu containers on Fedora hosts both with
OpenVZ and LXC.

Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
Url : http://openvz.org/pipermail/users/attachments/20100118/322299dd/attachment-0001.bin


More information about the Users mailing list