[Users] Re: I'm testing out vzpkg2!
Robert Nelson
robertn at the-nelsons.org
Sat Sep 27 15:55:45 EDT 2008
Scott Dowdle wrote:
> Robert,
>
> I've finished up the interview questions and you can find theme here:
> http://www.montanalinux.org/files/robert-nelson-interview.html
>
*Who is Robert Nelson?*
/Please answer whatever questions you feel comfortable answering./
*Q:* Please tell me a little bit about yourself. Where are you from
(originally and now)? What is your educational background? What are your
hobbies? What is your family status (married, kids?)? What do you do for
a living?
I was born in Burnaby, British Columbia, Canada. Burnaby is a suburb of
Vancouver where Expo '86 was held. I spent most of my life living in
various cities all across Canada. In 1992 I moved to Seattle,
Washington in the United States to work at Microsoft. In 2004 I retired
from Microsoft and currently live in Bellevue, Washington a few miles
from the Microsoft Campus. My partner of 8 years and I don't have any
children but we have two miniature Dachshunds that think they are our
children.
Since retiring I've occupied my time managing my real estate investments
and contributing to open-source projects, programming is probably the
closest thing I have to a hobby :-). Most of the open-source projects
I've been involved with have been directly or indirectly related to my
business.
*Q:* How long have you been programming? What programming languages do
you use / prefer? Are there any other software projects you are / have
been involved with that you would like to mention?
I started programming in 1973 when I used to skip high school to sneak
off to Simon Fraser University to play with the IBM 370/155 mainframe.
I was around so much that they offered me a summer job developing
courses using a CAI language they developed as an extension to APL. So
my first languages were APL/CAI and CourseWriter III. From there I
branched into PL/1 and System 360 Assembler.
Over the years I've learned and programmed in pretty much every
programming language ever developed, including some oldies but goodies
like Fortran and Cobol (Somehow I missed out on Algol). Most of my
professional life has been spent programming in C/C++ and various
machine languages.
Lately I've been working a lot in Perl, PHP, Python and Shell scripts
because they are the primary languages used in open-source projects.
I don't really have any preferences regarding specific programming
languages, I believe that they are all just tools to get the job done.
Some are better suited for certain jobs than others but they all have
strengths and weaknesses.
Ever since I started playing with Actor (a defunct language, like
Smalltalk but with a syntax similar to C++ rather than Pascal), I found
I prefer object-oriented ones. I find that object-oriented programming
is the best way to organize my thoughts and make large projects more
manageable. Even when I'm using straight C code I still organize the
code as if I was writing C++.
For most of my career I've worked on "system code", operating systems,
compilers and device drivers. When I was with Motorola I worked
primarily on proprietary and Unix SVR4 minicomputers. At Microsoft I
worked on the Interactive TV project, Windows CE and in the Windows NT
kernel group on the I/O subsystem and Plug and Play. Since leaving
Microsoft I've worked mainly on Linux. I avoid a religious attachment
to any platform, I feel that, like programming languages, each has its
own strengths and weaknesses. Sometimes I think that the acolytes on
all sides must be compensating for some physical shortcoming of their
own. :-)
The main open-source projects I've contributed to include Bacula, mtx,
FreePBX, and GForge. I've also contributed fixes to countless others.
I've contributed a few of the tools I've written to the open-source
community. Usually my involvement starts out with fixes for bugs that
hinder my use of the software. If there is some area that could be
improved to make the software much more useful for me then my
contribution might be larger. It really depends on my interest and how
easy it is to work with the other developers involved in the project.
But my involvement is usually selfish.
*Q:* How long have you been using OpenVZ? What other virtualization
products have you tried and do you use? What do you use OpenVZ for?
I've been using OpenVZ for over a year. My interest in virtualization
products sprang from my desire to get more use out of a dedicated server
I leased and my work on mtx (which I took over about a year or so ago)
which in turn was an offshoot of my involvement in Bacula. Both Bacula
and mtx required building and testing on a wide variety of operating
systems and versions. I found the process of installing and booting all
those operating systems tedious and was looking for a better solution
than filling my house with dedicated machines.
I started out with VMware. While it mostly met my needs, I prefer an
open-source solution where I can change the things I don't like and
there is more likely hood of someone contributing other useful tools.
I then switched my focus to Xen. It provided most of the functionality
of VMware albeit with reduced performance for Windows guests due to a
lack of PV drivers for video and disk.
That would have probably been the end of my virtualization quest were it
not for another requirement coming from my business. Since a number of
my investments are in Canada I've been using VoIP for a few years. I
had been running an Asterisk server in my house but found that Comcast's
network somewhat unreliable in terms of latency. So after shopping
around I found a good deal on a dedicated server with great connectivity
and moved my Asterisk server there.
Based on the success of running the server there I decided it would be
nice to move other servers like my email server out of my house and on
to the server. The only drawback was that the system was somewhat
resource limited and the cost of increasing memory was a significant
monthly increase. So I went looking for a more resource efficient way
of running multiple virtual servers.
I looked at VServer and OpenVZ, VServer seemed unreliable and without
much of a community backing it up. OpenVZ fit the bill and I settled on
it. It has worked well on my dedicated server. I run three virtual
machines, a DNS server, an Asterisk server and a Zimbra server. I've
since replaced the rented dedicated server with a co-located one of my
own and added two additional virtual servers Funambol for mobile sync
and EJBCA for a certificate authority.
As a result of my experience with OpenVZ I set up a virtual build
machine that runs about 16 variations of operating systems and versions
using a combination of Xen and OpenVZ. This allows me to a release a
new version of mtx for for two versions of Debian, three of Fedora, two
of CentOS/RHEL, two of FreeBSD, three of OpenSUSE, three of Ubuntu and
Windows for both 32 and 64 bits in an hour or so. That's a total of 32
different builds all using one machine with no reinstalls, rebooting or
other manual steps.
*The stock vzpkg*
*Q:* So, vzpkg used to work fairly well but over time, in certain
situations, it started to fail. What is wrong with the current version?
The main limitation of the current version is it was developed to
support RedHat distributions and is dependent on Yum/RPM. Another
limitation is that, due to the structure of the template meta data,
there was a lot of duplication of information resulting in extra
maintenance.
*vzpkg2 and pkg-cacher*
*Q:* You have added a number of features / capabilities to vzpkg. Could
you give us an overview of what's new?
I think the most significant change over the stock version of vzpkg is
the separation of the packager specific code from the higher level
code. This allows scripts to be written to support other package
managers like apt which is used on Debian and Ubuntu.
The other slightly less significant change is the introduction of the
concept of a hierarchical structure to the template meta data.
Information which is the same for all versions and platforms of a
distribution need only be specified once. If there is a need for
separate settings for a specific version it can be overridden by a file
lower in the template meta data tree.
Also new packager-independent commands have been added for managing
packages in installed containers.
*Q:* You added a new package named pkg-cacher. Where did pkg-cacher come
from and what does it do exactly? Can pkg-cacher be used independantly
of vzpkg2?
Most people managing multiple machines (physical or virtual) end up
installing a local mirror of some sort. This ranges from a subset of a
distribution like only the updates for a distribution on a single
platform to multiple mirrors of multiple distributions, versions and
platforms. These mirrors are generally maintained using rsync. They
are used to reduce bandwidth usage and installation time.
However the amount of disk space and bandwidth used to maintain these
mirrors can be quite significant. Particularly when most of the
packages are never actually used in the target environment.
While looking for a solution to these drawbacks I came across apt-cacher
available with Debian. It is a server written in Perl that acts as a
transparent caching HTTP proxy. It processes requests from apt just
like an HTTP server but forwards them to a distribution server and keeps
the results in a cache, then uses it to respond to subsequent requests
for the same file. It knows about the different types of files:
packages versus packager meta data. Since a package is static once
released there is no need to check the server for updated versions
whereas meta data changes over time and the distribution server must be
checked for updates. It also understands that the data may be present
on multiple mirrors but the content will be the same regardless of which
mirror is used.
General purpose caching proxies such as Squid may be used but they do
not understand the unique attributes of distribution repositories and
will duplicate files retrieved from different mirrors. They also rely
on the HTTP headers to decide retention policy rather than using the
packager meta data.
I used apt-cacher to handle my Debian distributions but wanted the same
functionality for my other distributions such as RedHat derived ones.
So I rewrote it as pkg-cacher.
Apt-cacher takes advantage of a key property of the Debian
distributions. The version and platform specific meta data is stored
separately from the packages. The packages for all versions are stored
in a single consolidated set of directories so there is no chance of two
packages having the same file name but different content.
However the same is not true of RedHat derived distributions. Each
version and platform has its own copy of the packages built using that
release and there are a number of packages with identical names but
different content. There are also packages which are unchanged from
version to version within a distribution as well as across distributions.
In order to deal with these differences pkg-cacher uses a different
directory structure for its cache.
Other significant differences from Debian are the RedHat packager uses
the Range HTTP header to retrieve partial information from the packages
and some distributions use the HTTP Redirect header to transfer to a
mirror closest to the client. I have added support for these headers in
pkg-cacher.
Pkg-cacher is designed to be a standalone tool separate from the new
vzpkg2. However its use complements vzpkg2 and the default installation
of vzpkg2 depends on it.
*Q:* How does pkg-cacher enhance vzpkg2?
The original vzpkg reduces the downloads by pointing yum's cache at a
directory within the template meta data tree. While this was a step in
the right direction, it still meant duplication across platforms. It
also provided no benefit to installed containers.
Pkg-cacher provides the benefits described in the previous question for
producing cached templates as well as installed containers.
*Q:* Does pkg-cacher come into play from the perspective of the containers?
The default templates included as part of vzpkg2 configure the template
meta data so that it uses the pkg-cacher server configured in vzpkg.conf
as VZPKG_CACHE_HOST. The operating system installed configuration files
are disabled by renaming them with a .disabled suffix and a new
configuration file is installed pointing to the pkg-cacher server.
*Q:* What container configuration changes have to be made in order for a
container to use the services provided by pkg-cacher?
Generally all that needs to be done is change the name of server in the
packager configuration files. This done automatically for containers
installed from cached templates generated by vzpkg2.
*Q:* Does one have to use pkg-cacher in order to use vzpkg2?
No, all that is required is modifying the vzpkg.conf files located in
the template meta data to use another proxy server, the original
distribution servers or even a copy of the distribution server in the
local filesystem.
*Q:* Are there any features in vzpkg2 that you can't use without
pkg-cacher?
No, pkg-cacher supplements vzpkg2 providing more efficient use of resources.
*A vision of the future*
*Q:* What is the next step? After vzpkg2 has had a bit more community
testing and you have gotten feedback and made any additional changes to
it that are needed, is the plan for it to replace the official vzpkg or
would you prefer it to stay an independant / separate app?
I would like to see the vzpkg2 changes incorporated into OpenVZ and
replace the current outdated vzpkg. I anticipate that pkg-cacher will
always remain a separate tool because of its general usefulness.
*Q:* Are there any features you haven't added to vzpkg2 (or pkg-cacher)
yet that you hope to impliment in the not too distant future?
There are still a number of features of apt-cacher which I haven't
rewritten to work with pkg-cacher. These are primarily in the area of
maintenance of the cache, such as removal of packages which are no
longer referenced by the packager meta data. I also plan on eliminating
the dependency on the lockfile utility included as part of procmail.
This was a dependency that I didn't realize was there until it was
brought to my attention recently. The final planned change is
conversion from a multiple process to a multithreaded application for
improved efficiency. A minor configuration change is the port that
pkg-cacher uses. It currently uses port 3142 which apt-cacher used.
However that port is actually registered to something else but near as I
can tell isn't actually used for its intended purpose. In the interest
of being a good netizen, I currently have a registration request pending
with the IANA for a port specifically for pkg-cacher's use..
For vzpkg2 there are two significant remaining work items. First is the
completion of the manual pages for all the commands. The second is
support for a way of specifying the included packages incrementally.
Currently package lists further down in the template meta data tree
replace those above. Ideally it would be useful to say for example
that, in a specific version, package X shouldn't be included but package
Y should. Also it would be nice to be able to include the processed
list of packages from another list. For example you would be able to
say that the "web server" configuration is all the packages included in
the "small" configuration with packages X, Y and Z added.
*Q:* What limitations, if any, do you see with vzpkg2 and what would
your perfect OS Template manager be like?
I think the only significant limitation of vzpkg2 is its implementation
using shell scripts. Some functions would be much easier to implement
in Perl or Python. As far as functionality I believe that, with the
additions described in the previous answer, it addresses all my needs
for a package manager. I'm happy to entertain any suggestions others
might have.
I suppose the obvious area that others might take exception to is the
lack of support for either OpenSUSE or GenToo. For OpenSUSE I've
created templates for version 10.x however in 11.x the OpenSUSE folks
modified rpm in way completely incompatible with the upstream version as
well as the version supplied on every other distribution. They did this
for an increase in compression ratios whose benefit is far outweighed by
problems caused by the incompatibility with the rest of the world and
even previous versions of their own distribution. They created this
incompatible version of rpm without renaming the tool. Because of this
incredible lack of judgment on their part (IMHO) I haven't bothered to
support it. That doesn't stop anyone else, with a need, from porting
their version of rpm to other distributions, renaming it something like
rpmSUSE and creating the appropriate packager specific scripts for
vzpkg2. That is one of the main benefits of vzpkg2 over vzpkg is it is
extensible merely by adding additional scripts specific to the packager
used on the new distribution.
There is no technical reason why GenToo couldn't be supported by the
current vzpkg2 other than I just haven't gotten around to writing the
interface scripts for emerge. I concentrated on yum/rpm and apt/dpkg to
ensure that I had the right level of abstraction to deal with two very
different packaging solutions. I also figured that supporting RedHat
and Debian based distributions covered the vast majority of users.
*Q:* Given the vast amount of changes from vzpkg to vzpkg2, and the
addition of pkg-cacher... I see a need for a bit of updated / additional
documentation. How is that going? All done? Need some help?
Documentation is one area where there is always need for more, better,
more concise, ... (fill in your favorite adjective here). I am working
to create additional manual pages for all the commands. But if someone
wants to volunteer, particularly on creating higher level, user
friendlier docs I think that would be great.
*In conclusion*
Robert, thank you... for your work on vzpkg2 and pkg-cacher... and for
the time you put into answering my quetsions. One last one for you
though...
*Q:* Are there any topics I've overlooked that you'd like to mention or
any additional comments you'd like to make?
None that I can think of.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://openvz.org/pipermail/users/attachments/20080927/3c1269a2/attachment-0001.html
More information about the Users
mailing list