[Users] Re: I'm testing out vzpkg2!

Robert Nelson robertn at the-nelsons.org
Sat Sep 27 15:55:45 EDT 2008


Scott Dowdle wrote:
> Robert,
>
> I've finished up the interview questions and you can find theme here:
> http://www.montanalinux.org/files/robert-nelson-interview.html
>   

*Who is Robert Nelson?*

/Please answer whatever questions you feel comfortable answering./

*Q:* Please tell me a little bit about yourself. Where are you from 
(originally and now)? What is your educational background? What are your 
hobbies? What is your family status (married, kids?)? What do you do for 
a living?

I was born in Burnaby, British Columbia, Canada.  Burnaby is a suburb of 
Vancouver where Expo '86 was held.  I spent most of my life living in 
various cities all across Canada.  In 1992 I moved to Seattle, 
Washington in the United States to work at Microsoft.  In 2004 I retired 
from Microsoft and currently live in Bellevue, Washington a few miles 
from the Microsoft Campus.  My partner of 8 years and I don't have any 
children but we have two miniature Dachshunds that think they are our 
children.

Since retiring I've occupied my time managing my real estate investments 
and contributing to open-source projects, programming is probably the 
closest thing I have to a hobby :-).  Most of the open-source projects 
I've been involved with have been directly or indirectly related to my 
business.

*Q:* How long have you been programming? What programming languages do 
you use / prefer? Are there any other software projects you are / have 
been involved with that you would like to mention?

I started programming in 1973 when I used to skip high school to sneak 
off to Simon Fraser University to play with the IBM 370/155 mainframe.  
I was around so much that they offered me a summer job developing 
courses using a CAI language they developed as an extension to APL.  So 
my first languages were APL/CAI and CourseWriter III.  From there I 
branched into PL/1 and System 360 Assembler.

Over the years I've learned and programmed in pretty much every 
programming language ever developed, including some oldies but goodies 
like Fortran and Cobol (Somehow I missed out on Algol).  Most of my 
professional life has been spent programming in C/C++ and various 
machine languages.

Lately I've been working a lot in Perl, PHP, Python and Shell scripts 
because they are the primary languages used in open-source projects.

I don't really have any preferences regarding specific programming 
languages, I believe that they are all just tools to get the job done.  
Some are better suited for certain jobs than others but they all have 
strengths and weaknesses.

Ever since I started playing with Actor (a defunct language, like 
Smalltalk but with a syntax similar to C++ rather than Pascal), I found 
I prefer object-oriented ones.  I find that object-oriented programming 
is the best way to organize my thoughts and make large projects more 
manageable.  Even when I'm using straight C code I still organize the 
code as if I was writing C++.

For most of my career I've worked on "system code", operating systems, 
compilers and device drivers.  When I was with Motorola I worked 
primarily on proprietary and Unix SVR4 minicomputers.  At Microsoft I 
worked on the Interactive TV project, Windows CE and in the Windows NT 
kernel group on the I/O subsystem and Plug and Play.  Since leaving 
Microsoft I've worked mainly on Linux.  I avoid a religious attachment 
to any platform, I feel that, like programming languages, each has its 
own strengths and weaknesses.  Sometimes I think that the acolytes on 
all sides must be compensating for some physical shortcoming of their 
own. :-)

The main open-source projects I've contributed to include Bacula, mtx, 
FreePBX, and GForge.  I've also contributed fixes to countless others.  
I've contributed a few of the tools I've written to the open-source 
community.  Usually my involvement starts out with fixes for bugs that 
hinder my use of the software.  If there is some area that could be 
improved to make the software much more useful for me then my 
contribution might be larger.  It really depends on my interest and how 
easy it is to work with the other developers involved in the project.  
But my involvement is usually selfish.

*Q:* How long have you been using OpenVZ? What other virtualization 
products have you tried and do you use? What do you use OpenVZ for?

I've been using OpenVZ for over a year.  My interest in virtualization 
products sprang from my desire to get more use out of a dedicated server 
I leased and my work on mtx (which I took over about a year or so ago) 
which in turn was an offshoot of my involvement in Bacula.  Both Bacula 
and mtx required building and testing on a wide variety of operating 
systems and versions.  I found the process of installing and booting all 
those operating systems tedious and was looking for a better solution 
than filling my house with dedicated machines.

I started out with VMware.  While it mostly met my needs, I prefer an 
open-source solution where I can change the things I don't like and 
there is more likely hood of someone contributing other useful tools.   
I then switched my focus to Xen.  It provided most of the functionality 
of VMware albeit with reduced performance for Windows guests due to a 
lack of PV drivers for video and disk. 

That would have probably been the end of my virtualization quest were it 
not for another requirement coming from my business.  Since a number of 
my investments are in Canada I've been using VoIP for a few years.  I 
had been running an Asterisk server in my house but found that Comcast's 
network somewhat unreliable in terms of latency.  So after shopping 
around I found a good deal on a dedicated server with great connectivity 
and moved my Asterisk server there.

Based on the success of running the server there I decided it would be 
nice to move other servers like my email server out of my house and on 
to the server.  The only drawback was that the system was somewhat 
resource limited and the cost of increasing memory was a significant 
monthly increase.  So I went looking for a more resource efficient way 
of running multiple virtual servers.

I looked at VServer and OpenVZ, VServer seemed unreliable and without 
much of a community backing it up.  OpenVZ fit the bill and I settled on 
it.  It has worked well on my dedicated server.  I run three virtual 
machines, a DNS server, an Asterisk server and a Zimbra server.  I've 
since replaced the rented dedicated server with a co-located one of my 
own and added two additional virtual servers Funambol for mobile sync 
and EJBCA for a certificate authority.

As a result of my experience with OpenVZ I set up a virtual build 
machine that runs about 16 variations of operating systems and versions 
using a combination of Xen and OpenVZ.  This allows me to a release a 
new version of mtx for for two versions of Debian, three of Fedora, two 
of CentOS/RHEL, two of FreeBSD, three of OpenSUSE, three of Ubuntu and 
Windows for both 32 and 64 bits in an hour or so.  That's a total of 32 
different builds all using one machine with no reinstalls, rebooting or 
other manual steps.

*The stock vzpkg*

*Q:* So, vzpkg used to work fairly well but over time, in certain 
situations, it started to fail. What is wrong with the current version?

The main limitation of the current version is it was developed to 
support RedHat distributions and is dependent on Yum/RPM.  Another 
limitation is that, due to the structure of the template meta data, 
there was a lot of duplication of information resulting in extra 
maintenance.

*vzpkg2 and pkg-cacher*

*Q:* You have added a number of features / capabilities to vzpkg. Could 
you give us an overview of what's new?

I think the most significant change over the stock version of vzpkg is 
the separation of the packager specific code from the higher level 
code.  This allows scripts to be written to support other package 
managers like apt which is used on Debian and Ubuntu.  

The other slightly less significant change is the introduction of the 
concept of a hierarchical structure to the template meta data.  
Information which is the same for all versions and platforms of a 
distribution need only be specified once.  If there is a need for 
separate settings for a specific version it can be overridden by a file 
lower in the template meta data tree.

Also new packager-independent commands have been added for managing 
packages in installed containers.

*Q:* You added a new package named pkg-cacher. Where did pkg-cacher come 
from and what does it do exactly? Can pkg-cacher be used independantly 
of vzpkg2?

Most people managing multiple machines (physical or virtual) end up 
installing a local mirror of some sort.  This ranges from a subset of a 
distribution like only the updates for a distribution on a single 
platform to multiple mirrors of multiple distributions, versions and 
platforms.  These mirrors are generally maintained using rsync.  They 
are used to reduce bandwidth usage and installation time.

However the amount of disk space and bandwidth used to maintain these 
mirrors can be quite significant.  Particularly when most of the 
packages are never actually used in the target environment.

While looking for a solution to these drawbacks I came across apt-cacher 
available with Debian.  It is a server written in Perl that acts as a 
transparent caching HTTP proxy.  It processes requests from apt just 
like an HTTP server but forwards them to a distribution server and keeps 
the results in a cache, then uses it to respond to subsequent requests 
for the same file.  It knows about the different types of files: 
packages versus packager meta data.  Since a package is static once 
released there is no need to check the server for updated versions 
whereas meta data changes over time and the distribution server must be 
checked for updates.  It also understands that the data may be present 
on multiple mirrors but the content will be the same regardless of which 
mirror is used.

General purpose caching proxies such as Squid may be used but they do 
not understand the unique attributes of distribution repositories and 
will duplicate files retrieved from different mirrors.  They also rely 
on the HTTP headers to decide retention policy rather than using the 
packager meta data.

I used apt-cacher to handle my Debian distributions but wanted the same 
functionality for my other distributions such as RedHat derived ones.  
So I rewrote it as pkg-cacher.

Apt-cacher takes advantage of a key property of the Debian 
distributions.  The version and platform specific meta data is stored 
separately from the packages.  The packages for all versions are stored 
in a single consolidated set of directories so there is no chance of two 
packages having the same file name but different content.

However the same is not true of RedHat derived distributions.  Each 
version and platform has its own copy of the packages built using that 
release and there are a number of packages with identical names but 
different content.  There are also packages which are unchanged from 
version to version within a distribution as well as across distributions.

In order to deal with these differences pkg-cacher uses a different 
directory structure for its cache.

Other significant differences from Debian are the RedHat packager uses 
the Range HTTP header to retrieve partial information from the packages 
and some distributions use the HTTP Redirect header to transfer to a 
mirror closest to the client.  I have added support for these headers in 
pkg-cacher.

Pkg-cacher is designed to be a standalone tool separate from the new 
vzpkg2.  However its use complements vzpkg2 and the default installation 
of vzpkg2 depends on it.

*Q:* How does pkg-cacher enhance vzpkg2?

The original vzpkg reduces the downloads by pointing yum's cache at a 
directory within the template meta data tree.  While this was a step in 
the right direction, it still meant duplication across platforms.  It 
also provided no benefit to installed containers.

Pkg-cacher provides the benefits described in the previous question for 
producing cached templates as well as installed containers.

*Q:* Does pkg-cacher come into play from the perspective of the containers?

The default templates included as part of vzpkg2 configure the template 
meta data so that it uses the pkg-cacher server configured in vzpkg.conf 
as VZPKG_CACHE_HOST.  The operating system installed configuration files 
are disabled by renaming them with a .disabled  suffix and a new 
configuration file is installed pointing to the pkg-cacher server.

*Q:* What container configuration changes have to be made in order for a 
container to use the services provided by pkg-cacher?

Generally all that needs to be done is change the name of server in the 
packager configuration files.  This done automatically for containers 
installed from cached templates generated by vzpkg2.

*Q:* Does one have to use pkg-cacher in order to use vzpkg2?

No, all that is required is modifying the vzpkg.conf files located in 
the template meta data to use another proxy server, the original 
distribution servers or even a copy of the distribution server in the 
local filesystem.

*Q:* Are there any features in vzpkg2 that you can't use without 
pkg-cacher?

No, pkg-cacher supplements vzpkg2 providing more efficient use of resources.

*A vision of the future*

*Q:* What is the next step? After vzpkg2 has had a bit more community 
testing and you have gotten feedback and made any additional changes to 
it that are needed, is the plan for it to replace the official vzpkg or 
would you prefer it to stay an independant / separate app?

I would like to see the vzpkg2 changes incorporated into OpenVZ and 
replace the current outdated vzpkg.  I anticipate that pkg-cacher will 
always remain a separate tool because of its general usefulness.

*Q:* Are there any features you haven't added to vzpkg2 (or pkg-cacher) 
yet that you hope to impliment in the not too distant future?

There are still a number of features of apt-cacher which I haven't 
rewritten to work with pkg-cacher.  These are primarily in the area of 
maintenance of the cache, such as removal of packages which are no 
longer referenced by the packager meta data.  I also plan on eliminating 
the dependency on the lockfile utility included as part of procmail.  
This was a dependency that I didn't realize was there until it was 
brought to my attention recently.  The final planned change is 
conversion from a multiple process to a multithreaded application for 
improved efficiency.  A minor configuration change is the port that 
pkg-cacher uses.  It currently uses port 3142 which apt-cacher used.  
However that port is actually registered to something else but near as I 
can tell isn't actually used for its intended purpose.  In the interest 
of being a good netizen, I currently have a registration request pending 
with the IANA for a port specifically for pkg-cacher's use..

For vzpkg2 there are two significant remaining work items.  First is the 
completion of the manual pages for all the commands.  The second is 
support for a way of specifying the included packages incrementally.

Currently package lists further down in the template meta data tree 
replace those above.  Ideally it would be useful to say for example 
that, in a specific version, package X shouldn't be included but package 
Y should.  Also it would be nice to be able to include the processed 
list of packages from another list.  For example you would be able to 
say that the "web server" configuration is all the packages included in 
the "small" configuration with packages X, Y and Z added.

*Q:* What limitations, if any, do you see with vzpkg2 and what would 
your perfect OS Template manager be like?

I think the only significant limitation of vzpkg2 is its implementation 
using shell scripts.  Some functions would be much easier to implement 
in Perl or Python.  As far as functionality I believe that, with the 
additions described in the previous answer, it addresses all my needs 
for a package manager.  I'm happy to entertain any suggestions others 
might have. 

I suppose the obvious area that others might take exception to is the 
lack of  support for either OpenSUSE or GenToo.  For OpenSUSE I've 
created templates for version 10.x however in 11.x the OpenSUSE folks 
modified rpm in way completely incompatible with the upstream version as 
well as the version supplied on every other distribution.  They did this 
for an increase in compression ratios whose benefit is far outweighed by 
problems caused by the incompatibility with the rest of the world and 
even previous versions of their own distribution.  They created this 
incompatible version of rpm without renaming the tool.  Because of this 
incredible lack of judgment on their part (IMHO) I haven't bothered to 
support it.  That doesn't stop anyone else, with a need, from porting 
their version of rpm to other distributions, renaming it something like 
rpmSUSE and creating the appropriate packager specific scripts for 
vzpkg2.  That is one of the main benefits of vzpkg2 over vzpkg is it is 
extensible merely by adding additional scripts specific to the packager 
used on the new distribution.

There is no technical reason why GenToo couldn't be supported by the 
current vzpkg2 other than I just haven't gotten around to writing the 
interface scripts for emerge.  I concentrated on yum/rpm and apt/dpkg to 
ensure that I had the right level of abstraction to deal with two very 
different packaging solutions.  I also figured that supporting RedHat 
and Debian based distributions covered the vast majority of users.

*Q:* Given the vast amount of changes from vzpkg to vzpkg2, and the 
addition of pkg-cacher... I see a need for a bit of updated / additional 
documentation. How is that going? All done? Need some help?

Documentation is one area where there is always need for more, better, 
more concise, ... (fill in your favorite adjective here).  I am working 
to create additional manual pages for all the commands.  But if someone 
wants to volunteer, particularly on creating higher level, user 
friendlier docs I think that would be great.

*In conclusion*

Robert, thank you... for your work on vzpkg2 and pkg-cacher... and for 
the time you put into answering my quetsions. One last one for you 
though...

*Q:* Are there any topics I've overlooked that you'd like to mention or 
any additional comments you'd like to make?

None that I can think of.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://openvz.org/pipermail/users/attachments/20080927/3c1269a2/attachment-0001.html


More information about the Users mailing list