<pre id="line74">So, this is just to confirm the final details about container mini-summit which will be held tomorrow.<br><br>Time: starting at 9am 3th Sept.<br>Place: Cambridge's University Arms Hotel, room Churchill D.
<br><br>Let's meet at the hotel lobby close to 9am and when go to the room.<br><br>Eric, Paul,<br>Can you please clarify whether will you be able to present or not?<br><br>PS sorry if you got this message a few times -- some DNS problems on my
<br>mail server.</pre><br><br><div><span class="gmail_quote">On 30/08/07, <b class="gmail_sendername">Cedric Le Goater</b> <<a href="mailto:clg@fr.ibm.com">clg@fr.ibm.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hello All,<br><br>Some of us will meet next week for the first mini-summit on containers.<br>Many thanks to Alasdair Kergon and LCE for the help they provided in<br>making this mini-summit happen !<br><br>It will be help on Monday the 3rd of September from 9:00 to 12:45 at LCE
<br>in room D. We also might get a phone line for external participants and,<br>if not, we should be able to set up a skype phone.<br><br>Here's a first try for the Agenda.<br><br>Global items<br><br> [ let's try to defer discussion after presentation ]
<br><br>* Pavel Emelianov status update<br>* Serge E. Hallyn Container Roadmap including<br> . task containers (Paul Menage)<br> . resource management (Srivatsa Vaddagiri)<br><br>Special items<br><br> [ brainstorm sessions which we would like to focus on ]
<br><br>* builing the global container object ('a la' openvz or vserver)<br>* container user space tools<br>* container checkpoint/restart<br><br><br>Thanks,<br><br>C.<br><br><br><br>====================== Section 1 ======================
<br>=Introduction<br>====================== Section 1 ======================<br><br>We are trying to create a roadmap for the next year of<br>'container' development, to be reported to the upcoming kernel<br>summit. Containers here is a bit of an ambiguous term, so we are
<br>taking it to mean all of:<br><br> 1. namespaces<br> kernel resource namespaces to support resource isolation<br> and virtualization for virtual servers and application<br> checkpoint/restart.
<br> 2. task containers framework<br> task containers provide a framework for subsystems which associate<br> state with arbitrary groups of processes, for purposes such as<br> resource control/monitoring.
<br> 3. checkpoint/restart<br><br>====================== Section 2 ======================<br>=Detailed development plans<br>====================== Section 2 ======================<br><br>A (still under construction) list of features we expect to be worked on
<br>next year looks like this:<br><br> 1. completion of ongoing namespaces<br> pid namespace<br> push merged patchset upstream<br> kthread cleanup<br> especially nfs
<br> autofs<br> af_unix credentials (stores pid_t?)<br> net namespace<br> ro bind mounts<br> 2. continuation with new namespaces<br>
devpts, console, and ttydrivers<br> user<br> time<br> namespace management tools<br> namespace entering (using one of:)<br> bind_ns()
<br> ns container subsystem<br> (vs refuse this functionality)<br> multiple /sys mounts<br> break /sys into smaller chunks?<br> shadow dirs vs namespaces
<br> multiple proc mounts<br> likely need to extend on the work done for pid namespaces<br> i.e. other /proc files will need some care<br> virtualization of statistics for 'top', etc
<br> 3. any additional work needed for virtual servers?<br> i.e. in-kernel keyring usage for cross-usernamespace permissions, etc<br> nfs and rpc updates needed?<br> general security fixes
<br> per-container capabilities?<br> device access controls<br> e.g. root in container should not have access to /dev/sda by default)<br>
filesystems access controls<br> 'container object'?<br> implementation (perhaps largely userspace abstraction)<br> container enter
<br> container list<br> container shutdown notification<br><br> 4. task containers functionality<br> base features<br> hierarchical/virtualized containers
<br> support vserver mgmnt of sub-containers<br> locking cleanup<br> control file API simplification<br> userpace RBCE to provide controls for
<br> users<br> groups<br> pgrp<br> executable<br> specific containers targeted:<br> split cpusets into
<br> cpuset<br> memset<br> network<br> connect/bind/accept controller using iptables<br> memory controller (see detail below)
<br> cpu controller d (see detailbelow)<br> io controller (see detail below)<br> network flow id control<br> per-container OOM handler (userspace)
<br> per-container swap<br> per-container disk I/O scheduling<br> per container memory reclaim<br> per container dirty page (write throttling) limit.
<br> network rate limiting (outbound) based on container<br> misc<br> User level APIS to identify the resource limits that is allowed to a<br> job, for example, how much physical memory a
<br> process can use. This should seamlessly<br> integrated with non-container environment as<br> well (may be with ulimit).<br>
Per container stats, like pages on active list, cpus usage, etc<br> memory controller<br> users and requirements:<br> 1. The containers solution would need resource
<br> management (including memory control and per container swap files).<br> Paul Menage, YAMOMOTO Takshi, Peter Zijlstra, Pavel Emelianov have all shown<br> interest in the memory controller patches.
<br> 2. The memory controller can account for page<br> cache as well, all people interested in limiting page cahce control, can<br> theoratically put move all page cache hungry applications under the same
<br> container.<br> Planned enhancements to the memory controller<br> 1. Improved shared page accounting<br> 2. Improved statistics
<br> 3. Soft-limit memory usage<br> generic infrastructure work:<br> 1. Enhancing containerstats<br> a. Working on per controller statistics
<br> b. Integrating taskstats with containerstats<br> 2. CPU accounting framework<br> a. Migrate the accounting to be more precis
<br> cpu controller<br> users and requirements:<br> 1. Virtualization solutions like containers and<br> KVM need CPU control. KVM for example would
<br> like to have both limits and guarantees<br> supported by a CPU controller, to control CPU<br> allocation to a particular instance.
<br> 2. Workload management products would like to exploit this for providing<br> guaranteed cpu bandwidth and also (hard/soft) limiting cpu usage.<br> work items
<br> 1. Fine-grained proportional-share fair-group scheduling.<br> 2. More accurate SMP fairness<br> 3. Hard limit<br> 4. SCHED_FIFO type policy for groups
<br> 5. Improved statistics and debug facility for group scheduler<br> io controller<br> users and requirements:<br> 1. At a talk presented to the Linux Foundation
<br> (OSDL), the attendees showed interest in an IO<br> controller to control IO bandwidth of various<br> filesystem operations (backup, journalling,
<br> etc)<br> work items:<br> 1. Proof of concept IO controller and community discussion/feedback<br> 2. Development and Integration of the IO controller with containers
<br> open issues<br> 1. Automatic tagging/resource classification engine<br><br><br> 5. checkpoint/restart<br> memory c/r<br> (there are a few designs and prototypes)
<br> (though this may be ironed out by then)<br> per-container swapfile?<br> overall checkpoint strategy (one of:)<br> in-kernel<br> userspace-driven
<br> hybrid<br> overall restart strategy<br> use freezer API<br> use suspend-to-disk?<br> sysvipc<br> "set identifier" syscall
<br> pid namespace<br> clone_with_pid()<br> live migration<br><br><br>====================== Section 3 ======================<br>=Use cases<br>====================== Section 3 ======================
<br><br> 1, Namespaces:<br><br> The most commonly listed uses for namespaces are virtual<br> servers and checkpoint restart. Other uses are debugging<br> (running tests in not-quite-virtual-servers) and resource
<br> isolation, such as the use of mounts namespaces to simulate<br> multi-level directories for LSPP.<br><br> 2. Task Containers:<br><br> (Vatsa to fill in)<br><br> 3. Checkpoint/restart
<br><br> load balancing:<br> applications can be migrated from high-load systems to ones<br> with a lower load. Long-running applications can be checkpointed<br> (or migrated) to start a short-running high-load job, then
<br> restarted.<br><br> kernel upgrades:<br> A long-running application - or whole virtual server - can<br> be migrated or checkpointed so that the system can be<br> rebooted, and the application can continue to run
<br><br><br>====================== Section 4 ======================<br>=Involved parties<br>====================== Section 4 ======================<br><br>In the list of stakeholders, I try to guess based on past comments and
<br>contributions what *general* area they are most likely to contribute in.<br>I may try to narrow those down later, but am just trying to get something<br>out the door right now before my next computer breaks.<br><br>Stakeholders:
<br> Eric Biederman<br> everything<br> google<br> task containers<br> ibm (serge, dave, cedric, daniel)<br> namespaces<br> checkpoint/restart
<br> bull (benjamin, pierre)<br> namespaces<br> checkpoint/restart<br> ibm (balbir, vatsa)<br> task containers<br> kerlabs<br> checkpoint/restart
<br> openvz<br> everything<br> NEC Japan (Masahiko Takahashi)<br> checkpoint/restart<br> Linux-VServer<br> namespaces+containers<br> zap project<br>
checkpoint/restart<br> planetlab<br> everything<br> hp<br> network namespaces, virtual servers?<br> XtreemOS<br> checkpoint/restart<br> Fujitsu/VA Linux Japan
<br> resource control<br> BLCR (Paul H. Hargrove)<br> checkpoint/restart<br><br>Is anyone else still missing from the list?<br><br>thanks,<br>-serge<br><br></blockquote></div><br>