<pre id="line74">So, this is just to confirm the final details about container mini-summit which will be held tomorrow.<br><br>Time: starting at 9am 3th Sept.<br>Place: Cambridge's University Arms Hotel, room Churchill D.
<br><br>Let's meet at the hotel lobby close to 9am and when go to the room.<br><br>Eric, Paul,<br>Can you please clarify whether will you be able to present or not?<br><br>PS sorry if you got this message a few times -- some DNS problems on my
<br>mail server.</pre><br><br><div><span class="gmail_quote">On 30/08/07, <b class="gmail_sendername">Cedric Le Goater</b> <<a href="mailto:clg@fr.ibm.com">clg@fr.ibm.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hello All,<br><br>Some of us will meet next week for the first mini-summit on containers.<br>Many thanks to Alasdair Kergon and LCE for the help they provided in<br>making this mini-summit happen !<br><br>It will be help on Monday the 3rd of September from 9:00 to 12:45 at LCE
<br>in room D. We also might get a phone line for external participants and,<br>if not, we should be able to set up a skype phone.<br><br>Here's a first try for the Agenda.<br><br>Global items<br><br>         [ let's try to defer discussion after presentation ]
<br><br>* Pavel Emelianov status update<br>* Serge E. Hallyn Container Roadmap including<br>        . task containers (Paul Menage)<br>        . resource management (Srivatsa Vaddagiri)<br><br>Special items<br><br>        [ brainstorm sessions which we would like to focus on ]
<br><br>* builing the global container object ('a la' openvz or vserver)<br>* container user space tools<br>* container checkpoint/restart<br><br><br>Thanks,<br><br>C.<br><br><br><br>======================  Section 1  ======================
<br>=Introduction<br>======================  Section 1  ======================<br><br>We are trying to create a roadmap for the next year of<br>'container' development, to be reported to the upcoming kernel<br>summit.  Containers here is a bit of an ambiguous term, so we are
<br>taking it to mean all of:<br><br>        1. namespaces<br>                kernel resource namespaces to support resource isolation<br>                and virtualization for virtual servers and application<br>                checkpoint/restart.
<br>        2. task containers framework<br>                 task containers provide a framework for subsystems which associate<br>                 state with arbitrary groups of processes, for purposes such as<br>                 resource control/monitoring.
<br>        3. checkpoint/restart<br><br>======================  Section 2  ======================<br>=Detailed development plans<br>======================  Section 2  ======================<br><br>A (still under construction) list of features we expect to be worked on
<br>next year looks like this:<br><br>        1. completion of ongoing namespaces<br>                pid namespace<br>                        push merged patchset upstream<br>                        kthread cleanup<br>                                especially nfs
<br>                                autofs<br>                        af_unix credentials (stores pid_t?)<br>                net namespace<br>                ro bind mounts<br>        2. continuation with new namespaces<br>
                devpts, console, and ttydrivers<br>                user<br>                time<br>                namespace management tools<br>                namespace entering  (using one of:)<br>                        bind_ns()
<br>                        ns container subsystem<br>                        (vs refuse this functionality)<br>                multiple /sys mounts<br>                        break /sys into smaller chunks?<br>                        shadow dirs vs namespaces
<br>                multiple proc mounts<br>                        likely need to extend on the work done for pid namespaces<br>                        i.e. other /proc files will need some care<br>                                virtualization of statistics for 'top', etc
<br>        3. any additional work needed for virtual servers?<br>                i.e. in-kernel keyring usage for cross-usernamespace permissions, etc<br>                        nfs and rpc updates needed?<br>                        general security fixes
<br>                                per-container capabilities?<br>                        device access controls<br>                                e.g. root in container should not have access to /dev/sda by default)<br>
                        filesystems access controls<br>                'container object'?<br>                        implementation (perhaps largely userspace abstraction)<br>                        container enter
<br>                        container list<br>                        container shutdown notification<br><br>        4. task containers functionality<br>                base features<br>                        hierarchical/virtualized containers
<br>                                support vserver mgmnt of sub-containers<br>                        locking cleanup<br>                        control file API simplification<br>                userpace RBCE to provide controls for
<br>                        users<br>                        groups<br>                        pgrp<br>                        executable<br>                specific containers targeted:<br>                        split cpusets into
<br>                                cpuset<br>                                memset<br>                        network<br>                                connect/bind/accept controller using iptables<br>                        memory controller (see detail below)
<br>                        cpu controller d (see detailbelow)<br>                        io controller (see detail below)<br>                        network flow id control<br>                        per-container OOM handler (userspace)
<br>                        per-container swap<br>                        per-container disk I/O scheduling<br>                        per container memory reclaim<br>                        per container dirty page (write throttling) limit.
<br>                        network rate limiting (outbound) based on container<br>                misc<br>                        User level APIS to identify the resource limits that is allowed to a<br>                                job, for example, how much physical memory a
<br>                                process can use.  This should seamlessly<br>                                integrated with non-container environment as<br>                                well (may be with ulimit).<br>
                        Per container stats, like pages on active list, cpus usage, etc<br>                memory controller<br>                        users and requirements:<br>                                1. The containers solution would need resource
<br>                                management (including memory control and per container swap files).<br>                                Paul Menage, YAMOMOTO Takshi, Peter Zijlstra, Pavel Emelianov have all shown<br>                                interest in the memory controller patches.
<br>                                2. The memory controller can account for page<br>                                cache as well, all people interested in limiting page cahce control, can<br>                                theoratically put move all page cache hungry applications under the same
<br>                                container.<br>                        Planned enhancements to the memory controller<br>                                1. Improved shared page accounting<br>                                2. Improved statistics
<br>                                3. Soft-limit memory usage<br>                        generic infrastructure work:<br>                                1. Enhancing containerstats<br>                                        a. Working on per controller statistics
<br>                                        b. Integrating taskstats with containerstats<br>                                2. CPU accounting framework<br>                                        a. Migrate the accounting to be more precis
<br>                cpu controller<br>                        users and requirements:<br>                                1. Virtualization solutions like containers and<br>                                   KVM need CPU control. KVM for example would
<br>                                   like to have both limits and guarantees<br>                                   supported by a CPU controller, to control CPU<br>                                   allocation to a particular instance.
<br>                                2. Workload management products would like to exploit this for providing<br>                                   guaranteed cpu bandwidth and also (hard/soft) limiting cpu usage.<br>                        work items
<br>                                1. Fine-grained proportional-share fair-group scheduling.<br>                                2. More accurate SMP fairness<br>                                3. Hard limit<br>                                4. SCHED_FIFO type policy for groups
<br>                                5. Improved statistics and debug facility for group scheduler<br>                io controller<br>                        users and requirements:<br>                                1. At a talk presented to the Linux Foundation
<br>                                (OSDL), the attendees showed interest in an IO<br>                                controller to control IO bandwidth of various<br>                                filesystem operations (backup, journalling,
<br>                                etc)<br>                        work items:<br>                                1. Proof of concept IO controller and community discussion/feedback<br>                                2. Development and Integration of the IO controller with containers
<br>                        open issues<br>                                1. Automatic tagging/resource classification engine<br><br><br>        5. checkpoint/restart<br>                memory c/r<br>                        (there are a few designs and prototypes)
<br>                        (though this may be ironed out by then)<br>                        per-container swapfile?<br>                overall checkpoint strategy  (one of:)<br>                        in-kernel<br>                        userspace-driven
<br>                        hybrid<br>                overall restart strategy<br>                use freezer API<br>                use suspend-to-disk?<br>                sysvipc<br>                        "set identifier" syscall
<br>                pid namespace<br>                        clone_with_pid()<br>                live migration<br><br><br>======================  Section 3  ======================<br>=Use cases<br>======================  Section 3  ======================
<br><br>        1, Namespaces:<br><br>        The most commonly listed uses for namespaces are virtual<br>        servers and checkpoint restart.  Other uses are debugging<br>        (running tests in not-quite-virtual-servers) and resource
<br>        isolation, such as the use of mounts namespaces to simulate<br>        multi-level directories for LSPP.<br><br>        2. Task Containers:<br><br>        (Vatsa to fill in)<br><br>        3. Checkpoint/restart
<br><br>        load balancing:<br>        applications can be migrated from high-load systems to ones<br>        with a lower load.  Long-running applications can be checkpointed<br>        (or migrated) to start a short-running high-load job, then
<br>        restarted.<br><br>        kernel upgrades:<br>        A long-running application - or whole virtual server - can<br>        be migrated or checkpointed so that the system can be<br>        rebooted, and the application can continue to run
<br><br><br>======================  Section 4  ======================<br>=Involved parties<br>======================  Section 4  ======================<br><br>In the list of stakeholders, I try to guess based on past comments and
<br>contributions what *general* area they are most likely to contribute in.<br>I may try to narrow those down later, but am just trying to get something<br>out the door right now before my next computer breaks.<br><br>Stakeholders:
<br>        Eric Biederman<br>                everything<br>        google<br>                task containers<br>        ibm (serge, dave, cedric, daniel)<br>                namespaces<br>                checkpoint/restart
<br>        bull (benjamin, pierre)<br>                namespaces<br>                checkpoint/restart<br>        ibm (balbir, vatsa)<br>                task containers<br>        kerlabs<br>                checkpoint/restart
<br>        openvz<br>                everything<br>        NEC Japan (Masahiko Takahashi)<br>                checkpoint/restart<br>        Linux-VServer<br>                namespaces+containers<br>        zap project<br>
                checkpoint/restart<br>        planetlab<br>                everything<br>        hp<br>                network namespaces, virtual servers?<br>        XtreemOS<br>                checkpoint/restart<br>        Fujitsu/VA Linux Japan
<br>                resource control<br>        BLCR (Paul H. Hargrove)<br>                checkpoint/restart<br><br>Is anyone else still missing from the list?<br><br>thanks,<br>-serge<br><br></blockquote></div><br>