[Devel] Re: [RFC] [PATCH] Cgroup based OOM killer controller
Evgeniy Polyakov
zbr at ioremap.net
Tue Jan 27 13:51:18 PST 2009
On Tue, Jan 27, 2009 at 12:37:21PM -0800, David Rientjes (rientjes at google.com) wrote:
> > Well, oom-killer can, since it drops unkillable state from the process
> > mask, that may be not enough though, but it tries more than userspace.
> >
>
> The only thing it does is send a SIGKILL and gives the thread access to
> memory reserves with TIF_MEMDIE, it doesn't drop any unkillable state. If
There is a small difference between force_sig_info() and usual
send_sinal() used by kill.
> its victim is hung in D state and the memory reserves do not allow it to
> return to being runnable, this task will not die and the oom killer would
> livelock unless given another target.
D-states are different. In the current tree we even have
page_lock_killable(), so it depends.
> > My main point was to haev a way to monitor memory usage and that any
> > process could tune own behaviour according to that information. Which is
> > not realated to the system oom-killer at all. Thus /dev/mem_notify is
> > interested first (and only the first) as a memory usage notification
> > interface and not a way to invoke any kind of 'soft' oom-killer.
>
> It's a way to prevent invoking the kernel oom killer by allowing userspace
> notification of events where methods such as droping caches, elevating
> limits, adding nodes, sending signals, etc, can prevent such a problem.
> When the system (or cgroup) is completely oom, it can also issue SIGKILLs
> that will free some memory and preempt the oom killer from acting.
>
> I think there might be some confusion about my proposal for extending
> /dev/mem_notify. Not only should it notify of certain low memory events,
> but it should also allow userspace notification of oom events, just like
> the cgroup oom notifier patch allowed. Instead of attaching a task to a
> cgroup file in that case, however, this would simply be the responsibility
> of a task that has set up a poll() on the cgroup's mem_notify file. A
> configurable delay could be imposed so page allocation attempts simply
> loop while the userspace handler responds and then only invoke the oom
> killer when absolutely necessary.
I have really no objections against this and extending oom-killer to
allow to wait a bit in the allocation path before userspace makes some
progress. But do not drop existing oom-killer (i.e. its ability to kill
processes) in favour of this new feature. Let's have both and if
extension failed for some reason, old oom-killer will do the things.
--
Evgeniy Polyakov
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list