[Devel] [PATCH rh7] mm: memcontrol: add memory.numa_migrate file
Vladimir Davydov
vdavydov at virtuozzo.com
Tue Aug 23 03:27:23 PDT 2016
On Tue, Aug 23, 2016 at 12:57:53PM +0300, Andrey Ryabinin wrote:
...
> echo "0 100" > /sys/fs/cgroup/memory/machine.slice/100/memory.numa_migrate
>
> [ 296.073002] BUG: soft lockup - CPU#1 stuck for 22s! [bash:4028]
Thanks for catching, will fix in v2.
> > +static struct page *memcg_numa_migrate_new_page(struct page *page,
> > + unsigned long private, int **result)
> > +{
> > + struct memcg_numa_migrate_struct *ms = (void *)private;
> > + gfp_t gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_NORETRY | __GFP_NOWARN;
> > +
> > + ms->current_node = next_node(ms->current_node, *ms->target_nodes);
> > + if (ms->current_node >= MAX_NUMNODES) {
> > + ms->current_node = first_node(*ms->target_nodes);
> > + BUG_ON(ms->current_node >= MAX_NUMNODES);
>
> Maybe WARN_ON() or VM_BUG_ON() ?
Will replace with VM_BUG_ON.
> > + }
> > +
> > + return __alloc_pages_nodemask(gfp_mask, 0,
> > + node_zonelist(ms->current_node, gfp_mask),
> > + ms->target_nodes);
> > +}
> > +
> > +/*
> > + * Isolate at most @nr_to_scan pages from @lruvec for further migration and
> > + * store them in @dst. Returns the number of pages scanned. Return value of 0
> > + * means that @lruved is empty.
> > + */
> > +static long memcg_numa_isolate_pages(struct lruvec *lruvec, enum lru_list lru,
> > + long nr_to_scan, struct list_head *dst)
> > +{
> > + struct list_head *src = &lruvec->lists[lru];
> > + struct zone *zone = lruvec_zone(lruvec);
> > + long scanned = 0, taken = 0;
> > +
> > + spin_lock_irq(&zone->lru_lock);
> > + while (!list_empty(src) && scanned < nr_to_scan && taken < nr_to_scan) {
> > + struct page *page = list_last_entry(src, struct page, lru);
> > + int nr_pages;
> > +
> > + VM_BUG_ON_PAGE(!PageLRU(page), page);
> > +
>
> __isolate_lru_page() will return -EINVAL for !PageLRU, so either this or the BUG() bellow is unnecessary.
OK, will remove the VM_BUG_ON_PAGE.
...
> > +static int memcg_numa_migrate_pages(struct mem_cgroup *memcg,
> > + nodemask_t *target_nodes, long nr_to_scan)
> > +{
> > + struct mem_cgroup *mi;
> > + long total_scanned = 0;
> > +
> > +again:
> > + for_each_mem_cgroup_tree(mi, memcg) {
> > + struct zone *zone;
> > +
> > + for_each_populated_zone(zone) {
> > + struct lruvec *lruvec;
> > + enum lru_list lru;
> > + long scanned;
> > +
> > + if (node_isset(zone_to_nid(zone), *target_nodes))
> > + continue;
> > +
> > + lruvec = mem_cgroup_zone_lruvec(zone, mi);
> > + /*
> > + * For the sake of simplicity, do not attempt to migrate
> > + * unevictable pages. It should be fine as long as there
> > + * aren't too many of them, which is usually true.
> > + */
> > + for_each_evictable_lru(lru) {
> > + scanned = __memcg_numa_migrate_pages(lruvec,
> > + lru, target_nodes,
> > + nr_to_scan > 0 ?
> > + SWAP_CLUSTER_MAX : -1);
>
> Shouldn't we just pass nr_to_scan here?
No, I want to migrate memory evenly from all nodes. I.e. if you have 2
source nodes and nr_to_scan=100, there should be ~50 pages migrated from
one node and ~50 from another, not 100-vs-0.
More information about the Devel
mailing list