[Devel] [PATCH rh7] mm: memcontrol: add memory.numa_migrate file

Vladimir Davydov vdavydov at virtuozzo.com
Tue Aug 23 03:27:23 PDT 2016


On Tue, Aug 23, 2016 at 12:57:53PM +0300, Andrey Ryabinin wrote:
...
> echo "0 100" > /sys/fs/cgroup/memory/machine.slice/100/memory.numa_migrate
> 
> [  296.073002] BUG: soft lockup - CPU#1 stuck for 22s! [bash:4028]

Thanks for catching, will fix in v2.

> > +static struct page *memcg_numa_migrate_new_page(struct page *page,
> > +				unsigned long private, int **result)
> > +{
> > +	struct memcg_numa_migrate_struct *ms = (void *)private;
> > +	gfp_t gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_NORETRY | __GFP_NOWARN;
> > +
> > +	ms->current_node = next_node(ms->current_node, *ms->target_nodes);
> > +	if (ms->current_node >= MAX_NUMNODES) {
> > +		ms->current_node = first_node(*ms->target_nodes);
> > +		BUG_ON(ms->current_node >= MAX_NUMNODES);
> 
> Maybe WARN_ON() or VM_BUG_ON() ?

Will replace with VM_BUG_ON.

> > +	}
> > +
> > +	return __alloc_pages_nodemask(gfp_mask, 0,
> > +			node_zonelist(ms->current_node, gfp_mask),
> > +			ms->target_nodes);
> > +}
> > +
> > +/*
> > + * Isolate at most @nr_to_scan pages from @lruvec for further migration and
> > + * store them in @dst. Returns the number of pages scanned. Return value of 0
> > + * means that @lruved is empty.
> > + */
> > +static long memcg_numa_isolate_pages(struct lruvec *lruvec, enum lru_list lru,
> > +				     long nr_to_scan, struct list_head *dst)
> > +{
> > +	struct list_head *src = &lruvec->lists[lru];
> > +	struct zone *zone = lruvec_zone(lruvec);
> > +	long scanned = 0, taken = 0;
> > +
> > +	spin_lock_irq(&zone->lru_lock);
> > +	while (!list_empty(src) && scanned < nr_to_scan && taken < nr_to_scan) {
> > +		struct page *page = list_last_entry(src, struct page, lru);
> > +		int nr_pages;
> > +
> > +		VM_BUG_ON_PAGE(!PageLRU(page), page);
> > +
> 
> __isolate_lru_page() will return -EINVAL for !PageLRU, so either this or the BUG() bellow is unnecessary.

OK, will remove the VM_BUG_ON_PAGE.

...
> > +static int memcg_numa_migrate_pages(struct mem_cgroup *memcg,
> > +				    nodemask_t *target_nodes, long nr_to_scan)
> > +{
> > +	struct mem_cgroup *mi;
> > +	long total_scanned = 0;
> > +
> > +again:
> > +	for_each_mem_cgroup_tree(mi, memcg) {
> > +		struct zone *zone;
> > +
> > +		for_each_populated_zone(zone) {
> > +			struct lruvec *lruvec;
> > +			enum lru_list lru;
> > +			long scanned;
> > +
> > +			if (node_isset(zone_to_nid(zone), *target_nodes))
> > +				continue;
> > +
> > +			lruvec = mem_cgroup_zone_lruvec(zone, mi);
> > +			/*
> > +			 * For the sake of simplicity, do not attempt to migrate
> > +			 * unevictable pages. It should be fine as long as there
> > +			 * aren't too many of them, which is usually true.
> > +			 */
> > +			for_each_evictable_lru(lru) {
> > +				scanned = __memcg_numa_migrate_pages(lruvec,
> > +						lru, target_nodes,
> > +						nr_to_scan > 0 ?
> > +						SWAP_CLUSTER_MAX : -1);
> 
> 					Shouldn't we just pass nr_to_scan here?

No, I want to migrate memory evenly from all nodes. I.e. if you have 2
source nodes and nr_to_scan=100, there should be ~50 pages migrated from
one node and ~50 from another, not 100-vs-0.


More information about the Devel mailing list