Skip to content
Commit e47483bc authored by Vlastimil Babka's avatar Vlastimil Babka Committed by Linus Torvalds
Browse files

mm, page_alloc: fix premature OOM when racing with cpuset mems update

Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
triggers OOM killer in few seconds, despite lots of free memory.  The
test attempts to repeatedly fault in memory in one process in a cpuset,
while changing allowed nodes of the cpuset between 0 and 1 in another
process.

The problem comes from insufficient protection against cpuset changes,
which can cause get_page_from_freelist() to consider all zones as
non-eligible due to nodemask and/or current->mems_allowed.  This was
masked in the past by sufficient retries, but since commit 682a3385
("mm, page_alloc: inline the fast path of the zonelist iterator") we fix
the preferred_zoneref once, and don't iterate over the whole zonelist in
further attempts, thus the only eligible zones might be placed in the
zonelist before our starting point and we always miss them.

A previous patch fixed this problem for current->mems_allowed.  However,
cpuset changes also update the task's mempolicy nodemask.  The fix has
two parts.  We have to repeat the preferred_zoneref search when we
detect cpuset update by way of seqcount, and we have to check the
seqcount before considering OOM.

[akpm@linux-foundation.org: fix typo in comment]
Link: http://lkml.kernel.org/r/20170120103843.24587-5-vbabka@suse.cz
Fixes: c33d6c06

 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
Reported-by: default avatarGanapatrao Kulkarni <gpkulkarni@gmail.com>
Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 5ce9bfef
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment