Commit 1d21ce27 authored by Zi Yan's avatar Zi Yan Committed by Wen Zhiwei
Browse files

mm/numa: no task_numa_fault() call if PMD is changed

stable inclusion
from stable-v6.6.48
commit c789a78151c113d06d84a3e8db93e156a222d6db
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IAWEBV

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c789a78151c113d06d84a3e8db93e156a222d6db

--------------------------------

commit fd8c35a92910f4829b7c99841f39b1b952c259d5 upstream.

When handling a numa page fault, task_numa_fault() should be called by a
process that restores the page table of the faulted folio to avoid
duplicated stats counting.  Commit c5b5a3dd ("mm: thp: refactor NUMA
fault handling") restructured do_huge_pmd_numa_page() and did not avoid
task_numa_fault() call in the second page table check after a numa
migration failure.  Fix it by making all !pmd_same() return immediately.

This issue can cause task_numa_fault() being called more than necessary
and lead to unexpected numa balancing results (It is hard to tell whether
the issue will cause positive or negative performance impact due to
duplicated numa fault counting).

Link: https://lkml.kernel.org/r/20240809145906.1513458-3-ziy@nvidia.com


Fixes: c5b5a3dd ("mm: thp: refactor NUMA fault handling")
Reported-by: default avatar"Huang, Ying" <ying.huang@intel.com>
Closes: https://lore.kernel.org/linux-mm/87zfqfw0yw.fsf@yhuang6-desk2.ccr.corp.intel.com/


Signed-off-by: default avatarZi Yan <ziy@nvidia.com>
Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
 mm/huge_memory.c
Signed-off-by: default avatarWen Zhiwei <wenzhiwei@kylinos.cn>
parent 29a39809
Loading
Loading
Loading
Loading
+13 −16
Original line number Diff line number Diff line
@@ -2006,7 +2006,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
	if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) {
		spin_unlock(vmf->ptl);
		goto out;
		return 0;
	}

	pmd = pmd_modify(oldpmd, vma->vm_page_prot);
@@ -2048,22 +2048,16 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
	if (migrated) {
		flags |= TNF_MIGRATED;
		nid = target_nid;
	} else {
		task_numa_fault(last_cpupid, nid, HPAGE_PMD_NR, flags);
		return 0;
	}

	flags |= TNF_MIGRATE_FAIL;
	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
	if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) {
		spin_unlock(vmf->ptl);
			goto out;
		}
		goto out_map;
	}

out:
	if (nid != NUMA_NO_NODE)
		task_numa_fault(last_cpupid, nid, HPAGE_PMD_NR, flags);

		return 0;

	}
out_map:
	/* Restore the PMD */
	pmd = pmd_modify(oldpmd, vma->vm_page_prot);
@@ -2073,7 +2067,10 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
	set_pmd_at(vma->vm_mm, haddr, vmf->pmd, pmd);
	update_mmu_cache_pmd(vma, vmf->address, vmf->pmd);
	spin_unlock(vmf->ptl);
	goto out;

	if (nid != NUMA_NO_NODE)
		task_numa_fault(last_cpupid, nid, HPAGE_PMD_NR, flags);
	return 0;
}

/*