Commit a13f58a0 authored by Jann Horn's avatar Jann Horn Committed by Ingo Molnar
Browse files

locking/refcount: Document interaction with PID_MAX_LIMIT



Document the circumstances under which refcount_t's saturation mechanism
works deterministically.

Acked-by: default avatarKees Cook <keescook@chromium.org>
Acked-by: default avatarWill Deacon <will@kernel.org>
Signed-off-by: default avatarJann Horn <jannh@google.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200303105427.260620-1-jannh@google.com
parent d22cc7f6
Loading
Loading
Loading
Loading
+18 −5
Original line number Diff line number Diff line
@@ -38,11 +38,24 @@
 * atomic operations, then the count will continue to edge closer to 0. If it
 * reaches a value of 1 before /any/ of the threads reset it to the saturated
 * value, then a concurrent refcount_dec_and_test() may erroneously free the
 * underlying object. Given the precise timing details involved with the
 * round-robin scheduling of each thread manipulating the refcount and the need
 * to hit the race multiple times in succession, there doesn't appear to be a
 * practical avenue of attack even if using refcount_add() operations with
 * larger increments.
 * underlying object.
 * Linux limits the maximum number of tasks to PID_MAX_LIMIT, which is currently
 * 0x400000 (and can't easily be raised in the future beyond FUTEX_TID_MASK).
 * With the current PID limit, if no batched refcounting operations are used and
 * the attacker can't repeatedly trigger kernel oopses in the middle of refcount
 * operations, this makes it impossible for a saturated refcount to leave the
 * saturation range, even if it is possible for multiple uses of the same
 * refcount to nest in the context of a single task:
 *
 *     (UINT_MAX+1-REFCOUNT_SATURATED) / PID_MAX_LIMIT =
 *     0x40000000 / 0x400000 = 0x100 = 256
 *
 * If hundreds of references are added/removed with a single refcounting
 * operation, it may potentially be possible to leave the saturation range; but
 * given the precise timing details involved with the round-robin scheduling of
 * each thread manipulating the refcount and the need to hit the race multiple
 * times in succession, there doesn't appear to be a practical avenue of attack
 * even if using refcount_add() operations with larger increments.
 *
 * Memory ordering
 * ===============