Unverified Commit 0f8821da authored by Sebastian Andrzej Siewior's avatar Sebastian Andrzej Siewior Committed by Christian Brauner
Browse files

fs/namespace: Boost the mount_lock.lock owner instead of spinning on PREEMPT_RT.

The MNT_WRITE_HOLD flag is used to hold back any new writers while the
mount point is about to be made read-only. __mnt_want_write() then loops
with disabled preemption until this flag disappears. Callers of
mnt_hold_writers() (which sets the flag) hold the spinlock_t of
mount_lock (seqlock_t) which disables preemption on !PREEMPT_RT and
ensures the task is not scheduled away so that the spinning side spins
for a long time.

On PREEMPT_RT the spinlock_t does not disable preemption and so it is
possible that the task setting MNT_WRITE_HOLD is preempted by task with
higher priority which then spins infinitely waiting for MNT_WRITE_HOLD
to get removed.

Acquire mount_lock::lock which is held by setter of MNT_WRITE_HOLD. This
will PI-boost the owner and wait until the lock is dropped and which
means that MNT_WRITE_HOLD is cleared again.

Link: https://lore.kernel.org/r/20211025152218.opvcqfku2lhqvp4o@linutronix.de
Link: https://lore.kernel.org/r/20211125120711.dgbsienyrsxfzpoi@linutronix.de


Acked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
parent 13605725
Loading
Loading
Loading
Loading
+18 −2
Original line number Diff line number Diff line
@@ -343,8 +343,24 @@ int __mnt_want_write(struct vfsmount *m)
	 * incremented count after it has set MNT_WRITE_HOLD.
	 */
	smp_mb();
	while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD)
	might_lock(&mount_lock.lock);
	while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) {
		if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
			cpu_relax();
		} else {
			/*
			 * This prevents priority inversion, if the task
			 * setting MNT_WRITE_HOLD got preempted on a remote
			 * CPU, and it prevents life lock if the task setting
			 * MNT_WRITE_HOLD has a lower priority and is bound to
			 * the same CPU as the task that is spinning here.
			 */
			preempt_enable();
			lock_mount_hash();
			unlock_mount_hash();
			preempt_disable();
		}
	}
	/*
	 * After the slowpath clears MNT_WRITE_HOLD, mnt_is_readonly will
	 * be set to match its requirements. So we must not load that until