Commit b97efd5e authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull KCSAN updates from Paul McKenney.

* 'kcsan.2021.05.18a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  kcsan: Use URL link for pointing access-marking.txt
  kcsan: Document "value changed" line
  kcsan: Report observed value changes
  kcsan: Remove kcsan_report_type
  kcsan: Remove reporting indirection
  kcsan: Refactor access_info initialization
  kcsan: Fold panic() call into print_report()
  kcsan: Refactor passing watchpoint/other_info
  kcsan: Distinguish kcsan_report() calls
  kcsan: Simplify value change detection
  kcsan: Add pointer to access-marking.txt to data_race() bullet
parents a412897f 117232c0
Loading
Loading
Loading
Loading
+39 −54
Original line number Diff line number Diff line
@@ -27,75 +27,57 @@ Error reports
A typical data race report looks like this::

    ==================================================================
    BUG: KCSAN: data-race in generic_permission / kernfs_refresh_inode

    write to 0xffff8fee4c40700c of 4 bytes by task 175 on cpu 4:
     kernfs_refresh_inode+0x70/0x170
     kernfs_iop_permission+0x4f/0x90
     inode_permission+0x190/0x200
     link_path_walk.part.0+0x503/0x8e0
     path_lookupat.isra.0+0x69/0x4d0
     filename_lookup+0x136/0x280
     user_path_at_empty+0x47/0x60
     vfs_statx+0x9b/0x130
     __do_sys_newlstat+0x50/0xb0
     __x64_sys_newlstat+0x37/0x50
     do_syscall_64+0x85/0x260
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

    read to 0xffff8fee4c40700c of 4 bytes by task 166 on cpu 6:
     generic_permission+0x5b/0x2a0
     kernfs_iop_permission+0x66/0x90
     inode_permission+0x190/0x200
     link_path_walk.part.0+0x503/0x8e0
     path_lookupat.isra.0+0x69/0x4d0
     filename_lookup+0x136/0x280
     user_path_at_empty+0x47/0x60
     do_faccessat+0x11a/0x390
     __x64_sys_access+0x3c/0x50
     do_syscall_64+0x85/0x260
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    BUG: KCSAN: data-race in test_kernel_read / test_kernel_write

    write to 0xffffffffc009a628 of 8 bytes by task 487 on cpu 0:
     test_kernel_write+0x1d/0x30
     access_thread+0x89/0xd0
     kthread+0x23e/0x260
     ret_from_fork+0x22/0x30

    read to 0xffffffffc009a628 of 8 bytes by task 488 on cpu 6:
     test_kernel_read+0x10/0x20
     access_thread+0x89/0xd0
     kthread+0x23e/0x260
     ret_from_fork+0x22/0x30

    value changed: 0x00000000000009a6 -> 0x00000000000009b2

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 6 PID: 166 Comm: systemd-journal Not tainted 5.3.0-rc7+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
    CPU: 6 PID: 488 Comm: access_thread Not tainted 5.12.0-rc2+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
    ==================================================================

The header of the report provides a short summary of the functions involved in
the race. It is followed by the access types and stack traces of the 2 threads
involved in the data race.
involved in the data race. If KCSAN also observed a value change, the observed
old value and new value are shown on the "value changed" line respectively.

The other less common type of data race report looks like this::

    ==================================================================
    BUG: KCSAN: data-race in e1000_clean_rx_irq+0x551/0xb10

    race at unknown origin, with read to 0xffff933db8a2ae6c of 1 bytes by interrupt on cpu 0:
     e1000_clean_rx_irq+0x551/0xb10
     e1000_clean+0x533/0xda0
     net_rx_action+0x329/0x900
     __do_softirq+0xdb/0x2db
     irq_exit+0x9b/0xa0
     do_IRQ+0x9c/0xf0
     ret_from_intr+0x0/0x18
     default_idle+0x3f/0x220
     arch_cpu_idle+0x21/0x30
     do_idle+0x1df/0x230
     cpu_startup_entry+0x14/0x20
     rest_init+0xc5/0xcb
     arch_call_rest_init+0x13/0x2b
     start_kernel+0x6db/0x700
    BUG: KCSAN: data-race in test_kernel_rmw_array+0x71/0xd0

    race at unknown origin, with read to 0xffffffffc009bdb0 of 8 bytes by task 515 on cpu 2:
     test_kernel_rmw_array+0x71/0xd0
     access_thread+0x89/0xd0
     kthread+0x23e/0x260
     ret_from_fork+0x22/0x30

    value changed: 0x0000000000002328 -> 0x0000000000002329

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc7+ #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
    CPU: 2 PID: 515 Comm: access_thread Not tainted 5.12.0-rc2+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
    ==================================================================

This report is generated where it was not possible to determine the other
racing thread, but a race was inferred due to the data value of the watched
memory location having changed. These can occur either due to missing
instrumentation or e.g. DMA accesses. These reports will only be generated if
``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y`` (selected by default).
memory location having changed. These reports always show a "value changed"
line. A common reason for reports of this type are missing instrumentation in
the racing thread, but could also occur due to e.g. DMA accesses. Such reports
are shown only if ``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y``, which is
enabled by default.

Selective analysis
~~~~~~~~~~~~~~~~~~
@@ -106,7 +88,8 @@ the below options are available:

* KCSAN understands the ``data_race(expr)`` annotation, which tells KCSAN that
  any data races due to accesses in ``expr`` should be ignored and resulting
  behaviour when encountering a data race is deemed safe.
  behaviour when encountering a data race is deemed safe.  Please see
  `"Marking Shared-Memory Accesses" in the LKMM`_ for more information.

* Disabling data race detection for entire functions can be accomplished by
  using the function attribute ``__no_kcsan``::
@@ -128,6 +111,8 @@ the below options are available:

    KCSAN_SANITIZE := n

.. _"Marking Shared-Memory Accesses" in the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt

Furthermore, it is possible to tell KCSAN to show or hide entire classes of
data races, depending on preferences. These can be changed via the following
Kconfig options:
+21 −32
Original line number Diff line number Diff line
@@ -380,9 +380,7 @@ static noinline void kcsan_found_watchpoint(const volatile void *ptr,

	if (consumed) {
		kcsan_save_irqtrace(current);
		kcsan_report(ptr, size, type, KCSAN_VALUE_CHANGE_MAYBE,
			     KCSAN_REPORT_CONSUMED_WATCHPOINT,
			     watchpoint - watchpoints);
		kcsan_report_set_info(ptr, size, type, watchpoint - watchpoints);
		kcsan_restore_irqtrace(current);
	} else {
		/*
@@ -407,12 +405,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
	const bool is_write = (type & KCSAN_ACCESS_WRITE) != 0;
	const bool is_assert = (type & KCSAN_ACCESS_ASSERT) != 0;
	atomic_long_t *watchpoint;
	union {
		u8 _1;
		u16 _2;
		u32 _4;
		u64 _8;
	} expect_value;
	u64 old, new, diff;
	unsigned long access_mask;
	enum kcsan_value_change value_change = KCSAN_VALUE_CHANGE_MAYBE;
	unsigned long ua_flags = user_access_save();
@@ -468,19 +461,19 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
	 * Read the current value, to later check and infer a race if the data
	 * was modified via a non-instrumented access, e.g. from a device.
	 */
	expect_value._8 = 0;
	old = 0;
	switch (size) {
	case 1:
		expect_value._1 = READ_ONCE(*(const u8 *)ptr);
		old = READ_ONCE(*(const u8 *)ptr);
		break;
	case 2:
		expect_value._2 = READ_ONCE(*(const u16 *)ptr);
		old = READ_ONCE(*(const u16 *)ptr);
		break;
	case 4:
		expect_value._4 = READ_ONCE(*(const u32 *)ptr);
		old = READ_ONCE(*(const u32 *)ptr);
		break;
	case 8:
		expect_value._8 = READ_ONCE(*(const u64 *)ptr);
		old = READ_ONCE(*(const u64 *)ptr);
		break;
	default:
		break; /* ignore; we do not diff the values */
@@ -506,33 +499,30 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
	 * racy access.
	 */
	access_mask = get_ctx()->access_mask;
	new = 0;
	switch (size) {
	case 1:
		expect_value._1 ^= READ_ONCE(*(const u8 *)ptr);
		if (access_mask)
			expect_value._1 &= (u8)access_mask;
		new = READ_ONCE(*(const u8 *)ptr);
		break;
	case 2:
		expect_value._2 ^= READ_ONCE(*(const u16 *)ptr);
		if (access_mask)
			expect_value._2 &= (u16)access_mask;
		new = READ_ONCE(*(const u16 *)ptr);
		break;
	case 4:
		expect_value._4 ^= READ_ONCE(*(const u32 *)ptr);
		if (access_mask)
			expect_value._4 &= (u32)access_mask;
		new = READ_ONCE(*(const u32 *)ptr);
		break;
	case 8:
		expect_value._8 ^= READ_ONCE(*(const u64 *)ptr);
		if (access_mask)
			expect_value._8 &= (u64)access_mask;
		new = READ_ONCE(*(const u64 *)ptr);
		break;
	default:
		break; /* ignore; we do not diff the values */
	}

	diff = old ^ new;
	if (access_mask)
		diff &= access_mask;

	/* Were we able to observe a value-change? */
	if (expect_value._8 != 0)
	if (diff != 0)
		value_change = KCSAN_VALUE_CHANGE_TRUE;

	/* Check if this access raced with another. */
@@ -566,8 +556,9 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
		if (is_assert && value_change == KCSAN_VALUE_CHANGE_TRUE)
			atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_ASSERT_FAILURES]);

		kcsan_report(ptr, size, type, value_change, KCSAN_REPORT_RACE_SIGNAL,
			     watchpoint - watchpoints);
		kcsan_report_known_origin(ptr, size, type, value_change,
					  watchpoint - watchpoints,
					  old, new, access_mask);
	} else if (value_change == KCSAN_VALUE_CHANGE_TRUE) {
		/* Inferring a race, since the value should not have changed. */

@@ -576,9 +567,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
			atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_ASSERT_FAILURES]);

		if (IS_ENABLED(CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN) || is_assert)
			kcsan_report(ptr, size, type, KCSAN_VALUE_CHANGE_TRUE,
				     KCSAN_REPORT_RACE_UNKNOWN_ORIGIN,
				     watchpoint - watchpoints);
			kcsan_report_unknown_origin(ptr, size, type, old, new, access_mask);
	}

	/*
+18 −21
Original line number Diff line number Diff line
@@ -116,30 +116,27 @@ enum kcsan_value_change {
	KCSAN_VALUE_CHANGE_TRUE,
};

enum kcsan_report_type {
/*
	 * The thread that set up the watchpoint and briefly stalled was
	 * signalled that another thread triggered the watchpoint.
 * The calling thread hit and consumed a watchpoint: set the access information
 * to be consumed by the reporting thread. No report is printed yet.
 */
	KCSAN_REPORT_RACE_SIGNAL,
void kcsan_report_set_info(const volatile void *ptr, size_t size, int access_type,
			   int watchpoint_idx);

/*
	 * A thread found and consumed a matching watchpoint.
 * The calling thread observed that the watchpoint it set up was hit and
 * consumed: print the full report based on information set by the racing
 * thread.
 */
	KCSAN_REPORT_CONSUMED_WATCHPOINT,
void kcsan_report_known_origin(const volatile void *ptr, size_t size, int access_type,
			       enum kcsan_value_change value_change, int watchpoint_idx,
			       u64 old, u64 new, u64 mask);

/*
	 * No other thread was observed to race with the access, but the data
	 * value before and after the stall differs.
 * No other thread was observed to race with the access, but the data value
 * before and after the stall differs. Reports a race of "unknown origin".
 */
	KCSAN_REPORT_RACE_UNKNOWN_ORIGIN,
};

/*
 * Print a race report from thread that encountered the race.
 */
extern void kcsan_report(const volatile void *ptr, size_t size, int access_type,
			 enum kcsan_value_change value_change,
			 enum kcsan_report_type type, int watchpoint_idx);
void kcsan_report_unknown_origin(const volatile void *ptr, size_t size, int access_type,
				 u64 old, u64 new, u64 mask);

#endif /* _KERNEL_KCSAN_KCSAN_H */
+88 −81
Original line number Diff line number Diff line
@@ -325,13 +325,10 @@ static void print_verbose_info(struct task_struct *task)
	print_irqtrace_events(task);
}

/*
 * Returns true if a report was generated, false otherwise.
 */
static bool print_report(enum kcsan_value_change value_change,
			 enum kcsan_report_type type,
static void print_report(enum kcsan_value_change value_change,
			 const struct access_info *ai,
			 const struct other_info *other_info)
			 const struct other_info *other_info,
			 u64 old, u64 new, u64 mask)
{
	unsigned long stack_entries[NUM_STACK_ENTRIES] = { 0 };
	int num_stack_entries = stack_trace_save(stack_entries, NUM_STACK_ENTRIES, 1);
@@ -344,25 +341,24 @@ static bool print_report(enum kcsan_value_change value_change,
	 * Must check report filter rules before starting to print.
	 */
	if (skip_report(KCSAN_VALUE_CHANGE_TRUE, stack_entries[skipnr]))
		return false;
		return;

	if (type == KCSAN_REPORT_RACE_SIGNAL) {
	if (other_info) {
		other_skipnr = get_stack_skipnr(other_info->stack_entries,
						other_info->num_stack_entries);
		other_frame = other_info->stack_entries[other_skipnr];

		/* @value_change is only known for the other thread */
		if (skip_report(value_change, other_frame))
			return false;
			return;
	}

	if (rate_limit_report(this_frame, other_frame))
		return false;
		return;

	/* Print report header. */
	pr_err("==================================================================\n");
	switch (type) {
	case KCSAN_REPORT_RACE_SIGNAL: {
	if (other_info) {
		int cmp;

		/*
@@ -374,22 +370,15 @@ static bool print_report(enum kcsan_value_change value_change,
		       get_bug_type(ai->access_type | other_info->ai.access_type),
		       (void *)(cmp < 0 ? other_frame : this_frame),
		       (void *)(cmp < 0 ? this_frame : other_frame));
	} break;

	case KCSAN_REPORT_RACE_UNKNOWN_ORIGIN:
	} else {
		pr_err("BUG: KCSAN: %s in %pS\n", get_bug_type(ai->access_type),
		       (void *)this_frame);
		break;

	default:
		BUG();
	}

	pr_err("\n");

	/* Print information about the racing accesses. */
	switch (type) {
	case KCSAN_REPORT_RACE_SIGNAL:
	if (other_info) {
		pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n",
		       get_access_type(other_info->ai.access_type), other_info->ai.ptr,
		       other_info->ai.size, get_thread_desc(other_info->ai.task_pid),
@@ -407,16 +396,10 @@ static bool print_report(enum kcsan_value_change value_change,
		pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n",
		       get_access_type(ai->access_type), ai->ptr, ai->size,
		       get_thread_desc(ai->task_pid), ai->cpu_id);
		break;

	case KCSAN_REPORT_RACE_UNKNOWN_ORIGIN:
	} else {
		pr_err("race at unknown origin, with %s to 0x%px of %zu bytes by %s on cpu %i:\n",
		       get_access_type(ai->access_type), ai->ptr, ai->size,
		       get_thread_desc(ai->task_pid), ai->cpu_id);
		break;

	default:
		BUG();
	}
	/* Print stack trace of this thread. */
	stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr,
@@ -425,24 +408,41 @@ static bool print_report(enum kcsan_value_change value_change,
	if (IS_ENABLED(CONFIG_KCSAN_VERBOSE))
		print_verbose_info(current);

	/* Print observed value change. */
	if (ai->size <= 8) {
		int hex_len = ai->size * 2;
		u64 diff = old ^ new;

		if (mask)
			diff &= mask;
		if (diff) {
			pr_err("\n");
			pr_err("value changed: 0x%0*llx -> 0x%0*llx\n",
			       hex_len, old, hex_len, new);
			if (mask) {
				pr_err(" bits changed: 0x%0*llx with mask 0x%0*llx\n",
				       hex_len, diff, hex_len, mask);
			}
		}
	}

	/* Print report footer. */
	pr_err("\n");
	pr_err("Reported by Kernel Concurrency Sanitizer on:\n");
	dump_stack_print_info(KERN_DEFAULT);
	pr_err("==================================================================\n");

	return true;
	if (panic_on_warn)
		panic("panic_on_warn set ...\n");
}

static void release_report(unsigned long *flags, struct other_info *other_info)
{
	if (other_info)
	/*
		 * Use size to denote valid/invalid, since KCSAN entirely
		 * ignores 0-sized accesses.
	 * Use size to denote valid/invalid, since KCSAN entirely ignores
	 * 0-sized accesses.
	 */
	other_info->ai.size = 0;

	raw_spin_unlock_irqrestore(&report_lock, *flags);
}

@@ -575,48 +575,42 @@ static bool prepare_report_consumer(unsigned long *flags,
	return false;
}

/*
 * Depending on the report type either sets @other_info and returns false, or
 * awaits @other_info and returns true. If @other_info is not required for the
 * report type, simply acquires @report_lock and returns true.
 */
static noinline bool prepare_report(unsigned long *flags,
				    enum kcsan_report_type type,
				    const struct access_info *ai,
				    struct other_info *other_info)
static struct access_info prepare_access_info(const volatile void *ptr, size_t size,
					      int access_type)
{
	switch (type) {
	case KCSAN_REPORT_CONSUMED_WATCHPOINT:
		prepare_report_producer(flags, ai, other_info);
		return false;
	case KCSAN_REPORT_RACE_SIGNAL:
		return prepare_report_consumer(flags, ai, other_info);
	default:
		/* @other_info not required; just acquire @report_lock. */
		raw_spin_lock_irqsave(&report_lock, *flags);
		return true;
	}
}

void kcsan_report(const volatile void *ptr, size_t size, int access_type,
		  enum kcsan_value_change value_change,
		  enum kcsan_report_type type, int watchpoint_idx)
{
	unsigned long flags = 0;
	const struct access_info ai = {
	return (struct access_info) {
		.ptr		= ptr,
		.size		= size,
		.access_type	= access_type,
		.task_pid	= in_task() ? task_pid_nr(current) : -1,
		.cpu_id		= raw_smp_processor_id()
	};
	struct other_info *other_info = type == KCSAN_REPORT_RACE_UNKNOWN_ORIGIN
					? NULL : &other_infos[watchpoint_idx];
}

void kcsan_report_set_info(const volatile void *ptr, size_t size, int access_type,
			   int watchpoint_idx)
{
	const struct access_info ai = prepare_access_info(ptr, size, access_type);
	unsigned long flags;

	kcsan_disable_current();
	if (WARN_ON(watchpoint_idx < 0 || watchpoint_idx >= ARRAY_SIZE(other_infos)))
		goto out;
	lockdep_off(); /* See kcsan_report_known_origin(). */

	prepare_report_producer(&flags, &ai, &other_infos[watchpoint_idx]);

	lockdep_on();
	kcsan_enable_current();
}

void kcsan_report_known_origin(const volatile void *ptr, size_t size, int access_type,
			       enum kcsan_value_change value_change, int watchpoint_idx,
			       u64 old, u64 new, u64 mask)
{
	const struct access_info ai = prepare_access_info(ptr, size, access_type);
	struct other_info *other_info = &other_infos[watchpoint_idx];
	unsigned long flags = 0;

	kcsan_disable_current();
	/*
	 * Because we may generate reports when we're in scheduler code, the use
	 * of printk() could deadlock. Until such time that all printing code
@@ -626,22 +620,35 @@ void kcsan_report(const volatile void *ptr, size_t size, int access_type,
	 */
	lockdep_off();

	if (prepare_report(&flags, type, &ai, other_info)) {
	if (!prepare_report_consumer(&flags, &ai, other_info))
		goto out;
	/*
		 * Never report if value_change is FALSE, only if we it is
	 * Never report if value_change is FALSE, only when it is
	 * either TRUE or MAYBE. In case of MAYBE, further filtering may
	 * be done once we know the full stack trace in print_report().
	 */
		bool reported = value_change != KCSAN_VALUE_CHANGE_FALSE &&
				print_report(value_change, type, &ai, other_info);

		if (reported && panic_on_warn)
			panic("panic_on_warn set ...\n");
	if (value_change != KCSAN_VALUE_CHANGE_FALSE)
		print_report(value_change, &ai, other_info, old, new, mask);

	release_report(&flags, other_info);
out:
	lockdep_on();
	kcsan_enable_current();
}

void kcsan_report_unknown_origin(const volatile void *ptr, size_t size, int access_type,
				 u64 old, u64 new, u64 mask)
{
	const struct access_info ai = prepare_access_info(ptr, size, access_type);
	unsigned long flags;

	kcsan_disable_current();
	lockdep_off(); /* See kcsan_report_known_origin(). */

	raw_spin_lock_irqsave(&report_lock, flags);
	print_report(KCSAN_VALUE_CHANGE_TRUE, &ai, NULL, old, new, mask);
	raw_spin_unlock_irqrestore(&report_lock, flags);

	lockdep_on();
out:
	kcsan_enable_current();
}