workqueue: Introduce struct wq_node_nr_active
[ Upstream commit 91ccc6e7 ] Currently, for both percpu and unbound workqueues, max_active applies per-cpu, which is a recent change for unbound workqueues. The change for unbound workqueues was a significant departure from the previous behavior of per-node application. It made some use cases create undesirable number of concurrent work items and left no good way of fixing them. To address the problem, workqueue is implementing a NUMA node segmented global nr_active mechanism, which will be explained further in the next patch. As a preparation, this patch introduces struct wq_node_nr_active. It's a data structured allocated for each workqueue and NUMA node pair and currently only tracks the workqueue's number of active work items on the node. This is split out from the next patch to make it easier to understand and review. Note that there is an extra wq_node_nr_active allocated for the invalid node nr_node_ids which is used to track nr_active for pools which don't have NUMA node associated such as the default fallback system-wide pool. This doesn't cause any behavior changes visible to userland yet. The next patch will expand to implement the control mechanism on top. v4: - Fixed out-of-bound access when freeing per-cpu workqueues. v3: - Use flexible array for wq->node_nr_active as suggested by Lai. v2: - wq->max_active now uses WRITE/READ_ONCE() as suggested by Lai. - Lai pointed out that pwq_tryinc_nr_active() incorrectly dropped pwq->max_active check. Restored. As the next patch replaces the max_active enforcement mechanism, this doesn't change the end result. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Stable-dep-of: 5797b1c1 ("workqueue: Implement system-wide nr_active enforcement for unbound workqueues") Signed-off-by: Sasha Levin <sashal@kernel.org>