Commit 288e21b6 authored by Will Deacon's avatar Will Deacon
Browse files

Merge branch 'for-next/perf' into for-next/core

* for-next/perf:
  drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
  perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
  docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
  drivers/perf: hisi: add driver for HNS3 PMU
  drivers/perf: hisi: Add description for HNS3 PMU driver
  drivers/perf: riscv_pmu_sbi: perf format
  perf/arm-cci: Use the bitmap API to allocate bitmaps
  drivers/perf: riscv_pmu: Add riscv pmu pm notifier
  perf: hisi: Extract hisi_pmu_init
  perf/marvell_cn10k: Fix TAD PMU register offset
  perf/marvell_cn10k: Remove useless license text when SPDX-License-Identifier is already used
  arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1
  perf/arm-cci: fix typo in comment
  drivers/perf:Directly use ida_alloc()/free()
  drivers/perf: Directly use ida_alloc()/free()
parents c436500d 92f2b8ba
Loading
Loading
Loading
Loading
+136 −0
Original line number Diff line number Diff line
======================================
HNS3 Performance Monitoring Unit (PMU)
======================================

HNS3(HiSilicon network system 3) Performance Monitoring Unit (PMU) is an
End Point device to collect performance statistics of HiSilicon SoC NIC.
On Hip09, each SICL(Super I/O cluster) has one PMU device.

HNS3 PMU supports collection of performance statistics such as bandwidth,
latency, packet rate and interrupt rate.

Each HNS3 PMU supports 8 hardware events.

HNS3 PMU driver
===============

The HNS3 PMU driver registers a perf PMU with the name of its sicl id.::

  /sys/devices/hns3_pmu_sicl_<sicl_id>

PMU driver provides description of available events, filter modes, format,
identifier and cpumask in sysfs.

The "events" directory describes the event code of all supported events
shown in perf list.

The "filtermode" directory describes the supported filter modes of each
event.

The "format" directory describes all formats of the config (events) and
config1 (filter options) fields of the perf_event_attr structure.

The "identifier" file shows version of PMU hardware device.

The "bdf_min" and "bdf_max" files show the supported bdf range of each
pmu device.

The "hw_clk_freq" file shows the hardware clock frequency of each pmu
device.

Example usage of checking event code and subevent code::

  $# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_time
  config=0x00204
  $# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_packet_num
  config=0x10204

Each performance statistic has a pair of events to get two values to
calculate real performance data in userspace.

The bits 0~15 of config (here 0x0204) are the true hardware event code. If
two events have same value of bits 0~15 of config, that means they are
event pair. And the bit 16 of config indicates getting counter 0 or
counter 1 of hardware event.

After getting two values of event pair in usersapce, the formula of
computation to calculate real performance data is:::

  counter 0 / counter 1

Example usage of checking supported filter mode::

  $# cat /sys/devices/hns3_pmu_sicl_0/filtermode/bw_ssu_rpu_byte_num
  filter mode supported: global/port/port-tc/func/func-queue/

Example usage of perf::

  $# perf list
  hns3_pmu_sicl_0/bw_ssu_rpu_byte_num/ [kernel PMU event]
  hns3_pmu_sicl_0/bw_ssu_rpu_time/     [kernel PMU event]
  ------------------------------------------

  $# perf stat -g -e hns3_pmu_sicl_0/bw_ssu_rpu_byte_num,global=1/ -e hns3_pmu_sicl_0/bw_ssu_rpu_time,global=1/ -I 1000
  or
  $# perf stat -g -e hns3_pmu_sicl_0/config=0x00002,global=1/ -e hns3_pmu_sicl_0/config=0x10002,global=1/ -I 1000


Filter modes
--------------

1. global mode
PMU collect performance statistics for all HNS3 PCIe functions of IO DIE.
Set the "global" filter option to 1 will enable this mode.
Example usage of perf::

  $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,global=1/ -I 1000

2. port mode
PMU collect performance statistic of one whole physical port. The port id
is same as mac id. The "tc" filter option must be set to 0xF in this mode,
here tc stands for traffic class.

Example usage of perf::

  $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0xF/ -I 1000

3. port-tc mode
PMU collect performance statistic of one tc of physical port. The port id
is same as mac id. The "tc" filter option must be set to 0 ~ 7 in this
mode.
Example usage of perf::

  $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0/ -I 1000

4. func mode
PMU collect performance statistic of one PF/VF. The function id is BDF of
PF/VF, its conversion formula::

  func = (bus << 8) + (device << 3) + (function)

for example:
  BDF         func
  35:00.0    0x3500
  35:00.1    0x3501
  35:01.0    0x3508

In this mode, the "queue" filter option must be set to 0xFFFF.
Example usage of perf::

  $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0xFFFF/ -I 1000

5. func-queue mode
PMU collect performance statistic of one queue of PF/VF. The function id
is BDF of PF/VF, the "queue" filter option must be set to the exact queue
id of function.
Example usage of perf::

  $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0/ -I 1000

6. func-intr mode
PMU collect performance statistic of one interrupt of PF/VF. The function
id is BDF of PF/VF, the "intr" filter option must be set to the exact
interrupt id of function.
Example usage of perf::

  $# perf stat -a -e hns3_pmu_sicl_0/config=0x00301,bdf=0x3500,intr=0/ -I 1000
+1 −0
Original line number Diff line number Diff line
@@ -9,6 +9,7 @@ Performance monitor support

   hisi-pmu
   hisi-pcie-pmu
   hns3-pmu
   imx-ddr
   qcom_l2_pmu
   qcom_l3_pmu
+6 −0
Original line number Diff line number Diff line
@@ -8944,6 +8944,12 @@ F: Documentation/admin-guide/perf/hisi-pcie-pmu.rst
F:	Documentation/admin-guide/perf/hisi-pmu.rst
F:	drivers/perf/hisilicon
HISILICON HNS3 PMU DRIVER
M:	Guangbin Huang <huangguangbin2@huawei.com>
S:	Supported
F:	Documentation/admin-guide/perf/hns3-pmu.rst
F:	drivers/perf/hisilicon/hns3_pmu.c
HISILICON QM AND ZIP Controller DRIVER
M:	Zhou Wang <wangzhou1@hisilicon.com>
L:	linux-crypto@vger.kernel.org
+1 −1
Original line number Diff line number Diff line
@@ -562,7 +562,7 @@ static const struct arm64_ftr_bits ftr_id_pfr2[] = {

static const struct arm64_ftr_bits ftr_id_dfr0[] = {
	/* [31:28] TraceFilt */
	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_PERFMON_SHIFT, 4, 0xf),
	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_DFR0_PERFMON_SHIFT, 4, 0),
	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MPROFDBG_SHIFT, 4, 0),
	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MMAPTRC_SHIFT, 4, 0),
	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_COPTRC_SHIFT, 4, 0),
+5 −6
Original line number Diff line number Diff line
@@ -1139,7 +1139,7 @@ static void cci_pmu_start(struct perf_event *event, int pmu_flags)

	/*
	 * To handle interrupt latency, we always reprogram the period
	 * regardlesss of PERF_EF_RELOAD.
	 * regardless of PERF_EF_RELOAD.
	 */
	if (pmu_flags & PERF_EF_RELOAD)
		WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
@@ -1261,7 +1261,7 @@ static int validate_group(struct perf_event *event)
		 */
		.used_mask = mask,
	};
	memset(mask, 0, BITS_TO_LONGS(cci_pmu->num_cntrs) * sizeof(unsigned long));
	bitmap_zero(mask, cci_pmu->num_cntrs);

	if (!validate_event(event->pmu, &fake_pmu, leader))
		return -EINVAL;
@@ -1629,9 +1629,8 @@ static struct cci_pmu *cci_pmu_alloc(struct device *dev)
					     GFP_KERNEL);
	if (!cci_pmu->hw_events.events)
		return ERR_PTR(-ENOMEM);
	cci_pmu->hw_events.used_mask = devm_kcalloc(dev,
						BITS_TO_LONGS(CCI_PMU_MAX_HW_CNTRS(model)),
						sizeof(*cci_pmu->hw_events.used_mask),
	cci_pmu->hw_events.used_mask = devm_bitmap_zalloc(dev,
							  CCI_PMU_MAX_HW_CNTRS(model),
							  GFP_KERNEL);
	if (!cci_pmu->hw_events.used_mask)
		return ERR_PTR(-ENOMEM);
Loading