Unverified Commit 287ff289 authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!11628 [OLK-5.10]Add description for HiSilicon PCIe PMU driver,Some updates for HiSilicon PCIe PMU

Merge Pull Request from: @zhangqizhi3 
 
HiSilicon's HIP09 platform supports PCIE PMU RCiEP devices. Document it to provide guidance on how to use it.
This series includes several updates to HiSilicon's PCIe PMUs:
- If the user specifies an unsupported event, PATCH 1/3 will fix the error count

PATCH 2/3 fixes the issue that TLP headers are not supported, only bandwidth counting is supported
PATCH 3/3 further derives the root port BDF range supported by the PMU 
 
Link:https://gitee.com/openeuler/kernel/pulls/11628

 

Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
parents 634b8845 cf6e8fcf
Loading
Loading
Loading
Loading
+148 −0
Original line number Diff line number Diff line
================================================
HiSilicon PCIe Performance Monitoring Unit (PMU)
================================================

On Hip09, HiSilicon PCIe Performance Monitoring Unit (PMU) could monitor
bandwidth, latency, bus utilization and buffer occupancy data of PCIe.

Each PCIe Core has a PMU to monitor multi Root Ports of this PCIe Core and
all Endpoints downstream these Root Ports.


HiSilicon PCIe PMU driver
=========================

The PCIe PMU driver registers a perf PMU with the name of its sicl-id and PCIe
Core id.::

  /sys/bus/event_source/hisi_pcie<sicl>_core<core>

PMU driver provides description of available events and filter options in sysfs,
see /sys/bus/event_source/devices/hisi_pcie<sicl>_core<core>.

The "format" directory describes all formats of the config (events) and config1
(filter options) fields of the perf_event_attr structure. The "events" directory
describes all documented events shown in perf list.

The "identifier" sysfs file allows users to identify the version of the
PMU hardware device.

The "bus" sysfs file allows users to get the bus number of Root Ports
monitored by PMU. Furthermore users can get the Root Ports range in
[bdf_min, bdf_max] from "bdf_min" and "bdf_max" sysfs attributes
respectively.

Example usage of perf::

  $# perf list
  hisi_pcie0_core0/rx_mwr_latency/ [kernel PMU event]
  hisi_pcie0_core0/rx_mwr_cnt/ [kernel PMU event]
  ------------------------------------------

  $# perf stat -e hisi_pcie0_core0/rx_mwr_latency,port=0xffff/
  $# perf stat -e hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/

The related events usually used to calculate the bandwidth, latency or others.
They need to start and end counting at the same time, therefore related events
are best used in the same event group to get the expected value. There are two
ways to know if they are related events:

a) By event name, such as the latency events "xxx_latency, xxx_cnt" or
   bandwidth events "xxx_flux, xxx_time".
b) By event type, such as "event=0xXXXX, event=0x1XXXX".

Example usage of perf group::

  $# perf stat -e "{hisi_pcie0_core0/rx_mwr_latency,port=0xffff/,hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/}"

The current driver does not support sampling. So "perf record" is unsupported.
Also attach to a task is unsupported for PCIe PMU.

Filter options
--------------

1. Target filter

   PMU could only monitor the performance of traffic downstream target Root
   Ports or downstream target Endpoint. PCIe PMU driver support "port" and
   "bdf" interfaces for users.
   Please notice that, one of these two interfaces must be set, and these two
   interfaces aren't supported at the same time. If they are both set, only
   "port" filter is valid.
   If "port" filter not being set or is set explicitly to zero (default), the
   "bdf" filter will be in effect, because "bdf=0" meaning 0000:000:00.0.

   - port

     "port" filter can be used in all PCIe PMU events, target Root Port can be
     selected by configuring the 16-bits-bitmap "port". Multi ports can be
     selected for AP-layer-events, and only one port can be selected for
     TL/DL-layer-events.

     For example, if target Root Port is 0000:00:00.0 (x8 lanes), bit0 of
     bitmap should be set, port=0x1; if target Root Port is 0000:00:04.0 (x4
     lanes), bit8 is set, port=0x100; if these two Root Ports are both
     monitored, port=0x101.

     Example usage of perf::

       $# perf stat -e hisi_pcie0_core0/rx_mwr_latency,port=0x1/ sleep 5

   - bdf

     "bdf" filter can only be used in bandwidth events, target Endpoint is
     selected by configuring BDF to "bdf". Counter only counts the bandwidth of
     message requested by target Endpoint.

     For example, "bdf=0x3900" means BDF of target Endpoint is 0000:39:00.0.

     Example usage of perf::

       $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,bdf=0x3900/ sleep 5

2. Trigger filter

   Event statistics start when the first time TLP length is greater/smaller
   than trigger condition. You can set the trigger condition by writing
   "trig_len", and set the trigger mode by writing "trig_mode". This filter can
   only be used in bandwidth events.

   For example, "trig_len=4" means trigger condition is 2^4 DW, "trig_mode=0"
   means statistics start when TLP length > trigger condition, "trig_mode=1"
   means start when TLP length < condition.

   Example usage of perf::

     $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,trig_len=0x4,trig_mode=1/ sleep 5

3. Threshold filter

   Counter counts when TLP length within the specified range. You can set the
   threshold by writing "thr_len", and set the threshold mode by writing
   "thr_mode". This filter can only be used in bandwidth events.

   For example, "thr_len=4" means threshold is 2^4 DW, "thr_mode=0" means
   counter counts when TLP length >= threshold, and "thr_mode=1" means counts
   when TLP length < threshold.

   Example usage of perf::

     $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,thr_len=0x4,thr_mode=1/ sleep 5

4. TLP Length filter

   When counting bandwidth, the data can be composed of certain parts of TLP
   packets. You can specify it through "len_mode":

   - 2'b00: Reserved (Do not use this since the behaviour is undefined)
   - 2'b01: Bandwidth of TLP payloads
   - 2'b10: Bandwidth of TLP headers
   - 2'b11: Bandwidth of both TLP payloads and headers

   For example, "len_mode=2" means only counting the bandwidth of TLP headers
   and "len_mode=3" means the final bandwidth data is composed of both TLP
   headers and payloads. Default value if not specified is 2'b11.

   Example usage of perf::

     $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,len_mode=0x1/ sleep 5
+1 −0
Original line number Diff line number Diff line
@@ -8,6 +8,7 @@ Performance monitor support
   :maxdepth: 1

   hisi-pmu
   hisi-pcie-pmu
   hns3-pmu
   imx-ddr
   qcom_l2_pmu
+33 −1
Original line number Diff line number Diff line
@@ -158,6 +158,22 @@ static ssize_t bus_show(struct device *dev, struct device_attribute *attr, char
}
static DEVICE_ATTR_RO(bus);

static ssize_t bdf_min_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev));

	return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_min);
}
static DEVICE_ATTR_RO(bdf_min);

static ssize_t bdf_max_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev));

	return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_max);
}
static DEVICE_ATTR_RO(bdf_max);

static struct hisi_pcie_reg_pair
hisi_pcie_parse_reg_value(struct hisi_pcie_pmu *pcie_pmu, u32 reg_off)
{
@@ -225,7 +241,7 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
static u64 hisi_pcie_pmu_get_event_ctrl_val(struct perf_event *event)
{
	u64 port, trig_len, thr_len, len_mode;
	u64 reg = HISI_PCIE_INIT_SET;
	u64 reg = 0;

	/* Config HISI_PCIE_EVENT_CTRL according to event. */
	reg |= FIELD_PREP(HISI_PCIE_EVENT_M, hisi_pcie_get_real_event(event));
@@ -469,10 +485,24 @@ static void hisi_pcie_pmu_set_period(struct perf_event *event)
	struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
	struct hw_perf_event *hwc = &event->hw;
	int idx = hwc->idx;
	u64 orig_cnt, cnt;

	orig_cnt = hisi_pcie_pmu_read_counter(event);

	local64_set(&hwc->prev_count, HISI_PCIE_INIT_VAL);
	hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_CNT, idx, HISI_PCIE_INIT_VAL);
	hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EXT_CNT, idx, HISI_PCIE_INIT_VAL);

	/*
	 * The counter maybe unwritable if the target event is unsupported.
	 * Check this by comparing the counts after setting the period. If
	 * the counts stay unchanged after setting the period then update
	 * the hwc->prev_count correctly. Otherwise the final counts user
	 * get maybe totally wrong.
	 */
	cnt = hisi_pcie_pmu_read_counter(event);
	if (orig_cnt == cnt)
		local64_set(&hwc->prev_count, cnt);
}

static void hisi_pcie_pmu_enable_counter(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc)
@@ -758,6 +788,8 @@ static const struct attribute_group hisi_pcie_pmu_format_group = {

static struct attribute *hisi_pcie_pmu_bus_attrs[] = {
	&dev_attr_bus.attr,
	&dev_attr_bdf_max.attr,
	&dev_attr_bdf_min.attr,
	NULL
};