Commit f02bc975 authored by Daejun Park's avatar Daejun Park Committed by Martin K. Petersen
Browse files

scsi: ufs: ufshpb: Introduce Host Performance Buffer feature

Implement Host Performance Buffer (HPB) initialization and add function
calls to UFS core driver.

NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of I/O requests to the corresponding physical
addresses of the flash storage.  In UFS, logical-to-physical-address (L2P)
map data, which is required to identify the physical address for the
requested I/Os, can only be partially stored in SRAM from NAND flash. Due
to this partial loading, accessing the flash address area, where the L2P
information for that address is not loaded in the SRAM, can result in
serious performance degradation.

The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command. The HPB read command allows to
read data faster than a regular read command in UFS since it provides the
physical address (HPB Entry) of the desired logical block in addition to
its logical address. The UFS device can access the physical block in NAND
directly without searching and uploading L2P mapping table. This improves
read performance because the NAND read operation for uploading L2P mapping
table is removed.

In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, HPB parameters are
configured in the device.

Total start-up time of popular applications was measured and the difference
observed between HPB being enabled and disabled. Popular applications are
12 game apps and 24 non-game apps. Each test cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.

The following is the test environment:

 - kernel version: 4.4.0
 - RAM: 8GB
 - UFS 2.1 (64GB)

Results:

   +-------+----------+----------+-------+
   | cycle | baseline | with HPB | diff  |
   +-------+----------+----------+-------+
   | 1     | 272.4    | 264.9    | -7.5  |
   | 2     | 250.4    | 248.2    | -2.2  |
   | 3     | 226.2    | 215.6    | -10.6 |
   | 4     | 230.6    | 214.8    | -15.8 |
   | 5     | 232.0    | 218.1    | -13.9 |
   | 6     | 231.9    | 212.6    | -19.3 |
   +-------+----------+----------+-------+

We also measured HPB performance using iozone:

   $ iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16 -s $IO_RANGE/16 -F \
   mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5 mnt/tmp_6 mnt/tmp_7 \
   mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12 mnt/tmp_13 \
   mnt/tmp_14 mnt/tmp_15 mnt/tmp_16

Results:

   +----------+--------+---------+
   | IO range | HPB on | HPB off |
   +----------+--------+---------+
   |   1 GB   | 294.8  | 300.87  |
   |   4 GB   | 293.51 | 179.35  |
   |   8 GB   | 294.85 | 162.52  |
   |  16 GB   | 293.45 | 156.26  |
   |  32 GB   | 277.4  | 153.25  |
   +----------+--------+---------+

Link: https://lore.kernel.org/r/20210712085830epcms2p8c1288b7f7a81b044158a18232617b572@epcms2p8


Reported-by: default avatarkernel test robot <lkp@intel.com>
Tested-by: default avatarBean Huo <beanhuo@micron.com>
Tested-by: default avatarCan Guo <cang@codeaurora.org>
Tested-by: default avatarStanley Chu <stanley.chu@mediatek.com>
Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
Reviewed-by: default avatarCan Guo <cang@codeaurora.org>
Reviewed-by: default avatarBean Huo <beanhuo@micron.com>
Reviewed-by: default avatarStanley Chu <stanley.chu@mediatek.com>
Acked-by: default avatarAvri Altman <Avri.Altman@wdc.com>
Signed-off-by: default avatarDaejun Park <daejun7.park@samsung.com>
Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
parent 33529018
Loading
Loading
Loading
Loading
+127 −0
Original line number Diff line number Diff line
@@ -1298,3 +1298,130 @@ Description: This node is used to set or display whether UFS WriteBooster is
		(if the platform supports UFSHCD_CAP_CLK_SCALING). For a
		platform that doesn't support UFSHCD_CAP_CLK_SCALING, we can
		disable/enable WriteBooster through this sysfs node.

What:		/sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the HPB specification version.
		The full information about the descriptor can be found in the UFS
		HPB (Host Performance Booster) Extension specifications.
		Example: version 1.2.3 = 0123h

		The file is read only.

What:		/sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_control
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows an indication of the HPB control mode.
		00h: Host control mode
		01h: Device control mode

		The file is read only.

What:		/sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/hpb_region_size
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the bHPBRegionSize which can be calculated
		as in the following (in bytes):
		HPB Region size = 512B * 2^bHPBRegionSize

		The file is read only.

What:		/sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/hpb_number_lu
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the maximum number of HPB LU supported	by
		the device.
		00h: HPB is not supported by the device.
		01h ~ 20h: Maximum number of HPB LU supported by the device

		The file is read only.

What:		/sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/hpb_subregion_size
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the bHPBSubRegionSize, which can be
		calculated as in the following (in bytes) and shall be a multiple of
		logical block size:
		HPB Sub-Region size = 512B x 2^bHPBSubRegionSize
		bHPBSubRegionSize shall not exceed bHPBRegionSize.

		The file is read only.

What:		/sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/hpb_max_active_regions
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the maximum number of active HPB regions that
		is supported by the device.

		The file is read only.

What:		/sys/class/scsi_device/*/device/unit_descriptor/hpb_lu_max_active_regions
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the maximum number of HPB regions assigned to
		the HPB logical unit.

		The file is read only.

What:		/sys/class/scsi_device/*/device/unit_descriptor/hpb_pinned_region_start_offset
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the start offset of HPB pinned region.

		The file is read only.

What:		/sys/class/scsi_device/*/device/unit_descriptor/hpb_number_pinned_regions
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of HPB pinned regions assigned to
		the HPB logical unit.

		The file is read only.

What:		/sys/class/scsi_device/*/device/hpb_stats/hit_cnt
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of reads that changed to HPB read.

		The file is read only.

What:		/sys/class/scsi_device/*/device/hpb_stats/miss_cnt
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of reads that cannot be changed to
		HPB read.

		The file is read only.

What:		/sys/class/scsi_device/*/device/hpb_stats/rb_noti_cnt
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of response UPIUs that has
		recommendations for activating sub-regions and/or inactivating region.

		The file is read only.

What:		/sys/class/scsi_device/*/device/hpb_stats/rb_active_cnt
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of active sub-regions recommended by
		response UPIUs.

		The file is read only.

What:		/sys/class/scsi_device/*/device/hpb_stats/rb_inactive_cnt
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of inactive regions recommended by
		response UPIUs.

		The file is read only.

What:		/sys/class/scsi_device/*/device/hpb_stats/map_req_cnt
Date:		June 2021
Contact:	Daejun Park <daejun7.park@samsung.com>
Description:	This entry shows the number of read buffer commands for
		activating sub-regions recommended by response UPIUs.

		The file is read only.
+9 −0
Original line number Diff line number Diff line
@@ -183,3 +183,12 @@ config SCSI_UFS_CRYPTO
	  Enabling this makes it possible for the kernel to use the crypto
	  capabilities of the UFS device (if present) to perform crypto
	  operations on data being transferred to/from the device.

config SCSI_UFS_HPB
	bool "Support UFS Host Performance Booster"
	depends on SCSI_UFSHCD
	help
	  The UFS HPB feature improves random read performance. It caches
	  L2P (logical to physical) map of UFS to host DRAM. The driver uses HPB
	  read command by piggybacking physical page number for bypassing FTL (flash
	  translation layer)'s L2P address translation.
+1 −0
Original line number Diff line number Diff line
@@ -8,6 +8,7 @@ ufshcd-core-y += ufshcd.o ufs-sysfs.o
ufshcd-core-$(CONFIG_DEBUG_FS)		+= ufs-debugfs.o
ufshcd-core-$(CONFIG_SCSI_UFS_BSG)	+= ufs_bsg.o
ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO)	+= ufshcd-crypto.o
ufshcd-core-$(CONFIG_SCSI_UFS_HPB)	+= ufshpb.o

obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o tc-dwc-g210.o
obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o ufshcd-dwc.o tc-dwc-g210.o
+18 −0
Original line number Diff line number Diff line
@@ -604,6 +604,8 @@ UFS_DEVICE_DESC_PARAM(device_version, _DEV_VER, 2);
UFS_DEVICE_DESC_PARAM(number_of_secure_wpa, _NUM_SEC_WPA, 1);
UFS_DEVICE_DESC_PARAM(psa_max_data_size, _PSA_MAX_DATA, 4);
UFS_DEVICE_DESC_PARAM(psa_state_timeout, _PSA_TMT, 1);
UFS_DEVICE_DESC_PARAM(hpb_version, _HPB_VER, 2);
UFS_DEVICE_DESC_PARAM(hpb_control, _HPB_CONTROL, 1);
UFS_DEVICE_DESC_PARAM(ext_feature_sup, _EXT_UFS_FEATURE_SUP, 4);
UFS_DEVICE_DESC_PARAM(wb_presv_us_en, _WB_PRESRV_USRSPC_EN, 1);
UFS_DEVICE_DESC_PARAM(wb_type, _WB_TYPE, 1);
@@ -636,6 +638,8 @@ static struct attribute *ufs_sysfs_device_descriptor[] = {
	&dev_attr_number_of_secure_wpa.attr,
	&dev_attr_psa_max_data_size.attr,
	&dev_attr_psa_state_timeout.attr,
	&dev_attr_hpb_version.attr,
	&dev_attr_hpb_control.attr,
	&dev_attr_ext_feature_sup.attr,
	&dev_attr_wb_presv_us_en.attr,
	&dev_attr_wb_type.attr,
@@ -709,6 +713,10 @@ UFS_GEOMETRY_DESC_PARAM(enh4_memory_max_alloc_units,
	_ENM4_MAX_NUM_UNITS, 4);
UFS_GEOMETRY_DESC_PARAM(enh4_memory_capacity_adjustment_factor,
	_ENM4_CAP_ADJ_FCTR, 2);
UFS_GEOMETRY_DESC_PARAM(hpb_region_size, _HPB_REGION_SIZE, 1);
UFS_GEOMETRY_DESC_PARAM(hpb_number_lu, _HPB_NUMBER_LU, 1);
UFS_GEOMETRY_DESC_PARAM(hpb_subregion_size, _HPB_SUBREGION_SIZE, 1);
UFS_GEOMETRY_DESC_PARAM(hpb_max_active_regions, _HPB_MAX_ACTIVE_REGS, 2);
UFS_GEOMETRY_DESC_PARAM(wb_max_alloc_units, _WB_MAX_ALLOC_UNITS, 4);
UFS_GEOMETRY_DESC_PARAM(wb_max_wb_luns, _WB_MAX_WB_LUNS, 1);
UFS_GEOMETRY_DESC_PARAM(wb_buff_cap_adj, _WB_BUFF_CAP_ADJ, 1);
@@ -746,6 +754,10 @@ static struct attribute *ufs_sysfs_geometry_descriptor[] = {
	&dev_attr_enh3_memory_capacity_adjustment_factor.attr,
	&dev_attr_enh4_memory_max_alloc_units.attr,
	&dev_attr_enh4_memory_capacity_adjustment_factor.attr,
	&dev_attr_hpb_region_size.attr,
	&dev_attr_hpb_number_lu.attr,
	&dev_attr_hpb_subregion_size.attr,
	&dev_attr_hpb_max_active_regions.attr,
	&dev_attr_wb_max_alloc_units.attr,
	&dev_attr_wb_max_wb_luns.attr,
	&dev_attr_wb_buff_cap_adj.attr,
@@ -1160,6 +1172,9 @@ UFS_UNIT_DESC_PARAM(provisioning_type, _PROVISIONING_TYPE, 1);
UFS_UNIT_DESC_PARAM(physical_memory_resourse_count, _PHY_MEM_RSRC_CNT, 8);
UFS_UNIT_DESC_PARAM(context_capabilities, _CTX_CAPABILITIES, 2);
UFS_UNIT_DESC_PARAM(large_unit_granularity, _LARGE_UNIT_SIZE_M1, 1);
UFS_UNIT_DESC_PARAM(hpb_lu_max_active_regions, _HPB_LU_MAX_ACTIVE_RGNS, 2);
UFS_UNIT_DESC_PARAM(hpb_pinned_region_start_offset, _HPB_PIN_RGN_START_OFF, 2);
UFS_UNIT_DESC_PARAM(hpb_number_pinned_regions, _HPB_NUM_PIN_RGNS, 2);
UFS_UNIT_DESC_PARAM(wb_buf_alloc_units, _WB_BUF_ALLOC_UNITS, 4);


@@ -1177,6 +1192,9 @@ static struct attribute *ufs_sysfs_unit_descriptor[] = {
	&dev_attr_physical_memory_resourse_count.attr,
	&dev_attr_context_capabilities.attr,
	&dev_attr_large_unit_granularity.attr,
	&dev_attr_hpb_lu_max_active_regions.attr,
	&dev_attr_hpb_pinned_region_start_offset.attr,
	&dev_attr_hpb_number_pinned_regions.attr,
	&dev_attr_wb_buf_alloc_units.attr,
	NULL,
};
+15 −0
Original line number Diff line number Diff line
@@ -122,6 +122,7 @@ enum flag_idn {
	QUERY_FLAG_IDN_WB_EN                            = 0x0E,
	QUERY_FLAG_IDN_WB_BUFF_FLUSH_EN                 = 0x0F,
	QUERY_FLAG_IDN_WB_BUFF_FLUSH_DURING_HIBERN8     = 0x10,
	QUERY_FLAG_IDN_HPB_RESET                        = 0x11,
};

/* Attribute idn for Query requests */
@@ -195,6 +196,9 @@ enum unit_desc_param {
	UNIT_DESC_PARAM_PHY_MEM_RSRC_CNT	= 0x18,
	UNIT_DESC_PARAM_CTX_CAPABILITIES	= 0x20,
	UNIT_DESC_PARAM_LARGE_UNIT_SIZE_M1	= 0x22,
	UNIT_DESC_PARAM_HPB_LU_MAX_ACTIVE_RGNS	= 0x23,
	UNIT_DESC_PARAM_HPB_PIN_RGN_START_OFF	= 0x25,
	UNIT_DESC_PARAM_HPB_NUM_PIN_RGNS	= 0x27,
	UNIT_DESC_PARAM_WB_BUF_ALLOC_UNITS	= 0x29,
};

@@ -235,6 +239,8 @@ enum device_desc_param {
	DEVICE_DESC_PARAM_PSA_MAX_DATA		= 0x25,
	DEVICE_DESC_PARAM_PSA_TMT		= 0x29,
	DEVICE_DESC_PARAM_PRDCT_REV		= 0x2A,
	DEVICE_DESC_PARAM_HPB_VER		= 0x40,
	DEVICE_DESC_PARAM_HPB_CONTROL		= 0x42,
	DEVICE_DESC_PARAM_EXT_UFS_FEATURE_SUP	= 0x4F,
	DEVICE_DESC_PARAM_WB_PRESRV_USRSPC_EN	= 0x53,
	DEVICE_DESC_PARAM_WB_TYPE		= 0x54,
@@ -283,6 +289,10 @@ enum geometry_desc_param {
	GEOMETRY_DESC_PARAM_ENM4_MAX_NUM_UNITS	= 0x3E,
	GEOMETRY_DESC_PARAM_ENM4_CAP_ADJ_FCTR	= 0x42,
	GEOMETRY_DESC_PARAM_OPT_LOG_BLK_SIZE	= 0x44,
	GEOMETRY_DESC_PARAM_HPB_REGION_SIZE	= 0x48,
	GEOMETRY_DESC_PARAM_HPB_NUMBER_LU	= 0x49,
	GEOMETRY_DESC_PARAM_HPB_SUBREGION_SIZE	= 0x4A,
	GEOMETRY_DESC_PARAM_HPB_MAX_ACTIVE_REGS	= 0x4B,
	GEOMETRY_DESC_PARAM_WB_MAX_ALLOC_UNITS	= 0x4F,
	GEOMETRY_DESC_PARAM_WB_MAX_WB_LUNS	= 0x53,
	GEOMETRY_DESC_PARAM_WB_BUFF_CAP_ADJ	= 0x54,
@@ -327,8 +337,10 @@ enum {

/* Possible values for dExtendedUFSFeaturesSupport */
enum {
	UFS_DEV_HPB_SUPPORT		= BIT(7),
	UFS_DEV_WRITE_BOOSTER_SUP	= BIT(8),
};
#define UFS_DEV_HPB_SUPPORT_VERSION		0x310

#define POWER_DESC_MAX_ACTV_ICC_LVLS		16

@@ -544,6 +556,9 @@ struct ufs_dev_info {
	u16	wspecversion;
	u32	clk_gating_wait_us;

	/* UFS HPB related flag */
	bool	hpb_enabled;

	/* UFS WB related flags */
	bool    wb_enabled;
	bool    wb_buf_flush_enabled;
Loading