Commit 0227a749 authored by Zhang Zekun's avatar Zhang Zekun
Browse files

iommu/iova: increase the iova_rcache depot max size to 128

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH


CVE: NA

---------------------------------------

In fio test with iodepth=256 with allowd cpus to 0-255, we observe a
serve performance decrease. The statistic of cache hit rate are
relatively low. Here are some statistics about the iova_cpu_rcahe of
all cpus:

iova alloc order		0	1	2	3	4	5
----------------------------------------------------------------------
average cpu_rcache hit rate	0.9941	0.7408	0.8109	0.8854	0.9082	0.8887

Jobs: 12 (f=12): [R(12)][20.0%][r=1091MiB/s][r=279k IOPS][eta 00m:28s]
Jobs: 12 (f=12): [R(12)][22.2%][r=1426MiB/s][r=365k IOPS][eta 00m:28s]
Jobs: 12 (f=12): [R(12)][25.0%][r=1607MiB/s][r=411k IOPS][eta 00m:27s]
Jobs: 12 (f=12): [R(12)][27.8%][r=1501MiB/s][r=384k IOPS][eta 00m:26s]
Jobs: 12 (f=12): [R(12)][30.6%][r=1486MiB/s][r=380k IOPS][eta 00m:25s]
Jobs: 12 (f=12): [R(12)][33.3%][r=1393MiB/s][r=357k IOPS][eta 00m:24s]
Jobs: 12 (f=12): [R(12)][36.1%][r=1550MiB/s][r=397k IOPS][eta 00m:23s]
Jobs: 12 (f=12): [R(12)][38.9%][r=1485MiB/s][r=380k IOPS][eta 00m:22s]

The under lying hisi sas driver has 16 thread irqs to free iova, but
these irq call back function will only free iovas on 16 certain cpus(cpu{0,
16,32...,240}). For example, thread irq which smp affinity is 0-15, will
only free iova on cpu 0. However, the driver will alloc iova on all
cpus(cpu{0-255}), cpus without free iova in local cpu_rcache need to get
free iovas from iova_rcache->depot. The current size of
iova_rcache->depot max size is 32, and it seems to be too small for 256
users (16 cpus will put iovas to iova_rcache->depot and 240 cpus will
try to get iova from it). Set iova_rcache->depot to 128 can fix the
performance issue, and the performance can return to normal.

iova alloc order		0	1	2	3	4	5
----------------------------------------------------------------------
average cpu_rcache hit rate	0.9925	0.9736	0.9789	0.9867	0.9889	0.9906

Jobs: 12 (f=12): [R(12)][12.9%][r=7526MiB/s][r=1927k IOPS][eta 04m:30s]
Jobs: 12 (f=12): [R(12)][13.2%][r=7527MiB/s][r=1927k IOPS][eta 04m:29s]
Jobs: 12 (f=12): [R(12)][13.5%][r=7529MiB/s][r=1927k IOPS][eta 04m:28s]
Jobs: 12 (f=12): [R(12)][13.9%][r=7531MiB/s][r=1928k IOPS][eta 04m:27s]
Jobs: 12 (f=12): [R(12)][14.2%][r=7529MiB/s][r=1928k IOPS][eta 04m:26s]
Jobs: 12 (f=12): [R(12)][14.5%][r=7528MiB/s][r=1927k IOPS][eta 04m:25s]
Jobs: 12 (f=12): [R(12)][14.8%][r=7527MiB/s][r=1927k IOPS][eta 04m:24s]
Jobs: 12 (f=12): [R(12)][15.2%][r=7525MiB/s][r=1926k IOPS][eta 04m:23s]

Signed-off-by: default avatarZhang Zekun <zhangzekun11@huawei.com>
parent 7ac36644
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment