Skip to content
  1. Aug 30, 2019
  2. Aug 29, 2019
    • Tejun Heo's avatar
      blkcg: fix missing free on error path of blk_iocost_init() · 3532e722
      Tejun Heo authored
      blk_iocost_init() forgot to free its percpu stat on the error path.
      Fix it.
      
      Fixes: 7caa4715
      
       ("blkcg: implement blk-iocost")
      Reported-by: default avatarHillf Danton <hdanton@sina.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3532e722
    • Stephen Rothwell's avatar
      blkcg: blk-iocost: predeclare used structs · 8d1c1560
      Stephen Rothwell authored
      Fixes: 7caa4715
      
       ("blkcg: implement blk-iocost")
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8d1c1560
    • Tejun Heo's avatar
      blkcg: add tools/cgroup/iocost_coef_gen.py · 8504dea7
      Tejun Heo authored
      
      
      Add a script which can be used to generate device-specific iocost
      linear model coefficients.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8504dea7
    • Tejun Heo's avatar
      blkcg: add tools/cgroup/iocost_monitor.py · 6954ff18
      Tejun Heo authored
      
      
      Instead of mucking with debugfs and ->pd_stat(), add drgn based
      monitoring script.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Omar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6954ff18
    • Tejun Heo's avatar
      blkcg: implement blk-iocost · 7caa4715
      Tejun Heo authored
      
      
      This patchset implements IO cost model based work-conserving
      proportional controller.
      
      While io.latency provides the capability to comprehensively prioritize
      and protect IOs depending on the cgroups, its protection is binary -
      the lowest latency target cgroup which is suffering is protected at
      the cost of all others.  In many use cases including stacking multiple
      workload containers in a single system, it's necessary to distribute
      IO capacity with better granularity.
      
      One challenge of controlling IO resources is the lack of trivially
      observable cost metric.  The most common metrics - bandwidth and iops
      - can be off by orders of magnitude depending on the device type and
      IO pattern.  However, the cost isn't a complete mystery.  Given
      several key attributes, we can make fairly reliable predictions on how
      expensive a given stream of IOs would be, at least compared to other
      IO patterns.
      
      The function which determines the cost of a given IO is the IO cost
      model for the device.  This controller distributes IO capacity based
      on the costs estimated by such model.  The more accurate the cost
      model the better but the controller adapts based on IO completion
      latency and as long as the relative costs across differents IO
      patterns are consistent and sensible, it'll adapt to the actual
      performance of the device.
      
      Currently, the only implemented cost model is a simple linear one with
      a few sets of default parameters for different classes of device.
      This covers most common devices reasonably well.  All the
      infrastructure to tune and add different cost models is already in
      place and a later patch will also allow using bpf progs for cost
      models.
      
      Please see the top comment in blk-iocost.c and documentation for
      more details.
      
      v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix
          for a divide-by-zero bug in current_hweight() triggered by zero
          inuse_sum.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Andy Newell <newella@fb.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7caa4715
    • Tejun Heo's avatar
      blk-mq: add optional request->alloc_time_ns · 6f816b4b
      Tejun Heo authored
      
      
      There are currently two start time timestamps - start_time_ns and
      io_start_time_ns.  The former marks the request allocation and and the
      second issue-to-device time.  The planned io.weight controller needs
      to measure the total time bios take to execute after it leaves rq_qos
      including the time spent waiting for request to become available,
      which can easily dominate on saturated devices.
      
      This patch adds request->alloc_time_ns which records when the request
      allocation attempt started.  As it isn't used for the usual stats,
      make it optional behind CONFIG_BLK_RQ_ALLOC_TIME and
      QUEUE_FLAG_RQ_ALLOC_TIME so that it can be compiled out when there are
      no users and it's active only on queues which need it even when
      compiled in.
      
      v2: s/pre_start_time/alloc_time/ and add CONFIG_BLK_RQ_ALLOC_TIME
          gating as suggested by Jens.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6f816b4b
    • Tejun Heo's avatar
      blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ · beab17fc
      Tejun Heo authored
      
      
      io.weight is gonna be another rq_qos cgroup mechanism.  Let's rename
      RQ_QOS_CGROUP which is being used by io.latency to RQ_QOS_LATENCY in
      preparation.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      beab17fc
    • Tejun Heo's avatar
      block/rq_qos: implement rq_qos_ops->queue_depth_changed() · 9677a3e0
      Tejun Heo authored
      
      
      wbt already gets queue depth changed notification through
      wbt_set_queue_depth().  Generalize it into
      rq_qos_ops->queue_depth_changed() so that other rq_qos policies can
      easily hook into the events too.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9677a3e0
    • Tejun Heo's avatar
      block/rq_qos: add rq_qos_merge() · d3e65fff
      Tejun Heo authored
      
      
      Add a merge hook for rq_qos.  This will be used by io.weight.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d3e65fff
    • Tejun Heo's avatar
      blkcg: separate blkcg_conf_get_disk() out of blkg_conf_prep() · 015d254c
      Tejun Heo authored
      
      
      Separate out blkcg_conf_get_disk() so that it can be used by blkcg
      policy interface file input parsers before the policy is actually
      enabled.  This doesn't introduce any functional changes.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      015d254c
    • Tejun Heo's avatar
      blkcg: make ->cpd_init_fn() optional · 86a5bba5
      Tejun Heo authored
      
      
      For policies which can do enough initialization from ->cpd_alloc_fn(),
      make ->cpd_init_fn() optional.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      86a5bba5