Skip to content
  1. Oct 12, 2023
  2. Oct 06, 2023
  3. Oct 04, 2023
  4. Oct 03, 2023
  5. Sep 29, 2023
    • Jens Axboe's avatar
      Merge tag 'md-next-20230927' of... · 03f7b57a
      Jens Axboe authored
      Merge tag 'md-next-20230927' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.7/block
      
      Pull MD updates from Song:
      
      "1. Make rdev add/remove independent from daemon thread, by Yu Kuai;
       2. Refactor code around quiesce() and mddev_suspend(), by Yu Kuai."
      
      * tag 'md-next-20230927' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md: replace deprecated strncpy with memcpy
        md/md-linear: Annotate struct linear_conf with __counted_by
        md: don't check 'mddev->pers' and 'pers->quiesce' from suspend_lo_store()
        md: don't check 'mddev->pers' from suspend_hi_store()
        md-bitmap: suspend array earlier in location_store()
        md-bitmap: remove the checking of 'pers->quiesce' from location_store()
        md: don't rely on 'mddev->pers' to be set in mddev_suspend()
        md: initialize 'writes_pending' while allocating mddev
        md: initialize 'active_io' while allocating mddev
        md: delay remove_and_add_spares() for read only array to md_start_sync()
        md: factor out a helper rdev_addable() from remove_and_add_spares()
        md: factor out a helper rdev_is_spare() from remove_and_add_spares()
        md: factor out a helper rdev_removeable() from remove_and_add_spares()
        md: delay choosing sync action to md_start_sync()
        md: factor out a helper to choose sync action from md_check_recovery()
        md: use separate work_struct for md_start_sync()
      03f7b57a
  6. Sep 26, 2023
    • Coly Li's avatar
      badblocks: switch to the improved badblock handling code · aa511ff8
      Coly Li authored
      
      
      This patch removes old code of badblocks_set(), badblocks_clear() and
      badblocks_check(), and make them as wrappers to call _badblocks_set(),
      _badblocks_clear() and _badblocks_check().
      
      By this change now the badblock handing switch to the improved algorithm
      in  _badblocks_set(), _badblocks_clear() and _badblocks_check().
      
      This patch only contains the changes of old code deletion, new added
      code for the improved algorithms are in previous patches.
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Geliang Tang <geliang.tang@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Cc: Xiao Ni <xni@redhat.com>
      Reviewed-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/20230811170513.2300-7-colyli@suse.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      aa511ff8
    • Coly Li's avatar
      badblocks: improve badblocks_check() for multiple ranges handling · 3ea3354c
      Coly Li authored
      
      
      This patch rewrites badblocks_check() with similar coding style as
      _badblocks_set() and _badblocks_clear(). The only difference is bad
      blocks checking may handle multiple ranges in bad tables now.
      
      If a checking range covers multiple bad blocks range in bad block table,
      like the following condition (C is the checking range, E1, E2, E3 are
      three bad block ranges in bad block table),
        +------------------------------------+
        |                C                   |
        +------------------------------------+
          +----+      +----+      +----+
          | E1 |      | E2 |      | E3 |
          +----+      +----+      +----+
      The improved badblocks_check() algorithm will divide checking range C
      into multiple parts, and handle them in 7 runs of a while-loop,
        +--+ +----+ +----+ +----+ +----+ +----+ +----+
        |C1| | C2 | | C3 | | C4 | | C5 | | C6 | | C7 |
        +--+ +----+ +----+ +----+ +----+ +----+ +----+
             +----+        +----+        +----+
             | E1 |        | E2 |        | E3 |
             +----+        +----+        +----+
      And the start LBA and length of range E1 will be set as first_bad and
      bad_sectors for the caller.
      
      The return value rule is consistent for multiple ranges. For example if
      there are following bad block ranges in bad block table,
         Index No.     Start        Len         Ack
             0          400          20          1
             1          500          50          1
             2          650          20          0
      the return value, first_bad, bad_sectors by calling badblocks_set() with
      different checking range can be the following values,
          Checking Start, Len     Return Value   first_bad    bad_sectors
                     100, 100          0           N/A           N/A
                     100, 310          1           400           10
                     100, 440          1           400           10
                     100, 540          1           400           10
                     100, 600         -1           400           10
                     100, 800         -1           400           10
      
      In order to make code review easier, this patch names the improved bad
      block range checking routine as _badblocks_check() and does not change
      existing badblock_check() code yet. Later patch will delete old code of
      badblocks_check() and make it as a wrapper to call _badblocks_check().
      Then the new added code won't mess up with the old deleted code, it will
      be more clear and easier for code review.
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Geliang Tang <geliang.tang@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Cc: Xiao Ni <xni@redhat.com>
      Reviewed-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/20230811170513.2300-6-colyli@suse.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3ea3354c
    • Coly Li's avatar
      badblocks: improve badblocks_clear() for multiple ranges handling · db448eb6
      Coly Li authored
      
      
      With the fundamental ideas and helper routines from badblocks_set()
      improvement, clearing bad block for multiple ranges is much simpler.
      
      With a similar idea from badblocks_set() improvement, this patch
      simplifies bad block range clearing into 5 situations. No matter how
      complicated the clearing condition is, we just look at the head part
      of clearing range with relative already set bad block range from the
      bad block table. The rested part will be handled in next run of the
      while-loop.
      
      Based on existing helpers added from badblocks_set(), this patch adds
      two more helpers,
      - front_clear()
        Clear the bad block range from bad block table which is front
        overlapped with the clearing range.
      - front_splitting_clear()
        Handle the condition that the clearing range hits middle of an
        already set bad block range from bad block table.
      
      Similar as badblocks_set(), the first part of clearing range is handled
      with relative bad block range which is find by prev_badblocks(). In most
      cases a valid hint is provided to prev_badblocks() to avoid unnecessary
      bad block table iteration.
      
      This patch also explains the detail algorithm code comments at beginning
      of badblocks.c, including which five simplified situations are
      categrized and how all the bad block range clearing conditions are
      handled by these five situations.
      
      Again, in order to make the code review easier and avoid the code
      changes mixed together, this patch does not modify badblock_clear() and
      implement another routine called _badblock_clear() for the improvement.
      Later patch will delete current code of badblock_clear() and make it as
      a wrapper to _badblock_clear(), so the code change can be much clear for
      review.
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Geliang Tang <geliang.tang@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Cc: Xiao Ni <xni@redhat.com>
      Reviewed-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/20230811170513.2300-5-colyli@suse.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      db448eb6
    • Coly Li's avatar
      badblocks: improve badblocks_set() for multiple ranges handling · 1726c774
      Coly Li authored
      
      
      Recently I received a bug report that current badblocks code does not
      properly handle multiple ranges. For example,
              badblocks_set(bb, 32, 1, true);
              badblocks_set(bb, 34, 1, true);
              badblocks_set(bb, 36, 1, true);
              badblocks_set(bb, 32, 12, true);
      Then indeed badblocks_show() reports,
              32 3
              36 1
      But the expected bad blocks table should be,
              32 12
      Obviously only the first 2 ranges are merged and badblocks_set() returns
      and ignores the rest setting range.
      
      This behavior is improper, if the caller of badblocks_set() wants to set
      a range of blocks into bad blocks table, all of the blocks in the range
      should be handled even the previous part encountering failure.
      
      The desired way to set bad blocks range by badblocks_set() is,
      - Set as many as blocks in the setting range into bad blocks table.
      - Merge the bad blocks ranges and occupy as less as slots in the bad
        blocks table.
      - Fast.
      
      Indeed the above proposal is complicated, especially with the following
      restrictions,
      - The setting bad blocks range can be acknowledged or not acknowledged.
      - The bad blocks table size is limited.
      - Memory allocation should be avoided.
      
      The basic idea of the patch is to categorize all possible bad blocks
      range setting combinations into much less simplified and more less
      special conditions. Inside badblocks_set() there is an implicit loop
      composed by jumping between labels 're_insert' and 'update_sectors'. No
      matter how large the setting bad blocks range is, in every loop just a
      minimized range from the head is handled by a pre-defined behavior from
      one of the categorized conditions. The logic is simple and code flow is
      manageable.
      
      The different relative layout between the setting range and existing bad
      block range are checked and handled (merge, combine, overwrite, insert)
      by the helpers in previous patch. This patch is to make all the helpers
      work together with the above idea.
      
      This patch only has the algorithm improvement for badblocks_set(). There
      are following patches contain improvement for badblocks_clear() and
      badblocks_check(). But the algorithm in badblocks_set() is fundamental
      and typical, other improvement in clear and check routines are based on
      all the helpers and ideas in this patch.
      
      In order to make the change to be more clear for code review, this patch
      does not directly modify existing badblocks_set(), and just add a new
      one named _badblocks_set(). Later patch will remove current existing
      badblocks_set() code and make it as a wrapper of _badblocks_set(). So
      the new added change won't be mixed with deleted code, the code review
      can be easier.
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Geliang Tang <geliang.tang@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Cc: Wols Lists <antlists@youngman.org.uk>
      Cc: Xiao Ni <xni@redhat.com>
      Reviewed-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/20230811170513.2300-4-colyli@suse.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1726c774
    • Coly Li's avatar
      badblocks: add helper routines for badblock ranges handling · c3c6a86e
      Coly Li authored
      
      
      This patch adds several helper routines to improve badblock ranges
      handling. These helper routines will be used later in the improved
      version of badblocks_set()/badblocks_clear()/badblocks_check().
      
      - Helpers prev_by_hint() and prev_badblocks() are used to find the bad
        range from bad table which the searching range starts at or after.
      
      - The following helpers are to decide the relative layout between the
        manipulating range and existing bad block range from bad table.
        - can_merge_behind()
          Return 'true' if the manipulating range can backward merge with the
          bad block range.
        - can_merge_front()
          Return 'true' if the manipulating range can forward merge with the
          bad block range.
        - can_combine_front()
          Return 'true' if two adjacent bad block ranges before the
          manipulating range can be merged.
        - overlap_front()
          Return 'true' if the manipulating range exactly overlaps with the
          bad block range in front of its range.
        - overlap_behind()
          Return 'true' if the manipulating range exactly overlaps with the
          bad block range behind its range.
        - can_front_overwrite()
          Return 'true' if the manipulating range can forward overwrite the
          bad block range in front of its range.
      
      - The following helpers are to add the manipulating range into the bad
        block table. Different routine is called with the specific relative
        layout between the manipulating range and other bad block range in the
        bad block table.
        - behind_merge()
          Merge the manipulating range with the bad block range behind its
          range, and return the number of merged length in unit of sector.
        - front_merge()
          Merge the manipulating range with the bad block range in front of
          its range, and return the number of merged length in unit of sector.
        - front_combine()
          Combine the two adjacent bad block ranges before the manipulating
          range into a larger one.
        - front_overwrite()
          Overwrite partial of whole bad block range which is in front of the
          manipulating range. The overwrite may split existing bad block range
          and generate more bad block ranges into the bad block table.
        - insert_at()
          Insert the manipulating range at a specific location in the bad
          block table.
      
      All the above helpers are used in later patches to improve the bad block
      ranges handling for badblocks_set()/badblocks_clear()/badblocks_check().
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Geliang Tang <geliang.tang@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Cc: Xiao Ni <xni@redhat.com>
      Reviewed-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/20230811170513.2300-3-colyli@suse.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c3c6a86e
    • Coly Li's avatar
      badblocks: add more helper structure and routines in badblocks.h · e850d9a5
      Coly Li authored
      
      
      This patch adds the following helper structure and routines into
      badblocks.h,
      - struct badblocks_context
        This structure is used in improved badblocks code for bad table
        iteration.
      - BB_END()
        The macro to calculate end LBA of a bad range record from bad
        table.
      - badblocks_full() and badblocks_empty()
        The inline routines to check whether bad table is full or empty.
      - set_changed() and clear_changed()
        The inline routines to set and clear 'changed' tag from struct
        badblocks.
      
      These new helper structure and routines can help to make the code more
      clear, they will be used in the improved badblocks code in following
      patches.
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarXiao Ni <xni@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Geliang Tang <geliang.tang@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Acked-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/20230811170513.2300-2-colyli@suse.de
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e850d9a5
    • Justin Stitt's avatar
      md: replace deprecated strncpy with memcpy · ceb04163
      Justin Stitt authored
      `strncpy` is deprecated for use on NUL-terminated destination strings
      [1] and as such we should prefer more robust and less ambiguous string
      interfaces.
      
      There are three such strncpy uses that this patch addresses:
      
      The respective destination buffers are:
      1) mddev->clevel
      2) clevel
      3) mddev->metadata_type
      
      We expect mddev->clevel to be NUL-terminated due to its use with format
      strings:
      |       ret = sprintf(page, "%s\n", mddev->clevel);
      
      Furthermore, we can see that mddev->clevel is not expected to be
      NUL-padded as `md_clean()` merely set's its first byte to NULL -- not
      the entire buffer:
      |       static void md_clean(struct mddev *mddev)
      |       {
      |       	mddev->array_sectors = 0;
      |       	mddev->external_size = 0;
      |               ...
      |       	mddev->level = LEVEL_NONE;
      |       	mddev->clevel[0] = 0;
      |               ...
      
      A suitable replacement for this instance is `memcpy` as we know the
      number of bytes to copy and perform manual NUL-termination at a
      specified offset. This really decays to just a byte copy from one buffer
      to another. `strscpy` is also a considerable replacement but using
      `slen` as the length argument would result in truncation of the last
      byte unless something like `slen + 1` was provided which isn't the most
      idiomatic strscpy usage.
      
      For the next case, the destination buffer `clevel` is expected to be
      NUL-terminated based on its usage within kstrtol() which expects
      NUL-terminated strings. Note that, in context, this code removes a
      trailing newline which is seemingly not required as kstrtol() can handle
      trailing newlines implicitly. However, there exists further usage of
      clevel (or buf) that would also like to have the newline removed. All in
      all, with similar reasoning to the first case, let's just use memcpy as
      this is just a byte copy and NUL-termination is handled manually.
      
      The third and final case concerning `mddev->metadata_type` is more or
      less the same as the other two. We expect that it be NUL-terminated
      based on its usage with seq_printf:
      |       seq_printf(seq, " super external:%s",
      |       	   mddev->metadata_type);
      ... and we can surmise that NUL-padding isn't required either due to how
      it is handled in md_clean():
      |       static void md_clean(struct mddev *mddev)
      |       {
      |       ...
      |       mddev->metadata_type[0] = 0;
      |       ...
      
      So really, all these instances have precisely calculated lengths and
      purposeful NUL-termination so we can just use memcpy to remove ambiguity
      surrounding strncpy.
      
      Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
      Link: https://github.com/KSPP/linux/issues/90
      
      
      Cc: linux-hardening@vger.kernel.org
      Signed-off-by: default avatarJustin Stitt <justinstitt@google.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20230925-strncpy-drivers-md-md-c-v1-1-2b0093b89c2b@google.com
      ceb04163
  7. Sep 23, 2023