Skip to content
  1. May 21, 2020
  2. May 17, 2020
  3. May 14, 2020
    • Jens Axboe's avatar
      Merge branch 'md-next' of... · 8fd2b980
      Jens Axboe authored
      Merge branch 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.8/drivers
      
      Pull MD changes from Song.
      
      * 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid1: Replace zero-length array with flexible-array
        md: add a newline when printing parameter 'start_ro' by sysfs
        md: stop using ->queuedata
        md/raid1: release pending accounting for an I/O only after write-behind is also finished
        md: remove redundant memalloc scope API usage
        raid5: update code comment of scribble_alloc()
        raid5: remove gfp flags from scribble_alloc()
        md: use memalloc scope APIs in mddev_suspend()/mddev_resume()
        md: remove the extra line for ->hot_add_disk
        md: flush md_rdev_misc_wq for HOT_ADD_DISK case
        md: don't flush workqueue unconditionally in md_open
        md: add new workqueue for delete rdev
        md: add checkings before flush md_misc_wq
      8fd2b980
    • Gustavo A. R. Silva's avatar
      md/raid1: Replace zero-length array with flexible-array · 358369f0
      Gustavo A. R. Silva authored
      The current codebase makes use of the zero-length array language
      extension to the C90 standard, but the preferred mechanism to declare
      variable-length types such as these ones is a flexible array member[1][2],
      introduced in C99:
      
      struct foo {
              int stuff;
              struct boo array[];
      };
      
      By making use of the mechanism above, we will get a compiler warning
      in case the flexible array does not occur last in the structure, which
      will help us prevent some kind of undefined behavior bugs from being
      inadvertently introduced[3] to the codebase from now on.
      
      Also, notice that, dynamic memory allocations won't be affected by
      this change:
      
      "Flexible array members have incomplete type, and so the sizeof operator
      may not be applied. As a quirk of the original implementation of
      zero-length arrays, sizeof evaluates to zero."[1]
      
      sizeof(flexible-array-member) triggers a warning because flexible array
      members have incomplete type[1]. There are some instances of code in
      which the sizeof operator is being incorrectly/erroneously applied to
      zero-length arrays and the result is zero. Such instances may be hiding
      some bugs. So, this work (flexible-array member conversions) will also
      help to get completely rid of those sorts of issues.
      
      This issue was found with the help of Coccinelle.
      
      [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
      [2] https://github.com/KSPP/linux/issues/21
      [3] commit 76497732
      
       ("cxgb3/l2t: Fix undefined behaviour")
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      358369f0
    • Xiongfeng Wang's avatar
      md: add a newline when printing parameter 'start_ro' by sysfs · 3f99980c
      Xiongfeng Wang authored
      
      
      Add a missing newline when printing module parameter 'start_ro' by
      sysfs.
      
      Signed-off-by: default avatarXiongfeng Wang <wangxiongfeng2@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      3f99980c
    • Christoph Hellwig's avatar
      md: stop using ->queuedata · e4fc5a74
      Christoph Hellwig authored
      
      
      Pointer to mddev is already available in private_data.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      e4fc5a74
    • David Jeffery's avatar
      md/raid1: release pending accounting for an I/O only after write-behind is also finished · c91114c2
      David Jeffery authored
      
      
      When using RAID1 and write-behind, md can deadlock when errors occur. With
      write-behind, r1bio structs can be accounted by raid1 as queued but not
      counted as pending. The pending count is dropped when the original bio is
      returned complete but write-behind for the r1bio may still be active.
      
      This breaks the accounting used in some conditions to know when the raid1
      md device has reached an idle state. It can result in calls to
      freeze_array deadlocking. freeze_array will never complete from a negative
      "unqueued" value being calculated due to a queued count larger than the
      pending count.
      
      To properly account for write-behind, move the call to allow_barrier from
      call_bio_endio to raid_end_bio_io. When using write-behind, md can call
      call_bio_endio before all write-behind I/O is complete. Using
      raid_end_bio_io for the point to call allow_barrier will release the
      pending count at a point where all I/O for an r1bio, even write-behind, is
      done.
      
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      c91114c2
    • Coly Li's avatar
      md: remove redundant memalloc scope API usage · 3024ba2d
      Coly Li authored
      
      
      In mddev_create_serial_pool(), memalloc scope APIs memalloc_noio_save()
      and memalloc_noio_restore() are used when allocating memory by calling
      mempool_create_kmalloc_pool(). After adding the memalloc scope APIs in
      raid array suspend context, it is unncessary to explicitly call them
      around mempool_create_kmalloc_pool() any longer.
      
      This patch removes the redundant memalloc scope APIs in
      mddev_create_serial_pool().
      
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      3024ba2d
    • Coly Li's avatar
      raid5: update code comment of scribble_alloc() · 7f8a30e5
      Coly Li authored
      
      
      Code comments of scribble_alloc() is outdated for a while. This patch
      update the comments in function header for the new parameter list.
      
      Suggested-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      7f8a30e5
    • Coly Li's avatar
      raid5: remove gfp flags from scribble_alloc() · ba54d4d4
      Coly Li authored
      Using GFP_NOIO flag to call scribble_alloc() from resize_chunk() does
      not have the expected behavior. kvmalloc_array() inside scribble_alloc()
      which receives the GFP_NOIO flag will eventually call kmalloc_node() to
      allocate physically continuous pages.
      
      Now we have memalloc scope APIs in mddev_suspend()/mddev_resume() to
      prevent memory reclaim I/Os during raid array suspend context, calling
      to kvmalloc_array() with GFP_KERNEL flag may avoid deadlock of recursive
      I/O as expected.
      
      This patch removes the useless gfp flags from parameters list of
      scribble_alloc(), and call kvmalloc_array() with GFP_KERNEL flag. The
      incorrect GFP_NOIO flag does not exist anymore.
      
      Fixes: b330e6a4
      
       ("md: convert to kvmalloc")
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      ba54d4d4
    • Coly Li's avatar
      md: use memalloc scope APIs in mddev_suspend()/mddev_resume() · 78f57ef9
      Coly Li authored
      In raid5.c:resize_chunk(), scribble_alloc() is called with GFP_NOIO
      flag, then it is sent into kvmalloc_array() inside scribble_alloc().
      
      The problem is kvmalloc_array() eventually calls kvmalloc_node() which
      does not accept non GFP_KERNEL compatible flag like GFP_NOIO, then
      kmalloc_node() is called indeed to allocate physically continuous
      pages. When system memory is under heavy pressure, and the requesting
      size is large, there is high probability that allocating continueous
      pages will fail.
      
      But simply using GFP_KERNEL flag to call kvmalloc_array() is also
      progblematic. In the code path where scribble_alloc() is called, the
      raid array is suspended, if kvmalloc_node() triggers memory reclaim I/Os
      and such I/Os go back to the suspend raid array, deadlock will happen.
      
      What is desired here is to allocate non-physically (a.k.a virtually)
      continuous pages and avoid memory reclaim I/Os. Michal Hocko suggests
      to use the mmealloc sceope APIs to restrict memory reclaim I/O in
      allocating context, specifically to call memalloc_noio_save() when
      suspend the raid array and to call memalloc_noio_restore() when
      resume the raid array.
      
      This patch adds the memalloc scope APIs in mddev_suspend() and
      mddev_resume(), to restrict memory reclaim I/Os during the raid array
      is suspended. The benifit of adding the memalloc scope API in the
      unified entry point mddev_suspend()/mddev_resume() is, no matter which
      md raid array type (personality), we are sure the deadlock by recursive
      memory reclaim I/O won't happen on the suspending context.
      
      Please notice that the memalloc scope APIs only take effect on the raid
      array suspending context, if the memory allocation is from another new
      created kthread after raid array suspended, the recursive memory reclaim
      I/Os won't be restricted. The mddev_suspend()/mddev_resume() entries are
      used for the critical section where the raid metadata is modifying,
      creating a kthread to allocate memory inside the critical section is
      queer and very probably being buggy.
      
      Fixes: b330e6a4
      
       ("md: convert to kvmalloc")
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      78f57ef9
    • Guoqing Jiang's avatar
      md: remove the extra line for ->hot_add_disk · 3f79cc22
      Guoqing Jiang authored
      
      
      It is not not necessary to add a newline for them since they don't exceed
      80 characters, and it is not intutive to distinguish ->hot_add_disk() from
      hot_add_disk() too.
      
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      3f79cc22
    • Guoqing Jiang's avatar
      md: flush md_rdev_misc_wq for HOT_ADD_DISK case · 78b990cf
      Guoqing Jiang authored
      
      
      Since rdev->kobj is removed asynchronously, it is possible that the
      rdev->kobj still exists when try to add the rdev again after rdev
      is removed. But this path md_ioctl (HOT_ADD_DISK) -> hot_add_disk
      -> bind_rdev_to_array missed it.
      
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      78b990cf
    • Guoqing Jiang's avatar
      md: don't flush workqueue unconditionally in md_open · f6766ff6
      Guoqing Jiang authored
      
      
      We need to check mddev->del_work before flush workqueu since the purpose
      of flush is to ensure the previous md is disappeared. Otherwise the similar
      deadlock appeared if LOCKDEP is enabled, it is due to md_open holds the
      bdev->bd_mutex before flush workqueue.
      
      kernel: [  154.522645] ======================================================
      kernel: [  154.522647] WARNING: possible circular locking dependency detected
      kernel: [  154.522650] 5.6.0-rc7-lp151.27-default #25 Tainted: G           O
      kernel: [  154.522651] ------------------------------------------------------
      kernel: [  154.522653] mdadm/2482 is trying to acquire lock:
      kernel: [  154.522655] ffff888078529128 ((wq_completion)md_misc){+.+.}, at: flush_workqueue+0x84/0x4b0
      kernel: [  154.522673]
      kernel: [  154.522673] but task is already holding lock:
      kernel: [  154.522675] ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590
      kernel: [  154.522691]
      kernel: [  154.522691] which lock already depends on the new lock.
      kernel: [  154.522691]
      kernel: [  154.522694]
      kernel: [  154.522694] the existing dependency chain (in reverse order) is:
      kernel: [  154.522696]
      kernel: [  154.522696] -> #4 (&bdev->bd_mutex){+.+.}:
      kernel: [  154.522704]        __mutex_lock+0x87/0x950
      kernel: [  154.522706]        __blkdev_get+0x79/0x590
      kernel: [  154.522708]        blkdev_get+0x65/0x140
      kernel: [  154.522709]        blkdev_get_by_dev+0x2f/0x40
      kernel: [  154.522716]        lock_rdev+0x3d/0x90 [md_mod]
      kernel: [  154.522719]        md_import_device+0xd6/0x1b0 [md_mod]
      kernel: [  154.522723]        new_dev_store+0x15e/0x210 [md_mod]
      kernel: [  154.522728]        md_attr_store+0x7a/0xc0 [md_mod]
      kernel: [  154.522732]        kernfs_fop_write+0x117/0x1b0
      kernel: [  154.522735]        vfs_write+0xad/0x1a0
      kernel: [  154.522737]        ksys_write+0xa4/0xe0
      kernel: [  154.522745]        do_syscall_64+0x64/0x2b0
      kernel: [  154.522748]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      kernel: [  154.522749]
      kernel: [  154.522749] -> #3 (&mddev->reconfig_mutex){+.+.}:
      kernel: [  154.522752]        __mutex_lock+0x87/0x950
      kernel: [  154.522756]        new_dev_store+0xc9/0x210 [md_mod]
      kernel: [  154.522759]        md_attr_store+0x7a/0xc0 [md_mod]
      kernel: [  154.522761]        kernfs_fop_write+0x117/0x1b0
      kernel: [  154.522763]        vfs_write+0xad/0x1a0
      kernel: [  154.522765]        ksys_write+0xa4/0xe0
      kernel: [  154.522767]        do_syscall_64+0x64/0x2b0
      kernel: [  154.522769]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      kernel: [  154.522770]
      kernel: [  154.522770] -> #2 (kn->count#253){++++}:
      kernel: [  154.522775]        __kernfs_remove+0x253/0x2c0
      kernel: [  154.522778]        kernfs_remove+0x1f/0x30
      kernel: [  154.522780]        kobject_del+0x28/0x60
      kernel: [  154.522783]        mddev_delayed_delete+0x24/0x30 [md_mod]
      kernel: [  154.522786]        process_one_work+0x2a7/0x5f0
      kernel: [  154.522788]        worker_thread+0x2d/0x3d0
      kernel: [  154.522793]        kthread+0x117/0x130
      kernel: [  154.522795]        ret_from_fork+0x3a/0x50
      kernel: [  154.522796]
      kernel: [  154.522796] -> #1 ((work_completion)(&mddev->del_work)){+.+.}:
      kernel: [  154.522800]        process_one_work+0x27e/0x5f0
      kernel: [  154.522802]        worker_thread+0x2d/0x3d0
      kernel: [  154.522804]        kthread+0x117/0x130
      kernel: [  154.522806]        ret_from_fork+0x3a/0x50
      kernel: [  154.522807]
      kernel: [  154.522807] -> #0 ((wq_completion)md_misc){+.+.}:
      kernel: [  154.522813]        __lock_acquire+0x1392/0x1690
      kernel: [  154.522816]        lock_acquire+0xb4/0x1a0
      kernel: [  154.522818]        flush_workqueue+0xab/0x4b0
      kernel: [  154.522821]        md_open+0xb6/0xc0 [md_mod]
      kernel: [  154.522823]        __blkdev_get+0xea/0x590
      kernel: [  154.522825]        blkdev_get+0x65/0x140
      kernel: [  154.522828]        do_dentry_open+0x1d1/0x380
      kernel: [  154.522831]        path_openat+0x567/0xcc0
      kernel: [  154.522834]        do_filp_open+0x9b/0x110
      kernel: [  154.522836]        do_sys_openat2+0x201/0x2a0
      kernel: [  154.522838]        do_sys_open+0x57/0x80
      kernel: [  154.522840]        do_syscall_64+0x64/0x2b0
      kernel: [  154.522842]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      kernel: [  154.522844]
      kernel: [  154.522844] other info that might help us debug this:
      kernel: [  154.522844]
      kernel: [  154.522846] Chain exists of:
      kernel: [  154.522846]   (wq_completion)md_misc --> &mddev->reconfig_mutex --> &bdev->bd_mutex
      kernel: [  154.522846]
      kernel: [  154.522850]  Possible unsafe locking scenario:
      kernel: [  154.522850]
      kernel: [  154.522852]        CPU0                    CPU1
      kernel: [  154.522853]        ----                    ----
      kernel: [  154.522854]   lock(&bdev->bd_mutex);
      kernel: [  154.522856]                                lock(&mddev->reconfig_mutex);
      kernel: [  154.522858]                                lock(&bdev->bd_mutex);
      kernel: [  154.522860]   lock((wq_completion)md_misc);
      kernel: [  154.522861]
      kernel: [  154.522861]  *** DEADLOCK ***
      kernel: [  154.522861]
      kernel: [  154.522864] 1 lock held by mdadm/2482:
      kernel: [  154.522865]  #0: ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590
      kernel: [  154.522868]
      kernel: [  154.522868] stack backtrace:
      kernel: [  154.522873] CPU: 1 PID: 2482 Comm: mdadm Tainted: G           O      5.6.0-rc7-lp151.27-default #25
      kernel: [  154.522875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      kernel: [  154.522878] Call Trace:
      kernel: [  154.522881]  dump_stack+0x8f/0xcb
      kernel: [  154.522884]  check_noncircular+0x194/0x1b0
      kernel: [  154.522888]  ? __lock_acquire+0x1392/0x1690
      kernel: [  154.522890]  __lock_acquire+0x1392/0x1690
      kernel: [  154.522893]  lock_acquire+0xb4/0x1a0
      kernel: [  154.522895]  ? flush_workqueue+0x84/0x4b0
      kernel: [  154.522898]  flush_workqueue+0xab/0x4b0
      kernel: [  154.522900]  ? flush_workqueue+0x84/0x4b0
      kernel: [  154.522905]  ? md_open+0xb6/0xc0 [md_mod]
      kernel: [  154.522908]  md_open+0xb6/0xc0 [md_mod]
      kernel: [  154.522910]  __blkdev_get+0xea/0x590
      kernel: [  154.522912]  ? bd_acquire+0xc0/0xc0
      kernel: [  154.522914]  blkdev_get+0x65/0x140
      kernel: [  154.522916]  ? bd_acquire+0xc0/0xc0
      kernel: [  154.522918]  do_dentry_open+0x1d1/0x380
      kernel: [  154.522921]  path_openat+0x567/0xcc0
      kernel: [  154.522923]  ? __lock_acquire+0x380/0x1690
      kernel: [  154.522926]  do_filp_open+0x9b/0x110
      kernel: [  154.522929]  ? __alloc_fd+0xe5/0x1f0
      kernel: [  154.522935]  ? kmem_cache_alloc+0x28c/0x630
      kernel: [  154.522939]  ? do_sys_openat2+0x201/0x2a0
      kernel: [  154.522941]  do_sys_openat2+0x201/0x2a0
      kernel: [  154.522944]  do_sys_open+0x57/0x80
      kernel: [  154.522946]  do_syscall_64+0x64/0x2b0
      kernel: [  154.522948]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      kernel: [  154.522951] RIP: 0033:0x7f98d279d9ae
      
      And md_alloc also flushed the same workqueue, but the thing is different
      here. Because all the paths call md_alloc don't hold bdev->bd_mutex, and
      the flush is necessary to avoid race condition, so leave it as it is.
      
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      f6766ff6
    • Guoqing Jiang's avatar
      md: add new workqueue for delete rdev · cc1ffe61
      Guoqing Jiang authored
      
      
      Since the purpose of call flush_workqueue in new_dev_store is to ensure
      md_delayed_delete() has completed, so we should check rdev->del_work is
      pending or not.
      
      To suppress lockdep warning, we have to check mddev->del_work while
      md_delayed_delete is attached to rdev->del_work, so it is not aligned
      to the purpose of flush workquee. So a new workqueue is needed to avoid
      the awkward situation, and introduce a new func flush_rdev_wq to flush
      the new workqueue after check if there was pending work.
      
      Also like new_dev_store, ADD_NEW_DISK ioctl has the same purpose to flush
      workqueue while it holds bdev->bd_mutex, so make the same change applies
      to the ioctl to avoid similar lock issue.
      
      And md_delayed_delete actually wants to delete rdev, so rename the function
      to rdev_delayed_delete.
      
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      cc1ffe61
    • Guoqing Jiang's avatar
      md: add checkings before flush md_misc_wq · 21e0958e
      Guoqing Jiang authored
      
      
      Coly reported possible circular locking dependencyi with LOCKDEP enabled,
      quote the below info from the detailed report [1].
      
      [ 1607.673903] Chain exists of:
      [ 1607.673903]   kn->count#256 --> (wq_completion)md_misc -->
      (work_completion)(&rdev->del_work)
      [ 1607.673903]
      [ 1607.827946]  Possible unsafe locking scenario:
      [ 1607.827946]
      [ 1607.898780]        CPU0                    CPU1
      [ 1607.952980]        ----                    ----
      [ 1608.007173]   lock((work_completion)(&rdev->del_work));
      [ 1608.069690]                                lock((wq_completion)md_misc);
      [ 1608.149887]                                lock((work_completion)(&rdev->del_work));
      [ 1608.242563]   lock(kn->count#256);
      [ 1608.283238]
      [ 1608.283238]  *** DEADLOCK ***
      [ 1608.283238]
      [ 1608.354078] 2 locks held by kworker/5:0/843:
      [ 1608.405152]  #0: ffff8889eecc9948 ((wq_completion)md_misc){+.+.}, at:
      process_one_work+0x42b/0xb30
      [ 1608.512399]  #1: ffff888a1d3b7e10
      ((work_completion)(&rdev->del_work)){+.+.}, at: process_one_work+0x42b/0xb30
      [ 1608.632130]
      
      Since works (rdev->del_work and mddev->del_work) are queued in md_misc_wq,
      then lockdep_map lock is held if either of them are running, then both of
      them try to hold kernfs lock by call kobject_del. Then if new_dev_store
      or array_state_store are triggered by write to the related sysfs node, so
      the write operation gets kernfs lock, but need the lockdep_map because all
      of them would trigger flush_workqueue(md_misc_wq) finally, then the same
      lockdep_map lock is needed.
      
      To suppress the lockdep warnning, we should flush the workqueue in case the
      related work is pending. And several works are attached to md_misc_wq, so
      we need to check which work should be checked:
      
      1. for __md_stop_writes, the purpose of call flush workqueue is ensure sync
      thread is started if it was starting, so check mddev->del_work is pending
      or not since md_start_sync is attached to mddev->del_work.
      
      2. __md_stop flushes md_misc_wq to ensure event_work is done, check the
      event_work is enough. Assume raid_{ctr,dtr} -> md_stop -> __md_stop doesn't
      need the kernfs lock.
      
      3. both new_dev_store (holds kernfs lock) and ADD_NEW_DISK ioctl (holds the
      bdev->bd_mutex) call flush_workqueue to ensure md_delayed_delete has
      completed, this case will be handled in next patch.
      
      4. md_open flushes workqueue to ensure the previous md is disappeared, but
      it holds bdev->bd_mutex then try to flush workqueue, so it is better to
      check mddev->del_work as well to avoid potential lock issue, this will be
      done in another patch.
      
      [1]: https://marc.info/?l=linux-raid&m=158518958031584&w=2
      
      Cc: Coly Li <colyli@suse.de>
      Reported-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      21e0958e
  4. May 13, 2020
    • Jens Axboe's avatar
      Merge tag 'floppy-for-5.8' of https://github.com/evdenis/linux-floppy into for-5.8/drivers · 91bf5ec3
      Jens Axboe authored
      
      
      Floppy patches for 5.8
      
      Cleanups:
        - symbolic register names for x86,sparc64,sparc32,powerpc,parisc,m68k
        - split of local/global variables for drive,fdc
        - UBSAN warning suppress in setup_rw_floppy()
      
      Changes were compile tested on arm, sparc64, powerpc, m68k. Many patches
      introduce no binary changes by using defines instead of magic numbers.
      The patches were also tested with syzkaller and simple write/read/format
      tests on real hardware.
      
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      
      * tag 'floppy-for-5.8' of https://github.com/evdenis/linux-floppy: (31 commits)
        floppy: suppress UBSAN warning in setup_rw_floppy()
        floppy: add defines for sizes of cmd & reply buffers of floppy_raw_cmd
        floppy: add FD_AUTODETECT_SIZE define for struct floppy_drive_params
        floppy: use print_hex_dump() in setup_DMA()
        floppy: cleanup: make set_fdc() always set current_drive and current_fd
        floppy: cleanup: get rid of current_reqD in favor of current_drive
        floppy: make sure to reset all FDCs upon resume()
        floppy: cleanup: do not iterate on current_fdc in do_floppy_init()
        floppy: cleanup: add a few comments about expectations in certain functions
        floppy: cleanup: do not iterate on current_fdc in DMA grab/release functions
        floppy: cleanup: make get_fdc_version() not rely on current_fdc anymore
        floppy: cleanup: make next_valid_format() not rely on current_drive anymore
        floppy: cleanup: make check_wp() not rely on current_{fdc,drive} anymore
        floppy: cleanup: make fdc_specify() not rely on current_{fdc,drive} anymore
        floppy: cleanup: make fdc_configure() not rely on current_fdc anymore
        floppy: cleanup: make perpendicular_mode() not rely on current_fdc anymore
        floppy: cleanup: make need_more_output() not rely on current_fdc anymore
        floppy: cleanup: make result() not rely on current_fdc anymore
        floppy: cleanup: make output_byte() not rely on current_fdc anymore
        floppy: cleanup: make wait_til_ready() not rely on current_fdc anymore
        ...
      91bf5ec3
    • Denis Efremov's avatar
      floppy: suppress UBSAN warning in setup_rw_floppy() · 0836275d
      Denis Efremov authored
      
      
      UBSAN: array-index-out-of-bounds in drivers/block/floppy.c:1521:45
      index 16 is out of range for type 'unsigned char [16]'
      Call Trace:
      ...
       setup_rw_floppy+0x5c3/0x7f0
       floppy_ready+0x2be/0x13b0
       process_one_work+0x2c1/0x5d0
       worker_thread+0x56/0x5e0
       kthread+0x122/0x170
       ret_from_fork+0x35/0x40
      
      From include/uapi/linux/fd.h:
      struct floppy_raw_cmd {
      	...
      	unsigned char cmd_count;
      	unsigned char cmd[16];
      	unsigned char reply_count;
      	unsigned char reply[16];
      	...
      }
      
      This out-of-bounds access is intentional. The command in struct
      floppy_raw_cmd may take up the space initially intended for the reply
      and the reply count. It is needed for long 82078 commands such as
      RESTORE, which takes 17 command bytes. Initial cmd size is not enough
      and since struct setup_rw_floppy is a part of uapi we check that
      cmd_count is in [0:16+1+16] in raw_cmd_copyin().
      
      The patch adds union with original cmd,reply_count,reply fields and
      fullcmd field of equivalent size. The cmd accesses are turned to
      fullcmd where appropriate to suppress UBSAN warning.
      
      Link: https://lore.kernel.org/r/20200501134416.72248-5-efremov@linux.com
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      0836275d
    • Denis Efremov's avatar
      floppy: add defines for sizes of cmd & reply buffers of floppy_raw_cmd · bd10a5f3
      Denis Efremov authored
      
      
      Use FD_RAW_CMD_SIZE, FD_RAW_REPLY_SIZE defines instead of magic numbers
      for cmd & reply buffers of struct floppy_raw_cmd. Remove local to
      floppy.c MAX_REPLIES define, as it is now FD_RAW_REPLY_SIZE.
      FD_RAW_CMD_FULLSIZE added as we allow command to also fill reply_count
      and reply fields.
      
      Link: https://lore.kernel.org/r/20200501134416.72248-4-efremov@linux.com
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      bd10a5f3
    • Denis Efremov's avatar
      floppy: add FD_AUTODETECT_SIZE define for struct floppy_drive_params · 9c4c5a24
      Denis Efremov authored
      
      
      Use FD_AUTODETECT_SIZE for autodetect buffer size in struct
      floppy_drive_params instead of a magic number.
      
      Link: https://lore.kernel.org/r/20200501134416.72248-3-efremov@linux.com
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      9c4c5a24
    • Denis Efremov's avatar
      floppy: use print_hex_dump() in setup_DMA() · 29ac6763
      Denis Efremov authored
      
      
      Remove pr_cont() and use print_hex_dump() in setup_DMA() to print the
      contents of the cmd buffer.
      
      Link: https://lore.kernel.org/r/20200501134416.72248-2-efremov@linux.com
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      29ac6763
    • Willy Tarreau's avatar
      floppy: cleanup: make set_fdc() always set current_drive and current_fd · ca1b409a
      Willy Tarreau authored
      
      
      When called with a negative drive value, set_fdc() would stick to the
      current fdc (which was assumed to reflect the current_drive's FDC). We
      do not need this anymore as the last call place with a negative value
      was just addressed. Let's make this function always set both current_fdc
      and current_drive so that there's no more ambiguity. A few comments
      stating this were added to a few non-obvious places.
      
      Link: https://lore.kernel.org/r/20200410101904.14652-3-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      ca1b409a
    • Willy Tarreau's avatar
      floppy: cleanup: get rid of current_reqD in favor of current_drive · 99ba6ccc
      Willy Tarreau authored
      
      
      This macro equals -1 and is used as an alternative for current_drive when
      calling reschedule_timeout(), which in turn needs to remap it. This only
      adds obfuscation, let's simply use current_drive.
      
      Link: https://lore.kernel.org/r/20200410101904.14652-2-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      99ba6ccc
    • Willy Tarreau's avatar
      floppy: make sure to reset all FDCs upon resume() · 6111a4f9
      Willy Tarreau authored
      
      
      In floppy_resume() we don't properly reinitialize all FDCs, instead
      we reinitialize the current FDC once per available FDC because value
      -1 is passed to user_reset_fdc(). Let's simply save the current drive
      and properly reinitialize each FDC.
      
      Link: https://lore.kernel.org/r/20200410101904.14652-1-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      6111a4f9
    • Willy Tarreau's avatar
      floppy: cleanup: do not iterate on current_fdc in do_floppy_init() · 05f5e319
      Willy Tarreau authored
      
      
      There's no need to iterate on current_fdc in do_floppy_init() anymore,
      in the first case it's only used as an array index to access fdc_state[],
      so let's get rid of this confusing assignment. The second case is a bit
      trickier because user_reset_fdc() needs to already know current_fdc when
      called with drive==-1 due to this call chain:
      
          user_reset_fdc()
            lock_fdc()
              set_fdc()
                 drive<0 ==> new_fdc = current_fdc
      
      Note that current_drive is not used in this code part and may even not
      match a unit belonging to current_fdc. Instead of passing -1 we can
      simply pass the first drive of the FDC being initialized, which is even
      cleaner as it will allow the function chain above to consistently assign
      both variables.
      
      Link: https://lore.kernel.org/r/20200410093023.14499-1-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      05f5e319
    • Willy Tarreau's avatar
      floppy: cleanup: add a few comments about expectations in certain functions · 12aebfac
      Willy Tarreau authored
      
      
      The locking in the driver is far from being obvious, with unlocking
      automatically happening at end of operations scheduled by interrupt,
      especially for the error paths where one does not necessarily expect
      that such an interrupt may be triggered. Let's add a few comments
      about what to expect at certain places to avoid misdetecting bugs
      which are not.
      
      Link: https://lore.kernel.org/r/20200331094054.24441-24-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      12aebfac
    • Willy Tarreau's avatar
      floppy: cleanup: do not iterate on current_fdc in DMA grab/release functions · 82a63010
      Willy Tarreau authored
      
      
      Both floppy_grab_irq_and_dma() and floppy_release_irq_and_dma() used to
      iterate on the global variable while setting up or freeing resources.
      Now that they exclusively rely on functions which take the fdc as an
      argument, so let's not touch the global one anymore.
      
      Link: https://lore.kernel.org/r/20200331094054.24441-23-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      82a63010
    • Willy Tarreau's avatar
      floppy: cleanup: make get_fdc_version() not rely on current_fdc anymore · e5a9c95f
      Willy Tarreau authored
      
      
      Now the fdc is passed in argument so that the function does not
      use current_fdc anymore.
      
      Link: https://lore.kernel.org/r/20200331094054.24441-22-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      e5a9c95f
    • Willy Tarreau's avatar
      floppy: cleanup: make next_valid_format() not rely on current_drive anymore · 43d81bb6
      Willy Tarreau authored
      
      
      Now the drive is passed in argument so that the function does not
      use current_drive anymore.
      
      Link: https://lore.kernel.org/r/20200331094054.24441-21-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      43d81bb6
    • Willy Tarreau's avatar
      floppy: cleanup: make check_wp() not rely on current_{fdc,drive} anymore · c7af70b0
      Willy Tarreau authored
      
      
      Now the fdc and drive are passed in argument so that the function does
      not use current_fdc nor current_drive anymore.
      
      Link: https://lore.kernel.org/r/20200331094054.24441-20-w@1wt.eu
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      c7af70b0