Skip to content
  1. Dec 09, 2018
    • David Rientjes's avatar
      Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask" · 356ff8a9
      David Rientjes authored
      This reverts commit 89c83fb5.
      
      This should have been done as part of 2f0799a0 ("mm, thp: restore
      node-local hugepage allocations").  The movement of the thp allocation
      policy from alloc_pages_vma() to alloc_hugepage_direct_gfpmask() was
      intended to only set __GFP_THISNODE for mempolicies that are not
      MPOL_BIND whereas the revert could set this regardless of mempolicy.
      
      While the check for MPOL_BIND between alloc_hugepage_direct_gfpmask()
      and alloc_pages_vma() was racy, that has since been removed since the
      revert.  What is left is the possibility to use __GFP_THISNODE in
      policy_node() when it is unexpected because the special handling for
      hugepages in alloc_pages_vma()  was removed as part of the consolidation.
      
      Secondly, prior to 89c83fb5, alloc_pages_vma() implemented a somewhat
      different policy for hugepage allocations, which were allocated through
      alloc_hugepage_vma().  For hugepage allocations, if the allocating
      process's node is in the set of allowed nodes, allocate with
      __GFP_THISNODE for that node (for MPOL_PREFERRED, use that node with
      __GFP_THISNODE instead).  This was changed for shmem_alloc_hugepage() to
      allow fallback to other nodes in 89c83fb5 as it did for new_page() in
      mm/mempolicy.c which is functionally different behavior and removes the
      requirement to only allocate hugepages locally.
      
      So this commit does a full revert of 89c83fb5 instead of the partial
      revert that was done in 2f0799a0.  The result is the same thp
      allocation policy for 4.20 that was in 4.19.
      
      Fixes: 89c83fb5 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
      Fixes: 2f0799a0
      
       ("mm, thp: restore node-local hugepage allocations")
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      356ff8a9
  2. Dec 08, 2018
  3. Dec 07, 2018
    • Will Deacon's avatar
      arm64: hibernate: Avoid sending cross-calling with interrupts disabled · b4aecf78
      Will Deacon authored
      Since commit 3b8c9f1c ("arm64: IPI each CPU after invalidating the
      I-cache for kernel mappings"), a call to flush_icache_range() will use
      an IPI to cross-call other online CPUs so that any stale instructions
      are flushed from their pipelines. This triggers a WARN during the
      hibernation resume path, where flush_icache_range() is called with
      interrupts disabled and is therefore prone to deadlock:
      
        | Disabling non-boot CPUs ...
        | CPU1: shutdown
        | psci: CPU1 killed.
        | CPU2: shutdown
        | psci: CPU2 killed.
        | CPU3: shutdown
        | psci: CPU3 killed.
        | WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
        | Modules linked in:
        | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1
      
      Since all secondary CPUs have been taken offline prior to invalidating
      the I-cache, there's actually no need for an IPI and we can simply call
      __flush_icache_range() instead.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 3b8c9f1c
      
       ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
      Reported-by: default avatarKunihiko Hayashi <hayashi.kunihiko@socionext.com>
      Tested-by: default avatarKunihiko Hayashi <hayashi.kunihiko@socionext.com>
      Tested-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      b4aecf78
    • Jens Axboe's avatar
      Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linus · 8b878ee2
      Jens Axboe authored
      Pull NVMe fixes from Christoph.
      
      * 'nvme-4.20' of git://git.infradead.org/nvme:
        nvmet-rdma: fix response use after free
        nvme: validate controller state before rescheduling keep alive
      8b878ee2
    • Jens Axboe's avatar
      blk-mq: punt failed direct issue to dispatch list · c616cbee
      Jens Axboe authored
      After the direct dispatch corruption fix, we permanently disallow direct
      dispatch of non read/write requests. This works fine off the normal IO
      path, as they will be retried like any other failed direct dispatch
      request. But for the blk_insert_cloned_request() that only DM uses to
      bypass the bottom level scheduler, we always first attempt direct
      dispatch. For some types of requests, that's now a permanent failure,
      and no amount of retrying will make that succeed. This results in a
      livelock.
      
      Instead of making special cases for what we can direct issue, and now
      having to deal with DM solving the livelock while still retaining a BUSY
      condition feedback loop, always just add a request that has been through
      ->queue_rq() to the hardware queue dispatch list. These are safe to use
      as no merging can take place there. Additionally, if requests do have
      prepped data from drivers, we aren't dependent on them not sharing space
      in the request structure to safely add them to the IO scheduler lists.
      
      This basically reverts ffe81d45 and is based on a patch from Ming,
      but with the list insert case covered as well.
      
      Fixes: ffe81d45
      
       ("blk-mq: fix corruption with direct issue")
      Cc: stable@vger.kernel.org
      Suggested-by: default avatarMing Lei <ming.lei@redhat.com>
      Reported-by: default avatarBart Van Assche <bvanassche@acm.org>
      Tested-by: default avatarMing Lei <ming.lei@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c616cbee
    • Israel Rukshin's avatar
      nvmet-rdma: fix response use after free · d7dcdf9d
      Israel Rukshin authored
      nvmet_rdma_release_rsp() may free the response before using it at error
      flow.
      
      Fixes: 8407879c
      
       ("nvmet-rdma: fix possible bogus dereference under heavy load")
      Signed-off-by: default avatarIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      d7dcdf9d
    • James Smart's avatar
      nvme: validate controller state before rescheduling keep alive · 86880d64
      James Smart authored
      
      
      Delete operations are seeing NULL pointer references in call_timer_fn.
      Tracking these back, the timer appears to be the keep alive timer.
      
      nvme_keep_alive_work() which is tied to the timer that is cancelled
      by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't
      wait for it's completion. So nvme_stop_keep_alive() only stops a timer
      when it's pending. When a keep alive is in flight, there is no timer
      running and the nvme_stop_keep_alive() will have no affect on the keep
      alive io. Thus, if the io completes successfully, the keep alive timer
      will be rescheduled.   In the failure case, delete is called, the
      controller state is changed, the nvme_stop_keep_alive() is called while
      the io is outstanding, and the delete path continues on. The keep
      alive happens to successfully complete before the delete paths mark it
      as aborted as part of the queue termination, so the timer is restarted.
      The delete paths then tear down the controller, and later on the timer
      code fires and the timer entry is now corrupt.
      
      Fix by validating the controller state before rescheduling the keep
      alive. Testing with the fix has confirmed the condition above was hit.
      
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      86880d64
    • Paolo Valente's avatar
      block, bfq: fix decrement of num_active_groups · ba7aeae5
      Paolo Valente authored
      Since commit '2d29c9f8 ("block, bfq: improve asymmetric scenarios
      detection")', if there are process groups with I/O requests waiting for
      completion, then BFQ tags the scenario as 'asymmetric'. This detection
      is needed for preserving service guarantees (for details, see comments
      on the computation * of the variable asymmetric_scenario in the
      function bfq_better_to_idle).
      
      Unfortunately, commit '2d29c9f8 ("block, bfq: improve asymmetric
      scenarios detection")' contains an error exactly in the updating of
      the number of groups with I/O requests waiting for completion: if a
      group has more than one descendant process, then the above number of
      groups, which is renamed from num_active_groups to a more appropriate
      num_groups_with_pending_reqs by this commit, may happen to be wrongly
      decremented multiple times, namely every time one of the descendant
      processes gets all its pending I/O requests completed.
      
      A correct, complete solution should work as follows. Consider a group
      that is inactive, i.e., that has no descendant process with pending
      I/O inside BFQ queues. Then suppose that num_groups_with_pending_reqs
      is still accounting for this group, because the group still has some
      descendant process with some I/O request still in
      flight. num_groups_with_pending_reqs should be decremented when the
      in-flight request of the last descendant process is finally completed
      (assuming that nothing else has changed for the group in the meantime,
      in terms of composition of the group and active/inactive state of
      child groups and processes). To accomplish this, an additional
      pending-request counter must be added to entities, and must be
      updated correctly.
      
      To avoid this additional field and operations, this commit resorts to
      the following tradeoff between simplicity and accuracy: for an
      inactive group that is still counted in num_groups_with_pending_reqs,
      this commit decrements num_groups_with_pending_reqs when the first
      descendant process of the group remains with no request waiting for
      completion.
      
      This simplified scheme provides a fix to the unbalanced decrements
      introduced by 2d29c9f8. Since this error was also caused by lack
      of comments on this non-trivial issue, this commit also adds related
      comments.
      
      Fixes: 2d29c9f8
      
       ("block, bfq: improve asymmetric scenarios detection")
      Reported-by: default avatarSteven Barrett <steven@liquorix.net>
      Tested-by: default avatarSteven Barrett <steven@liquorix.net>
      Tested-by: default avatarLucjan Lucjanov <lucjan.lucjanov@gmail.com>
      Reviewed-by: default avatarFederico Motta <federico@willer.it>
      Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ba7aeae5
    • Herbert Xu's avatar
      crypto: user - Disable statistics interface · e61efff4
      Herbert Xu authored
      
      
      Since this user-space API is still undergoing significant changes,
      this patch disables it for the current merge window.
      
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      e61efff4
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2018-12-07' of git://anongit.freedesktop.org/drm/drm · d387ac13
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "There's a bit more in here than I'd like, and I'm hoping things calm
        down when I'm out.
      
        msm:
         - a bunch of display fixes for the new DPU
         - a couple of command submission fixes
      
        omap:
         - some DSI fixes
      
        ast:
         - driver unload crash fix
      
        core:
         - fix the lease uevent so userspace can distinguish it
      
        amd:
         - fix a bpc regression
         - fix lru handling regression
         - fixed firmware support for new GPUs
         - power management fixes for vega20"
      
      * tag 'drm-fixes-2018-12-07' of git://anongit.freedesktop.org/drm/drm: (37 commits)
        drm/ast: Fix connector leak during driver unload
        drm/amdgpu/vcn: Update vcn.cur_state during suspend
        drm/amd/display: Fix overflow/truncation from strncpy.
        drm/amd/powerplay: improve OD code robustness
        drm/amdgpu: enlarge maximum waiting time of KIQ
        drm/fb-helper: Fix typo in parameter description
        drm/amd/powerplay: support SoftMin/Max setting for some specific DPM
        drm/amd/powerplay: issue pre-display settings for display change event
        drm/amd/powerplay: support new pptable upload on Vega20
        drm/amdgpu/gmc8: always load MC firmware in the driver
        drm/amdgpu/gmc8: update MC firmware for polaris
        drm/amdgpu: update mc firmware image for polaris12 variants
        drm/msm: Fix error return checking
        drm/msm/dpu: Ignore alpha for XBGR8888 format
        drm/msm: dpu: Fix "WARNING: invalid free of devm_ allocated data"
        drm/msm/hdmi: Drop pointless static qualifier in msm_hdmi_bind()
        drm/msm: Move fence put to where failure occurs
        drm/msm: dpu: Don't set legacy plane->crtc pointer
        drm/msm/gpu: Don't map command buffers with nr_relocs equal to 0
        drm/msm/hdmi: Enable HPD after HDMI IRQ is set up
        ...
      d387ac13
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.20-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 7f80c732
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "This is mainly fallout from the updates to the SUNRPC code that is
        being triggered from less common combinations of NFS mount options.
      
        Highlights include:
      
        Stable fixes:
         - Fix a page leak when using RPCSEC_GSS/krb5p to encrypt data.
      
        Bugfixes:
         - Fix a regression that causes the RPC receive code to hang
         - Fix call_connect_status() so that it handles tasks that got
           transmitted while queued waiting for the socket lock.
         - Fix a memory leak in call_encode()
         - Fix several other connect races.
         - Fix receive code error handling.
         - Use the discard iterator rather than MSG_TRUNC for compatibility
           with AF_UNIX/AF_LOCAL sockets.
         - nfs: don't dirty kernel pages read by direct-io
         - pnfs/Flexfiles fix to enforce per-mirror stateid only for NFSv4
           data servers"
      
      * tag 'nfs-for-4.20-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: Don't force a redundant disconnection in xs_read_stream()
        SUNRPC: Fix up socket polling
        SUNRPC: Use the discard iterator rather than MSG_TRUNC
        SUNRPC: Treat EFAULT as a truncated message in xs_read_stream_request()
        SUNRPC: Fix up handling of the XDRBUF_SPARSE_PAGES flag
        SUNRPC: Fix RPC receive hangs
        SUNRPC: Fix a potential race in xprt_connect()
        SUNRPC: Fix a memory leak in call_encode()
        SUNRPC: Fix leak of krb5p encode pages
        SUNRPC: call_connect_status() must handle tasks that got transmitted
        nfs: don't dirty kernel pages read by direct-io
        flexfiles: enforce per-mirror stateid only for v4 DSes
      7f80c732
    • Linus Torvalds's avatar
      Merge branch 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm · b72f711a
      Linus Torvalds authored
      Pull ARM spectre fix from Russell King:
       "Exynos folk noticed that CPU hotplug wasn't working with their kernel
        configuration, and have tested this as fixing the problem"
      
      * 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: ensure that processor vtables is not lost after boot
      b72f711a
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · 7e40b56c
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "Some small fixes that have been accumulated:
      
         - Chris Cole noticed that in a SMP environment, the DMA cache
           coherence handling can produce undesirable results in a corner
           case
      
         - Propagate that fix for ARMv7M as well
      
         - Fix a false positive with source fortification
      
         - Fix an uninitialised return that Nathan Jones spotted"
      
      * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 8816/1: dma-mapping: fix potential uninitialized return
        ARM: 8815/1: V7M: align v7m_dma_inv_range() with v7 counterpart
        ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handling
        ARM: 8806/1: kprobes: Fix false positive with FORTIFY_SOURCE
      7e40b56c
    • Masahiro Yamada's avatar
      i2c: uniphier-f: fix violation of tLOW requirement for Fast-mode · ece27a33
      Masahiro Yamada authored
      
      
      Currently, the clock duty is set as tLOW/tHIGH = 1/1. For Fast-mode,
      tLOW is set to 1.25 us while the I2C spec requires tLOW >= 1.3 us.
      
      tLOW/tHIGH = 5/4 would meet both Standard-mode and Fast-mode:
        Standard-mode: tLOW = 5.56 us, tHIGH = 4.44 us
        Fast-mode:     tLOW = 1.39 us, tHIGH = 1.11 us
      
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      ece27a33
    • Masahiro Yamada's avatar
      i2c: uniphier: fix violation of tLOW requirement for Fast-mode · 8469636a
      Masahiro Yamada authored
      
      
      Currently, the clock duty is set as tLOW/tHIGH = 1/1. For Fast-mode,
      tLOW is set to 1.25 us while the I2C spec requires tLOW >= 1.3 us.
      
      tLOW/tHIGH = 5/4 would meet both Standard-mode and Fast-mode:
        Standard-mode: tLOW = 5.56 us, tHIGH = 4.44 us
        Fast-mode:     tLOW = 1.39 us, tHIGH = 1.11 us
      
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      8469636a
    • Masahiro Yamada's avatar
      i2c: uniphier-f: fill TX-FIFO only in IRQ handler for repeated START · cd8843f5
      Masahiro Yamada authored
      
      
      - For a repeated START condition, this controller starts data transfer
         immediately after the slave address is written to the TX-FIFO.
      
       - Once the TX-FIFO empty interrupt is asserted, the controller makes
         a pause even if additional data are written to the TX-FIFO.
      
      Given those circumstances, the data after a repeated START may not be
      transferred if the interrupt is asserted while the TX-FIFO is being
      filled up. A more reliable way is to append TX data only in the
      interrupt handler.
      
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      cd8843f5
    • Masahiro Yamada's avatar
      i2c: uniphier-f: fix timeout error after reading 8 bytes · c2a653de
      Masahiro Yamada authored
      I was totally screwed up in commit eaba6878 ("i2c: uniphier-f:
      fix race condition when IRQ is cleared"). Since that commit, if the
      number of read bytes is multiple of the FIFO size (8, 16, 24... bytes),
      the STOP condition could be issued twice, depending on the timing.
      If this happens, the controller will go wrong, resulting in the timeout
      error.
      
      It was more than 3 years ago when I wrote this driver, so my memory
      about this hardware was vague. Please let me correct the description
      in the commit log of eaba6878.
      
      Clearing the IRQ status on exiting the IRQ handler is absolutely
      fine. This controller makes a pause while any IRQ status is asserted.
      If the IRQ status is cleared first, the hardware may start the next
      transaction before the IRQ handler finishes what it supposed to do.
      
      This partially reverts the bad commit with clear comments so that I
      will never repeat this mistake.
      
      I also investigated what is happening at the last moment of the read
      mode. The UNIPHIER_FI2C_INT_RF interrupt is asserted a bit earlier
      (by half a period of the clock cycle) than UNIPHIER_FI2C_INT_RB.
      
      I consulted a hardware engineer, and I got the following information:
      
      UNIPHIER_FI2C_INT_RF
          asserted at the falling edge of SCL at the 8th bit.
      
      UNIPHIER_FI2C_INT_RB
          asserted at the rising edge of SCL at the 9th (ACK) bit.
      
      In order to avoid calling uniphier_fi2c_stop() twice, check the latter
      interrupt. I also commented this because it is obscure hardware internal.
      
      Fixes: eaba6878
      
       ("i2c: uniphier-f: fix race condition when IRQ is cleared")
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      c2a653de
    • Hans de Goede's avatar
      i2c: scmi: Fix probe error on devices with an empty SMB0001 ACPI device node · 0544ee4b
      Hans de Goede authored
      
      
      Some AMD based HP laptops have a SMB0001 ACPI device node which does not
      define any methods.
      
      This leads to the following error in dmesg:
      
      [    5.222731] cmi: probe of SMB0001:00 failed with error -5
      
      This commit makes acpi_smbus_cmi_add() return -ENODEV instead in this case
      silencing the error. In case of a failure of the i2c_add_adapter() call
      this commit now propagates the error from that call instead of -EIO.
      
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      0544ee4b
    • Adamski, Krzysztof (Nokia - PL/Wroclaw)'s avatar
      i2c: axxia: properly handle master timeout · 6c7f25ca
      
      
      According to Intel (R) Axxia TM Lionfish Communication Processor
      Peripheral Subsystem Hardware Reference Manual, the AXXIA I2C module
      have a programmable Master Wait Timer, which among others, checks the
      time between commands send in manual mode. When a timeout (25ms) passes,
      TSS bit is set in Master Interrupt Status register and a Stop command is
      issued by the hardware.
      
      The axxia_i2c_xfer(), does not properly handle this situation, however.
      For each message a separate axxia_i2c_xfer_msg() is called and this
      function incorrectly assumes that any interrupt might happen only when
      waiting for completion. This is mostly correct but there is one
      exception - a master timeout can trigger if enough time has passed
      between individual transfers. It will, by definition, happen between
      transfers when the interrupts are disabled by the code. If that happens,
      the hardware issues Stop command.
      
      The interrupt indicating timeout will not be triggered as soon as we
      enable them since the Master Interrupt Status is cleared when master
      mode is entered again (which happens before enabling irqs) meaning this
      error is lost and the transfer is continued even though the Stop was
      issued on the bus. The subsequent operations completes without error but
      a bogus value (0xFF in case of read) is read as the client device is
      confused because aborted transfer. No error is returned from
      master_xfer() making caller believe that a valid value was read.
      
      To fix the problem, the TSS bit (indicating timeout) in Master Interrupt
      Status register is checked before each transfer. If it is set, there was
      a timeout before this transfer and (as described above) the hardware
      already issued Stop command so the transaction should be aborted thus
      -ETIMEOUT is returned from the master_xfer() callback. In order to be
      sure no timeout was issued we can't just read the status just before
      starting new transaction as there will always be a small window of time
      (few CPU cycles at best) where this might still happen. For this reason
      we have to temporally disable the timer before checking for TSS bit.
      Disabling it will, however, clear the TSS bit so in order to preserve
      that information, we have to read it in ISR so we have to ensure that
      the TSS interrupt is not masked between transfers of one transaction.
      There is no need to call bus recovery or controller reinitialization if
      that happens so it's skipped.
      
      Signed-off-by: default avatarKrzysztof Adamski <krzysztof.adamski@nokia.com>
      Reviewed-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      6c7f25ca
    • Stefan Hajnoczi's avatar
      vhost/vsock: fix use-after-free in network stack callers · 834e772c
      Stefan Hajnoczi authored
      
      
      If the network stack calls .send_pkt()/.cancel_pkt() during .release(),
      a struct vhost_vsock use-after-free is possible.  This occurs because
      .release() does not wait for other CPUs to stop using struct
      vhost_vsock.
      
      Switch to an RCU-enabled hashtable (indexed by guest CID) so that
      .release() can wait for other CPUs by calling synchronize_rcu().  This
      also eliminates vhost_vsock_lock acquisition in the data path so it
      could have a positive effect on performance.
      
      This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt".
      
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: default avatar <syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      834e772c
    • Halil Pasic's avatar
      virtio/s390: fix race in ccw_io_helper() · 78b1a52e
      Halil Pasic authored
      While ccw_io_helper() seems like intended to be exclusive in a sense that
      it is supposed to facilitate I/O for at most one thread at any given
      time, there is actually nothing ensuring that threads won't pile up at
      vcdev->wait_q. If they do, all threads get woken up and see the status
      that belongs to some other request than their own. This can lead to bugs.
      For an example see:
      https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788432
      
      
      
      This race normally does not cause any problems. The operations provided
      by struct virtio_config_ops are usually invoked in a well defined
      sequence, normally don't fail, and are normally used quite infrequent
      too.
      
      Yet, if some of the these operations are directly triggered via sysfs
      attributes, like in the case described by the referenced bug, userspace
      is given an opportunity to force races by increasing the frequency of the
      given operations.
      
      Let us fix the problem by ensuring, that for each device, we finish
      processing the previous request before starting with a new one.
      
      Signed-off-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Reported-by: default avatarColin Ian King <colin.king@canonical.com>
      Cc: stable@vger.kernel.org
      Message-Id: <20180925121309.58524-3-pasic@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      78b1a52e
    • Halil Pasic's avatar
      virtio/s390: avoid race on vcdev->config · 2448a299
      Halil Pasic authored
      
      
      Currently we have a race on vcdev->config in virtio_ccw_get_config() and
      in virtio_ccw_set_config().
      
      This normally does not cause problems, as these are usually infrequent
      operations. However, for some devices writing to/reading from the config
      space can be triggered through sysfs attributes. For these, userspace can
      force the race by increasing the frequency.
      
      Signed-off-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Cc: stable@vger.kernel.org
      Message-Id: <20180925121309.58524-2-pasic@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      2448a299
    • Stefan Hajnoczi's avatar
      vhost/vsock: fix reset orphans race with close timeout · c38f57da
      Stefan Hajnoczi authored
      
      
      If a local process has closed a connected socket and hasn't received a
      RST packet yet, then the socket remains in the table until a timeout
      expires.
      
      When a vhost_vsock instance is released with the timeout still pending,
      the socket is never freed because vhost_vsock has already set the
      SOCK_DONE flag.
      
      Check if the close timer is pending and let it close the socket.  This
      prevents the race which can leak sockets.
      
      Reported-by: default avatarMaximilian Riemensberger <riemensberger@cadami.net>
      Cc: Graham Whaley <graham.whaley@gmail.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      c38f57da
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · abb8d6ec
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "This is a single commit that fixes a bug in uprobes SDT code due to a
        missing mutex protection"
      
      * tag 'trace-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        Uprobes: Fix kernel oops with delayed_uprobe_remove()
      abb8d6ec
    • Linus Torvalds's avatar
      Merge tag 'sound-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2acee31c
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Still more incoming fixes than wished at this stage, but all look like
        small and reasonable fixes.
      
        In addition to the usual HD-audio and USB-audio quirks for various
        devices, two notable changes are included:
      
         - a fix for USB-audio UAF at probing a malformed descriptor
      
         - workarounds for PCM rwsem mutex starvation"
      
      * tag 'sound-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4860G/Z6860G
        ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4660G
        ALSA: hda/realtek - Add support for Acer Aspire C24-860 headset mic
        ALSA: hda/realtek: ALC286 mic and headset-mode fixups for Acer Aspire U27-880
        ALSA: usb-audio: Fix UAF decrement if card has no live interfaces in card.c
        ALSA: hda/realtek - Fix speaker output regression on Thinkpad T570
        ALSA: pcm: Fix interval evaluation with openmin/max
        ALSA: hda: Add support for AMD Stoney Ridge
        ALSA: usb-audio: Add SMSL D1 to quirks for native DSD support
        ALSA: pcm: Fix starvation on down_write_nonblock()
        ALSA: pcm: Call snd_pcm_unlink() conditionally at closing
      2acee31c
    • Linus Torvalds's avatar
      Merge tag 'csky-4.20-rc6' of github.com:c-sky/csky-linux · 002f421a
      Linus Torvalds authored
      Pull C-SKY fixes from Guo Ren:
      
       - bugfix for tlb_get_pgd() error
      
       - update MAINTAINERS file for C-SKY drivers
      
      * tag 'csky-4.20-rc6' of github.com:c-sky/csky-linux:
        csky: bugfix tlb_get_pgd error.
        MAINTAINERS: add maintainer for C-SKY drivers
      002f421a
    • Andy Shevchenko's avatar
      dmaengine: dw: Fix FIFO size for Intel Merrifield · ffe843b1
      Andy Shevchenko authored
      Intel Merrifield has a reduced size of FIFO used in iDMA 32-bit controller,
      i.e. 512 bytes instead of 1024.
      
      Fix this by partitioning it as 64 bytes per channel.
      
      Note, in the future we might switch to 'fifo-size' property instead of
      hard coded value.
      
      Fixes: 199244d6
      
       ("dmaengine: dw: add support of iDMA 32-bit hardware")
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      ffe843b1
    • Alexander Popov's avatar
      stackleak: Register the 'stackleak_cleanup' pass before the '*free_cfg' pass · 8fb2dfb2
      Alexander Popov authored
      Currently the 'stackleak_cleanup' pass deleting a CALL insn is executed
      after the 'reload' pass. That allows gcc to do some weird optimization in
      function prologues and epilogues, which are generated later [1].
      
      Let's avoid that by registering the 'stackleak_cleanup' pass before
      the '*free_cfg' pass. It's the moment when the stack frame size is
      already final, function prologues and epilogues are generated, and the
      machine-dependent code transformations are not done.
      
      [1] https://www.openwall.com/lists/kernel-hardening/2018/11/23/2
      
      
      
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarAlexander Popov <alex.popov@linux.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      8fb2dfb2
    • Russell King's avatar
      ARM: ensure that processor vtables is not lost after boot · 3a4d0c21
      Russell King authored
      
      
      Marek Szyprowski reported problems with CPU hotplug in current kernels.
      This was tracked down to the processor vtables being located in an
      init section, and therefore discarded after kernel boot, despite being
      required after boot to properly initialise the non-boot CPUs.
      
      Arrange for these tables to end up in .rodata when required.
      
      Reported-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Tested-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Fixes: 383fb3ee
      
       ("ARM: spectre-v2: per-CPU vtables to work around big.Little systems")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      3a4d0c21
  4. Dec 06, 2018