Skip to content
  1. Jun 10, 2021
    • Filipe Manana's avatar
      btrfs: fix deadlock when cloning inline extents and low on available space · baa67631
      Filipe Manana authored
      commit 76a6d5cd upstream.
      
      There are a few cases where cloning an inline extent requires copying data
      into a page of the destination inode. For these cases we are allocating
      the required data and metadata space while holding a leaf locked. This can
      result in a deadlock when we are low on available space because allocating
      the space may flush delalloc and two deadlock scenarios can happen:
      
      1) When starting writeback for an inode with a very small dirty range that
         fits in an inline extent, we deadlock during the writeback when trying
         to insert the inline extent, at cow_file_range_inline(), if the extent
         is going to be located in the leaf for which we are already holding a
         read lock;
      
      2) After successfully starting writeback, for non-inline extent cases,
         the async reclaim thread will hang waiting for an ordered extent to
         complete if the ordered extent completion needs to modify the leaf
         for which the clone task is holding a read lock (for adding or
         replacing file extent items). So the cloning task will wait forever
         on the async reclaim thread to make progress, which in turn is
         waiting for the ordered extent completion which in turn is waiting
         to acquire a write lock on the same leaf.
      
      So fix this by making sure we release the path (and therefore the leaf)
      every time we need to copy the inline extent's data into a page of the
      destination inode, as by that time we do not need to have the leaf locked.
      
      Fixes: 05a5a762
      
       ("Btrfs: implement full reflink support for inline extents")
      CC: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      baa67631
    • Josef Bacik's avatar
      btrfs: abort in rename_exchange if we fail to insert the second ref · 0df50d47
      Josef Bacik authored
      commit dc09ef35
      
       upstream.
      
      Error injection stress uncovered a problem where we'd leave a dangling
      inode ref if we failed during a rename_exchange.  This happens because
      we insert the inode ref for one side of the rename, and then for the
      other side.  If this second inode ref insert fails we'll leave the first
      one dangling and leave a corrupt file system behind.  Fix this by
      aborting if we did the insert for the first inode ref.
      
      CC: stable@vger.kernel.org # 4.9+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0df50d47
    • Josef Bacik's avatar
      btrfs: fixup error handling in fixup_inode_link_counts · 48568f39
      Josef Bacik authored
      commit 011b28ac
      
       upstream.
      
      This function has the following pattern
      
      	while (1) {
      		ret = whatever();
      		if (ret)
      			goto out;
      	}
      	ret = 0
      out:
      	return ret;
      
      However several places in this while loop we simply break; when there's
      a problem, thus clearing the return value, and in one case we do a
      return -EIO, and leak the memory for the path.
      
      Fix this by re-arranging the loop to deal with ret == 1 coming from
      btrfs_search_slot, and then simply delete the
      
      	ret = 0;
      out:
      
      bit so everybody can break if there is an error, which will allow for
      proper error handling to occur.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48568f39
    • Josef Bacik's avatar
      btrfs: return errors from btrfs_del_csums in cleanup_ref_head · 466d83fd
      Josef Bacik authored
      commit 856bd270
      
       upstream.
      
      We are unconditionally returning 0 in cleanup_ref_head, despite the fact
      that btrfs_del_csums could fail.  We need to return the error so the
      transaction gets aborted properly, fix this by returning ret from
      btrfs_del_csums in cleanup_ref_head.
      
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      466d83fd
    • Josef Bacik's avatar
      btrfs: fix error handling in btrfs_del_csums · 5a89982f
      Josef Bacik authored
      commit b86652be
      
       upstream.
      
      Error injection stress would sometimes fail with checksums on disk that
      did not have a corresponding extent.  This occurred because the pattern
      in btrfs_del_csums was
      
      	while (1) {
      		ret = btrfs_search_slot();
      		if (ret < 0)
      			break;
      	}
      	ret = 0;
      out:
      	btrfs_free_path(path);
      	return ret;
      
      If we got an error from btrfs_search_slot we'd clear the error because
      we were breaking instead of goto out.  Instead of using goto out, simply
      handle the cases where we may leave a random value in ret, and get rid
      of the
      
      	ret = 0;
      out:
      
      pattern and simply allow break to have the proper error reporting.  With
      this fix we properly abort the transaction and do not commit thinking we
      successfully deleted the csum.
      
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a89982f
    • Josef Bacik's avatar
      btrfs: mark ordered extent and inode with error if we fail to finish · b547a16b
      Josef Bacik authored
      commit d61bec08
      
       upstream.
      
      While doing error injection testing I saw that sometimes we'd get an
      abort that wouldn't stop the current transaction commit from completing.
      This abort was coming from finish ordered IO, but at this point in the
      transaction commit we should have gotten an error and stopped.
      
      It turns out the abort came from finish ordered io while trying to write
      out the free space cache.  It occurred to me that any failure inside of
      finish_ordered_io isn't actually raised to the person doing the writing,
      so we could have any number of failures in this path and think the
      ordered extent completed successfully and the inode was fine.
      
      Fix this by marking the ordered extent with BTRFS_ORDERED_IOERR, and
      marking the mapping of the inode with mapping_set_error, so any callers
      that simply call fdatawait will also get the error.
      
      With this we're seeing the IO error on the free space inode when we fail
      to do the finish_ordered_io.
      
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b547a16b
    • Naveen N Rao's avatar
      powerpc/kprobes: Fix validation of prefixed instructions across page boundary · 5e5e63ba
      Naveen N Rao authored
      commit 82123a3d upstream.
      
      When checking if the probed instruction is the suffix of a prefixed
      instruction, we access the instruction at the previous word. If the
      probed instruction is the very first word of a module, we can end up
      trying to access an invalid page.
      
      Fix this by skipping the check for all instructions at the beginning of
      a page. Prefixed instructions cannot cross a 64-byte boundary and as
      such, we don't expect to encounter a suffix as the very first word in a
      page for kernel text. Even if there are prefixed instructions crossing
      a page boundary (from a module, for instance), the instruction will be
      illegal, so preventing probing on the suffix of such prefix instructions
      isn't worthwhile.
      
      Fixes: b4657f76
      
       ("powerpc/kprobes: Don't allow breakpoints on suffixes")
      Cc: stable@vger.kernel.org # v5.8+
      Reported-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/0df9a032a05576a2fa8e97d1b769af2ff0eafbd6.1621416666.git.naveen.n.rao@linux.vnet.ibm.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e5e63ba
    • Thomas Gleixner's avatar
      x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing · 42f75a43
      Thomas Gleixner authored
      commit 7d65f9e8 upstream.
      
      PIC interrupts do not support affinity setting and they can end up on
      any online CPU. Therefore, it's required to mark the associated vectors
      as system-wide reserved. Otherwise, the corresponding irq descriptors
      are copied to the secondary CPUs but the vectors are not marked as
      assigned or reserved. This works correctly for the IO/APIC case.
      
      When the IO/APIC is disabled via config, kernel command line or lack of
      enumeration then all legacy interrupts are routed through the PIC, but
      nothing marks them as system-wide reserved vectors.
      
      As a consequence, a subsequent allocation on a secondary CPU can result in
      allocating one of these vectors, which triggers the BUG() in
      apic_update_vector() because the interrupt descriptor slot is not empty.
      
      Imran tried to work around that by marking those interrupts as allocated
      when a CPU comes online. But that's wrong in case that the IO/APIC is
      available and one of the legacy interrupts, e.g. IRQ0, has been switched to
      PIC mode because then marking them as allocated will fail as they are
      already marked as system vectors.
      
      Stay consistent and update the legacy vectors after attempting IO/APIC
      initialization and mark them as system vectors in case that no IO/APIC is
      available.
      
      Fixes: 69cde000
      
       ("x86/vector: Use matrix allocator for vector assignment")
      Reported-by: default avatarImran Khan <imran.f.khan@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20210519233928.2157496-1-imran.f.khan@oracle.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42f75a43
    • Nirmoy Das's avatar
      drm/amdgpu: make sure we unpin the UVD BO · 3a6b6922
      Nirmoy Das authored
      commit 07438603 upstream.
      
      Releasing pinned BOs is illegal now. UVD 6 was missing from:
      commit 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO")
      
      Fixes: 2f40801d
      
       ("drm/amdgpu: make sure we unpin the UVD BO")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNirmoy Das <nirmoy.das@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a6b6922
    • Luben Tuikov's avatar
      drm/amdgpu: Don't query CE and UE errors · 58da0b50
      Luben Tuikov authored
      commit dce3d8e1 upstream.
      
      On QUERY2 IOCTL don't query counts of correctable
      and uncorrectable errors, since when RAS is
      enabled and supported on Vega20 server boards,
      this takes insurmountably long time, in O(n^3),
      which slows the system down to the point of it
      being unusable when we have GUI up.
      
      Fixes: ae363a21
      
       ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2")
      Cc: Alexander Deucher <Alexander.Deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLuben Tuikov <luben.tuikov@amd.com>
      Reviewed-by: default avatarAlexander Deucher <Alexander.Deucher@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58da0b50
    • Krzysztof Kozlowski's avatar
      nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect · 48ee0db6
      Krzysztof Kozlowski authored
      commit 4ac06a1e upstream.
      
      It's possible to trigger NULL pointer dereference by local unprivileged
      user, when calling getsockname() after failed bind() (e.g. the bind
      fails because LLCP_SAP_MAX used as SAP):
      
        BUG: kernel NULL pointer dereference, address: 0000000000000000
        CPU: 1 PID: 426 Comm: llcp_sock_getna Not tainted 5.13.0-rc2-next-20210521+ #9
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1 04/01/2014
        Call Trace:
         llcp_sock_getname+0xb1/0xe0
         __sys_getpeername+0x95/0xc0
         ? lockdep_hardirqs_on_prepare+0xd5/0x180
         ? syscall_enter_from_user_mode+0x1c/0x40
         __x64_sys_getpeername+0x11/0x20
         do_syscall_64+0x36/0x70
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      This can be reproduced with Syzkaller C repro (bind followed by
      getpeername):
      https://syzkaller.appspot.com/x/repro.c?x=14def446e00000
      
      Cc: <stable@vger.kernel.org>
      Fixes: d646960f
      
       ("NFC: Initial LLCP support")
      Reported-by: default avatar <syzbot+80fb126e7f7d8b1a5914@syzkaller.appspotmail.com>
      Reported-by: default avatarbutt3rflyh4ck <butterflyhuangxx@gmail.com>
      Signed-off-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
      Link: https://lore.kernel.org/r/20210531072138.5219-1-krzysztof.kozlowski@canonical.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48ee0db6
    • Pu Wen's avatar
      x86/sev: Check SME/SEV support in CPUID first · 445477e9
      Pu Wen authored
      commit 009767db upstream.
      
      The first two bits of the CPUID leaf 0x8000001F EAX indicate whether SEV
      or SME is supported, respectively. It's better to check whether SEV or
      SME is actually supported before accessing the MSR_AMD64_SEV to check
      whether SEV or SME is enabled.
      
      This is both a bare-metal issue and a guest/VM issue. Since the first
      generation Hygon Dhyana CPU doesn't support the MSR_AMD64_SEV, reading that
      MSR results in a #GP - either directly from hardware in the bare-metal
      case or via the hypervisor (because the RDMSR is actually intercepted)
      in the guest/VM case, resulting in a failed boot. And since this is very
      early in the boot phase, rdmsrl_safe()/native_read_msr_safe() can't be
      used.
      
      So check the CPUID bits first, before accessing the MSR.
      
       [ tlendacky: Expand and improve commit message. ]
       [ bp: Massage commit message. ]
      
      Fixes: eab696d8
      
       ("x86/sev: Do not require Hypervisor CPUID bit for SEV guests")
      Signed-off-by: default avatarPu Wen <puwen@hygon.cn>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Cc: <stable@vger.kernel.org> # v5.10+
      Link: https://lkml.kernel.org/r/20210602070207.2480-1-puwen@hygon.cn
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      445477e9
    • Thomas Gleixner's avatar
      x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() · 942c5864
      Thomas Gleixner authored
      commit 9bfecd05 upstream.
      
      While digesting the XSAVE-related horrors which got introduced with
      the supervisor/user split, the recent addition of ENQCMD-related
      functionality got on the radar and turned out to be similarly broken.
      
      update_pasid(), which is only required when X86_FEATURE_ENQCMD is
      available, is invoked from two places:
      
       1) From switch_to() for the incoming task
      
       2) Via a SMP function call from the IOMMU/SMV code
      
      #1 is half-ways correct as it hacks around the brokenness of get_xsave_addr()
         by enforcing the state to be 'present', but all the conditionals in that
         code are completely pointless for that.
      
         Also the invocation is just useless overhead because at that point
         it's guaranteed that TIF_NEED_FPU_LOAD is set on the incoming task
         and all of this can be handled at return to user space.
      
      #2 is broken beyond repair. The comment in the code claims that it is safe
         to invoke this in an IPI, but that's just wishful thinking.
      
         FPU state of a running task is protected by fregs_lock() which is
         nothing else than a local_bh_disable(). As BH-disabled regions run
         usually with interrupts enabled the IPI can hit a code section which
         modifies FPU state and there is absolutely no guarantee that any of the
         assumptions which are made for the IPI case is true.
      
         Also the IPI is sent to all CPUs in mm_cpumask(mm), but the IPI is
         invoked with a NULL pointer argument, so it can hit a completely
         unrelated task and unconditionally force an update for nothing.
         Worse, it can hit a kernel thread which operates on a user space
         address space and set a random PASID for it.
      
      The offending commit does not cleanly revert, but it's sufficient to
      force disable X86_FEATURE_ENQCMD and to remove the broken update_pasid()
      code to make this dysfunctional all over the place. Anything more
      complex would require more surgery and none of the related functions
      outside of the x86 core code are blatantly wrong, so removing those
      would be overkill.
      
      As nothing enables the PASID bit in the IA32_XSS MSR yet, which is
      required to make this actually work, this cannot result in a regression
      except for related out of tree train-wrecks, but they are broken already
      today.
      
      Fixes: 20f0afd1
      
       ("x86/mmu: Allocate/free a PASID")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/87mtsd6gr9.ffs@nanos.tec.linutronix.de
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      942c5864
    • Ding Hui's avatar
      mm/page_alloc: fix counting of free pages after take off from buddy · 68dcd32b
      Ding Hui authored
      commit bac9c6fa upstream.
      
      Recently we found that there is a lot MemFree left in /proc/meminfo
      after do a lot of pages soft offline, it's not quite correct.
      
      Before Oscar's rework of soft offline for free pages [1], if we soft
      offline free pages, these pages are left in buddy with HWPoison flag,
      and NR_FREE_PAGES is not updated immediately.  So the difference between
      NR_FREE_PAGES and real number of available free pages is also even big
      at the beginning.
      
      However, with the workload running, when we catch HWPoison page in any
      alloc functions subsequently, we will remove it from buddy, meanwhile
      update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES will get
      more and more closer to the real number of available free pages.
      (regardless of unpoison_memory())
      
      Now, for offline free pages, after a successful call
      take_page_off_buddy(), the page is no longer belong to buddy allocator,
      and will not be used any more, but we missed accounting NR_FREE_PAGES in
      this situation, and there is no chance to be updated later.
      
      Do update in take_page_off_buddy() like rmqueue() does, but avoid double
      counting if some one already set_migratetype_isolate() on the page.
      
      [1]: commit 06be6ff3 ("mm,hwpoison: rework soft offline for free pages")
      
      Link: https://lkml.kernel.org/r/20210526075247.11130-1-dinghui@sangfor.com.cn
      Fixes: 06be6ff3
      
       ("mm,hwpoison: rework soft offline for free pages")
      Signed-off-by: default avatarDing Hui <dinghui@sangfor.com.cn>
      Suggested-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68dcd32b
    • Gerald Schaefer's avatar
      mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() · 5f2e1e81
      Gerald Schaefer authored
      commit 04f7ce3f upstream.
      
      In pmd/pud_advanced_tests(), the vaddr is aligned up to the next pmd/pud
      entry, and so it does not match the given pmdp/pudp and (aligned down)
      pfn any more.
      
      For s390, this results in memory corruption, because the IDTE
      instruction used e.g.  in xxx_get_and_clear() will take the vaddr for
      some calculations, in combination with the given pmdp.  It will then end
      up with a wrong table origin, ending on ...ff8, and some of those
      wrongly set low-order bits will also select a wrong pagetable level for
      the index addition.  IDTE could therefore invalidate (or 0x20) something
      outside of the page tables, depending on the wrongly picked index, which
      in turn depends on the random vaddr.
      
      As result, we sometimes see "BUG task_struct (Not tainted): Padding
      overwritten" on s390, where one 0x5a padding value got overwritten with
      0x7a.
      
      Fix this by aligning down, similar to how the pmd/pud_aligned pfns are
      calculated.
      
      Link: https://lkml.kernel.org/r/20210525130043.186290-2-gerald.schaefer@linux.ibm.com
      Fixes: a5c3b9ff
      
       ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: <stable@vger.kernel.org>	[5.9+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f2e1e81
    • Junxiao Bi's avatar
      ocfs2: fix data corruption by fallocate · c8d5faee
      Junxiao Bi authored
      commit 6bba4471
      
       upstream.
      
      When fallocate punches holes out of inode size, if original isize is in
      the middle of last cluster, then the part from isize to the end of the
      cluster will be zeroed with buffer write, at that time isize is not yet
      updated to match the new size, if writeback is kicked in, it will invoke
      ocfs2_writepage()->block_write_full_page() where the pages out of inode
      size will be dropped.  That will cause file corruption.  Fix this by
      zero out eof blocks when extending the inode size.
      
      Running the following command with qemu-image 4.2.1 can get a corrupted
      coverted image file easily.
      
          qemu-img convert -p -t none -T none -f qcow2 $qcow_image \
                   -O qcow2 -o compat=1.1 $qcow_image.conv
      
      The usage of fallocate in qemu is like this, it first punches holes out
      of inode size, then extend the inode size.
      
          fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2276196352, 65536) = 0
          fallocate(11, 0, 2276196352, 65536) = 0
      
      v1: https://www.spinics.net/lists/linux-fsdevel/msg193999.html
      v2: https://lore.kernel.org/linux-fsdevel/20210525093034.GB4112@quack2.suse.cz/T/
      
      Link: https://lkml.kernel.org/r/20210528210648.9124-1-junxiao.bi@oracle.com
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8d5faee
    • Mark Rutland's avatar
      pid: take a reference when initializing `cad_pid` · 7178be00
      Mark Rutland authored
      commit 0711f0d7 upstream.
      
      During boot, kernel_init_freeable() initializes `cad_pid` to the init
      task's struct pid.  Later on, we may change `cad_pid` via a sysctl, and
      when this happens proc_do_cad_pid() will increment the refcount on the
      new pid via get_pid(), and will decrement the refcount on the old pid
      via put_pid().  As we never called get_pid() when we initialized
      `cad_pid`, we decrement a reference we never incremented, can therefore
      free the init task's struct pid early.  As there can be dangling
      references to the struct pid, we can later encounter a use-after-free
      (e.g.  when delivering signals).
      
      This was spotted when fuzzing v5.13-rc3 with Syzkaller, but seems to
      have been around since the conversion of `cad_pid` to struct pid in
      commit 9ec52099 ("[PATCH] replace cad_pid by a struct pid") from the
      pre-KASAN stone age of v2.6.19.
      
      Fix this by getting a reference to the init task's struct pid when we
      assign it to `cad_pid`.
      
      Full KASAN splat below.
      
         ==================================================================
         BUG: KASAN: use-after-free in ns_of_pid include/linux/pid.h:153 [inline]
         BUG: KASAN: use-after-free in task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509
         Read of size 4 at addr ffff23794dda0004 by task syz-executor.0/273
      
         CPU: 1 PID: 273 Comm: syz-executor.0 Not tainted 5.12.0-00001-g9aef892b2d15 #1
         Hardware name: linux,dummy-virt (DT)
         Call trace:
          ns_of_pid include/linux/pid.h:153 [inline]
          task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509
          do_notify_parent+0x308/0xe60 kernel/signal.c:1950
          exit_notify kernel/exit.c:682 [inline]
          do_exit+0x2334/0x2bd0 kernel/exit.c:845
          do_group_exit+0x108/0x2c8 kernel/exit.c:922
          get_signal+0x4e4/0x2a88 kernel/signal.c:2781
          do_signal arch/arm64/kernel/signal.c:882 [inline]
          do_notify_resume+0x300/0x970 arch/arm64/kernel/signal.c:936
          work_pending+0xc/0x2dc
      
         Allocated by task 0:
          slab_post_alloc_hook+0x50/0x5c0 mm/slab.h:516
          slab_alloc_node mm/slub.c:2907 [inline]
          slab_alloc mm/slub.c:2915 [inline]
          kmem_cache_alloc+0x1f4/0x4c0 mm/slub.c:2920
          alloc_pid+0xdc/0xc00 kernel/pid.c:180
          copy_process+0x2794/0x5e18 kernel/fork.c:2129
          kernel_clone+0x194/0x13c8 kernel/fork.c:2500
          kernel_thread+0xd4/0x110 kernel/fork.c:2552
          rest_init+0x44/0x4a0 init/main.c:687
          arch_call_rest_init+0x1c/0x28
          start_kernel+0x520/0x554 init/main.c:1064
          0x0
      
         Freed by task 270:
          slab_free_hook mm/slub.c:1562 [inline]
          slab_free_freelist_hook+0x98/0x260 mm/slub.c:1600
          slab_free mm/slub.c:3161 [inline]
          kmem_cache_free+0x224/0x8e0 mm/slub.c:3177
          put_pid.part.4+0xe0/0x1a8 kernel/pid.c:114
          put_pid+0x30/0x48 kernel/pid.c:109
          proc_do_cad_pid+0x190/0x1b0 kernel/sysctl.c:1401
          proc_sys_call_handler+0x338/0x4b0 fs/proc/proc_sysctl.c:591
          proc_sys_write+0x34/0x48 fs/proc/proc_sysctl.c:617
          call_write_iter include/linux/fs.h:1977 [inline]
          new_sync_write+0x3ac/0x510 fs/read_write.c:518
          vfs_write fs/read_write.c:605 [inline]
          vfs_write+0x9c4/0x1018 fs/read_write.c:585
          ksys_write+0x124/0x240 fs/read_write.c:658
          __do_sys_write fs/read_write.c:670 [inline]
          __se_sys_write fs/read_write.c:667 [inline]
          __arm64_sys_write+0x78/0xb0 fs/read_write.c:667
          __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
          invoke_syscall arch/arm64/kernel/syscall.c:49 [inline]
          el0_svc_common.constprop.1+0x16c/0x388 arch/arm64/kernel/syscall.c:129
          do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:168
          el0_svc+0x28/0x38 arch/arm64/kernel/entry-common.c:416
          el0_sync_handler+0x134/0x180 arch/arm64/kernel/entry-common.c:432
          el0_sync+0x154/0x180 arch/arm64/kernel/entry.S:701
      
         The buggy address belongs to the object at ffff23794dda0000
          which belongs to the cache pid of size 224
         The buggy address is located 4 bytes inside of
          224-byte region [ffff23794dda0000, ffff23794dda00e0)
         The buggy address belongs to the page:
         page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4dda0
         head:(____ptrval____) order:1 compound_mapcount:0
         flags: 0x3fffc0000010200(slab|head)
         raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff23794d40d080
         raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
         page dumped because: kasan: bad access detected
      
         Memory state around the buggy address:
          ffff23794dd9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
          ffff23794dd9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
         >ffff23794dda0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                            ^
          ffff23794dda0080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
          ffff23794dda0100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
         ==================================================================
      
      Link: https://lkml.kernel.org/r/20210524172230.38715-1-mark.rutland@arm.com
      Fixes: 9ec52099
      
       ("[PATCH] replace cad_pid by a struct pid")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Christian Brauner <christian@brauner.io>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kees Cook <keescook@chromium.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7178be00
    • Phil Elwell's avatar
      usb: dwc2: Fix build in periphal-only mode · a4ed12f5
      Phil Elwell authored
      In branches to which 24d209db ("usb: dwc2: Fix hibernation between
      host and device modes.") has been back-ported, the bus_suspended member
      of struct dwc2_hsotg is only present in builds that support host-mode.
      To avoid having to pull in several more non-Fix commits in order to
      get it to compile, wrap the usage of the member in a macro conditional.
      
      Fixes: 24d209db
      
       ("usb: dwc2: Fix hibernation between host and device modes.")
      Signed-off-by: default avatarPhil Elwell <phil@raspberrypi.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4ed12f5
    • Ritesh Harjani's avatar
      ext4: fix accessing uninit percpu counter variable with fast_commit · 3b713aaf
      Ritesh Harjani authored
      commit b45f189a upstream.
      
      When running generic/527 with fast_commit configuration, the following
      issue is seen on Power.  With fast_commit, during ext4_fc_replay()
      (which can be called from ext4_fill_super()), if inode eviction
      happens then it can access an uninitialized percpu counter variable.
      
      This patch adds the check before accessing the counters in
      ext4_free_inode() path.
      
      [  321.165371] run fstests generic/527 at 2021-04-29 08:38:43
      [  323.027786] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: block_validity. Quota mode: none.
      [  323.618772] BUG: Unable to handle kernel data access on read at 0x1fbd80000
      [  323.619767] Faulting instruction address: 0xc000000000bae78c
      cpu 0x1: Vector: 300 (Data Access) at [c000000010706ef0]
          pc: c000000000bae78c: percpu_counter_add_batch+0x3c/0x100
          lr: c0000000006d0bb0: ext4_free_inode+0x780/0xb90
          pid   = 5593, comm = mount
      	ext4_free_inode+0x780/0xb90
      	ext4_evict_inode+0xa8c/0xc60
      	evict+0xfc/0x1e0
      	ext4_fc_replay+0xc50/0x20f0
      	do_one_pass+0xfe0/0x1350
      	jbd2_journal_recover+0x184/0x2e0
      	jbd2_journal_load+0x1c0/0x4a0
      	ext4_fill_super+0x2458/0x4200
      	mount_bdev+0x1dc/0x290
      	ext4_mount+0x28/0x40
      	legacy_get_tree+0x4c/0xa0
      	vfs_get_tree+0x4c/0x120
      	path_mount+0xcf8/0xd70
      	do_mount+0x80/0xd0
      	sys_mount+0x3fc/0x490
      	system_call_exception+0x384/0x3d0
      	system_call_common+0xec/0x278
      
      Cc: stable@kernel.org
      Fixes: 8016e29f
      
       ("ext4: fast commit recovery path")
      Signed-off-by: default avatarRitesh Harjani <riteshh@linux.ibm.com>
      Reviewed-by: default avatarHarshad Shirwadkar <harshadshirwadkar@gmail.com>
      Link: https://lore.kernel.org/r/6cceb9a75c54bef8fa9696c1b08c8df5ff6169e2.1619692410.git.riteshh@linux.ibm.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b713aaf
    • Phillip Potter's avatar
      ext4: fix memory leak in ext4_mb_init_backend on error path. · 2050c6e5
      Phillip Potter authored
      commit a8867f4e
      
       upstream.
      
      Fix a memory leak discovered by syzbot when a file system is corrupted
      with an illegally large s_log_groups_per_flex.
      
      Reported-by: default avatar <syzbot+aa12d6106ea4ca1b6aae@syzkaller.appspotmail.com>
      Signed-off-by: default avatarPhillip Potter <phil@philpotter.co.uk>
      Cc: stable@kernel.org
      Link: https://lore.kernel.org/r/20210412073837.1686-1-phil@philpotter.co.uk
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2050c6e5
    • Harshad Shirwadkar's avatar
      ext4: fix fast commit alignment issues · fb86acc6
      Harshad Shirwadkar authored
      commit a7ba36bc upstream.
      
      Fast commit recovery data on disk may not be aligned. So, when the
      recovery code reads it, this patch makes sure that fast commit info
      found on-disk is first memcpy-ed into an aligned variable before
      accessing it. As a consequence of it, we also remove some macros that
      could resulted in unaligned accesses.
      
      Cc: stable@kernel.org
      Fixes: 8016e29f
      
       ("ext4: fast commit recovery path")
      Signed-off-by: default avatarHarshad Shirwadkar <harshadshirwadkar@gmail.com>
      Link: https://lore.kernel.org/r/20210519215920.2037527-1-harshads@google.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb86acc6
    • Ye Bin's avatar
      ext4: fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed · d3b668b9
      Ye Bin authored
      commit 082cd4ec
      
       upstream.
      
      We got follow bug_on when run fsstress with injecting IO fault:
      [130747.323114] kernel BUG at fs/ext4/extents_status.c:762!
      [130747.323117] Internal error: Oops - BUG: 0 [#1] SMP
      ......
      [130747.334329] Call trace:
      [130747.334553]  ext4_es_cache_extent+0x150/0x168 [ext4]
      [130747.334975]  ext4_cache_extents+0x64/0xe8 [ext4]
      [130747.335368]  ext4_find_extent+0x300/0x330 [ext4]
      [130747.335759]  ext4_ext_map_blocks+0x74/0x1178 [ext4]
      [130747.336179]  ext4_map_blocks+0x2f4/0x5f0 [ext4]
      [130747.336567]  ext4_mpage_readpages+0x4a8/0x7a8 [ext4]
      [130747.336995]  ext4_readpage+0x54/0x100 [ext4]
      [130747.337359]  generic_file_buffered_read+0x410/0xae8
      [130747.337767]  generic_file_read_iter+0x114/0x190
      [130747.338152]  ext4_file_read_iter+0x5c/0x140 [ext4]
      [130747.338556]  __vfs_read+0x11c/0x188
      [130747.338851]  vfs_read+0x94/0x150
      [130747.339110]  ksys_read+0x74/0xf0
      
      This patch's modification is according to Jan Kara's suggestion in:
      https://patchwork.ozlabs.org/project/linux-ext4/patch/20210428085158.3728201-1-yebin10@huawei.com/
      "I see. Now I understand your patch. Honestly, seeing how fragile is trying
      to fix extent tree after split has failed in the middle, I would probably
      go even further and make sure we fix the tree properly in case of ENOSPC
      and EDQUOT (those are easily user triggerable).  Anything else indicates a
      HW problem or fs corruption so I'd rather leave the extent tree as is and
      don't try to fix it (which also means we will not create overlapping
      extents)."
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20210506141042.3298679-1-yebin10@huawei.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3b668b9
    • Alexey Makhalov's avatar
      ext4: fix memory leak in ext4_fill_super · 01d349a4
      Alexey Makhalov authored
      commit afd09b61 upstream.
      
      Buffer head references must be released before calling kill_bdev();
      otherwise the buffer head (and its page referenced by b_data) will not
      be freed by kill_bdev, and subsequently that bh will be leaked.
      
      If blocksizes differ, sb_set_blocksize() will kill current buffers and
      page cache by using kill_bdev(). And then super block will be reread
      again but using correct blocksize this time. sb_set_blocksize() didn't
      fully free superblock page and buffer head, and being busy, they were
      not freed and instead leaked.
      
      This can easily be reproduced by calling an infinite loop of:
      
        systemctl start <ext4_on_lvm>.mount, and
        systemctl stop <ext4_on_lvm>.mount
      
      ... since systemd creates a cgroup for each slice which it mounts, and
      the bh leak get amplified by a dying memory cgroup that also never
      gets freed, and memory consumption is much more easily noticed.
      
      Fixes: ce40733c ("ext4: Check for return value from sb_set_blocksize")
      Fixes: ac27a0ec
      
       ("ext4: initial copy of files from ext3")
      Link: https://lore.kernel.org/r/20210521075533.95732-1-amakhalov@vmware.com
      Signed-off-by: default avatarAlexey Makhalov <amakhalov@vmware.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01d349a4
    • Marek Vasut's avatar
      ARM: dts: imx6q-dhcom: Add PU,VDD1P1,VDD2P5 regulators · b2057d13
      Marek Vasut authored
      commit 8967b27a upstream.
      
      Per schematic, both PU and SOC regulator are supplied from LTC3676 SW1
      via VDDSOC_IN rail, add the PU input. Both VDD1P1, VDD2P5 are supplied
      from LTC3676 SW2 via VDDHIGH_IN rail, add both inputs.
      
      While no instability or problems are currently observed, the regulators
      should be fully described in DT and that description should fully match
      the hardware, else this might lead to unforseen issues later. Fix this.
      
      Fixes: 52c7a088
      
       ("ARM: dts: imx6q: Add support for the DHCOM iMX6 SoM and PDK2")
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Christoph Niedermaier <cniedermaier@dh-electronics.com>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: Ludwig Zenz <lzenz@dh-electronics.com>
      Cc: NXP Linux Team <linux-imx@nxp.com>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarChristoph Niedermaier <cniedermaier@dh-electronics.com>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2057d13
    • Michal Vokáč's avatar
      ARM: dts: imx6dl-yapp4: Fix RGMII connection to QCA8334 switch · 623603e2
      Michal Vokáč authored
      commit 0e4a4a08 upstream.
      
      The FEC does not have a PHY so it should not have a phy-handle. It is
      connected to the switch at RGMII level so we need a fixed-link sub-node
      on both ends.
      
      This was not a problem until the qca8k.c driver was converted to PHYLINK
      by commit b3591c2a ("net: dsa: qca8k: Switch to PHYLINK instead of
      PHYLIB"). That commit revealed the FEC configuration was not correct.
      
      Fixes: 87489ec3
      
       ("ARM: dts: imx: Add Y Soft IOTA Draco, Hydra and Ursa boards")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichal Vokáč <michal.vokac@ysoft.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      623603e2
    • Hui Wang's avatar
      ALSA: hda: update the power_state during the direct-complete · 846848c0
      Hui Wang authored
      commit b8b90c17 upstream.
      
      The patch_realtek.c needs to check if the power_state.event equals
      PM_EVENT_SUSPEND, after using the direct-complete, the suspend() and
      resume() will be skipped if the codec is already rt_suspended, in this
      case, the patch_realtek.c will always get PM_EVENT_ON even the system
      is really resumed from S3.
      
      We could set power_state to PMSG_SUSPEND in the prepare(), if other
      PM functions are called before complete(), those functions will
      override power_state; if no other PM functions are called before
      complete(), we could know the suspend() and resume() are skipped since
      only S3 pm functions could be skipped by direct-complete, in this case
      set power_state to PMSG_RESUME in the complete(). This could guarantee
      the first time of calling hda_codec_runtime_resume() after complete()
      has the correct power_state.
      
      Fixes: 215a22ed
      
       ("ALSA: hda: Refactor codec PM to use direct-complete optimization")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Link: https://lore.kernel.org/r/20210602145424.3132-1-hui.wang@canonical.com
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      846848c0
    • Carlos M's avatar
      ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx · cfbb57fc
      Carlos M authored
      commit 901be145
      
       upstream.
      
      For the HP Pavilion 15-CK0xx, with audio subsystem ID 0x103c:0x841c,
      adding a line in patch_realtek.c to apply the ALC269_FIXUP_HP_MUTE_LED_MIC3
      fix activates the mute key LED.
      
      Signed-off-by: default avatarCarlos M <carlos.marr.pz@gmail.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20210531202026.35427-1-carlos.marr.pz@gmail.com
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cfbb57fc
    • Takashi Iwai's avatar
      ALSA: timer: Fix master timer notification · 029c0610
      Takashi Iwai authored
      commit 9c1fe96b
      
       upstream.
      
      snd_timer_notify1() calls the notification to each slave for a master
      event, but it passes a wrong event number.  It should be +10 offset,
      corresponding to SNDRV_TIMER_EVENT_MXXX, but it's incorrectly with
      +100 offset.  Casually this was spotted by UBSAN check via syzkaller.
      
      Reported-by: default avatar <syzbot+d102fa5b35335a7e544e@syzkaller.appspotmail.com>
      Reviewed-by: default avatarJaroslav Kysela <perex@perex.cz>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/000000000000e5560e05c3bd1d63@google.com
      Link: https://lore.kernel.org/r/20210602113823.23777-1-tiwai@suse.de
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      029c0610
    • Bob Peterson's avatar
      gfs2: fix scheduling while atomic bug in glocks · d11e5b96
      Bob Peterson authored
      commit 20265d9a upstream.
      
      Before this patch, in the unlikely event that gfs2_glock_dq encountered
      a withdraw, it would do a wait_on_bit to wait for its journal to be
      recovered, but it never released the glock's spin_lock, which caused a
      scheduling-while-atomic error.
      
      This patch unlocks the lockref spin_lock before waiting for recovery.
      
      Fixes: 601ef0d5
      
       ("gfs2: Force withdraw to replay journals and wait for it to finish")
      Cc: stable@vger.kernel.org # v5.7+
      Reported-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d11e5b96
    • Ahelenia Ziemiańska's avatar
      HID: multitouch: require Finger field to mark Win8 reports as MT · 127f25be
      Ahelenia Ziemiańska authored
      commit a2353e3b
      
       upstream.
      
      This effectively changes collection_is_mt from
        contact ID in report->field
      to
        (device is Win8 => collection is finger) && contact ID in report->field
      
      Some devices erroneously report Pen for fingers, and Win8 stylus-on-touchscreen
      devices report contact ID, but mark the accompanying touchscreen device's
      collection correctly
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAhelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
      Acked-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      127f25be
    • Johan Hovold's avatar
      HID: magicmouse: fix NULL-deref on disconnect · b5d013c4
      Johan Hovold authored
      commit 4b4f6cec upstream.
      
      Commit 9d7b1866
      
       ("HID: magicmouse: add support for Apple Magic
      Trackpad 2") added a sanity check for an Apple trackpad but returned
      success instead of -ENODEV when the check failed. This means that the
      remove callback will dereference the never-initialised driver data
      pointer when the driver is later unbound (e.g. on USB disconnect).
      
      Reported-by: default avatar <syzbot+ee6f6e2e68886ca256a8@syzkaller.appspotmail.com>
      Fixes: 9d7b1866
      
       ("HID: magicmouse: add support for Apple Magic Trackpad 2")
      Cc: stable@vger.kernel.org      # 4.20
      Cc: Claudio Mettler <claudio@ponyfleisch.ch>
      Cc: Marek Wyborski <marek.wyborski@emwesoft.com>
      Cc: Sean O'Brien <seobrien@chromium.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5d013c4
    • Johnny Chuang's avatar
      HID: i2c-hid: Skip ELAN power-on command after reset · a5e554f7
      Johnny Chuang authored
      commit ca66a677 upstream.
      
      For ELAN touchscreen, we found our boot code of IC was not flexible enough
      to receive and handle this command.
      Once the FW main code of our controller is crashed for some reason,
      the controller could not be enumerated successfully to be recognized
      by the system host. therefore, it lost touch functionality.
      
      Add quirk for skip send power-on command after reset.
      It will impact to ELAN touchscreen and touchpad on HID over I2C projects.
      
      Fixes: 43b7029f
      
       ("HID: i2c-hid: Send power-on command after reset").
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJohnny Chuang <johnny.chuang.emc@gmail.com>
      Reviewed-by: default avatarHarry Cutts <hcutts@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Tested-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5e554f7
    • Pavel Skripkin's avatar
      net: caif: fix memory leak in cfusbl_device_notify · 46403c1f
      Pavel Skripkin authored
      commit 7f5d8666 upstream.
      
      In case of caif_enroll_dev() fail, allocated
      link_support won't be assigned to the corresponding
      structure. So simply free allocated pointer in case
      of error.
      
      Fixes: 7ad65bf6
      
       ("caif: Add support for CAIF over CDC NCM USB interface")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46403c1f
    • Pavel Skripkin's avatar
      net: caif: fix memory leak in caif_device_notify · af280634
      Pavel Skripkin authored
      commit b53558a9 upstream.
      
      In case of caif_enroll_dev() fail, allocated
      link_support won't be assigned to the corresponding
      structure. So simply free allocated pointer in case
      of error
      
      Fixes: 7c18d220
      
       ("caif: Restructure how link caif link layer enroll")
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: default avatar <syzbot+7ec324747ce876a29db6@syzkaller.appspotmail.com>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af280634
    • Pavel Skripkin's avatar
      net: caif: add proper error handling · d6db7274
      Pavel Skripkin authored
      commit a2805dca upstream.
      
      caif_enroll_dev() can fail in some cases. Ingnoring
      these cases can lead to memory leak due to not assigning
      link_support pointer to anywhere.
      
      Fixes: 7c18d220
      
       ("caif: Restructure how link caif link layer enroll")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6db7274
    • Pavel Skripkin's avatar
      net: caif: added cfserl_release function · dac53568
      Pavel Skripkin authored
      commit bce130e7
      
       upstream.
      
      Added cfserl_release() function.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dac53568
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: free empty intermediate nodes when removing single node · df3b45f6
      Jason A. Donenfeld authored
      commit bf7b042d upstream.
      
      When removing single nodes, it's possible that that node's parent is an
      empty intermediate node, in which case, it too should be removed.
      Otherwise the trie fills up and never is fully emptied, leading to
      gradual memory leaks over time for tries that are modified often. There
      was originally code to do this, but was removed during refactoring in
      2016 and never reworked. Now that we have proper parent pointers from
      the previous commits, we can implement this properly.
      
      In order to reduce branching and expensive comparisons, we want to keep
      the double pointer for parent assignment (which lets us easily chain up
      to the root), but we still need to actually get the parent's base
      address. So encode the bit number into the last two bits of the pointer,
      and pack and unpack it as needed. This is a little bit clumsy but is the
      fastest and less memory wasteful of the compromises. Note that we align
      the root struct here to a minimum of 4, because it's embedded into a
      larger struct, and we're relying on having the bottom two bits for our
      flag, which would only be 16-bit aligned on m68k.
      
      The existing macro-based helpers were a bit unwieldy for adding the bit
      packing to, so this commit replaces them with safer and clearer ordinary
      functions.
      
      We add a test to the randomized/fuzzer part of the selftests, to free
      the randomized tries by-peer, refuzz it, and repeat, until it's supposed
      to be empty, and then then see if that actually resulted in the whole
      thing being emptied. That combined with kmemcheck should hopefully make
      sure this commit is doing what it should. Along the way this resulted in
      various other cleanups of the tests and fixes for recent graphviz.
      
      Fixes: e7096c13
      
       ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df3b45f6
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: allocate nodes in kmem_cache · c5155c74
      Jason A. Donenfeld authored
      commit dc680de2 upstream.
      
      The previous commit moved from O(n) to O(1) for removal, but in the
      process introduced an additional pointer member to a struct that
      increased the size from 60 to 68 bytes, putting nodes in the 128-byte
      slab. With deployed systems having as many as 2 million nodes, this
      represents a significant doubling in memory usage (128 MiB -> 256 MiB).
      Fix this by using our own kmem_cache, that's sized exactly right. This
      also makes wireguard's memory usage more transparent in tools like
      slabtop and /proc/slabinfo.
      
      Fixes: e7096c13
      
       ("net: WireGuard secure network tunnel")
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5155c74
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: remove nodes in O(1) · 70a9a71a
      Jason A. Donenfeld authored
      commit f634f418 upstream.
      
      Previously, deleting peers would require traversing the entire trie in
      order to rebalance nodes and safely free them. This meant that removing
      1000 peers from a trie with a half million nodes would take an extremely
      long time, during which we're holding the rtnl lock. Large-scale users
      were reporting 200ms latencies added to the networking stack as a whole
      every time their userspace software would queue up significant removals.
      That's a serious situation.
      
      This commit fixes that by maintaining a double pointer to the parent's
      bit pointer for each node, and then using the already existing node list
      belonging to each peer to go directly to the node, fix up its pointers,
      and free it with RCU. This means removal is O(1) instead of O(n), and we
      don't use gobs of stack.
      
      The removal algorithm has the same downside as the code that it fixes:
      it won't collapse needlessly long runs of fillers.  We can enhance that
      in the future if it ever becomes a problem. This commit documents that
      limitation with a TODO comment in code, a small but meaningful
      improvement over the prior situation.
      
      Currently the biggest flaw, which the next commit addresses, is that
      because this increases the node size on 64-bit machines from 60 bytes to
      68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up
      using twice as much memory per node, because of power-of-two
      allocations, which is a big bummer. We'll need to figure something out
      there.
      
      Fixes: e7096c13
      
       ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70a9a71a
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: initialize list head in selftest · 42a66771
      Jason A. Donenfeld authored
      commit 46cfe8ee upstream.
      
      The randomized trie tests weren't initializing the dummy peer list head,
      resulting in a NULL pointer dereference when used. Fix this by
      initializing it in the randomized trie test, just like we do for the
      static unit test.
      
      While we're at it, all of the other strings like this have the word
      "self-test", so add it to the missing place here.
      
      Fixes: e7096c13
      
       ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42a66771