Skip to content
  1. Sep 18, 2020
  2. Sep 11, 2020
  3. Sep 04, 2020
    • Nicolin Chen's avatar
      dma-mapping: set default segment_boundary_mask to ULONG_MAX · 135ba11a
      Nicolin Chen authored
      
      
      The default segment_boundary_mask was set to DMA_BIT_MAKS(32)
      a decade ago by referencing SCSI/block subsystem, as a 32-bit
      mask was good enough for most of the devices.
      
      Now more and more drivers set dma_masks above DMA_BIT_MAKS(32)
      while only a handful of them call dma_set_seg_boundary(). This
      means that most drivers have a 4GB segmention boundary because
      DMA API returns a 32-bit default value, though they might not
      really have such a limit.
      
      The default segment_boundary_mask should mean "no limit" since
      the device doesn't explicitly set the mask. But a 32-bit mask
      certainly limits those devices capable of 32+ bits addressing.
      
      So this patch sets default segment_boundary_mask to ULONG_MAX.
      
      Signed-off-by: default avatarNicolin Chen <nicoleotsuka@gmail.com>
      Acked-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      135ba11a
    • Nicolin Chen's avatar
      dma-mapping: introduce dma_get_seg_boundary_nr_pages() · 1e9d90db
      Nicolin Chen authored
      
      
      We found that callers of dma_get_seg_boundary mostly do an ALIGN
      with page mask and then do a page shift to get number of pages:
          ALIGN(boundary + 1, 1 << shift) >> shift
      
      However, the boundary might be as large as ULONG_MAX, which means
      that a device has no specific boundary limit. So either "+ 1" or
      passing it to ALIGN() would potentially overflow.
      
      According to kernel defines:
          #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
          #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)
      
      We can simplify the logic here into a helper function doing:
        ALIGN(boundary + 1, 1 << shift) >> shift
      = ALIGN_MASK(b + 1, (1 << s) - 1) >> s
      = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
      = [b + 1 + (1 << s) - 1] >> s
      = [b + (1 << s)] >> s
      = (b >> s) + 1
      
      This patch introduces and applies dma_get_seg_boundary_nr_pages()
      as an overflow-free helper for the dma_get_seg_boundary() callers
      to get numbers of pages. It also takes care of the NULL dev case
      for non-DMA API callers.
      
      Suggested-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarNicolin Chen <nicoleotsuka@gmail.com>
      Acked-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      1e9d90db
  4. Sep 01, 2020
    • Barry Song's avatar
      mm: cma: use CMA_MAX_NAME to define the length of cma name array · 2281f797
      Barry Song authored
      
      
      CMA_MAX_NAME should be visible to CMA's users as they might need it to set
      the name of CMA areas and avoid hardcoding the size locally.
      So this patch moves CMA_MAX_NAME from local header file to include/linux
      header file and removes the hardcode in both hugetlb.c and contiguous.c.
      
      Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      2281f797
    • Barry Song's avatar
      arm64: mm: reserve per-numa CMA to localize coherent dma buffers · c6303ab9
      Barry Song authored
      
      
      Right now, smmu is using dma_alloc_coherent() to get memory to save queues
      and tables. Typically, on ARM64 server, there is a default CMA located at
      node0, which could be far away from node2, node3 etc.
      with this patch, smmu will get memory from local numa node to save command
      queues and page tables. that means dma_unmap latency will be shrunk much.
      Meanwhile, when iommu.passthrough is on, device drivers which call dma_
      alloc_coherent() will also get local memory and avoid the travel between
      numa nodes.
      
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      c6303ab9
    • Barry Song's avatar
      dma-contiguous: provide the ability to reserve per-numa CMA · b7176c26
      Barry Song authored
      
      
      Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get
      coherent DMA buffers to save their command queues and page tables. As
      there is only one default CMA in the whole system, SMMUs on nodes other
      than node0 will get remote memory. This leads to significant latency.
      
      This patch provides per-numa CMA so that drivers like SMMU can get local
      memory. Tests show localizing CMA can decrease dma_unmap latency much.
      For instance, before this patch, SMMU on node2  has to wait for more than
      560ns for the completion of CMD_SYNC in an empty command queue; with this
      patch, it needs 240ns only.
      
      A positive side effect of this patch would be improving performance even
      further for those users who are worried about performance more than DMA
      security and use iommu.passthrough=1 to skip IOMMU. With local CMA, all
      drivers can get local coherent DMA buffers.
      
      Also, this patch changes the default CONFIG_CMA_AREAS to 19 in NUMA. As
      1+CONFIG_CMA_AREAS should be quite enough for most servers on the market
      even they enable both hugetlb_cma and pernuma_cma.
      2 numa nodes: 2(hugetlb) + 2(pernuma) + 1(default global cma) = 5
      4 numa nodes: 4(hugetlb) + 4(pernuma) + 1(default global cma) = 9
      8 numa nodes: 8(hugetlb) + 8(pernuma) + 1(default global cma) = 17
      
      Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      b7176c26
  5. Aug 31, 2020
    • Linus Torvalds's avatar
      Linux 5.9-rc3 · f75aef39
      Linus Torvalds authored
      v5.9-rc3
      f75aef39
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · e43327c7
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - fix regression in af_alg that affects iwd
      
       - restore polling delay in qat
      
       - fix double free in ingenic on error path
      
       - fix potential build failure in sa2ul due to missing Kconfig dependency
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: af_alg - Work around empty control messages without MSG_MORE
        crypto: sa2ul - add Kconfig selects to fix build error
        crypto: ingenic - Drop kfree for memory allocated with devm_kzalloc
        crypto: qat - add delay before polling mailbox
      e43327c7
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dcc5c6f0
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Three interrupt related fixes for X86:
      
         - Move disabling of the local APIC after invoking fixup_irqs() to
           ensure that interrupts which are incoming are noted in the IRR and
           not ignored.
      
         - Unbreak affinity setting.
      
           The rework of the entry code reused the regular exception entry
           code for device interrupts. The vector number is pushed into the
           errorcode slot on the stack which is then lifted into an argument
           and set to -1 because that's regs->orig_ax which is used in quite
           some places to check whether the entry came from a syscall.
      
           But it was overlooked that orig_ax is used in the affinity cleanup
           code to validate whether the interrupt has arrived on the new
           target. It turned out that this vector check is pointless because
           interrupts are never moved from one vector to another on the same
           CPU. That check is a historical leftover from the time where x86
           supported multi-CPU affinities, but not longer needed with the now
           strict single CPU affinity. Famous last words ...
      
         - Add a missing check for an empty cpumask into the matrix allocator.
      
           The affinity change added a warning to catch the case where an
           interrupt is moved on the same CPU to a different vector. This
           triggers because a condition with an empty cpumask returns an
           assignment from the allocator as the allocator uses for_each_cpu()
           without checking the cpumask for being empty. The historical
           inconsistent for_each_cpu() behaviour of ignoring the cpumask and
           unconditionally claiming that CPU0 is in the mask struck again.
           Sigh.
      
        plus a new entry into the MAINTAINER file for the HPE/UV platform"
      
      * tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/matrix: Deal with the sillyness of for_each_cpu() on UP
        x86/irq: Unbreak interrupt affinity setting
        x86/hotplug: Silence APIC only after all interrupts are migrated
        MAINTAINERS: Add entry for HPE Superdome Flex (UV) maintainers
      dcc5c6f0
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d2283cdc
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A set of fixes for interrupt chip drivers:
      
         - Revert the platform driver conversion of interrupt chip drivers as
           it turned out to create more problems than it solves.
      
         - Fix a trivial typo in the new module helpers which made probing
           reliably fail.
      
         - Small fixes in the STM32 and MIPS Ingenic drivers
      
         - The TI firmware rework which had badly managed dependencies and had
           to wait post rc1"
      
      * tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/ingenic: Leave parent IRQ unmasked on suspend
        irqchip/stm32-exti: Avoid losing interrupts due to clearing pending bits by mistake
        irqchip: Revert modular support for drivers using IRQCHIP_PLATFORM_DRIVER helperse
        irqchip: Fix probing deferal when using IRQCHIP_PLATFORM_DRIVER helpers
        arm64: dts: k3-am65: Update the RM resource types
        arm64: dts: k3-am65: ti-sci-inta/intr: Update to latest bindings
        arm64: dts: k3-j721e: ti-sci-inta/intr: Update to latest bindings
        irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC
        irqchip/ti-sci-inta: Do not store TISCI device id in platform device id field
        dt-bindings: irqchip: Convert ti, sci-inta bindings to yaml
        dt-bindings: irqchip: ti, sci-inta: Update docs to support different parent.
        irqchip/ti-sci-intr: Add support for INTR being a parent to INTR
        dt-bindings: irqchip: Convert ti, sci-intr bindings to yaml
        dt-bindings: irqchip: ti, sci-intr: Update bindings to drop the usage of gic as parent
        firmware: ti_sci: Add support for getting resource with subtype
        firmware: ti_sci: Drop unused structure ti_sci_rm_type_map
        firmware: ti_sci: Drop the device id to resource type translation
      d2283cdc
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0063a82d
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "A single fix for the scheduler:
      
         - Make is_idle_task() __always_inline to prevent the compiler from
           putting it out of line into the wrong section because it's used
           inside noinstr sections"
      
      * tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Use __always_inline on is_idle_task()
      0063a82d
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b69bea8a
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "A set of fixes for lockdep, tracing and RCU:
      
         - Prevent recursion by using raw_cpu_* operations
      
         - Fixup the interrupt state in the cpu idle code to be consistent
      
         - Push rcu_idle_enter/exit() invocations deeper into the idle path so
           that the lock operations are inside the RCU watching sections
      
         - Move trace_cpu_idle() into generic code so it's called before RCU
           goes idle.
      
         - Handle raw_local_irq* vs. local_irq* operations correctly
      
         - Move the tracepoints out from under the lockdep recursion handling
           which turned out to be fragile and inconsistent"
      
      * tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        lockdep,trace: Expose tracepoints
        lockdep: Only trace IRQ edges
        mips: Implement arch_irqs_disabled()
        arm64: Implement arch_irqs_disabled()
        nds32: Implement arch_irqs_disabled()
        locking/lockdep: Cleanup
        x86/entry: Remove unused THUNKs
        cpuidle: Move trace_cpu_idle() into generic code
        cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
        sched,idle,rcu: Push rcu_idle deeper into the idle path
        cpuidle: Fixup IRQ state
        lockdep: Use raw_cpu_*() for per-cpu variables
      b69bea8a
    • Linus Torvalds's avatar
      Merge tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6 · 3edd8db2
      Linus Torvalds authored
      Pull cfis fix from Steve French:
       "DFS fix for referral problem when using SMB1"
      
      * tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix check of tcon dfs in smb1
      3edd8db2
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 8bb5021c
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Revert our removal of PROT_SAO, at least one user expressed an
         interest in using it on Power9. Instead don't allow it to be used in
         guests unless enabled explicitly at compile time.
      
       - A fix for a crash introduced by a recent change to FP handling.
      
       - Revert a change to our idle code that left Power10 with no idle
         support.
      
       - One minor fix for the new scv system call path to set PPR.
      
       - Fix a crash in our "generic" PMU if branch stack events were enabled.
      
       - A fix for the IMC PMU, to correctly identify host kernel samples.
      
       - The ADB_PMU powermac code was found to be incompatible with
         VMAP_STACK, so make them incompatible in Kconfig until the code can
         be fixed.
      
       - A build fix in drivers/video/fbdev/controlfb.c, and a documentation
         fix.
      
      Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy,
      Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin,
      Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan
      Srinivasan.
      
      * tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU
        Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"
        powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc
        powerpc/perf: Fix crashes with generic_compat_pmu & BHRB
        powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode
        powerpc/64s: scv entry should set PPR
        Documentation/powerpc: fix malformed table in syscall64-abi
        video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n
        selftests/powerpc: Update PROT_SAO test to skip ISA 3.1
        powerpc/64s: Disallow PROT_SAO in LPARs by default
        Revert "powerpc/64s: Remove PROT_SAO support"
      8bb5021c
    • Linus Torvalds's avatar
      Merge tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 6f0306d1
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Let's try this again...  Here are some USB fixes for 5.9-rc3.
      
        This differs from the previous pull request for this release in that
        the usb gadget patch now does not break some systems, and actually
        does what it was intended to do. Many thanks to Marek Szyprowski for
        quickly noticing and testing the patch from Andy Shevchenko to resolve
        this issue.
      
        Additionally, some more new USB quirks have been added to get some new
        devices to work properly based on user reports.
      
        Other than that, the patches are all here, and they contain:
      
         - usb gadget driver fixes
      
         - xhci driver fixes
      
         - typec fixes
      
         - new quirks and ids
      
         - fixes for USB patches that went into 5.9-rc1.
      
        All of these have been tested in linux-next with no reported issues"
      
      * tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
        usb: storage: Add unusual_uas entry for Sony PSZ drives
        USB: Ignore UAS for JMicron JMS567 ATA/ATAPI Bridge
        usb: host: ohci-exynos: Fix error handling in exynos_ohci_probe()
        USB: gadget: u_f: Unbreak offset calculation in VLAs
        USB: quirks: Ignore duplicate endpoint on Sound Devices MixPre-D
        usb: typec: tcpm: Fix Fix source hard reset response for TDA 2.3.1.1 and TDA 2.3.1.2 failures
        USB: PHY: JZ4770: Fix static checker warning.
        USB: gadget: f_ncm: add bounds checks to ncm_unwrap_ntb()
        USB: gadget: u_f: add overflow checks to VLA macros
        xhci: Always restore EP_SOFT_CLEAR_TOGGLE even if ep reset failed
        xhci: Do warm-reset when both CAS and XDEV_RESUME are set
        usb: host: xhci: fix ep context print mismatch in debugfs
        usb: uas: Add quirk for PNY Pro Elite
        tools: usb: move to tools buildsystem
        USB: Fix device driver race
        USB: Also match device drivers using the ->match vfunc
        usb: host: xhci-tegra: fix tegra_xusb_get_phy()
        usb: host: xhci-tegra: otg usb2/usb3 port init
        usb: hcd: Fix use after free in usb_hcd_pci_remove()
        usb: typec: ucsi: Hold con->lock for the entire duration of ucsi_register_port()
        ...
      6f0306d1
    • Linus Torvalds's avatar
      Merge tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 42df60fc
      Linus Torvalds authored
      Pull EDAC fix from Borislav Petkov:
       "A fix to properly clear ghes_edac driver state on driver remove so
        that a subsequent load can probe the system properly (Shiju Jose)"
      
      * tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()
      42df60fc
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping · c4011283
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
       "Fix a possibly uninitialized variable (Dan Carpenter)"
      
      * tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping:
        dma-pool: Fix an uninitialized variable bug in atomic_pool_expand()
      c4011283
    • Thomas Gleixner's avatar
      genirq/matrix: Deal with the sillyness of for_each_cpu() on UP · 784a0830
      Thomas Gleixner authored
      Most of the CPU mask operations behave the same way, but for_each_cpu() and
      it's variants ignore the cpumask argument and claim that CPU0 is always in
      the mask. This is historical, inconsistent and annoying behaviour.
      
      The matrix allocator uses for_each_cpu() and can be called on UP with an
      empty cpumask. The calling code does not expect that this succeeds but
      until commit e027ffff ("x86/irq: Unbreak interrupt affinity setting")
      this went unnoticed. That commit added a WARN_ON() to catch cases which
      move an interrupt from one vector to another on the same CPU. The warning
      triggers on UP.
      
      Add a check for the cpumask being empty to prevent this.
      
      Fixes: 2f75d9e1
      
       ("genirq: Implement bitmap matrix allocator")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      784a0830
  6. Aug 30, 2020
    • Linus Torvalds's avatar
      Merge tag 'fallthrough-fixes-5.9-rc3' of... · 1127b219
      Linus Torvalds authored
      Merge tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull fallthrough fixes from Gustavo A. R. Silva:
       "Fix some minor issues introduced by the recent treewide fallthrough
        conversions:
      
         - Fix identation issue
      
         - Fix erroneous fallthrough annotation
      
         - Remove unnecessary fallthrough annotation
      
         - Fix code comment changed by fallthrough conversion"
      
      * tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        arm64/cpuinfo: Remove unnecessary fallthrough annotation
        media: dib0700: Fix identation issue in dib8096_set_param_override()
        afs: Remove erroneous fallthough annotation
        iio: dpot-dac: fix code comment in dpot_dac_read_raw()
      1127b219
    • Linus Torvalds's avatar
      fsldma: fix very broken 32-bit ppc ioread64 functionality · 0a4c56c8
      Linus Torvalds authored
      Commit ef91bb19
      
       ("kernel.h: Silence sparse warning in
      lower_32_bits") caused new warnings to show in the fsldma driver, but
      that commit was not to blame: it only exposed some very incorrect code
      that tried to take the low 32 bits of an address.
      
      That made no sense for multiple reasons, the most notable one being that
      that code was intentionally limited to only 32-bit ppc builds, so "only
      low 32 bits of an address" was completely nonsensical.  There were no
      high bits to mask off to begin with.
      
      But even more importantly fropm a correctness standpoint, turning the
      address into an integer then caused the subsequent address arithmetic to
      be completely wrong too, and the "+1" actually incremented the address
      by one, rather than by four.
      
      Which again was incorrect, since the code was reading two 32-bit values
      and trying to make a 64-bit end result of it all.  Surprisingly, the
      iowrite64() did not suffer from the same odd and incorrect model.
      
      This code has never worked, but it's questionable whether anybody cared:
      of the two users that actually read the 64-bit value (by way of some C
      preprocessor hackery and eventually the 'get_cdar()' inline function),
      one of them explicitly ignored the value, and the other one might just
      happen to work despite the incorrect value being read.
      
      This patch at least makes it not fail the build any more, and makes the
      logic superficially sane.  Whether it makes any difference to the code
      _working_ or not shall remain a mystery.
      
      Compile-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0a4c56c8
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · e77aee13
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "A core fix for ACPI matching and two driver bugfixes"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: iproc: Fix shifting 31 bits
        i2c: rcar: in slave mode, clear NACK earlier
        i2c: acpi: Remove dead code, i.e. i2c_acpi_match_device()
        i2c: core: Don't fail PRP0001 enumeration when no ID table exist
      e77aee13