Skip to content
  1. Sep 13, 2023
    • Joel Fernandes (Google)'s avatar
      mm/vmalloc: add a safer version of find_vm_area() for debug · 1f55e766
      Joel Fernandes (Google) authored
      commit 0818e739 upstream.
      
      It is unsafe to dump vmalloc area information when trying to do so from
      some contexts.  Add a safer trylock version of the same function to do a
      best-effort VMA finding and use it from vmalloc_dump_obj().
      
      [applied test robot feedback on unused function fix.]
      [applied Uladzislau feedback on locking.]
      Link: https://lkml.kernel.org/r/20230904180806.1002832-1-joel@joelfernandes.org
      Fixes: 98f18083
      
       ("mm: Make mem_dump_obj() handle vmalloc() memory")
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Reviewed-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Reported-by: default avatarZhen Lei <thunder.leizhen@huaweicloud.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Zqiang <qiang.zhang1211@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f55e766
    • Bart Van Assche's avatar
      scsi: core: Fix the scsi_set_resid() documentation · d41edcef
      Bart Van Assche authored
      commit f669b8a6 upstream.
      
      Because scsi_finish_command() subtracts the residual from the buffer
      length, residual overflows must not be reported. Reflect this in the SCSI
      documentation. See also commit 9237f04e
      
       ("scsi: core: Fix
      scsi_get/set_resid() interface")
      
      Cc: Damien Le Moal <dlemoal@kernel.org>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Link: https://lore.kernel.org/r/20230721160154.874010-2-bvanassche@acm.org
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d41edcef
    • Kees Cook's avatar
      printk: ringbuffer: Fix truncating buffer size min_t cast · 4329b63c
      Kees Cook authored
      commit 53e9e33e
      
       upstream.
      
      If an output buffer size exceeded U16_MAX, the min_t(u16, ...) cast in
      copy_data() was causing writes to truncate. This manifested as output
      bytes being skipped, seen as %NUL bytes in pstore dumps when the available
      record size was larger than 65536. Fix the cast to no longer truncate
      the calculation.
      
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: John Ogness <john.ogness@linutronix.de>
      Reported-by: default avatarVijay Balakrishna <vijayb@linux.microsoft.com>
      Link: https://lore.kernel.org/lkml/d8bb1ec7-a4c5-43a2-9de0-9643a70b899f@linux.microsoft.com/
      Fixes: b6cf8b3f
      
       ("printk: add lockless ringbuffer")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Tested-by: default avatarVijay Balakrishna <vijayb@linux.microsoft.com>
      Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
      Reviewed-by: default avatarTyler Hicks (Microsoft) <code@tyhicks.com>
      Tested-by: default avatarTyler Hicks (Microsoft) <code@tyhicks.com>
      Reviewed-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Reviewed-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Link: https://lore.kernel.org/r/20230811054528.never.165-kees@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4329b63c
    • Zqiang's avatar
      rcu: dump vmalloc memory info safely · dddca4c4
      Zqiang authored
      commit c83ad36a upstream.
      
      Currently, for double invoke call_rcu(), will dump rcu_head objects memory
      info, if the objects is not allocated from the slab allocator, the
      vmalloc_dump_obj() will be invoke and the vmap_area_lock spinlock need to
      be held, since the call_rcu() can be invoked in interrupt context,
      therefore, there is a possibility of spinlock deadlock scenarios.
      
      And in Preempt-RT kernel, the rcutorture test also trigger the following
      lockdep warning:
      
      BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
      in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
      preempt_count: 1, expected: 0
      RCU nest depth: 1, expected: 1
      3 locks held by swapper/0/1:
       #0: ffffffffb534ee80 (fullstop_mutex){+.+.}-{4:4}, at: torture_init_begin+0x24/0xa0
       #1: ffffffffb5307940 (rcu_read_lock){....}-{1:3}, at: rcu_torture_init+0x1ec7/0x2370
       #2: ffffffffb536af40 (vmap_area_lock){+.+.}-{3:3}, at: find_vmap_area+0x1f/0x70
      irq event stamp: 565512
      hardirqs last  enabled at (565511): [<ffffffffb379b138>] __call_rcu_common+0x218/0x940
      hardirqs last disabled at (565512): [<ffffffffb5804262>] rcu_torture_init+0x20b2/0x2370
      softirqs last  enabled at (399112): [<ffffffffb36b2586>] __local_bh_enable_ip+0x126/0x170
      softirqs last disabled at (399106): [<ffffffffb43fef59>] inet_register_protosw+0x9/0x1d0
      Preemption disabled at:
      [<ffffffffb58040c3>] rcu_torture_init+0x1f13/0x2370
      CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W          6.5.0-rc4-rt2-yocto-preempt-rt+ #15
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x68/0xb0
       dump_stack+0x14/0x20
       __might_resched+0x1aa/0x280
       ? __pfx_rcu_torture_err_cb+0x10/0x10
       rt_spin_lock+0x53/0x130
       ? find_vmap_area+0x1f/0x70
       find_vmap_area+0x1f/0x70
       vmalloc_dump_obj+0x20/0x60
       mem_dump_obj+0x22/0x90
       __call_rcu_common+0x5bf/0x940
       ? debug_smp_processor_id+0x1b/0x30
       call_rcu_hurry+0x14/0x20
       rcu_torture_init+0x1f82/0x2370
       ? __pfx_rcu_torture_leak_cb+0x10/0x10
       ? __pfx_rcu_torture_leak_cb+0x10/0x10
       ? __pfx_rcu_torture_init+0x10/0x10
       do_one_initcall+0x6c/0x300
       ? debug_smp_processor_id+0x1b/0x30
       kernel_init_freeable+0x2b9/0x540
       ? __pfx_kernel_init+0x10/0x10
       kernel_init+0x1f/0x150
       ret_from_fork+0x40/0x50
       ? __pfx_kernel_init+0x10/0x10
       ret_from_fork_asm+0x1b/0x30
       </TASK>
      
      The previous patch fixes this by using the deadlock-safe best-effort
      version of find_vm_area.  However, in case of failure print the fact that
      the pointer was a vmalloc pointer so that we print at least something.
      
      Link: https://lkml.kernel.org/r/20230904180806.1002832-2-joel@joelfernandes.org
      Fixes: 98f18083
      
       ("mm: Make mem_dump_obj() handle vmalloc() memory")
      Signed-off-by: default avatarZqiang <qiang.zhang1211@gmail.com>
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Reported-by: default avatarZhen Lei <thunder.leizhen@huaweicloud.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dddca4c4
    • Hou Tao's avatar
      virtio_pmem: add the missing REQ_OP_WRITE for flush bio · e39e870e
      Hou Tao authored
      commit c1dbd8a8 upstream.
      
      When doing mkfs.xfs on a pmem device, the following warning was
      reported:
      
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 submit_bio_noacct
       Modules linked in:
       CPU: 2 PID: 384 Comm: mkfs.xfs Not tainted 6.4.0-rc7+ #154
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
       RIP: 0010:submit_bio_noacct+0x340/0x520
       ......
       Call Trace:
        <TASK>
        ? submit_bio_noacct+0xd5/0x520
        submit_bio+0x37/0x60
        async_pmem_flush+0x79/0xa0
        nvdimm_flush+0x17/0x40
        pmem_submit_bio+0x370/0x390
        __submit_bio+0xbc/0x190
        submit_bio_noacct_nocheck+0x14d/0x370
        submit_bio_noacct+0x1ef/0x520
        submit_bio+0x55/0x60
        submit_bio_wait+0x5a/0xc0
        blkdev_issue_flush+0x44/0x60
      
      The root cause is that submit_bio_noacct() needs bio_op() is either
      WRITE or ZONE_APPEND for flush bio and async_pmem_flush() doesn't assign
      REQ_OP_WRITE when allocating flush bio, so submit_bio_noacct just fail
      the flush bio.
      
      Simply fix it by adding the missing REQ_OP_WRITE for flush bio. And we
      could fix the flush order issue and do flush optimization later.
      
      Cc: stable@vger.kernel.org # 6.3+
      Fixes: b4a6bb3a
      
       ("block: add a sanity check for non-write flush/fua bios")
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Reviewed-by: default avatarPankaj Gupta <pankaj.gupta@amd.com>
      Tested-by: default avatarPankaj Gupta <pankaj.gupta@amd.com>
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e39e870e
    • Takashi Iwai's avatar
      ALSA: pcm: Fix missing fixup call in compat hw_refine ioctl · 4f973bc0
      Takashi Iwai authored
      commit 358040e3
      
       upstream.
      
      The update of rate_num/den and msbits were factored out to
      fixup_unreferenced_params() function to be called explicitly after the
      hw_refine or hw_params procedure.  It's called from
      snd_pcm_hw_refine_user(), but it's forgotten in the PCM compat ioctl.
      This ended up with the incomplete rate_num/den and msbits parameters
      when 32bit compat ioctl is used.
      
      This patch adds the missing call in snd_pcm_ioctl_hw_params_compat().
      
      Reported-by: default avatar <Meng_Cai@novatek.com.cn>
      Fixes: f9a076bf
      
       ("ALSA: pcm: calculate non-mask/non-interval parameters always when possible")
      Reviewed-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Reviewed-by: default avatarJaroslav Kysela <perex@perex.cz>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20230829134344.31588-1-tiwai@suse.de
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f973bc0
    • Kalesh Singh's avatar
      Multi-gen LRU: fix per-zone reclaim · 21a67da7
      Kalesh Singh authored
      commit 669281ee upstream.
      
      MGLRU has a LRU list for each zone for each type (anon/file) in each
      generation:
      
      	long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
      
      The min_seq (oldest generation) can progress independently for each
      type but the max_seq (youngest generation) is shared for both anon and
      file. This is to maintain a common frame of reference.
      
      In order for eviction to advance the min_seq of a type, all the per-zone
      lists in the oldest generation of that type must be empty.
      
      The eviction logic only considers pages from eligible zones for
      eviction or promotion.
      
          scan_folios() {
      	...
      	for (zone = sc->reclaim_idx; zone >= 0; zone--)  {
      	    ...
      	    sort_folio(); 	// Promote
      	    ...
      	    isolate_folio(); 	// Evict
      	}
      	...
          }
      
      Consider the system has the movable zone configured and default 4
      generations. The current state of the system is as shown below
      (only illustrating one type for simplicity):
      
      Type: ANON
      
      	Zone    DMA32     Normal    Movable    Device
      
      	Gen 0       0          0        4GB         0
      
      	Gen 1       0        1GB        1MB         0
      
      	Gen 2     1MB        4GB        1MB         0
      
      	Gen 3     1MB        1MB        1MB         0
      
      Now consider there is a GFP_KERNEL allocation request (eligible zone
      index <= Normal), evict_folios() will return without doing any work
      since there are no pages to scan in the eligible zones of the oldest
      generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE
      allocation request; which may not happen soon if there is a lot of free
      memory in the movable zone. This can lead to OOM kills, although there
      is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to
      reclaim.
      
      This issue is not seen in the conventional active/inactive LRU since
      there are no per-zone lists.
      
      If there are no (not enough) folios to scan in the eligible zones, move
      folios from ineligible zone (zone_index > reclaim_index) to the next
      generation. This allows for the progression of min_seq and reclaiming
      from the next generation (Gen 1).
      
      Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently.
      
      [1] https://github.com/raspberrypi/linux/issues/5395
      
      Link: https://lkml.kernel.org/r/20230802025606.346758-1-kaleshsingh@google.com
      Fixes: ac35a490
      
       ("mm: multi-gen LRU: minimal implementation")
      Signed-off-by: default avatarKalesh Singh <kaleshsingh@google.com>
      Reported-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
      Reported-by: default avatarLecopzer Chen <lecopzer.chen@mediatek.com>
      Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
      Tested-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Cc: Barry Song <baohua@kernel.org>
      Cc: Brian Geffon <bgeffon@google.com>
      Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Steven Barrett <steven@liquorix.net>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21a67da7
    • Boris Brezillon's avatar
      PM / devfreq: Fix leak in devfreq_dev_release() · 1640e9c7
      Boris Brezillon authored
      commit 5693d077 upstream.
      
      srcu_init_notifier_head() allocates resources that need to be released
      with a srcu_cleanup_notifier_head() call.
      
      Reported by kmemleak.
      
      Fixes: 0fe3a664
      
       ("PM / devfreq: Add new DEVFREQ_TRANSITION_NOTIFIER notifier")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Reviewed-by: default avatarDhruva Gole <d-gole@ti.com>
      Signed-off-by: default avatarChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1640e9c7
    • Radoslaw Tyl's avatar
      igb: set max size RX buffer when store bad packet is enabled · 6a9abbcc
      Radoslaw Tyl authored
      commit bb5ed01c upstream.
      
      Increase the RX buffer size to 3K when the SBP bit is on. The size of
      the RX buffer determines the number of pages allocated which may not
      be sufficient for receive frames larger than the set MTU size.
      
      Cc: stable@vger.kernel.org
      Fixes: 89eaefb6
      
       ("igb: Support RX-ALL feature flag.")
      Reported-by: default avatarManfred Rudigier <manfred.rudigier@omicronenergy.com>
      Signed-off-by: default avatarRadoslaw Tyl <radoslawx.tyl@intel.com>
      Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a9abbcc
    • Mohamed Khalfella's avatar
      skbuff: skb_segment, Call zero copy functions before using skbuff frags · f99006e8
      Mohamed Khalfella authored
      commit 2ea35288 upstream.
      
      Commit bf5c25d6 ("skbuff: in skb_segment, call zerocopy functions
      once per nskb") added the call to zero copy functions in skb_segment().
      The change introduced a bug in skb_segment() because skb_orphan_frags()
      may possibly change the number of fragments or allocate new fragments
      altogether leaving nrfrags and frag to point to the old values. This can
      cause a panic with stacktrace like the one below.
      
      [  193.894380] BUG: kernel NULL pointer dereference, address: 00000000000000bc
      [  193.895273] CPU: 13 PID: 18164 Comm: vh-net-17428 Kdump: loaded Tainted: G           O      5.15.123+ #26
      [  193.903919] RIP: 0010:skb_segment+0xb0e/0x12f0
      [  194.021892] Call Trace:
      [  194.027422]  <TASK>
      [  194.072861]  tcp_gso_segment+0x107/0x540
      [  194.082031]  inet_gso_segment+0x15c/0x3d0
      [  194.090783]  skb_mac_gso_segment+0x9f/0x110
      [  194.095016]  __skb_gso_segment+0xc1/0x190
      [  194.103131]  netem_enqueue+0x290/0xb10 [sch_netem]
      [  194.107071]  dev_qdisc_enqueue+0x16/0x70
      [  194.110884]  __dev_queue_xmit+0x63b/0xb30
      [  194.121670]  bond_start_xmit+0x159/0x380 [bonding]
      [  194.128506]  dev_hard_start_xmit+0xc3/0x1e0
      [  194.131787]  __dev_queue_xmit+0x8a0/0xb30
      [  194.138225]  macvlan_start_xmit+0x4f/0x100 [macvlan]
      [  194.141477]  dev_hard_start_xmit+0xc3/0x1e0
      [  194.144622]  sch_direct_xmit+0xe3/0x280
      [  194.147748]  __dev_queue_xmit+0x54a/0xb30
      [  194.154131]  tap_get_user+0x2a8/0x9c0 [tap]
      [  194.157358]  tap_sendmsg+0x52/0x8e0 [tap]
      [  194.167049]  handle_tx_zerocopy+0x14e/0x4c0 [vhost_net]
      [  194.173631]  handle_tx+0xcd/0xe0 [vhost_net]
      [  194.176959]  vhost_worker+0x76/0xb0 [vhost]
      [  194.183667]  kthread+0x118/0x140
      [  194.190358]  ret_from_fork+0x1f/0x30
      [  194.193670]  </TASK>
      
      In this case calling skb_orphan_frags() updated nr_frags leaving nrfrags
      local variable in skb_segment() stale. This resulted in the code hitting
      i >= nrfrags prematurely and trying to move to next frag_skb using
      list_skb pointer, which was NULL, and caused kernel panic. Move the call
      to zero copy functions before using frags and nr_frags.
      
      Fixes: bf5c25d6
      
       ("skbuff: in skb_segment, call zerocopy functions once per nskb")
      Signed-off-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Reported-by: default avatarAmit Goyal <agoyal@purestorage.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f99006e8
    • Wander Lairson Costa's avatar
      netfilter: xt_sctp: validate the flag_info count · b63b4e11
      Wander Lairson Costa authored
      commit e9947649 upstream.
      
      sctp_mt_check doesn't validate the flag_count field. An attacker can
      take advantage of that to trigger a OOB read and leak memory
      information.
      
      Add the field validation in the checkentry function.
      
      Fixes: 2e4e6a17
      
       ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarLucas Leong <wmliang@infosec.exchange>
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b63b4e11
    • Wander Lairson Costa's avatar
      netfilter: xt_u32: validate user space input · 83b99532
      Wander Lairson Costa authored
      commit 69c5d284 upstream.
      
      The xt_u32 module doesn't validate the fields in the xt_u32 structure.
      An attacker may take advantage of this to trigger an OOB read by setting
      the size fields with a value beyond the arrays boundaries.
      
      Add a checkentry function to validate the structure.
      
      This was originally reported by the ZDI project (ZDI-CAN-18408).
      
      Fixes: 1b50b8a3
      
       ("[NETFILTER]: Add u32 match")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83b99532
    • Xiao Liang's avatar
      netfilter: nft_exthdr: Fix non-linear header modification · 93450ea5
      Xiao Liang authored
      commit 28427f36 upstream.
      
      Fix skb_ensure_writable() size. Don't use nft_tcp_header_pointer() to
      make it explicit that pointers point to the packet (not local buffer).
      
      Fixes: 99d1712b ("netfilter: exthdr: tcp option set support")
      Fixes: 7890cbea
      
       ("netfilter: exthdr: add support for tcp option removal")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarXiao Liang <shaw.leon@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93450ea5
    • Kyle Zeng's avatar
      netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c · d59b6fc4
      Kyle Zeng authored
      commit 050d91c0 upstream.
      
      The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can
      lead to the use of wrong `CIDR_POS(c)` for calculating array offsets,
      which can lead to integer underflow. As a result, it leads to slab
      out-of-bound access.
      This patch adds back the IP_SET_HASH_WITH_NET0 macro to
      ip_set_hash_netportnet to address the issue.
      
      Fixes: 886503f3
      
       ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net")
      Suggested-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarKyle Zeng <zengyhkyle@gmail.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d59b6fc4
    • Eric Dumazet's avatar
      igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU · 87f07ec5
      Eric Dumazet authored
      commit c3b704d4 upstream.
      
      This is a follow up of commit 915d975b ("net: deal with integer
      overflows in kmalloc_reserve()") based on David Laight feedback.
      
      Back in 2010, I failed to realize malicious users could set dev->mtu
      to arbitrary values. This mtu has been since limited to 0x7fffffff but
      regardless of how big dev->mtu is, it makes no sense for igmpv3_newpack()
      to allocate more than IP_MAX_MTU and risk various skb fields overflows.
      
      Fixes: 57e1ab6e
      
       ("igmp: refine skb allocations")
      Link: https://lore.kernel.org/netdev/d273628df80f45428e739274ab9ecb72@AcuMS.aculab.com/
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDavid Laight <David.Laight@ACULAB.COM>
      Cc: Kyle Zeng <zengyhkyle@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87f07ec5
    • Eric Dumazet's avatar
      net: deal with integer overflows in kmalloc_reserve() · e4ffc47a
      Eric Dumazet authored
      commit 915d975b upstream.
      
      Blamed commit changed:
          ptr = kmalloc(size);
          if (ptr)
            size = ksize(ptr);
      
      to:
          size = kmalloc_size_roundup(size);
          ptr = kmalloc(size);
      
      This allowed various crash as reported by syzbot [1]
      and Kyle Zeng.
      
      Problem is that if @size is bigger than 0x80000001,
      kmalloc_size_roundup(size) returns 2^32.
      
      kmalloc_reserve() uses a 32bit variable (obj_size),
      so 2^32 is truncated to 0.
      
      kmalloc(0) returns ZERO_SIZE_PTR which is not handled by
      skb allocations.
      
      Following trace can be triggered if a netdev->mtu is set
      close to 0x7fffffff
      
      We might in the future limit netdev->mtu to more sensible
      limit (like KMALLOC_MAX_SIZE).
      
      This patch is based on a syzbot report, and also a report
      and tentative fix from Kyle Zeng.
      
      [1]
      BUG: KASAN: user-memory-access in __build_skb_around net/core/skbuff.c:294 [inline]
      BUG: KASAN: user-memory-access in __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527
      Write of size 32 at addr 00000000fffffd10 by task syz-executor.4/22554
      
      CPU: 1 PID: 22554 Comm: syz-executor.4 Not tainted 6.1.39-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
      Call trace:
      dump_backtrace+0x1c8/0x1f4 arch/arm64/kernel/stacktrace.c:279
      show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:286
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0x120/0x1a0 lib/dump_stack.c:106
      print_report+0xe4/0x4b4 mm/kasan/report.c:398
      kasan_report+0x150/0x1ac mm/kasan/report.c:495
      kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189
      memset+0x40/0x70 mm/kasan/shadow.c:44
      __build_skb_around net/core/skbuff.c:294 [inline]
      __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527
      alloc_skb include/linux/skbuff.h:1316 [inline]
      igmpv3_newpack+0x104/0x1088 net/ipv4/igmp.c:359
      add_grec+0x81c/0x1124 net/ipv4/igmp.c:534
      igmpv3_send_cr net/ipv4/igmp.c:667 [inline]
      igmp_ifc_timer_expire+0x1b0/0x1008 net/ipv4/igmp.c:810
      call_timer_fn+0x1c0/0x9f0 kernel/time/timer.c:1474
      expire_timers kernel/time/timer.c:1519 [inline]
      __run_timers+0x54c/0x710 kernel/time/timer.c:1790
      run_timer_softirq+0x28/0x4c kernel/time/timer.c:1803
      _stext+0x380/0xfbc
      ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:79
      call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891
      do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:84
      invoke_softirq kernel/softirq.c:437 [inline]
      __irq_exit_rcu+0x1c0/0x4cc kernel/softirq.c:683
      irq_exit_rcu+0x14/0x78 kernel/softirq.c:695
      el0_interrupt+0x7c/0x2e0 arch/arm64/kernel/entry-common.c:717
      __el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:724
      el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:729
      el0t_64_irq+0x1a0/0x1a4 arch/arm64/kernel/entry.S:584
      
      Fixes: 12d6c1d3
      
       ("skbuff: Proactively round up to kmalloc bucket size")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarKyle Zeng <zengyhkyle@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4ffc47a
    • Yuan Yao's avatar
      virtio_ring: fix avail_wrap_counter in virtqueue_add_packed · bf11b89b
      Yuan Yao authored
      [ Upstream commit 1acfe2c1 ]
      
      In current packed virtqueue implementation, the avail_wrap_counter won't
      flip, in the case when the driver supplies a descriptor chain with a
      length equals to the queue size; total_sg == vq->packed.vring.num.
      
      Let’s assume the following situation:
      vq->packed.vring.num=4
      vq->packed.next_avail_idx: 1
      vq->packed.avail_wrap_counter: 0
      
      Then the driver adds a descriptor chain containing 4 descriptors.
      
      We expect the following result with avail_wrap_counter flipped:
      vq->packed.next_avail_idx: 1
      vq->packed.avail_wrap_counter: 1
      
      But, the current implementation gives the following result:
      vq->packed.next_avail_idx: 1
      vq->packed.avail_wrap_counter: 0
      
      To reproduce the bug, you can set a packed queue size as small as
      possible, so that the driver is more likely to provide a descriptor
      chain with a length equal to the packed queue size. For example, in
      qemu run following commands:
      sudo qemu-system-x86_64 \
      -enable-kvm \
      -nographic \
      -kernel "path/to/kernel_image" \
      -m 1G \
      -drive file="path/to/rootfs",if=none,id=disk \
      -device virtio-blk,drive=disk \
      -drive file="path/to/disk_image",if=none,id=rwdisk \
      -device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\
      indirect_desc=off \
      -append "console=ttyS0 root=/dev/vda rw init=/bin/bash"
      
      Inside the VM, create a directory and mount the rwdisk device on it. The
      rwdisk will hang and mount operation will not complete.
      
      This commit fixes the wrap counter error by flipping the
      packed.avail_wrap_counter, when start of descriptor chain equals to the
      end of descriptor chain (head == i).
      
      Fixes: 1ce9e605
      
       ("virtio_ring: introduce packed ring support")
      Signed-off-by: default avatarYuan Yao <yuanyaogoog@chromium.org>
      Message-Id: <20230808051110.3492693-1-yuanyaogoog@chromium.org>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf11b89b
    • Jason Wang's avatar
      virtio_vdpa: build affinity masks conditionally · 5f259224
      Jason Wang authored
      [ Upstream commit ae15acea ]
      
      We try to build affinity mask via create_affinity_masks()
      unconditionally which may lead several issues:
      
      - the affinity mask is not used for parent without affinity support
        (only VDUSE support the affinity now)
      - the logic of create_affinity_masks() might not work for devices
        other than block. For example it's not rare in the networking device
        where the number of queues could exceed the number of CPUs. Such
        case breaks the current affinity logic which is based on
        group_cpus_evenly() who assumes the number of CPUs are not less than
        the number of groups. This can trigger a warning[1]:
      
      	if (ret >= 0)
      		WARN_ON(nr_present + nr_others < numgrps);
      
      Fixing this by only build the affinity masks only when
      
      - Driver passes affinity descriptor, driver like virtio-blk can make
        sure to limit the number of queues when it exceeds the number of CPUs
      - Parent support affinity setting config ops
      
      This help to avoid the warning. More optimizations could be done on
      top.
      
      [1]
      [  682.146655] WARNING: CPU: 6 PID: 1550 at lib/group_cpus.c:400 group_cpus_evenly+0x1aa/0x1c0
      [  682.146668] CPU: 6 PID: 1550 Comm: vdpa Not tainted 6.5.0-rc5jason+ #79
      [  682.146671] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
      [  682.146673] RIP: 0010:group_cpus_evenly+0x1aa/0x1c0
      [  682.146676] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc e8 1b c4 74 ff 48 89 ef e8 13 ac 98 ff 4c 89 e7 45 31 e4 e8 08 ac 98 ff eb c2 <0f> 0b eb b6 e8 fd 05 c3 00 45 31 e4 eb e5 cc cc cc cc cc cc cc cc
      [  682.146679] RSP: 0018:ffffc9000215f498 EFLAGS: 00010293
      [  682.146682] RAX: 000000000001f1e0 RBX: 0000000000000041 RCX: 0000000000000000
      [  682.146684] RDX: ffff888109922058 RSI: 0000000000000041 RDI: 0000000000000030
      [  682.146686] RBP: ffff888109922058 R08: ffffc9000215f498 R09: ffffc9000215f4a0
      [  682.146687] R10: 00000000000198d0 R11: 0000000000000030 R12: ffff888107e02800
      [  682.146689] R13: 0000000000000030 R14: 0000000000000030 R15: 0000000000000041
      [  682.146692] FS:  00007fef52315740(0000) GS:ffff888237380000(0000) knlGS:0000000000000000
      [  682.146695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  682.146696] CR2: 00007fef52509000 CR3: 0000000110dbc004 CR4: 0000000000370ee0
      [  682.146698] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  682.146700] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  682.146701] Call Trace:
      [  682.146703]  <TASK>
      [  682.146705]  ? __warn+0x7b/0x130
      [  682.146709]  ? group_cpus_evenly+0x1aa/0x1c0
      [  682.146712]  ? report_bug+0x1c8/0x1e0
      [  682.146717]  ? handle_bug+0x3c/0x70
      [  682.146721]  ? exc_invalid_op+0x14/0x70
      [  682.146723]  ? asm_exc_invalid_op+0x16/0x20
      [  682.146727]  ? group_cpus_evenly+0x1aa/0x1c0
      [  682.146729]  ? group_cpus_evenly+0x15c/0x1c0
      [  682.146731]  create_affinity_masks+0xaf/0x1a0
      [  682.146735]  virtio_vdpa_find_vqs+0x83/0x1d0
      [  682.146738]  ? __pfx_default_calc_sets+0x10/0x10
      [  682.146742]  virtnet_find_vqs+0x1f0/0x370
      [  682.146747]  virtnet_probe+0x501/0xcd0
      [  682.146749]  ? vp_modern_get_status+0x12/0x20
      [  682.146751]  ? get_cap_addr.isra.0+0x10/0xc0
      [  682.146754]  virtio_dev_probe+0x1af/0x260
      [  682.146759]  really_probe+0x1a5/0x410
      
      Fixes: 3dad5682
      
       ("virtio-vdpa: Support interrupt affinity spreading mechanism")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Message-Id: <20230811091539.1359865-1-jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5f259224
    • Liao Chang's avatar
      cpufreq: Fix the race condition while updating the transition_task of policy · 7d85dfdf
      Liao Chang authored
      [ Upstream commit 61bfbf79 ]
      
      The field 'transition_task' of policy structure is used to track the
      task which is performing the frequency transition. Using this field to
      print a warning once detect a case where the same task is calling
      _begin() again before completing the preivous frequency transition via
      the _end().
      
      However, there is a potential race condition in _end() and _begin() APIs
      while updating the field 'transition_task' of policy, the scenario is
      depicted below:
      
                   Task A                            Task B
      
              /* 1st freq transition */
              Invoke _begin() {
                      ...
                      ...
              }
                                              /* 2nd freq transition */
                                              Invoke _begin() {
                                                      ... //waiting for A to
                                                      ... //clear
                                                      ... //transition_ongoing
                                                      ... //in _end() for
                                                      ... //the 1st transition
                                                              |
              Change the frequency                            |
                                                              |
              Invoke _end() {                                 |
                      ...                                     |
                      ...                                     |
                      transition_ongoing = false;             V
                                                      transition_ongoing = true;
                                                      transition_task = current;
                      transition_task = NULL;
                      ... //A overwrites the task
                      ... //performing the transition
                      ... //result in error warning.
              }
      
      To fix this race condition, the transition_lock of policy structure is
      now acquired before updating policy structure in _end() API. Which ensure
      that only one task can update the 'transition_task' field at a time.
      
      Link: https://lore.kernel.org/all/b3c61d8a-d52d-3136-fbf0-d1de9f1ba411@huawei.com/
      Fixes: ca654dc3
      
       ("cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end")
      Signed-off-by: default avatarLiao Chang <liaochang1@huawei.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7d85dfdf
    • Vincent Whitchurch's avatar
      um: virt-pci: fix missing declaration warning · bb31371c
      Vincent Whitchurch authored
      [ Upstream commit 974b808d
      
       ]
      
      Fix this warning which appears with W=1 and without CONFIG_OF:
      
       warning: no previous declaration for 'pcibios_get_phb_of_node'
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202308230949.PphIIlhq-lkp@intel.com/
      Fixes: 314a1408
      
       ("um: virt-pci: implement pcibios_get_phb_of_node()")
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bb31371c
    • Maciej S. Szmigiero's avatar
      Drivers: hv: vmbus: Don't dereference ACPI root object handle · 9fc162c5
      Maciej S. Szmigiero authored
      [ Upstream commit 78e04bbf ]
      
      Since the commit referenced in the Fixes: tag below the VMBus client driver
      is walking the ACPI namespace up from the VMBus ACPI device to the ACPI
      namespace root object trying to find Hyper-V MMIO ranges.
      
      However, if it is not able to find them it ends trying to walk resources of
      the ACPI namespace root object itself.
      This object has all-ones handle, which causes a NULL pointer dereference
      in the ACPI code (from dereferencing this pointer with an offset).
      
      This in turn causes an oops on boot with VMBus host implementations that do
      not provide Hyper-V MMIO ranges in their VMBus ACPI device or its
      ancestors.
      The QEMU VMBus implementation is an example of such implementation.
      
      I guess providing these ranges is optional, since all tested Windows
      versions seem to be able to use VMBus devices without them.
      
      Fix this by explicitly terminating the lookup at the ACPI namespace root
      object.
      
      Note that Linux guests under KVM/QEMU do not use the Hyper-V PV interface
      by default - they only do so if the KVM PV interface is missing or
      disabled.
      
      Example stack trace of such oops:
      [ 3.710827] ? __die+0x1f/0x60
      [ 3.715030] ? page_fault_oops+0x159/0x460
      [ 3.716008] ? exc_page_fault+0x73/0x170
      [ 3.716959] ? asm_exc_page_fault+0x22/0x30
      [ 3.717957] ? acpi_ns_lookup+0x7a/0x4b0
      [ 3.718898] ? acpi_ns_internalize_name+0x79/0xc0
      [ 3.720018] acpi_ns_get_node_unlocked+0xb5/0xe0
      [ 3.721120] ? acpi_ns_check_object_type+0xfe/0x200
      [ 3.722285] ? acpi_rs_convert_aml_to_resource+0x37/0x6e0
      [ 3.723559] ? down_timeout+0x3a/0x60
      [ 3.724455] ? acpi_ns_get_node+0x3a/0x60
      [ 3.725412] acpi_ns_get_node+0x3a/0x60
      [ 3.726335] acpi_ns_evaluate+0x1c3/0x2c0
      [ 3.727295] acpi_ut_evaluate_object+0x64/0x1b0
      [ 3.728400] acpi_rs_get_method_data+0x2b/0x70
      [ 3.729476] ? vmbus_platform_driver_probe+0x1d0/0x1d0 [hv_vmbus]
      [ 3.730940] ? vmbus_platform_driver_probe+0x1d0/0x1d0 [hv_vmbus]
      [ 3.732411] acpi_walk_resources+0x78/0xd0
      [ 3.733398] vmbus_platform_driver_probe+0x9f/0x1d0 [hv_vmbus]
      [ 3.734802] platform_probe+0x3d/0x90
      [ 3.735684] really_probe+0x19b/0x400
      [ 3.736570] ? __device_attach_driver+0x100/0x100
      [ 3.737697] __driver_probe_device+0x78/0x160
      [ 3.738746] driver_probe_device+0x1f/0x90
      [ 3.739743] __driver_attach+0xc2/0x1b0
      [ 3.740671] bus_for_each_dev+0x70/0xc0
      [ 3.741601] bus_add_driver+0x10e/0x210
      [ 3.742527] driver_register+0x55/0xf0
      [ 3.744412] ? 0xffffffffc039a000
      [ 3.745207] hv_acpi_init+0x3c/0x1000 [hv_vmbus]
      
      Fixes: 7f163a6f
      
       ("drivers:hv: Modify hv_vmbus to search for all MMIO ranges available.")
      Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Reviewed-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Link: https://lore.kernel.org/r/fd8e64ceeecfd1d95ff49021080cf699e88dbbde.1691606267.git.maciej.szmigiero@oracle.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9fc162c5
    • Fenghua Yu's avatar
      dmaengine: idxd: Fix issues with PRS disable sysfs knob · bfcf2805
      Fenghua Yu authored
      [ Upstream commit 8cae6657 ]
      
      There are two issues in the current PRS disable sysfs store function
      wq_prs_disable_store():
      
      1. Since PRS disable knob is invisible if PRS disable is not supported
         in WQ, it's redundant to check PRS support again in the store function
         again. Remove the redundant PRS support check.
      2. Since PRS disable is read-only when the device is not configurable,
         PRS disable cannot be changed on the device. Add device configurable
         check in the store function.
      
      Fixes: f2dc3271
      
       ("dmaengine: idxd: add per wq PRS disable")
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Link: https://lore.kernel.org/r/20230811012635.535413-2-fenghua.yu@intel.com
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bfcf2805
    • Fenghua Yu's avatar
      dmaengine: idxd: Allow ATS disable update only for configurable devices · df212ed7
      Fenghua Yu authored
      [ Upstream commit 0056a7f0 ]
      
      ATS disable status in a WQ is read-only if the device is not configurable.
      This change ensures that the ATS disable attribute can be modified via
      sysfs only on configurable devices.
      
      Fixes: 92de5fa2
      
       ("dmaengine: idxd: add ATS disable knob for work queues")
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Link: https://lore.kernel.org/r/20230811012635.535413-1-fenghua.yu@intel.com
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df212ed7
    • Fenghua Yu's avatar
      dmaengine: idxd: Expose ATS disable knob only when WQ ATS is supported · e5d80eb3
      Fenghua Yu authored
      [ Upstream commit 62b41b65
      
       ]
      
      WQ Advanced Translation Service (ATS) can be controlled only when
      WQ ATS is supported. The sysfs ATS disable knob should be visible only
      when the features is supported.
      
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Link: https://lore.kernel.org/r/20230712174436.3435088-2-fenghua.yu@intel.com
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Stable-dep-of: 0056a7f0
      
       ("dmaengine: idxd: Allow ATS disable update only for configurable devices")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e5d80eb3
    • Fenghua Yu's avatar
      dmaengine: idxd: Simplify WQ attribute visibility checks · 8f513756
      Fenghua Yu authored
      [ Upstream commit 97b1185f
      
       ]
      
      The functions that check if WQ attributes are invisible are almost
      duplicate. Define a helper to simplify these functions and future
      WQ attribute visibility checks as well.
      
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Link: https://lore.kernel.org/r/20230712174436.3435088-1-fenghua.yu@intel.com
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Stable-dep-of: 0056a7f0
      
       ("dmaengine: idxd: Allow ATS disable update only for configurable devices")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f513756
    • ruanjinjie's avatar
      dmaengine: ste_dma40: Add missing IRQ check in d40_probe · fa2e3c48
      ruanjinjie authored
      [ Upstream commit c05ce690 ]
      
      Check for the return value of platform_get_irq(): if no interrupt
      is specified, it wouldn't make sense to call request_irq().
      
      Fixes: 8d318a50
      
       ("DMAENGINE: Support for ST-Ericssons DMA40 block v3")
      Signed-off-by: default avatarRuan Jinjie <ruanjinjie@huawei.com>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Link: https://lore.kernel.org/r/20230724144108.2582917-1-ruanjinjie@huawei.com
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fa2e3c48
    • Randy Dunlap's avatar
      um: Fix hostaudio build errors · eaa8d097
      Randy Dunlap authored
      [ Upstream commit db4bfcba ]
      
      Use "select" to ensure that the required kconfig symbols are set
      as expected.
      Drop HOSTAUDIO since it is now equivalent to UML_SOUND.
      
      Set CONFIG_SOUND=m in ARCH=um defconfig files to maintain the
      status quo of the default configs.
      
      Allow SOUND with UML regardless of HAS_IOMEM. Otherwise there is a
      kconfig warning for unmet dependencies. (This was not an issue when
      SOUND was defined in arch/um/drivers/Kconfig. I have done 50 randconfig
      builds and didn't find any issues.)
      
      This fixes build errors when CONFIG_SOUND is not set:
      
      ld: arch/um/drivers/hostaudio_kern.o: in function `hostaudio_cleanup_module':
      hostaudio_kern.c:(.exit.text+0xa): undefined reference to `unregister_sound_mixer'
      ld: hostaudio_kern.c:(.exit.text+0x15): undefined reference to `unregister_sound_dsp'
      ld: arch/um/drivers/hostaudio_kern.o: in function `hostaudio_init_module':
      hostaudio_kern.c:(.init.text+0x19): undefined reference to `register_sound_dsp'
      ld: hostaudio_kern.c:(.init.text+0x31): undefined reference to `register_sound_mixer'
      ld: hostaudio_kern.c:(.init.text+0x49): undefined reference to `unregister_sound_dsp'
      
      and this kconfig warning:
      WARNING: unmet direct dependencies detected for SOUND
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Fixes: d886e87c
      
       ("sound: make OSS sound core optional")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: lore.kernel.org/r/202307141416.vxuRVpFv-lkp@intel.com
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: linux-um@lists.infradead.org
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: linux-kbuild@vger.kernel.org
      Cc: alsa-devel@alsa-project.org
      Reviewed-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      eaa8d097
    • Yi Yang's avatar
      mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume() · 9a61f2c0
      Yi Yang authored
      [ Upstream commit a5a88125 ]
      
      In fsmc_nand_resume(), the return value of clk_prepare_enable() should be
      checked since it might fail.
      
      Fixes: e25da1c0
      
       ("mtd: fsmc_nand: Add clk_{un}prepare() support")
      Signed-off-by: default avatarYi Yang <yiyang13@huawei.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20230817115839.10192-1-yiyang13@huawei.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9a61f2c0
    • Hsin-Yi Wang's avatar
      mtd: spi-nor: Check bus width while setting QE bit · c4b5d365
      Hsin-Yi Wang authored
      [ Upstream commit f01d8155 ]
      
      spi_nor_write_16bit_sr_and_check() should also check if bus width is
      4 before setting QE bit.
      
      Fixes: 39d1e334
      
       ("mtd: spi-nor: Fix clearing of QE bit on lock()/unlock()")
      Suggested-by: default avatarMichael Walle <michael@walle.cc>
      Suggested-by: default avatarTudor Ambarus <tudor.ambarus@linaro.org>
      Signed-off-by: default avatarHsin-Yi Wang <hsinyi@chromium.org>
      Reviewed-by: default avatarMichael Walle <michael@walle.cc>
      Link: https://lore.kernel.org/r/20230818064524.1229100-2-hsinyi@chromium.org
      Signed-off-by: default avatarTudor Ambarus <tudor.ambarus@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c4b5d365
    • Marek Behún's avatar
      leds: trigger: tty: Do not use LED_ON/OFF constants, use led_blink_set_oneshot instead · 0301718d
      Marek Behún authored
      [ Upstream commit 73009457 ]
      
      The tty LED trigger uses the obsolete LED_ON & LED_OFF constants when
      setting LED brightness. This is bad because the LED_ON constant is equal
      to 1, and so when activating the tty LED trigger on a LED class device
      with max_brightness greater than 1, the LED is dimmer than it can be
      (when max_brightness is 255, the LED is very dimm indeed; some devices
      translate 1/255 to 0, so the LED is OFF all the time).
      
      Instead of directly setting brightness to a specific value, use the
      led_blink_set_oneshot() function from LED core to configure the blink.
      This function takes the current configured brightness as blink
      brightness if not zero, and max brightness otherwise.
      
      This also changes the behavior of the TTY LED trigger. Previously if
      rx/tx stats kept changing, the LED was ON all the time they kept
      changing. With this patch the LED will blink on TTY activity.
      
      Fixes: fd4a641a
      
       ("leds: trigger: implement a tty trigger")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Link: https://lore.kernel.org/r/20230802090753.13611-1-kabel@kernel.org
      Signed-off-by: default avatarLee Jones <lee@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0301718d
    • Marek Behún's avatar
      leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false · e2a579c5
      Marek Behún authored
      [ Upstream commit c3f85318 ]
      
      At the time we call
          BUG_ON(props.color == LED_COLOR_ID_MULTI);
      the props variable is still initialized to zero.
      
      Call the BUG_ON only after we parse fwnode into props.
      
      Fixes: 77dce3a2
      
       ("leds: disallow /sys/class/leds/*:multi:* for now")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Link: https://lore.kernel.org/r/20230801151623.30387-1-kabel@kernel.org
      Signed-off-by: default avatarLee Jones <lee@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e2a579c5
    • Marek Behún's avatar
      leds: multicolor: Use rounded division when calculating color components · 92e1279d
      Marek Behún authored
      [ Upstream commit 065d099f ]
      
      Given channel intensity, LED brightness and max LED brightness, the
      multicolor LED framework helper led_mc_calc_color_components() computes
      the color channel brightness as
      
          chan_brightness = brightness * chan_intensity / max_brightness
      
      Consider the situation when (brightness, intensity, max_brightness) is
      for example (16, 15, 255), then chan_brightness is computed to 0
      although the fractional divison would give 0.94, which should be rounded
      to 1.
      
      Use DIV_ROUND_CLOSEST here for the division to give more realistic
      component computation:
      
          chan_brightness = DIV_ROUND_CLOSEST(brightness * chan_intensity,
                                              max_brightness)
      
      Fixes: 55d5d3b4
      
       ("leds: multicolor: Introduce a multicolor class definition")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Link: https://lore.kernel.org/r/20230801124931.8661-1-kabel@kernel.org
      Signed-off-by: default avatarLee Jones <lee@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      92e1279d
    • Ahmad Fatoum's avatar
      thermal/drivers/imx8mm: Suppress log message on probe deferral · e3aa6884
      Ahmad Fatoum authored
      [ Upstream commit 4afcb58e ]
      
      nvmem_cell_read_u32() may return -EPROBE_DEFER if NVMEM supplier has not
      yet been probed. Future reprobe may succeed, so printing:
      
        i.mx8mm_thermal 30260000.tmu: Failed to read OCOTP nvmem cell (-517).
      
      to the log is confusing. Fix this by using dev_err_probe. This also
      elevates the message from warning to error, which is more correct: The
      log message is only ever printed in probe error path and probe aborts
      afterwards, so it really warrants an error-level message.
      
      Fixes: 40329164
      
       ("thermal/drivers/imx: Add support for loading calibration data from OCOTP")
      Signed-off-by: default avatarAhmad Fatoum <a.fatoum@pengutronix.de>
      Reviewed-by: default avatarMarek Vasut <marex@denx.de>
      Reviewed-by: default avatarPeng Fan <peng.fan@nxp.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230708112647.2897294-1-a.fatoum@pengutronix.de
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e3aa6884
    • Nícolas F. R. A. Prado's avatar
      thermal/drivers/mediatek/lvts_thermal: Manage threshold between sensors · db478bcb
      Nícolas F. R. A. Prado authored
      [ Upstream commit 2bba1acf ]
      
      Each LVTS thermal controller can have up to four sensors, each capable
      of triggering its own interrupt when its measured temperature crosses
      the configured threshold. The threshold for each sensor is handled
      separately by the thermal framework, since each one is registered with
      its own thermal zone and trips. However, the temperature thresholds are
      configured on the controller, and therefore are shared between all
      sensors on that controller.
      
      When the temperature measured by the sensors is different enough to
      cause the thermal framework to configure different thresholds for each
      one, interrupts start triggering on sensors outside the last threshold
      configured.
      
      To address the issue, track the thresholds required by each sensor and
      only actually set the highest one in the hardware, and disable
      interrupts for all sensors outside the current configured range.
      
      Fixes: f5f633b1
      
       ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
      Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
      Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230706153823.201943-7-nfraprado@collabora.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      db478bcb
    • Nícolas F. R. A. Prado's avatar
      thermal/drivers/mediatek/lvts_thermal: Don't leave threshold zeroed · 8e382e94
      Nícolas F. R. A. Prado authored
      [ Upstream commit 77354eae ]
      
      The thermal framework might leave the low threshold unset if there
      aren't any lower trip points. This leaves the register zeroed, which
      translates to a very high temperature for the low threshold. The
      interrupt for this threshold is then immediately triggered, and the
      state machine gets stuck, preventing any other temperature monitoring
      interrupts to ever trigger.
      
      (The same happens by not setting the Cold or Hot to Normal thresholds
      when using those)
      
      Set the unused threshold to a valid low value. This value was chosen so
      that for any valid golden temperature read from the efuse, when the
      value is converted to raw and back again to milliCelsius, the result
      doesn't underflow.
      
      Fixes: f5f633b1
      
       ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
      Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
      Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230706153823.201943-6-nfraprado@collabora.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8e382e94
    • Nícolas F. R. A. Prado's avatar
      thermal/drivers/mediatek/lvts_thermal: Disable undesired interrupts · 71b7ccc4
      Nícolas F. R. A. Prado authored
      [ Upstream commit 487bf099 ]
      
      Out of the many interrupts supported by the hardware, the only ones of
      interest to the driver currently are:
      * The temperature went over the high offset threshold, for any of the
        sensors
      * The temperature went below the low offset threshold, for any of the
        sensors
      * The temperature went over the stage3 threshold
      
      These are the only thresholds configured by the driver through the
      OFFSETH, OFFSETL, and PROTTC registers, respectively.
      
      The current interrupt mask in LVTS_MONINT_CONF, enables many more
      interrupts, including data ready on sensors for both filtered and
      immediate mode. These are not only not handled by the driver, but they
      are also triggered too often, causing unneeded overhead. Disable these
      unnecessary interrupts.
      
      The meaning of each bit can be seen in the comment describing
      LVTS_MONINTST in the IRQ handler.
      
      Fixes: f5f633b1
      
       ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
      Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230706153823.201943-5-nfraprado@collabora.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      71b7ccc4
    • Nícolas F. R. A. Prado's avatar
      thermal/drivers/mediatek/lvts_thermal: Use offset threshold for IRQ · 30a642a6
      Nícolas F. R. A. Prado authored
      [ Upstream commit f79e996c ]
      
      There are two kinds of temperature monitoring interrupts available:
      * High Offset, Low Offset
      * Hot, Hot to normal, Cold
      
      The code currently uses the hot/h2n/cold interrupts, however in a way
      that doesn't work: the cold threshold is left uninitialized, which
      prevents the other thresholds from ever triggering, and the h2n
      interrupt is used as the lower threshold, which prevents the hot
      interrupt from triggering again after the thresholds are updated by the
      thermal framework, since a hot interrupt can only trigger again after
      the hot to normal interrupt has been triggered.
      
      But better yet than addressing those issues, is to use the high/low
      offset interrupts instead. This way only two thresholds need to be
      managed, which have a simpler state machine, making them a better match
      to the thermal framework's high and low thresholds.
      
      Fixes: f5f633b1
      
       ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
      Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
      Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230706153823.201943-4-nfraprado@collabora.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      30a642a6
    • Nícolas F. R. A. Prado's avatar
      thermal/drivers/mediatek/lvts_thermal: Honor sensors in immediate mode · ba01e461
      Nícolas F. R. A. Prado authored
      [ Upstream commit 64de162e ]
      
      Each controller can be configured to operate on immediate or filtered
      mode. On filtered mode, the sensors are enabled by setting the
      corresponding bits in MONCTL0, while on immediate mode, by setting
      MSRCTL1.
      
      Previously, the code would set MSRCTL1 for all four sensors when
      configured to immediate mode, but given that the controller might not
      have all four sensors connected, this would cause interrupts to trigger
      for non-existent sensors. Fix this by handling the MSRCTL1 register
      analogously to the MONCTL0: only enable the sensors that were declared.
      
      Fixes: f5f633b1
      
       ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Tested-by: default avatarChen-Yu Tsai <wenst@chromium.org>
      Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
      Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230706153823.201943-3-nfraprado@collabora.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ba01e461
    • Nícolas F. R. A. Prado's avatar
      thermal/drivers/mediatek/lvts_thermal: Handle IRQ on all controllers · 436b4b33
      Nícolas F. R. A. Prado authored
      [ Upstream commit cbd8c5aa ]
      
      There is a single IRQ handler for each LVTS thermal domain, and it is
      supposed to check each of its underlying controllers for the origin of
      the interrupt and clear its status. However due to a typo, only the
      first controller was ever being handled, which resulted in the interrupt
      never being cleared when it happened on the other controllers. Add the
      missing index so interrupts are handled for all controllers.
      
      Fixes: f5f633b1
      
       ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
      Reviewed-by: default avatarMatthias Brugger <matthias.bgg@gmail.com>
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Tested-by: default avatarChen-Yu Tsai <wenst@chromium.org>
      Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
      Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20230706153823.201943-2-nfraprado@collabora.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      436b4b33
    • Dan Carpenter's avatar
      leds: pwm: Fix error code in led_pwm_create_fwnode() · bab77f96
      Dan Carpenter authored
      [ Upstream commit cadb2de2 ]
      
      Negative -EINVAL was intended, not positive EINVAL.  Fix it.
      
      Fixes: 95138e01
      
       ("leds: pwm: Make error handling more robust")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Link: https://lore.kernel.org/r/a33b981a-b2c4-4dc2-b00a-626a090d2f11@moroto.mountain
      Signed-off-by: default avatarLee Jones <lee@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bab77f96