Skip to content
  1. Dec 23, 2021
  2. Dec 22, 2021
  3. Dec 21, 2021
  4. Dec 20, 2021
  5. Dec 18, 2021
    • 蒋家盛's avatar
      qlcnic: potential dereference null pointer of rx_queue->page_ring · 60ec7fcf
      蒋家盛 authored
      The return value of kcalloc() needs to be checked.
      To avoid dereference of null pointer in case of the failure of alloc.
      Therefore, it might be better to change the return type of
      qlcnic_sriov_alloc_vlans() and return -ENOMEM when alloc fails and
      return 0 the others.
      Also, qlcnic_sriov_set_guest_vlan_mode() and __qlcnic_pci_sriov_enable()
      should deal with the return value of qlcnic_sriov_alloc_vlans().
      
      Fixes: 154d0c81
      
       ("qlcnic: VLAN enhancement for 84XX adapters")
      Signed-off-by: default avatarJiasheng Jiang <jiasheng@iscas.ac.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60ec7fcf
    • Lin Ma's avatar
      ax25: NPD bug when detaching AX25 device · 1ade48d0
      Lin Ma authored
      
      
      The existing cleanup routine implementation is not well synchronized
      with the syscall routine. When a device is detaching, below race could
      occur.
      
      static int ax25_sendmsg(...) {
        ...
        lock_sock()
        ax25 = sk_to_ax25(sk);
        if (ax25->ax25_dev == NULL) // CHECK
        ...
        ax25_queue_xmit(skb, ax25->ax25_dev->dev); // USE
        ...
      }
      
      static void ax25_kill_by_device(...) {
        ...
        if (s->ax25_dev == ax25_dev) {
          s->ax25_dev = NULL;
          ...
      }
      
      Other syscall functions like ax25_getsockopt, ax25_getname,
      ax25_info_show also suffer from similar races. To fix them, this patch
      introduce lock_sock() into ax25_kill_by_device in order to guarantee
      that the nullify action in cleanup routine cannot proceed when another
      socket request is pending.
      
      Signed-off-by: default avatarHanjie Wu <nagi@zju.edu.cn>
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ade48d0
    • Lin Ma's avatar
      hamradio: improve the incomplete fix to avoid NPD · b2f37aea
      Lin Ma authored
      The previous commit 3e0588c2
      
       ("hamradio: defer ax25 kfree after
      unregister_netdev") reorder the kfree operations and unregister_netdev
      operation to prevent UAF.
      
      This commit improves the previous one by also deferring the nullify of
      the ax->tty pointer. Otherwise, a NULL pointer dereference bug occurs.
      Partial of the stack trace is shown below.
      
      BUG: kernel NULL pointer dereference, address: 0000000000000538
      RIP: 0010:ax_xmit+0x1f9/0x400
      ...
      Call Trace:
       dev_hard_start_xmit+0xec/0x320
       sch_direct_xmit+0xea/0x240
       __qdisc_run+0x166/0x5c0
       __dev_queue_xmit+0x2c7/0xaf0
       ax25_std_establish_data_link+0x59/0x60
       ax25_connect+0x3a0/0x500
       ? security_socket_connect+0x2b/0x40
       __sys_connect+0x96/0xc0
       ? __hrtimer_init+0xc0/0xc0
       ? common_nsleep+0x2e/0x50
       ? switch_fpu_return+0x139/0x1a0
       __x64_sys_connect+0x11/0x20
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The crash point is shown as below
      
      static void ax_encaps(...) {
        ...
        set_bit(TTY_DO_WRITE_WAKEUP, &ax->tty->flags); // ax->tty = NULL!
        ...
      }
      
      By placing the nullify action after the unregister_netdev, the ax->tty
      pointer won't be assigned as NULL net_device framework layer is well
      synchronized.
      
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2f37aea
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · aa3cc8a9
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-12-17
      
      Maciej Fijalkowski says:
      
      It seems that previous [0] Rx fix was not enough and there are still
      issues with AF_XDP Rx ZC support in ice driver. Elza reported that for
      multiple XSK sockets configured on a single netdev, some of them were
      becoming dead after a while. We have spotted more things that needed to
      be addressed this time. More of information can be found in particular
      commit messages.
      
      It also carries Alexandr's patch that was sent previously which was
      overlapping with this set.
      
      [0]: https://lore.kernel.org/bpf/20211129231746.2767739-1-anthony.l.nguyen@intel.com/
      
      
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa3cc8a9
    • George Kennedy's avatar
      tun: avoid double free in tun_free_netdev · 158b515f
      George Kennedy authored
      
      
      Avoid double free in tun_free_netdev() by moving the
      dev->tstats and tun->security allocs to a new ndo_init routine
      (tun_net_init()) that will be called by register_netdevice().
      ndo_init is paired with the desctructor (tun_free_netdev()),
      so if there's an error in register_netdevice() the destructor
      will handle the frees.
      
      BUG: KASAN: double-free or invalid-free in selinux_tun_dev_free_security+0x1a/0x20 security/selinux/hooks.c:5605
      
      CPU: 0 PID: 25750 Comm: syz-executor416 Not tainted 5.16.0-rc2-syzk #1
      Hardware name: Red Hat KVM, BIOS
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0x89/0xb5 lib/dump_stack.c:106
      print_address_description.constprop.9+0x28/0x160 mm/kasan/report.c:247
      kasan_report_invalid_free+0x55/0x80 mm/kasan/report.c:372
      ____kasan_slab_free mm/kasan/common.c:346 [inline]
      __kasan_slab_free+0x107/0x120 mm/kasan/common.c:374
      kasan_slab_free include/linux/kasan.h:235 [inline]
      slab_free_hook mm/slub.c:1723 [inline]
      slab_free_freelist_hook mm/slub.c:1749 [inline]
      slab_free mm/slub.c:3513 [inline]
      kfree+0xac/0x2d0 mm/slub.c:4561
      selinux_tun_dev_free_security+0x1a/0x20 security/selinux/hooks.c:5605
      security_tun_dev_free_security+0x4f/0x90 security/security.c:2342
      tun_free_netdev+0xe6/0x150 drivers/net/tun.c:2215
      netdev_run_todo+0x4df/0x840 net/core/dev.c:10627
      rtnl_unlock+0x13/0x20 net/core/rtnetlink.c:112
      __tun_chr_ioctl+0x80c/0x2870 drivers/net/tun.c:3302
      tun_chr_ioctl+0x2f/0x40 drivers/net/tun.c:3311
      vfs_ioctl fs/ioctl.c:51 [inline]
      __do_sys_ioctl fs/ioctl.c:874 [inline]
      __se_sys_ioctl fs/ioctl.c:860 [inline]
      __x64_sys_ioctl+0x19d/0x220 fs/ioctl.c:860
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarGeorge Kennedy <george.kennedy@oracle.com>
      Suggested-by: default avatarJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/1639679132-19884-1-git-send-email-george.kennedy@oracle.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      158b515f
    • Yevhen Orlov's avatar
      net: marvell: prestera: fix incorrect structure access · 2efc2256
      Yevhen Orlov authored
      In line:
      	upper = info->upper_dev;
      We access upper_dev field, which is related only for particular events
      (e.g. event == NETDEV_CHANGEUPPER). So, this line cause invalid memory
      access for another events,
      when ptr is not netdev_notifier_changeupper_info.
      
      The KASAN logs are as follows:
      
      [   30.123165] BUG: KASAN: stack-out-of-bounds in prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera]
      [   30.133336] Read of size 8 at addr ffff80000cf772b0 by task udevd/778
      [   30.139866]
      [   30.141398] CPU: 0 PID: 778 Comm: udevd Not tainted 5.16.0-rc3 #6
      [   30.147588] Hardware name: DNI AmazonGo1 A7040 board (DT)
      [   30.153056] Call trace:
      [   30.155547]  dump_backtrace+0x0/0x2c0
      [   30.159320]  show_stack+0x18/0x30
      [   30.162729]  dump_stack_lvl+0x68/0x84
      [   30.166491]  print_address_description.constprop.0+0x74/0x2b8
      [   30.172346]  kasan_report+0x1e8/0x250
      [   30.176102]  __asan_load8+0x98/0xe0
      [   30.179682]  prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera]
      [   30.186847]  prestera_netdev_event_handler+0x1b4/0x1c0 [prestera]
      [   30.193313]  raw_notifier_call_chain+0x74/0xa0
      [   30.197860]  call_netdevice_notifiers_info+0x68/0xc0
      [   30.202924]  register_netdevice+0x3cc/0x760
      [   30.207190]  register_netdev+0x24/0x50
      [   30.211015]  prestera_device_register+0x8a0/0xba0 [prestera]
      
      Fixes: 3d5048cc
      
       ("net: marvell: prestera: move netdev topology validation to prestera_main")
      Signed-off-by: default avatarYevhen Orlov <yevhen.orlov@plvision.eu>
      Link: https://lore.kernel.org/r/20211216171714.11341-1-yevhen.orlov@plvision.eu
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2efc2256
    • Yevhen Orlov's avatar
      net: marvell: prestera: fix incorrect return of port_find · 8b681bd7
      Yevhen Orlov authored
      In case, when some ports is in list and we don't find requested - we
      return last iterator state and not return NULL as expected.
      
      Fixes: 501ef306
      
       ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
      Signed-off-by: default avatarYevhen Orlov <yevhen.orlov@plvision.eu>
      Link: https://lore.kernel.org/r/20211216170736.8851-1-yevhen.orlov@plvision.eu
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8b681bd7
    • Hoang Le's avatar
      Revert "tipc: use consistent GFP flags" · f845fe58
      Hoang Le authored
      This reverts commit 86c3a3e9.
      
      The tipc_aead_init() function can be calling from an interrupt routine.
      This allocation might sleep with GFP_KERNEL flag, hence the following BUG
      is reported.
      
      [   17.657509] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:230
      [   17.660916] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/3
      [   17.664093] preempt_count: 302, expected: 0
      [   17.665619] RCU nest depth: 2, expected: 0
      [   17.667163] Preemption disabled at:
      [   17.667165] [<0000000000000000>] 0x0
      [   17.669753] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G        W         5.16.0-rc4+ #1
      [   17.673006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      [   17.675540] Call Trace:
      [   17.676285]  <IRQ>
      [   17.676913]  dump_stack_lvl+0x34/0x44
      [   17.678033]  __might_resched.cold+0xd6/0x10f
      [   17.679311]  kmem_cache_alloc_trace+0x14d/0x220
      [   17.680663]  tipc_crypto_start+0x4a/0x2b0 [tipc]
      [   17.682146]  ? kmem_cache_alloc_trace+0xd3/0x220
      [   17.683545]  tipc_node_create+0x2f0/0x790 [tipc]
      [   17.684956]  tipc_node_check_dest+0x72/0x680 [tipc]
      [   17.686706]  ? ___cache_free+0x31/0x350
      [   17.688008]  ? skb_release_data+0x128/0x140
      [   17.689431]  tipc_disc_rcv+0x479/0x510 [tipc]
      [   17.690904]  tipc_rcv+0x71c/0x730 [tipc]
      [   17.692219]  ? __netif_receive_skb_core+0xb7/0xf60
      [   17.693856]  tipc_l2_rcv_msg+0x5e/0x90 [tipc]
      [   17.695333]  __netif_receive_skb_list_core+0x20b/0x260
      [   17.697072]  netif_receive_skb_list_internal+0x1bf/0x2e0
      [   17.698870]  ? dev_gro_receive+0x4c2/0x680
      [   17.700255]  napi_complete_done+0x6f/0x180
      [   17.701657]  virtnet_poll+0x29c/0x42e [virtio_net]
      [   17.703262]  __napi_poll+0x2c/0x170
      [   17.704429]  net_rx_action+0x22f/0x280
      [   17.705706]  __do_softirq+0xfd/0x30a
      [   17.706921]  common_interrupt+0xa4/0xc0
      [   17.708206]  </IRQ>
      [   17.708922]  <TASK>
      [   17.709651]  asm_common_interrupt+0x1e/0x40
      [   17.711078] RIP: 0010:default_idle+0x18/0x20
      
      Fixes: 86c3a3e9
      
       ("tipc: use consistent GFP flags")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Link: https://lore.kernel.org/r/20211217030059.5947-1-hoang.h.le@dektech.com.au
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f845fe58
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: increase buffer reservation · 1488fc20
      Aleksander Jan Bajkowski authored
      If the user sets a lower mtu on the CPU port than on the switch,
      then DMA inserts a few more bytes into the buffer than expected.
      In the worst case, it may exceed the size of the buffer. The
      experiments showed that the buffer should be a multiple of the
      burst length value. This patch rounds the length of the rx buffer
      upwards and fixes this bug. The reservation of FCS space in the
      buffer has been removed as PMAC strips the FCS.
      
      Fixes: 998ac358
      
       ("net: lantiq: add support for jumbo frames")
      Reported-by: default avatarThomas Nixon <tom@tomn.co.uk>
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1488fc20
    • Jakub Kicinski's avatar
      Merge branch 'net-sched-fix-ct-zone-matching-for-invalid-conntrack-state' · 14193d57
      Jakub Kicinski authored
      Paul Blakey says:
      
      ====================
      net/sched: Fix ct zone matching for invalid conntrack state
      
      Currently, when a packet is marked as invalid conntrack_in in act_ct,
      post_ct will be set, and connection info (nf_conn) will be removed
      from the skb. Later openvswitch and flower matching will parse this
      as ct_state=+trk+inv. But because the connection info is missing,
      there is also no zone info to match against even though the packet
      is tracked.
      
      This series fixes that, by passing the last executed zone by act_ct.
      The zone info is passed along from act_ct to the ct flow dissector
      (used by flower to extract zone info) and to ovs, the same way as post_ct
      is passed, via qdisc layer skb cb to dissector, and via skb extension
      to OVS.
      
      Since adding any more data to qdisc skb cb, there will be no room
      for BPF skb cb to extend it and stay under skb->cb size, this series
      moves the tc related info from within qdisc skb cb to a tc specific cb
      that also extends it.
      ====================
      
      Link: https://lore.kernel.org/r/20211214172435.24207-1-paulb@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      14193d57
    • Paul Blakey's avatar
      net: openvswitch: Fix matching zone id for invalid conns arriving from tc · 635d448a
      Paul Blakey authored
      Zone id is not restored if we passed ct and ct rejected the connection,
      as there is no ct info on the skb.
      
      Save the zone from tc skb cb to tc skb extension and pass it on to
      ovs, use that info to restore the zone id for invalid connections.
      
      Fixes: d29334c1
      
       ("net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      635d448a
    • Paul Blakey's avatar
      net/sched: flow_dissector: Fix matching on zone id for invalid conns · 38495958
      Paul Blakey authored
      If ct rejects a flow, it removes the conntrack info from the skb.
      act_ct sets the post_ct variable so the dissector will see this case
      as an +tracked +invalid state, but the zone id is lost with the
      conntrack info.
      
      To restore the zone id on such cases, set the last executed zone,
      via the tc control block, when passing ct, and read it back in the
      dissector if there is no ct info on the skb (invalid connection).
      
      Fixes: 7baf2429
      
       ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      38495958
    • Paul Blakey's avatar
      net/sched: Extend qdisc control block with tc control block · ec624fe7
      Paul Blakey authored
      
      
      BPF layer extends the qdisc control block via struct bpf_skb_data_end
      and because of that there is no more room to add variables to the
      qdisc layer control block without going over the skb->cb size.
      
      Extend the qdisc control block with a tc control block,
      and move all tc related variables to there as a pre-step for
      extending the tc control block with additional members.
      
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ec624fe7
    • Maciej Fijalkowski's avatar
      ice: xsk: fix cleaned_count setting · dcbaf72a
      Maciej Fijalkowski authored
      Currently cleaned_count is initialized to ICE_DESC_UNUSED(rx_ring) and
      later on during the Rx processing it is incremented per each frame that
      driver consumed. This can result in excessive buffers requested from xsk
      pool based on that value.
      
      To address this, just drop cleaned_count and pass
      ICE_DESC_UNUSED(rx_ring) directly as a function argument to
      ice_alloc_rx_bufs_zc(). Idea is to ask for buffers as many as consumed.
      
      Let us also call ice_alloc_rx_bufs_zc unconditionally at the end of
      ice_clean_rx_irq_zc. This has been changed in that way for corresponding
      ice_clean_rx_irq, but not here.
      
      Fixes: 2d4238f5
      
       ("ice: Add support for AF_XDP")
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      dcbaf72a
    • Maciej Fijalkowski's avatar
      ice: xsk: allow empty Rx descriptors on XSK ZC data path · 8bea15ab
      Maciej Fijalkowski authored
      Commit ac6f733a ("ice: allow empty Rx descriptors") stated that ice
      HW can produce empty descriptors that are valid and they should be
      processed.
      
      Add this support to xsk ZC path to avoid potential processing problems.
      
      Fixes: 2d4238f5
      
       ("ice: Add support for AF_XDP")
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      8bea15ab
    • Maciej Fijalkowski's avatar
      ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor · 8b51a13c
      Maciej Fijalkowski authored
      The descriptor that ntu is pointing at when we exit
      ice_alloc_rx_bufs_zc() should not have its corresponding DD bit cleared
      as descriptor is not allocated in there and it is not valid for HW
      usage.
      
      The allocation routine at the entry will fill the descriptor that ntu
      points to after it was set to ntu + nb_buffs on previous call.
      
      Even the spec says:
      "The tail pointer should be set to one descriptor beyond the last empty
      descriptor in host descriptor ring."
      
      Therefore, step away from clearing the status_error0 on ntu + nb_buffs
      descriptor.
      
      Fixes: db804cfc
      
       ("ice: Use the xsk batched rx allocation interface")
      Reported-by: default avatarElza Mathew <elza.mathew@intel.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      8b51a13c
    • Alexander Lobakin's avatar
      ice: remove dead store on XSK hotpath · 0708b6fa
      Alexander Lobakin authored
      The 'if (ntu == rx_ring->count)' block in ice_alloc_rx_buffers_zc()
      was previously residing in the loop, but after introducing the
      batched interface it is used only to wrap-around the NTU descriptor,
      thus no more need to assign 'xdp'.
      
      Fixes: db804cfc
      
       ("ice: Use the xsk batched rx allocation interface")
      Signed-off-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Acked-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      0708b6fa