Skip to content
  1. Jan 09, 2022
  2. Jan 08, 2022
  3. Jan 07, 2022
    • Dan Carpenter's avatar
      netrom: fix api breakage in nr_setsockopt() · dc35616e
      Dan Carpenter authored
      This needs to copy an unsigned int from user space instead of a long to
      avoid breaking user space with an API change.
      
      I have updated all the integer overflow checks from ULONG to UINT as
      well.  This is a slight API change but I do not expect it to affect
      anything in real life.
      
      Fixes: 3087a6f3
      
       ("netrom: fix copying in user data in nr_setsockopt")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc35616e
    • Dan Carpenter's avatar
      ax25: uninitialized variable in ax25_setsockopt() · 93719370
      Dan Carpenter authored
      The "opt" variable is unsigned long but we only copy 4 bytes from
      the user so the lower 4 bytes are uninitialized.
      
      I have changed the integer overflow checks from ULONG to UINT as well.
      This is a slight API change but I don't expect it to break anything.
      
      Fixes: a7b75c5a
      
       ("net: pass a sockptr_t into ->setsockopt")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93719370
    • David S. Miller's avatar
      Merge branch 'octeontx2-ptp-bugs' · b69c5b58
      David S. Miller authored
      
      
      Subbaraya Sundeep says:
      
      ====================
      octeontx2: Fix PTP bugs
      
      This patchset addresses two problems found when using
      ptp.
      Patch 1 - Increases the refcount of ptp device before use
      which was missing and it lead to refcount increment after use
      bug when module is loaded and unloaded couple of times.
      Patch 2 - PTP resources allocated by VF are not being freed
      during VF teardown. This patch fixes that.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b69c5b58
    • Rakesh Babu Saladi's avatar
      octeontx2-nicvf: Free VF PTP resources. · eabd0f88
      Rakesh Babu Saladi authored
      When a VF is removed respective PTP resources are not
      being freed currently. This patch fixes it.
      
      Fixes: 43510ef4
      
       ("octeontx2-nicvf: Add PTP hardware clock support to NIX VF")
      Signed-off-by: default avatarRakesh Babu Saladi <rsaladi2@marvell.com>
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eabd0f88
    • Subbaraya Sundeep's avatar
      octeontx2-af: Increment ptp refcount before use · 93440f48
      Subbaraya Sundeep authored
      Before using the ptp pci device by AF driver increment
      the reference count of it.
      
      Fixes: a8b90c9d
      
       ("octeontx2-af: Add PTP device id for CN10K and 95O silcons")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93440f48
    • David S. Miller's avatar
      Merge branch 'mptcp-fixes' · fff63521
      David S. Miller authored
      
      
      Mat Martineau says:
      
      ====================
      mptcp: Fixes for buffer reclaim and option writing
      
      Here are three fixes dealing with a syzkaller crash MPTCP triggers in
      the memory manager in 5.16-rc8, and some option writing problems.
      
      Patches 1 and 2 fix some corner cases in MPTCP option writing.
      
      Patch 3 addresses a crash that syzkaller found a way to trigger in the mm
      subsystem by passing an invalid value to __sk_mem_reduce_allocated().
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fff63521
    • Mat Martineau's avatar
      mptcp: Check reclaim amount before reducing allocation · 269bda9e
      Mat Martineau authored
      syzbot found a page counter underflow that was triggered by MPTCP's
      reclaim code:
      
      page_counter underflow: -4294964789 nr_pages=4294967295
      WARNING: CPU: 2 PID: 3785 at mm/page_counter.c:56 page_counter_cancel+0xcf/0xe0 mm/page_counter.c:56
      Modules linked in:
      CPU: 2 PID: 3785 Comm: kworker/2:6 Not tainted 5.16.0-rc1-syzkaller #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      Workqueue: events mptcp_worker
      
      RIP: 0010:page_counter_cancel+0xcf/0xe0 mm/page_counter.c:56
      Code: c7 04 24 00 00 00 00 45 31 f6 eb 97 e8 2a 2b b5 ff 4c 89 ea 48 89 ee 48 c7 c7 00 9e b8 89 c6 05 a0 c1 ba 0b 01 e8 95 e4 4b 07 <0f> 0b eb a8 4c 89 e7 e8 25 5a fb ff eb c7 0f 1f 00 41 56 41 55 49
      RSP: 0018:ffffc90002d4f918 EFLAGS: 00010082
      
      RAX: 0000000000000000 RBX: ffff88806a494120 RCX: 0000000000000000
      RDX: ffff8880688c41c0 RSI: ffffffff815e8f28 RDI: fffff520005a9f15
      RBP: ffffffff000009cb R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff815e2cfe R11: 0000000000000000 R12: ffff88806a494120
      R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2de21000 CR3: 000000005ad59000 CR4: 0000000000150ee0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       page_counter_uncharge+0x2e/0x60 mm/page_counter.c:160
       drain_stock+0xc1/0x180 mm/memcontrol.c:2219
       refill_stock+0x139/0x2f0 mm/memcontrol.c:2271
       __sk_mem_reduce_allocated+0x24d/0x550 net/core/sock.c:2945
       __mptcp_rmem_reclaim net/mptcp/protocol.c:167 [inline]
       __mptcp_mem_reclaim_partial+0x124/0x410 net/mptcp/protocol.c:975
       mptcp_mem_reclaim_partial net/mptcp/protocol.c:982 [inline]
       mptcp_alloc_tx_skb net/mptcp/protocol.c:1212 [inline]
       mptcp_sendmsg_frag+0x18c6/0x2190 net/mptcp/protocol.c:1279
       __mptcp_push_pending+0x232/0x720 net/mptcp/protocol.c:1545
       mptcp_release_cb+0xfe/0x200 net/mptcp/protocol.c:2975
       release_sock+0xb4/0x1b0 net/core/sock.c:3306
       mptcp_worker+0x51e/0xc10 net/mptcp/protocol.c:2443
       process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
       worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
       kthread+0x405/0x4f0 kernel/kthread.c:327
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      
      __mptcp_mem_reclaim_partial() could call __mptcp_rmem_reclaim() with a
      negative value, which passed that negative value to
      __sk_mem_reduce_allocated() and triggered the splat above.
      
      Check for a reclaim amount that is positive and large enough for
      __mptcp_rmem_reclaim() to actually adjust rmem_fwd_alloc (much like
      the sk_mem_reclaim_partial() code the function is based on).
      
      v2: Use '>' instead of '>=', since SK_MEM_QUANTUM - 1 would get
      right-shifted into nothing by __mptcp_rmem_reclaim.
      
      Fixes: 6511882c ("mptcp: allocate fwd memory separately on the rx and tx path")
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/252
      
      
      Reported-and-tested-by: default avatar <syzbot+bc9e2d2dbcb347dd215a@syzkaller.appspotmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      269bda9e
    • Geliang Tang's avatar
      mptcp: fix a DSS option writing error · 110b6d1f
      Geliang Tang authored
      'ptr += 1;' was omitted in the original code.
      
      If the DSS is the last option -- which is what we have most of the
      time -- that's not an issue. But it is if we need to send something else
      after like a RM_ADDR or an MP_PRIO.
      
      Fixes: 1bff1e43
      
       ("mptcp: optimize out option generation")
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      110b6d1f
    • Matthieu Baerts's avatar
      mptcp: fix opt size when sending DSS + MP_FAIL · 04fac2ca
      Matthieu Baerts authored
      When these two options had to be sent -- which is not common -- the DSS
      size was not being taken into account in the remaining size.
      
      Additionally in this situation, the reported size was only the one of
      the MP_FAIL which can cause issue if at the end, we need to write more
      in the TCP options than previously said.
      
      Here we use a dedicated variable for MP_FAIL size to keep the
      WARN_ON_ONCE() just after.
      
      Fixes: c25aeb4e
      
       ("mptcp: MP_FAIL suboption sending")
      Acked-and-tested-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04fac2ca
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2022-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 14676c04
      David S. Miller authored
      
      
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2022-01-06
      
      This series provides bug fixes to mlx5 driver.
      Please pull and let me know if there is any problem.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14676c04
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 29507144
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Refcount leak in ipt_CLUSTERIP rule loading path, from Xin Xiong.
      
      2) Use socat in netfilter selftests, from Hangbin Liu.
      
      3) Skip layer checksum 4 update for IP fragments.
      
      4) Missing allocation of pcpu scratch maps on clone in
         nft_set_pipapo, from Florian Westphal.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
        netfilter: nft_set_pipapo: allocate pcpu scratch maps on clone
        netfilter: nft_payload: do not update layer 4 checksum when mangling fragments
        selftests: netfilter: switch to socat for tests using -q option
        netfilter: ipt_CLUSTERIP: fix refcount leak in clusterip_tg_check()
      ====================
      
      Link: https://lore.kernel.org/r/20220106215139.170824-1-pablo@netfilter.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      29507144
    • Moshe Shemesh's avatar
      Revert "net/mlx5: Add retry mechanism to the command entry index allocation" · 4f6626b0
      Moshe Shemesh authored
      This reverts commit 410bd754.
      
      The reverted commit had added a retry mechanism to the command entry
      index allocation. The previous patch ensures that there is a free
      command entry index once the command work handler holds the command
      semaphore. Thus the retry mechanism is not needed.
      
      Fixes: 410bd754
      
       ("net/mlx5: Add retry mechanism to the command entry index allocation")
      Signed-off-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4f6626b0
    • Moshe Shemesh's avatar
      net/mlx5: Set command entry semaphore up once got index free · 8e715cd6
      Moshe Shemesh authored
      Avoid a race where command work handler may fail to allocate command
      entry index, by holding the command semaphore down till command entry
      index is being freed.
      
      Fixes: 410bd754
      
       ("net/mlx5: Add retry mechanism to the command entry index allocation")
      Signed-off-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8e715cd6
    • Maor Dickman's avatar
      net/mlx5e: Sync VXLAN udp ports during uplink representor profile change · 07f6dc40
      Maor Dickman authored
      Currently during NIC profile disablement all VXLAN udp ports offloaded to the
      HW are flushed and during its enablement the driver send notification to
      the stack to inform the core that the entire UDP tunnel port state has been
      lost, uplink representor doesn't have the same behavior which can cause
      VXLAN udp ports offload to be in bad state while moving between modes while
      VXLAN interface exist.
      
      Fixed by aligning the uplink representor profile behavior to the NIC behavior.
      
      Fixes: 84db6612
      
       ("net/mlx5e: Move set vxlan nic info to profile init")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      07f6dc40
    • Shay Drory's avatar
      net/mlx5: Fix access to sf_dev_table on allocation failure · a1c7c49c
      Shay Drory authored
      Even when SF devices are supported, the SF device table allocation
      can still fail.
      In such case mlx5_sf_dev_supported still reports true, but SF device
      table is invalid. This can result in NULL table access.
      
      Hence, fix it by adding NULL table check.
      
      Fixes: 1958fc2f
      
       ("net/mlx5: SF, Add auxiliary device driver")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      a1c7c49c
    • Paul Blakey's avatar
      net/mlx5e: Fix matching on modified inner ip_ecn bits · b6dfff21
      Paul Blakey authored
      Tunnel device follows RFC 6040, and during decapsulation inner
      ip_ecn might change depending on inner and outer ip_ecn as follows:
      
       +---------+----------------------------------------+
       |Arriving |         Arriving Outer Header          |
       |   Inner +---------+---------+---------+----------+
       |  Header | Not-ECT | ECT(0)  | ECT(1)  |   CE     |
       +---------+---------+---------+---------+----------+
       | Not-ECT | Not-ECT | Not-ECT | Not-ECT | <drop>   |
       |  ECT(0) |  ECT(0) | ECT(0)  | ECT(1)  |   CE*    |
       |  ECT(1) |  ECT(1) | ECT(1)  | ECT(1)* |   CE*    |
       |    CE   |   CE    |  CE     | CE      |   CE     |
       +---------+---------+---------+---------+----------+
      
      Cells marked above are changed from original inner packet ip_ecn value.
      
      Tc then matches on the modified inner ip_ecn, but hw offload which
      matches the inner ip_ecn value before decap, will fail.
      
      Fix that by mapping all the cases of outer and inner ip_ecn matching,
      and only supporting cases where we know inner wouldn't be changed by
      decap, or in the outer ip_ecn=CE case, inner ip_ecn didn't matter.
      
      Fixes: bcef735c
      
       ("net/mlx5e: Offload TC matching on tos/ttl for ip tunnels")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Reviewed-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b6dfff21
    • Aya Levin's avatar
      Revert "net/mlx5e: Block offload of outer header csum for GRE tunnel" · 01c3fd11
      Aya Levin authored
      This reverts commit 54e1217b.
      
      Although the NIC doesn't support offload of outer header CSUM, using
      gso_partial_features allows offloading the tunnel's segmentation. The
      driver relies on the stack CSUM calculation of the outer header. For
      this, NETIF_F_GSO_GRE_CSUM must be a member of the device's features.
      
      Fixes: 54e1217b
      
       ("net/mlx5e: Block offload of outer header csum for GRE tunnel")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      01c3fd11
    • Aya Levin's avatar
      Revert "net/mlx5e: Block offload of outer header csum for UDP tunnels" · 64050cda
      Aya Levin authored
      This reverts commit 6d6727dd.
      
      Although the NIC doesn't support offload of outer header CSUM, using
      gso_partial_features allows offloading the tunnel's segmentation. The
      driver relies on the stack CSUM calculation of the outer header. For
      this, NETIF_F_GSO_UDP_TUNNEL_CSUM must be a member of the device's
      features.
      
      Fixes: 6d6727dd
      
       ("net/mlx5e: Block offload of outer header csum for UDP tunnels")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      64050cda
    • Maor Dickman's avatar
      net/mlx5e: Don't block routes with nexthop objects in SW · 9e72a55a
      Maor Dickman authored
      Routes with nexthop objects is currently not supported by multipath offload
      and any attempts to use it is blocked, however this also block adding SW
      routes with nexthop.
      
      Resolve this by returning NOTIFY_DONE instead of an error which will allow such
      a route to be created in SW but not offloaded.
      
      This fix also solve an issue which block adding such routes on different devices
      due to missing check if the route FIB device is one of multipath devices.
      
      Fixes: 6a87afc0
      
       ("mlx5: Fail attempts to use routes with nexthop objects")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9e72a55a
    • Maor Dickman's avatar
      net/mlx5e: Fix wrong usage of fib_info_nh when routes with nexthop objects are used · 885751eb
      Maor Dickman authored
      Creating routes with nexthop objects while in switchdev mode leads to access to
      un-allocated memory and trigger bellow call trace due to hitting WARN_ON.
      This is caused due to illegal usage of fib_info_nh in TC tunnel FIB event handling to
      resolve the FIB device while fib_info built in with nexthop.
      
      Fixed by ignoring attempts to use nexthop objects with routes until support can be
      properly added.
      
      WARNING: CPU: 1 PID: 1724 at include/net/nexthop.h:468 mlx5e_tc_tun_fib_event+0x448/0x570 [mlx5_core]
      CPU: 1 PID: 1724 Comm: ip Not tainted 5.15.0_for_upstream_min_debug_2021_11_09_02_04 #1
      RIP: 0010:mlx5e_tc_tun_fib_event+0x448/0x570 [mlx5_core]
      RSP: 0018:ffff8881349f7910 EFLAGS: 00010202
      RAX: ffff8881492f1980 RBX: ffff8881349f79e8 RCX: 0000000000000000
      RDX: ffff8881349f79e8 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: ffff8881349f7950 R08: 00000000000000fe R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff88811e9d0000
      R13: ffff88810eb62000 R14: ffff888106710268 R15: 0000000000000018
      FS:  00007f1d5ca6e800(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffedba44ff8 CR3: 0000000129808004 CR4: 0000000000370ea0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       atomic_notifier_call_chain+0x42/0x60
       call_fib_notifiers+0x21/0x40
       fib_table_insert+0x479/0x6d0
       ? try_charge_memcg+0x480/0x6d0
       inet_rtm_newroute+0x65/0xb0
       rtnetlink_rcv_msg+0x2af/0x360
       ? page_add_file_rmap+0x13/0x130
       ? do_set_pte+0xcd/0x120
       ? rtnl_calcit.isra.0+0x120/0x120
       netlink_rcv_skb+0x4e/0xf0
       netlink_unicast+0x1ee/0x2b0
       netlink_sendmsg+0x22e/0x460
       sock_sendmsg+0x33/0x40
       ____sys_sendmsg+0x1d1/0x1f0
       ___sys_sendmsg+0xab/0xf0
       ? __mod_memcg_lruvec_state+0x40/0x60
       ? __mod_lruvec_page_state+0x95/0xd0
       ? page_add_new_anon_rmap+0x4e/0xf0
       ? __handle_mm_fault+0xec6/0x1470
       __sys_sendmsg+0x51/0x90
       ? internal_get_user_pages_fast+0x480/0xa10
       do_syscall_64+0x3d/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 8914add2
      
       ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      885751eb
    • Dima Chumak's avatar
      net/mlx5e: Fix nullptr on deleting mirroring rule · de31854e
      Dima Chumak authored
      Deleting a Tc rule with multiple outputs, one of which is internal port,
      like this one:
      
        tc filter del dev enp8s0f0_0 ingress protocol ip pref 5 flower \
            dst_mac 0c:42:a1:d1:d0:88 \
            src_mac e4:ea:09:08:00:02 \
            action tunnel_key  set \
                src_ip 0.0.0.0 \
                dst_ip 7.7.7.8 \
                id 8 \
                dst_port 4789 \
            action mirred egress mirror dev vxlan_sys_4789 pipe \
            action mirred egress redirect dev enp8s0f0_1
      
      Triggers a call trace:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000230
        RIP: 0010:del_sw_hw_rule+0x2b/0x1f0 [mlx5_core]
        Call Trace:
         tree_remove_node+0x16/0x30 [mlx5_core]
         mlx5_del_flow_rules+0x51/0x160 [mlx5_core]
         __mlx5_eswitch_del_rule+0x4b/0x170 [mlx5_core]
         mlx5e_tc_del_fdb_flow+0x295/0x550 [mlx5_core]
         mlx5e_flow_put+0x1f/0x70 [mlx5_core]
         mlx5e_delete_flower+0x286/0x390 [mlx5_core]
         tc_setup_cb_destroy+0xac/0x170
         fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
         __fl_delete+0x15e/0x170 [cls_flower]
         fl_delete+0x36/0x80 [cls_flower]
         tc_del_tfilter+0x3a6/0x6e0
         rtnetlink_rcv_msg+0xe5/0x360
         ? rtnl_calcit.isra.0+0x110/0x110
         netlink_rcv_skb+0x46/0x110
         netlink_unicast+0x16b/0x200
         netlink_sendmsg+0x202/0x3d0
         sock_sendmsg+0x33/0x40
         ____sys_sendmsg+0x1c3/0x200
         ? copy_msghdr_from_user+0xd6/0x150
         ___sys_sendmsg+0x88/0xd0
         ? ___sys_recvmsg+0x88/0xc0
         ? do_futex+0x10c/0x460
         __sys_sendmsg+0x59/0xa0
         do_syscall_64+0x48/0x140
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fix by disabling offloading for flows matching
      esw_is_chain_src_port_rewrite() which have more than one output.
      
      Fixes: 10742efc
      
       ("net/mlx5e: VF tunnel TX traffic offloading")
      Signed-off-by: default avatarDima Chumak <dchumak@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      de31854e
    • Aya Levin's avatar
      net/mlx5e: Fix page DMA map/unmap attributes · 0b7cfa40
      Aya Levin authored
      Driver initiates DMA sync, hence it may skip CPU sync. Add
      DMA_ATTR_SKIP_CPU_SYNC as input attribute both to dma_map_page and
      dma_unmap_page to avoid redundant sync with the CPU.
      When forcing the device to work with SWIOTLB, the extra sync might cause
      data corruption. The driver unmaps the whole page while the hardware
      used just a part of the bounce buffer. So syncing overrides the entire
      page with bounce buffer that only partially contains real data.
      
      Fixes: bc77b240 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
      Fixes: db05815b
      
       ("net/mlx5e: Add XSK zero-copy support")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      0b7cfa40
  4. Jan 06, 2022
    • Wen Gu's avatar
      net/smc: Reset conn->lgr when link group registration fails · 36595d8a
      Wen Gu authored
      SMC connections might fail to be registered in a link group due to
      unable to find a usable link during its creation. As a result,
      smc_conn_create() will return a failure and most resources related
      to the connection won't be applied or initialized, such as
      conn->abort_work or conn->lnk.
      
      If smc_conn_free() is invoked later, it will try to access the
      uninitialized resources related to the connection, thus causing
      a warning or crash.
      
      This patch tries to fix this by resetting conn->lgr to NULL if an
      abnormal exit occurs in smc_lgr_register_conn(), thus avoiding the
      access to uninitialized resources in smc_conn_free().
      
      Meanwhile, the new created link group should be terminated if smc
      connections can't be registered in it. So smc_lgr_cleanup_early() is
      modified to take care of link group only and invoked to terminate
      unusable link group by smc_conn_create(). The call to smc_conn_free()
      is moved out from smc_lgr_cleanup_early() to smc_conn_abort().
      
      Fixes: 56bc3b20
      
       ("net/smc: assign link to a new connection")
      Suggested-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36595d8a
    • 蒋家盛's avatar
      fsl/fman: Check for null pointer after calling devm_ioremap · d5a73ec9
      蒋家盛 authored
      As the possible failure of the allocation, the devm_ioremap() may return
      NULL pointer.
      Take tgec_initialization() as an example.
      If allocation fails, the params->base_addr will be NULL pointer and will
      be assigned to tgec->regs in tgec_config().
      Then it will cause the dereference of NULL pointer in set_mac_address(),
      which is called by tgec_init().
      Therefore, it should be better to add the sanity check after the calling
      of the devm_ioremap().
      
      Fixes: 39339616
      
       ("fsl/fman: Add FMan MAC driver")
      Signed-off-by: default avatarJiasheng Jiang <jiasheng@iscas.ac.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5a73ec9
    • Dan Carpenter's avatar
      rocker: fix a sleeping in atomic bug · 43d01212
      Dan Carpenter authored
      This code is holding the &ofdpa->flow_tbl_lock spinlock so it is not
      allowed to sleep.  That means we have to pass the OFDPA_OP_FLAG_NOWAIT
      flag to ofdpa_flow_tbl_del().
      
      Fixes: 936bd486
      
       ("rocker: use FIB notifications instead of switchdev calls")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43d01212
    • Eric Dumazet's avatar
      ppp: ensure minimum packet size in ppp_write() · 44073187
      Eric Dumazet authored
      It seems pretty clear ppp layer assumed user space
      would always be kind to provide enough data
      in their write() to a ppp device.
      
      This patch makes sure user provides at least
      2 bytes.
      
      It adds PPP_PROTO_LEN macro that could replace
      in net-next many occurrences of hard-coded 2 value.
      
      I replaced only one occurrence to ease backports
      to stable kernels.
      
      The bug manifests in the following report:
      
      BUG: KMSAN: uninit-value in ppp_send_frame+0x28d/0x27c0 drivers/net/ppp/ppp_generic.c:1740
       ppp_send_frame+0x28d/0x27c0 drivers/net/ppp/ppp_generic.c:1740
       __ppp_xmit_process+0x23e/0x4b0 drivers/net/ppp/ppp_generic.c:1640
       ppp_xmit_process+0x1fe/0x480 drivers/net/ppp/ppp_generic.c:1661
       ppp_write+0x5cb/0x5e0 drivers/net/ppp/ppp_generic.c:513
       do_iter_write+0xb0c/0x1500 fs/read_write.c:853
       vfs_writev fs/read_write.c:924 [inline]
       do_writev+0x645/0xe00 fs/read_write.c:967
       __do_sys_writev fs/read_write.c:1040 [inline]
       __se_sys_writev fs/read_write.c:1037 [inline]
       __x64_sys_writev+0xe5/0x120 fs/read_write.c:1037
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Uninit was created at:
       slab_post_alloc_hook mm/slab.h:524 [inline]
       slab_alloc_node mm/slub.c:3251 [inline]
       __kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4974
       kmalloc_reserve net/core/skbuff.c:354 [inline]
       __alloc_skb+0x545/0xf90 net/core/skbuff.c:426
       alloc_skb include/linux/skbuff.h:1126 [inline]
       ppp_write+0x11d/0x5e0 drivers/net/ppp/ppp_generic.c:501
       do_iter_write+0xb0c/0x1500 fs/read_write.c:853
       vfs_writev fs/read_write.c:924 [inline]
       do_writev+0x645/0xe00 fs/read_write.c:967
       __do_sys_writev fs/read_write.c:1040 [inline]
       __se_sys_writev fs/read_write.c:1037 [inline]
       __x64_sys_writev+0xe5/0x120 fs/read_write.c:1037
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linux-ppp@vger.kernel.org
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44073187
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · c4251db3
      David S. Miller authored
      
      
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2022-01-06
      
      1) Fix xfrm policy lookups for ipv6 gre packets by initializing
         fl6_gre_key properly. From Ghalem Boudour.
      
      2) Fix the dflt policy check on forwarding when there is no
         policy configured. The check was done for the wrong direction.
         From Nicolas Dichtel.
      
      3) Use the correct 'struct xfrm_user_offload' when calculating
         netlink message lenghts in xfrm_sa_len(). From Eric Dumazet.
      
      4) Tread inserting xfrm interface id 0 as an error.
         From Antony Antony.
      
      5) Fail if xfrm state or policy is inserted with XFRMA_IF_ID 0,
         xfrm interfaces with id 0 are not allowed.
         From Antony Antony.
      
      6) Fix inner_ipproto setting in the sec_path for tunnel mode.
         From  Raed Salem.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4251db3
    • Florian Westphal's avatar
      netfilter: nft_set_pipapo: allocate pcpu scratch maps on clone · 23c54263
      Florian Westphal authored
      This is needed in case a new transaction is made that doesn't insert any
      new elements into an already existing set.
      
      Else, after second 'nft -f ruleset.txt', lookups in such a set will fail
      because ->lookup() encounters raw_cpu_ptr(m->scratch) == NULL.
      
      For the initial rule load, insertion of elements takes care of the
      allocation, but for rule reloads this isn't guaranteed: we might not
      have additions to the set.
      
      Fixes: 3c4287f6
      
       ("nf_tables: Add set type for arbitrary concatenation of ranges")
      Reported-by: default avataretkaar <lists.netfilter.org@prvy.eu>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      23c54263
    • Pablo Neira Ayuso's avatar
      netfilter: nft_payload: do not update layer 4 checksum when mangling fragments · 4e1860a3
      Pablo Neira Ayuso authored
      IP fragments do not come with the transport header, hence skip bogus
      layer 4 checksum updates.
      
      Fixes: 18140969
      
       ("netfilter: nft_payload: layer 4 checksum adjustment for pseudoheader fields")
      Reported-and-tested-by: default avatarSteffen Weinreich <steve@weinreich.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4e1860a3
    • Hangbin Liu's avatar
      selftests: netfilter: switch to socat for tests using -q option · 1585f590
      Hangbin Liu authored
      
      
      The nc cmd(nmap-ncat) that distributed with Fedora/Red Hat does not have
      option -q. This make some tests failed with:
      
      	nc: invalid option -- 'q'
      
      Let's switch to socat which is far more dependable.
      
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1585f590
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-5.16-20220105' of... · 502a2ce9
      Jakub Kicinski authored
      Merge tag 'linux-can-fixes-for-5.16-20220105' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2022-01-05
      
      It consists of 2 patches, both by me. The first one fixes the use of
      an uninitialized variable in the gs_usb driver the other one a
      skb_over_panic in the ISOTP stack in case of reception of too large
      ISOTP messages.
      
      * tag 'linux-can-fixes-for-5.16-20220105' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: isotp: convert struct tpcon::{idx,len} to unsigned int
        can: gs_usb: fix use of uninitialized variable, detach device on reception of invalid USB data
      ====================
      
      Link: https://lore.kernel.org/r/20220105205443.1274709-1-mkl@pengutronix.de
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      502a2ce9
    • Linus Torvalds's avatar
      Merge tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 75acfdb6
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski"
       "Networking fixes, including fixes from bpf, and WiFi. One last pull
        request, turns out some of the recent fixes did more harm than good.
      
        Current release - regressions:
      
         - Revert "xsk: Do not sleep in poll() when need_wakeup set", made the
           problem worse
      
         - Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in
           __fixed_phy_register", broke EPROBE_DEFER handling
      
         - Revert "net: usb: r8152: Add MAC pass-through support for more
           Lenovo Docks", broke setups without a Lenovo dock
      
        Current release - new code bugs:
      
         - selftests: set amt.sh executable
      
        Previous releases - regressions:
      
         - batman-adv: mcast: don't send link-local multicast to mcast routers
      
        Previous releases - always broken:
      
         - ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY
      
         - sctp: hold endpoint before calling cb in
           sctp_transport_lookup_process
      
         - mac80211: mesh: embed mesh_paths and mpp_paths into
           ieee80211_if_mesh to avoid complicated handling of sub-object
           allocation failures
      
         - seg6: fix traceroute in the presence of SRv6
      
         - tipc: fix a kernel-infoleak in __tipc_sendmsg()"
      
      * tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
        selftests: set amt.sh executable
        Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks"
        sfc: The RX page_ring is optional
        iavf: Fix limit of total number of queues to active queues of VF
        i40e: Fix incorrect netdev's real number of RX/TX queues
        i40e: Fix for displaying message regarding NVM version
        i40e: fix use-after-free in i40e_sync_filters_subtask()
        i40e: Fix to not show opcode msg on unsuccessful VF MAC change
        ieee802154: atusb: fix uninit value in atusb_set_extended_addr
        mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
        mac80211: initialize variable have_higher_than_11mbit
        sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
        netrom: fix copying in user data in nr_setsockopt
        udp6: Use Segment Routing Header for dest address if present
        icmp: ICMPV6: Examine invoking packet for Segment Route Headers.
        seg6: export get_srh() for ICMP handling
        Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register"
        ipv6: Do cleanup if attribute validation fails in multipath route
        ipv6: Continue processing multipath route even if gateway attribute is invalid
        net/fsl: Remove leftover definition in xgmac_mdio
        ...
      75acfdb6
    • Marc Kleine-Budde's avatar
      can: isotp: convert struct tpcon::{idx,len} to unsigned int · 5f33a09e
      Marc Kleine-Budde authored
      In isotp_rcv_ff() 32 bit of data received over the network is assigned
      to struct tpcon::len. Later in that function the length is checked for
      the maximal supported length against MAX_MSG_LENGTH.
      
      As struct tpcon::len is an "int" this check does not work, if the
      provided length overflows the "int".
      
      Later on struct tpcon::idx is compared against struct tpcon::len.
      
      To fix this problem this patch converts both struct tpcon::{idx,len}
      to unsigned int.
      
      Fixes: e057dd3f ("can: add ISO 15765-2:2016 transport protocol")
      Link: https://lore.kernel.org/all/20220105132429.1170627-1-mkl@pengutronix.de
      
      
      Cc: stable@vger.kernel.org
      Acked-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Reported-by: default avatar <syzbot+4c63f36709a642f801c5@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      5f33a09e
    • Marc Kleine-Budde's avatar
      can: gs_usb: fix use of uninitialized variable, detach device on reception of invalid USB data · 4a8737ff
      Marc Kleine-Budde authored
      The received data contains the channel the received data is associated
      with. If the channel number is bigger than the actual number of
      channels assume broken or malicious USB device and shut it down.
      
      This fixes the error found by clang:
      
      | drivers/net/can/usb/gs_usb.c:386:6: error: variable 'dev' is used
      |                                     uninitialized whenever 'if' condition is true
      |         if (hf->channel >= GS_MAX_INTF)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~
      | drivers/net/can/usb/gs_usb.c:474:10: note: uninitialized use occurs here
      |                           hf, dev->gs_hf_size, gs_usb_receive_bulk_callback,
      |                               ^~~
      
      Link: https://lore.kernel.org/all/20211210091158.408326-1-mkl@pengutronix.de
      Fixes: d08e973a
      
       ("can: gs_usb: Added support for the GS_USB CAN devices")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      4a8737ff