Skip to content
  1. Jan 20, 2021
    • Eric Dumazet's avatar
      tcp: do not mess with cloned skbs in tcp_add_backlog() · b160c285
      Eric Dumazet authored
      Heiner Kallweit reported that some skbs were sent with
      the following invalid GSO properties :
      - gso_size > 0
      - gso_type == 0
      
      This was triggerring a WARN_ON_ONCE() in rtl8169_tso_csum_v2.
      
      Juerg Haefliger was able to reproduce a similar issue using
      a lan78xx NIC and a workload mixing TCP incoming traffic
      and forwarded packets.
      
      The problem is that tcp_add_backlog() is writing
      over gso_segs and gso_size even if the incoming packet will not
      be coalesced to the backlog tail packet.
      
      While skb_try_coalesce() would bail out if tail packet is cloned,
      this overwriting would lead to corruptions of other packets
      cooked by lan78xx, sharing a common super-packet.
      
      The strategy used by lan78xx is to use a big skb, and split
      it into all received packets using skb_clone() to avoid copies.
      The drawback of this strategy is that all the small skb share a common
      struct skb_shared_info.
      
      This patch rewrites TCP gso_size/gso_segs handling to only
      happen on the tail skb, since skb_try_coalesce() made sure
      it was not cloned.
      
      Fixes: 4f693b55
      
       ("tcp: implement coalescing on backlog queue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Bisected-by: default avatarJuerg Haefliger <juergh@canonical.com>
      Tested-by: default avatarJuerg Haefliger <juergh@canonical.com>
      Reported-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=209423
      Link: https://lore.kernel.org/r/20210119164900.766957-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b160c285
    • Hangbin Liu's avatar
      selftests: net: fib_tests: remove duplicate log test · fd23d2dc
      Hangbin Liu authored
      
      
      The previous test added an address with a specified metric and check if
      correspond route was created. I somehow added two logs for the same
      test. Remove the duplicated one.
      
      Reported-by: default avatarAntoine Tenart <atenart@redhat.com>
      Fixes: 0d29169a
      
       ("selftests/net/fib_tests: update addr_metric_test for peer route testing")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20210119025930.2810532-1-liuhangbin@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fd23d2dc
    • Bongsu Jeon's avatar
      net: nfc: nci: fix the wrong NCI_CORE_INIT parameters · 4964e5a1
      Bongsu Jeon authored
      Fix the code because NCI_CORE_INIT_CMD includes two parameters in NCI2.0
      but there is no parameters in NCI1.x.
      
      Fixes: bcd684aa
      
       ("net/nfc/nci: Support NCI 2.x initial sequence")
      Signed-off-by: default avatarBongsu Jeon <bongsu.jeon@samsung.com>
      Link: https://lore.kernel.org/r/20210118205522.317087-1-bongsu.jeon@samsung.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4964e5a1
    • Geert Uytterhoeven's avatar
      sh_eth: Fix power down vs. is_opened flag ordering · f6a2e94b
      Geert Uytterhoeven authored
      sh_eth_close() does a synchronous power down of the device before
      marking it closed.  Revert the order, to make sure the device is never
      marked opened while suspended.
      
      While at it, use pm_runtime_put() instead of pm_runtime_put_sync(), as
      there is no reason to do a synchronous power down.
      
      Fixes: 7fa2955f
      
       ("sh_eth: Fix sleeping function called from invalid context")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarSergei Shtylyov <sergei.shtylyov@gmail.com>
      Reviewed-by: default avatarNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Link: https://lore.kernel.org/r/20210118150812.796791-1-geert+renesas@glider.be
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f6a2e94b
    • Tariq Toukan's avatar
      net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled · a3eb4e9d
      Tariq Toukan authored
      With NETIF_F_HW_TLS_RX packets are decrypted in HW. This cannot be
      logically done when RXCSUM offload is off.
      
      Fixes: 14136564
      
       ("net: Add TLS RX offload feature")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarBoris Pismenny <borisp@nvidia.com>
      Link: https://lore.kernel.org/r/20210117151538.9411-1-tariqt@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a3eb4e9d
    • Jakub Kicinski's avatar
      Merge branch 'ipv4-ensure-ecn-bits-don-t-influence-source-address-validation' · 2565ff4e
      Jakub Kicinski authored
      Guillaume Nault says:
      
      ====================
      ipv4: Ensure ECN bits don't influence source address validation
      
      Functions that end up calling fib_table_lookup() should clear the ECN
      bits from the TOS, otherwise ECT(0) and ECT(1) packets can be treated
      differently.
      
      Most functions already clear the ECN bits, but there are a few cases
      where this is not done. This series only fixes the ones related to
      source address validation.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1610790904.git.gnault@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2565ff4e
    • Guillaume Nault's avatar
      netfilter: rpfilter: mask ecn bits before fib lookup · 2e5a6266
      Guillaume Nault authored
      RT_TOS() only masks one of the two ECN bits. Therefore rpfilter_mt()
      treats Not-ECT or ECT(1) packets in a different way than those with
      ECT(0) or CE.
      
      Reproducer:
      
        Create two netns, connected with a veth:
        $ ip netns add ns0
        $ ip netns add ns1
        $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
        $ ip -netns ns0 link set dev veth01 up
        $ ip -netns ns1 link set dev veth10 up
        $ ip -netns ns0 address add 192.0.2.10/32 dev veth01
        $ ip -netns ns1 address add 192.0.2.11/32 dev veth10
      
        Add a route to ns1 in ns0:
        $ ip -netns ns0 route add 192.0.2.11/32 dev veth01
      
        In ns1, only packets with TOS 4 can be routed to ns0:
        $ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10
      
        Ping from ns0 to ns1 works regardless of the ECN bits, as long as TOS
        is 4:
        $ ip netns exec ns0 ping -Q 4 192.0.2.11   # TOS 4, Not-ECT
          ... 0% packet loss ...
        $ ip netns exec ns0 ping -Q 5 192.0.2.11   # TOS 4, ECT(1)
          ... 0% packet loss ...
        $ ip netns exec ns0 ping -Q 6 192.0.2.11   # TOS 4, ECT(0)
          ... 0% packet loss ...
        $ ip netns exec ns0 ping -Q 7 192.0.2.11   # TOS 4, CE
          ... 0% packet loss ...
      
        Now use iptable's rpfilter module in ns1:
        $ ip netns exec ns1 iptables-legacy -t raw -A PREROUTING -m rpfilter --invert -j DROP
      
        Not-ECT and ECT(1) packets still pass:
        $ ip netns exec ns0 ping -Q 4 192.0.2.11   # TOS 4, Not-ECT
          ... 0% packet loss ...
        $ ip netns exec ns0 ping -Q 5 192.0.2.11   # TOS 4, ECT(1)
          ... 0% packet loss ...
      
        But ECT(0) and ECN packets are dropped:
        $ ip netns exec ns0 ping -Q 6 192.0.2.11   # TOS 4, ECT(0)
          ... 100% packet loss ...
        $ ip netns exec ns0 ping -Q 7 192.0.2.11   # TOS 4, CE
          ... 100% packet loss ...
      
      After this patch, rpfilter doesn't drop ECT(0) and CE packets anymore.
      
      Fixes: 8f97339d
      
       ("netfilter: add ipv4 reverse path filter match")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e5a6266
    • Guillaume Nault's avatar
      udp: mask TOS bits in udp_v4_early_demux() · 8d2b51b0
      Guillaume Nault authored
      udp_v4_early_demux() is the only function that calls
      ip_mc_validate_source() with a TOS that hasn't been masked with
      IPTOS_RT_MASK.
      
      This results in different behaviours for incoming multicast UDPv4
      packets, depending on if ip_mc_validate_source() is called from the
      early-demux path (udp_v4_early_demux) or from the regular input path
      (ip_route_input_noref).
      
      ECN would normally not be used with UDP multicast packets, so the
      practical consequences should be limited on that side. However,
      IPTOS_RT_MASK is used to also masks the TOS' high order bits, to align
      with the non-early-demux path behaviour.
      
      Reproducer:
      
        Setup two netns, connected with veth:
        $ ip netns add ns0
        $ ip netns add ns1
        $ ip -netns ns0 link set dev lo up
        $ ip -netns ns1 link set dev lo up
        $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
        $ ip -netns ns0 link set dev veth01 up
        $ ip -netns ns1 link set dev veth10 up
        $ ip -netns ns0 address add 192.0.2.10 peer 192.0.2.11/32 dev veth01
        $ ip -netns ns1 address add 192.0.2.11 peer 192.0.2.10/32 dev veth10
      
        In ns0, add route to multicast address 224.0.2.0/24 using source
        address 198.51.100.10:
        $ ip -netns ns0 address add 198.51.100.10/32 dev lo
        $ ip -netns ns0 route add 224.0.2.0/24 dev veth01 src 198.51.100.10
      
        In ns1, define route to 198.51.100.10, only for packets with TOS 4:
        $ ip -netns ns1 route add 198.51.100.10/32 tos 4 dev veth10
      
        Also activate rp_filter in ns1, so that incoming packets not matching
        the above route get dropped:
        $ ip netns exec ns1 sysctl -wq net.ipv4.conf.veth10.rp_filter=1
      
        Now try to receive packets on 224.0.2.11:
        $ ip netns exec ns1 socat UDP-RECVFROM:1111,ip-add-membership=224.0.2.11:veth10,ignoreeof -
      
        In ns0, send packet to 224.0.2.11 with TOS 4 and ECT(0) (that is,
        tos 6 for socat):
        $ echo test0 | ip netns exec ns0 socat - UDP-DATAGRAM:224.0.2.11:1111,bind=:1111,tos=6
      
        The "test0" message is properly received by socat in ns1, because
        early-demux has no cached dst to use, so source address validation
        is done by ip_route_input_mc(), which receives a TOS that has the
        ECN bits masked.
      
        Now send another packet to 224.0.2.11, still with TOS 4 and ECT(0):
        $ echo test1 | ip netns exec ns0 socat - UDP-DATAGRAM:224.0.2.11:1111,bind=:1111,tos=6
      
        The "test1" message isn't received by socat in ns1, because, now,
        early-demux has a cached dst to use and calls ip_mc_validate_source()
        immediately, without masking the ECN bits.
      
      Fixes: bc044e8d
      
       ("udp: perform source validation for mcast early demux")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8d2b51b0
    • Jakub Kicinski's avatar
      Merge branch 'sh_eth-fix-reboot-crash' · f7b9820d
      Jakub Kicinski authored
      Geert Uytterhoeven says:
      
      ====================
      sh_eth: Fix reboot crash
      
      This patch fixes a regression v5.11-rc1, where rebooting while a sh_eth
      device is not opened will cause a crash.
      
      Changes compared to v1:
        - Export mdiobb_{read,write}(),
        - Call mdiobb_{read,write}() now they are exported,
        - Use mii_bus.parent to avoid bb_info.dev copy,
        - Drop RFC state.
      
      Alternatively, mdio-bitbang could provide Runtime PM-aware wrappers
      itself, and use them either manually (through a new parameter to
      alloc_mdio_bitbang(), or a new alloc_mdio_bitbang_*() function), or
      automatically (e.g. if pm_runtime_enabled() returns true).  Note that
      the latter requires a "struct device *" parameter to operate on.
      Currently there are only two drivers that call alloc_mdio_bitbang() and
      use Runtime PM: the Renesas sh_eth and ravb drivers.  This series fixes
      the former, while the latter is not affected (it keeps the device
      powered all the time between driver probe and driver unbind, and
      changing that seems to be non-trivial).
      ====================
      
      Link: https://lore.kernel.org/r/20210118150656.796584-1-geert+renesas@glider.be
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f7b9820d
    • Geert Uytterhoeven's avatar
      sh_eth: Make PHY access aware of Runtime PM to fix reboot crash · 02cae02a
      Geert Uytterhoeven authored
      Wolfram reports that his R-Car H2-based Lager board can no longer be
      rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
      The issue can be reproduced on other boards (e.g. Koelsch with R-Car
      M2-W) too, if CONFIG_IP_PNP is disabled, and the Ethernet interface is
      down at reboot time:
      
          Unhandled fault: imprecise external abort (0x1406) at 0x00000000
          pgd = (ptrval)
          [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
          Internal error: : 1406 [#1] ARM
          Modules linked in:
          CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
          Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
          PC is at sh_mdio_ctrl+0x44/0x60
          LR is at sh_mmd_ctrl+0x20/0x24
          ...
          Backtrace:
          [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
           r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
          [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
          [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
           r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
          [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
           r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
          [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
           r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
          [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
           r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
          [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
          [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
           r5:c229f800 r4:c229f800
          [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
           r5:c229f800 r4:c229f804
          [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
          [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
           r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
          [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
          [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
           r5:4321fedc r4:01234567
          [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
           r7:00000058 r6:00000000 r5:00000000 r4:00000000
          [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)
      
      As of commit e2f016cf
      
       ("net: phy: add a shutdown procedure"),
      system reboot calls phy_disable_interrupts() during shutdown.  As this
      happens unconditionally, the PHY registers may be accessed while the
      device is suspended, causing undefined behavior, which may crash the
      system.
      
      Fix this by wrapping the PHY bitbang accessors in the sh_eth driver by
      wrappers that take care of Runtime PM, to resume the device when needed.
      
      Reported-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      02cae02a
    • Geert Uytterhoeven's avatar
      mdio-bitbang: Export mdiobb_{read,write}() · 8eed01b5
      Geert Uytterhoeven authored
      
      
      Export mdiobb_read() and mdiobb_write(), so Ethernet controller drivers
      can call them from their MDIO read/write wrappers.
      
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8eed01b5
    • Oleksandr Mazur's avatar
      net: core: devlink: use right genl user_ptr when handling port param get/set · 7e238de8
      Oleksandr Mazur authored
      Fix incorrect user_ptr dereferencing when handling port param get/set:
      
          idx [0] stores the 'struct devlink' pointer;
          idx [1] stores the 'struct devlink_port' pointer;
      
      Fixes: 637989b5
      
       ("devlink: Always use user_ptr[0] for devlink and simplify post_doit")
      CC: Parav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarOleksandr Mazur <oleksandr.mazur@plvision.eu>
      Signed-off-by: default avatarVadym Kochan <vadym.kochan@plvision.eu>
      Link: https://lore.kernel.org/r/20210119085333.16833-1-vadym.kochan@plvision.eu
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7e238de8
  2. Jan 19, 2021
  3. Jan 17, 2021
  4. Jan 16, 2021
    • Eric Dumazet's avatar
      net_sched: avoid shift-out-of-bounds in tcindex_set_parms() · bcd0cf19
      Eric Dumazet authored
      tc_index being 16bit wide, we need to check that TCA_TCINDEX_SHIFT
      attribute is not silly.
      
      UBSAN: shift-out-of-bounds in net/sched/cls_tcindex.c:260:29
      shift exponent 255 is too large for 32-bit type 'int'
      CPU: 0 PID: 8516 Comm: syz-executor228 Not tainted 5.10.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
       __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
       valid_perfect_hash net/sched/cls_tcindex.c:260 [inline]
       tcindex_set_parms.cold+0x1b/0x215 net/sched/cls_tcindex.c:425
       tcindex_change+0x232/0x340 net/sched/cls_tcindex.c:546
       tc_new_tfilter+0x13fb/0x21b0 net/sched/cls_api.c:2127
       rtnetlink_rcv_msg+0x8b6/0xb80 net/core/rtnetlink.c:5555
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2336
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2390
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2423
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20210114185229.1742255-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bcd0cf19
    • Eric Dumazet's avatar
      net_sched: gen_estimator: support large ewma log · dd5e0733
      Eric Dumazet authored
      syzbot report reminded us that very big ewma_log were supported in the past,
      even if they made litle sense.
      
      tc qdisc replace dev xxx root est 1sec 131072sec ...
      
      While fixing the bug, also add boundary checks for ewma_log, in line
      with range supported by iproute2.
      
      UBSAN: shift-out-of-bounds in net/core/gen_estimator.c:83:38
      shift exponent -1 is negative
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
       __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
       est_timer.cold+0xbb/0x12d net/core/gen_estimator.c:83
       call_timer_fn+0x1a5/0x710 kernel/time/timer.c:1417
       expire_timers kernel/time/timer.c:1462 [inline]
       __run_timers.part.0+0x692/0xa80 kernel/time/timer.c:1731
       __run_timers kernel/time/timer.c:1712 [inline]
       run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1744
       __do_softirq+0x2bc/0xa77 kernel/softirq.c:343
       asm_call_irq_on_stack+0xf/0x20
       </IRQ>
       __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
       run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
       do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
       invoke_softirq kernel/softirq.c:226 [inline]
       __irq_exit_rcu+0x17f/0x200 kernel/softirq.c:420
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:432
       sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1096
       asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
      RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
      RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
      RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:169 [inline]
      RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
      RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
      
      Fixes: 1c0d32fd
      
       ("net_sched: gen_estimator: complete rewrite of rate estimators")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20210114181929.1717985-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dd5e0733
    • Eric Dumazet's avatar
      net_sched: reject silly cell_log in qdisc_get_rtab() · e4bedf48
      Eric Dumazet authored
      iproute2 probably never goes beyond 8 for the cell exponent,
      but stick to the max shift exponent for signed 32bit.
      
      UBSAN reported:
      UBSAN: shift-out-of-bounds in net/sched/sch_api.c:389:22
      shift exponent 130 is too large for 32-bit type 'int'
      CPU: 1 PID: 8450 Comm: syz-executor586 Not tainted 5.11.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x183/0x22e lib/dump_stack.c:120
       ubsan_epilogue lib/ubsan.c:148 [inline]
       __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395
       __detect_linklayer+0x2a9/0x330 net/sched/sch_api.c:389
       qdisc_get_rtab+0x2b5/0x410 net/sched/sch_api.c:435
       cbq_init+0x28f/0x12c0 net/sched/sch_cbq.c:1180
       qdisc_create+0x801/0x1470 net/sched/sch_api.c:1246
       tc_modify_qdisc+0x9e3/0x1fc0 net/sched/sch_api.c:1662
       rtnetlink_rcv_msg+0xb1d/0xe60 net/core/rtnetlink.c:5564
       netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2494
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg net/socket.c:672 [inline]
       ____sys_sendmsg+0x5a2/0x900 net/socket.c:2345
       ___sys_sendmsg net/socket.c:2399 [inline]
       __sys_sendmsg+0x319/0x400 net/socket.c:2432
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20210114160637.1660597-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e4bedf48
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e23a8d00
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-01-16
      
      1) Fix a double bpf_prog_put() for BPF_PROG_{TYPE_EXT,TYPE_TRACING} types in
         link creation's error path causing a refcount underflow, from Jiri Olsa.
      
      2) Fix BTF validation errors for the case where kernel modules don't declare
         any new types and end up with an empty BTF, from Andrii Nakryiko.
      
      3) Fix BPF local storage helpers to first check their {task,inode} owners for
         being NULL before access, from KP Singh.
      
      4) Fix a memory leak in BPF setsockopt handling for the case where optlen is
         zero and thus temporary optval buffer should be freed, from Stanislav Fomichev.
      
      5) Fix a syzbot memory allocation splat in BPF_PROG_TEST_RUN infra for
         raw_tracepoint caused by too big ctx_size_in, from Song Liu.
      
      6) Fix LLVM code generation issues with verifier where PTR_TO_MEM{,_OR_NULL}
         registers were spilled to stack but not recognized, from Gilad Reti.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        MAINTAINERS: Update my email address
        selftests/bpf: Add verifier test for PTR_TO_MEM spill
        bpf: Support PTR_TO_MEM{,_OR_NULL} register spilling
        bpf: Reject too big ctx_size_in for raw_tp test run
        libbpf: Allow loading empty BTFs
        bpf: Allow empty module BTFs
        bpf: Don't leak memory in bpf getsockopt when optlen == 0
        bpf: Update local storage test to check handling of null ptrs
        bpf: Fix typo in bpf_inode_storage.c
        bpf: Local storage helpers should check nullness of owner ptr passed
        bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attach
      ====================
      
      Link: https://lore.kernel.org/r/20210116002025.15706-1-daniel@iogearbox.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e23a8d00
    • Cong Wang's avatar
      cls_flower: call nla_ok() before nla_next() · c96adff9
      Cong Wang authored
      
      
      fl_set_enc_opt() simply checks if there are still bytes left to parse,
      but this is not sufficent as syzbot seems to be able to generate
      malformatted netlink messages. nla_ok() is more strict so should be
      used to validate the next nlattr here.
      
      And nla_validate_nested_deprecated() has less strict check too, it is
      probably too late to switch to the strict version, but we can just
      call nla_ok() too after it.
      
      Reported-and-tested-by: default avatar <syzbot+2624e3778b18fc497c92@syzkaller.appspotmail.com>
      Fixes: 0a6e7778 ("net/sched: allow flower to match tunnel options")
      Fixes: 79b1011c
      
       ("net: sched: allow flower to match erspan options")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20210115185024.72298-1-xiyou.wangcong@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c96adff9
    • Björn Töpel's avatar
      MAINTAINERS: Update my email address · 235ecd36
      Björn Töpel authored
      
      
      My Intel email will stop working in a not too distant future. Move my
      MAINTAINERS entries to my kernel.org address.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210115104337.7751-1-bjorn.topel@gmail.com
      235ecd36
    • Yingjie Wang's avatar
      octeontx2-af: Fix missing check bugs in rvu_cgx.c · b7ba6cfa
      Yingjie Wang authored
      In rvu_mbox_handler_cgx_mac_addr_get()
      and rvu_mbox_handler_cgx_mac_addr_set(),
      the msg is expected only from PFs that are mapped to CGX LMACs.
      It should be checked before mapping,
      so we add the is_cgx_config_permitted() in the functions.
      
      Fixes: 96be2e0d
      
       ("octeontx2-af: Support for MAC address filters in CGX")
      Signed-off-by: default avatarYingjie Wang <wangyingjie55@126.com>
      Reviewed-by: default avatarGeetha <sowjanya&lt;gakula@marvell.com>
      Link: https://lore.kernel.org/r/1610719804-35230-1-git-send-email-wangyingjie55@126.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b7ba6cfa
  5. Jan 15, 2021