Skip to content
  1. Sep 10, 2019
    • Pablo Neira Ayuso's avatar
      netfilter: nf_flow_table: teardown flow timeout race · ea3c243c
      Pablo Neira Ayuso authored
      [ Upstream commit 1e5b2471 ]
      
      Flows that are in teardown state (due to RST / FIN TCP packet) still
      have their offload flag set on. Hence, the conntrack garbage collector
      may race to undo the timeout adjustment that the fixup routine performs,
      leaving the conntrack entry in place with the internal offload timeout
      (one day).
      
      Update teardown flow state to ESTABLISHED and set tracking to liberal,
      then once the offload bit is cleared, adjust timeout if it is more than
      the default fixup timeout (conntrack might already have set a lower
      timeout from the packet path).
      
      Fixes: da5984e5
      
       ("netfilter: nf_flow_table: add support for sending flows back to the slow path")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ea3c243c
    • Pablo Neira Ayuso's avatar
      netfilter: nf_flow_table: conntrack picks up expired flows · 96a517d0
      Pablo Neira Ayuso authored
      [ Upstream commit 3e68db2f ]
      
      Update conntrack entry to pick up expired flows, otherwise the conntrack
      entry gets stuck with the internal offload timeout (one day). The TCP
      state also needs to be adjusted to ESTABLISHED state and tracking is set
      to liberal mode in order to give conntrack a chance to pick up the
      expired flow.
      
      Fixes: ac2a6666
      
       ("netfilter: add generic flow table infrastructure")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      96a517d0
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: use-after-free in failing rule with bound set · 586f0014
      Pablo Neira Ayuso authored
      [ Upstream commit 6a0a8d10 ]
      
      If a rule that has already a bound anonymous set fails to be added, the
      preparation phase releases the rule and the bound set. However, the
      transaction object from the abort path still has a reference to the set
      object that is stale, leading to a use-after-free when checking for the
      set->bound field. Add a new field to the transaction that specifies if
      the set is bound, so the abort path can skip releasing it since the rule
      command owns it and it takes care of releasing it. After this update,
      the set->bound field is removed.
      
      [   24.649883] Unable to handle kernel paging request at virtual address 0000000000040434
      [   24.657858] Mem abort info:
      [   24.660686]   ESR = 0x96000004
      [   24.663769]   Exception class = DABT (current EL), IL = 32 bits
      [   24.669725]   SET = 0, FnV = 0
      [   24.672804]   EA = 0, S1PTW = 0
      [   24.675975] Data abort info:
      [   24.678880]   ISV = 0, ISS = 0x00000004
      [   24.682743]   CM = 0, WnR = 0
      [   24.685723] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000428952000
      [   24.692207] [0000000000040434] pgd=0000000000000000
      [   24.697119] Internal error: Oops: 96000004 [#1] SMP
      [...]
      [   24.889414] Call trace:
      [   24.891870]  __nf_tables_abort+0x3f0/0x7a0
      [   24.895984]  nf_tables_abort+0x20/0x40
      [   24.899750]  nfnetlink_rcv_batch+0x17c/0x588
      [   24.904037]  nfnetlink_rcv+0x13c/0x190
      [   24.907803]  netlink_unicast+0x18c/0x208
      [   24.911742]  netlink_sendmsg+0x1b0/0x350
      [   24.915682]  sock_sendmsg+0x4c/0x68
      [   24.919185]  ___sys_sendmsg+0x288/0x2c8
      [   24.923037]  __sys_sendmsg+0x7c/0xd0
      [   24.926628]  __arm64_sys_sendmsg+0x2c/0x38
      [   24.930744]  el0_svc_common.constprop.0+0x94/0x158
      [   24.935556]  el0_svc_handler+0x34/0x90
      [   24.939322]  el0_svc+0x8/0xc
      [   24.942216] Code: 37280300 f9404023 91014262 aa1703e0 (f9401863)
      [   24.948336] ---[ end trace cebbb9dcbed3b56f ]---
      
      Fixes: f6ac8585
      
       ("netfilter: nf_tables: unbind set in rule from commit path")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      586f0014
    • Fuqian Huang's avatar
      net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context · 830b5c37
      Fuqian Huang authored
      [ Upstream commit 8c25d088
      
       ]
      
      As spin_unlock_irq will enable interrupts.
      Function tsi108_stat_carry is called from interrupt handler tsi108_irq.
      Interrupts are enabled in interrupt handler.
      Use spin_lock_irqsave/spin_unlock_irqrestore instead of spin_(un)lock_irq
      in IRQ context to avoid this.
      
      Signed-off-by: default avatarFuqian Huang <huangfq.daxian@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      830b5c37
    • Marek Szyprowski's avatar
      clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU · 60a4f2b2
      Marek Szyprowski authored
      [ Upstream commit baf7b79e ]
      
      M2M scaler clocks require special handling of their parent bus clock during
      power domain on/off sequences. MSCL clocks were not initially added to the
      sub-CMU handler, because that time there was no driver for the M2M scaler
      device and it was not possible to test it.
      
      This patch fixes this issue. Parent clock for M2M scaler devices is now
      properly preserved during MSC power domain on/off sequence. This gives M2M
      scaler devices proper performance: fullHD XRGB32 image 1000 rotations test
      takes 3.17s instead of 45.08s.
      
      Fixes: b06a532b
      
       ("clk: samsung: Add Exynos5 sub-CMU clock driver")
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Link: https://lkml.kernel.org/r/20190808121839.23892-1-m.szyprowski@samsung.com
      Acked-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      60a4f2b2
    • Sylwester Nawrocki's avatar
      clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU · c65a2b20
      Sylwester Nawrocki authored
      [ Upstream commit b6adeb6b ]
      
      This patch fixes broken sound on Exynos5422/5800 platforms after
      system/suspend resume cycle in cases where the audio root clock
      is derived from MAU_EPLL_CLK.
      
      In order to preserve state of the USER_MUX_MAU_EPLL_CLK clock mux
      during system suspend/resume cycle for Exynos5800 we group the MAU
      block input clocks in "MAU" sub-CMU and add the clock mux control
      bit to .suspend_regs.  This ensures that user configuration of the mux
      is not lost after the PMU block changes the mux setting to OSC_DIV
      when switching off the MAU power domain.
      
      Adding the SRC_TOP9 register to exynos5800_clk_regs[] array is not
      sufficient as at the time of the syscore_ops suspend call MAU power
      domain is already turned off and we already save and subsequently
      restore an incorrect register's value.
      
      Fixes: b06a532b
      
       ("clk: samsung: Add Exynos5 sub-CMU clock driver")
      Reported-by: default avatarJaafar Ali <jaafarkhalaf@gmail.com>
      Suggested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Tested-by: default avatarJaafar Ali <jaafarkhalaf@gmail.com>
      Signed-off-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Link: https://lkml.kernel.org/r/20190808144929.18685-2-s.nawrocki@samsung.com
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c65a2b20
    • Sylwester Nawrocki's avatar
      clk: samsung: Change signature of exynos5_subcmus_init() function · f7bd5e9f
      Sylwester Nawrocki authored
      [ Upstream commit bf32e7db ]
      
      In order to make it easier in subsequent patch to create different subcmu
      lists for exynos5420 and exynos5800 SoCs the code is rewritten so we pass
      an array of pointers to the subcmus initialization function.
      
      Fixes: b06a532b
      
       ("clk: samsung: Add Exynos5 sub-CMU clock driver")
      Tested-by: default avatarJaafar Ali <jaafarkhalaf@gmail.com>
      Signed-off-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Link: https://lkml.kernel.org/r/20190808144929.18685-1-s.nawrocki@samsung.com
      Reviewed-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f7bd5e9f
    • Aya Levin's avatar
      net/mlx5e: Fix error flow of CQE recovery on tx reporter · 8f374779
      Aya Levin authored
      [ Upstream commit 276d197e ]
      
      CQE recovery function begins with test and set of recovery bit. Add an
      error flow which ensures clearing of this bit when leaving the recovery
      function, to allow further recoveries to take place. This allows removal
      of clearing recovery bit on sq activate.
      
      Fixes: de8650a8
      
       ("net/mlx5e: Add tx reporter support")
      Signed-off-by: default avatarAya Levin <ayal@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f374779
    • Florian Westphal's avatar
      netfilter: nf_flow_table: fix offload for flows that are subject to xfrm · 701b8990
      Florian Westphal authored
      [ Upstream commit 589b474a
      
       ]
      
      This makes the previously added 'encap test' pass.
      Because its possible that the xfrm dst entry becomes stale while such
      a flow is offloaded, we need to call dst_check() -- the notifier that
      handles this for non-tunneled traffic isn't sufficient, because SA or
      or policies might have changed.
      
      If dst becomes stale the flow offload entry will be tagged for teardown
      and packets will be passed to 'classic' forwarding path.
      
      Removing the entry right away is problematic, as this would
      introduce a race condition with the gc worker.
      
      In case flow is long-lived, it could eventually be offloaded again
      once the gc worker removes the entry from the flow table.
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      701b8990
    • Andrii Nakryiko's avatar
      libbpf: set BTF FD for prog only when there is supported .BTF.ext data · b8a132a4
      Andrii Nakryiko authored
      [ Upstream commit 3415ec64 ]
      
      5d01ab7b ("libbpf: fix erroneous multi-closing of BTF FD")
      introduced backwards-compatibility issue, manifesting itself as -E2BIG
      error returned on program load due to unknown non-zero btf_fd attribute
      value for BPF_PROG_LOAD sys_bpf() sub-command.
      
      This patch fixes bug by ensuring that we only ever associate BTF FD with
      program if there is a BTF.ext data that was successfully loaded into
      kernel, which automatically means kernel supports func_info/line_info
      and associated BTF FD for progs (checked and ensured also by BTF
      sanitization code).
      
      Fixes: 5d01ab7b
      
       ("libbpf: fix erroneous multi-closing of BTF FD")
      Reported-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b8a132a4
    • Andrii Nakryiko's avatar
      libbpf: fix erroneous multi-closing of BTF FD · a65fb289
      Andrii Nakryiko authored
      [ Upstream commit 5d01ab7b ]
      
      Libbpf stores associated BTF FD per each instance of bpf_program. When
      program is unloaded, that FD is closed. This is wrong, because leads to
      a race and possibly closing of unrelated files, if application
      simultaneously opens new files while bpf_programs are unloaded.
      
      It's also unnecessary, because struct btf "owns" that FD, and
      btf__free(), called from bpf_object__close() will close it. Thus the fix
      is to never have per-program BTF FD and fetch it from obj->btf, when
      necessary.
      
      Fixes: 2993e051
      
       ("tools/bpf: add support to read .BTF.ext sections")
      Reported-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a65fb289
    • Sven Eckelmann's avatar
      batman-adv: Fix netlink dumping of all mcast_flags buckets · fa689968
      Sven Eckelmann authored
      [ Upstream commit fa3a03da ]
      
      The bucket variable is only updated outside the loop over the mcast_flags
      buckets. It will only be updated during a dumping run when the dumping has
      to be interrupted and a new message has to be started.
      
      This could result in repeated or missing entries when the multicast flags
      are dumped to userspace.
      
      Fixes: d2d489b7
      
       ("batman-adv: Add inconsistent multicast netlink dump detection")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fa689968
    • Ka-Cheong Poon's avatar
      net/rds: Fix info leak in rds6_inc_info_copy() · a4c88340
      Ka-Cheong Poon authored
      [ Upstream commit 7d0a0658 ]
      
      The rds6_inc_info_copy() function has a couple struct members which
      are leaking stack information.  The ->tos field should hold actual
      information and the ->flags field needs to be zeroed out.
      
      Fixes: 3eb45036 ("rds: add type of service(tos) infrastructure")
      Fixes: b7ff8b10
      
       ("rds: Extend RDS API for IPv6 support")
      Reported-by: default avatar黄ID蝴蝶 <butterflyhuangxx@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarKa-Cheong Poon <ka-cheong.poon@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4c88340
    • Davide Caratti's avatar
      net/sched: pfifo_fast: fix wrong dereference when qdisc is reset · fdd2bc36
      Davide Caratti authored
      [ Upstream commit 04d37cf4 ]
      
      Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
      'TCQ_F_NOLOCK' bit in the parent qdisc, we need to be sure that per-cpu
      counters are present when 'reset()' is called for pfifo_fast qdiscs.
      Otherwise, the following script:
      
       # tc q a dev lo handle 1: root htb default 100
       # tc c a dev lo parent 1: classid 1:100 htb \
       > rate 95Mbit ceil 100Mbit burst 64k
       [...]
       # tc f a dev lo parent 1: protocol arp basic classid 1:100
       [...]
       # tc q a dev lo parent 1:100 handle 100: pfifo_fast
       [...]
       # tc q d dev lo root
      
      can generate the following splat:
      
       Unable to handle kernel paging request at virtual address dfff2c01bd148000
       Mem abort info:
         ESR = 0x96000004
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000004
         CM = 0, WnR = 0
       [dfff2c01bd148000] address between user and kernel address ranges
       Internal error: Oops: 96000004 [#1] SMP
       [...]
       pstate: 80000005 (Nzcv daif -PAN -UAO)
       pc : pfifo_fast_reset+0x280/0x4d8
       lr : pfifo_fast_reset+0x21c/0x4d8
       sp : ffff800d09676fa0
       x29: ffff800d09676fa0 x28: ffff200012ee22e4
       x27: dfff200000000000 x26: 0000000000000000
       x25: ffff800ca0799958 x24: ffff1001940f332b
       x23: 0000000000000007 x22: ffff200012ee1ab8
       x21: 0000600de8a40000 x20: 0000000000000000
       x19: ffff800ca0799900 x18: 0000000000000000
       x17: 0000000000000002 x16: 0000000000000000
       x15: 0000000000000000 x14: 0000000000000000
       x13: 0000000000000000 x12: ffff1001b922e6e2
       x11: 1ffff001b922e6e1 x10: 0000000000000000
       x9 : 1ffff001b922e6e1 x8 : dfff200000000000
       x7 : 0000000000000000 x6 : 0000000000000000
       x5 : 1fffe400025dc45c x4 : 1fffe400025dc357
       x3 : 00000c01bd148000 x2 : 0000600de8a40000
       x1 : 0000000000000007 x0 : 0000600de8a40004
       Call trace:
        pfifo_fast_reset+0x280/0x4d8
        qdisc_reset+0x6c/0x370
        htb_reset+0x150/0x3b8 [sch_htb]
        qdisc_reset+0x6c/0x370
        dev_deactivate_queue.constprop.5+0xe0/0x1a8
        dev_deactivate_many+0xd8/0x908
        dev_deactivate+0xe4/0x190
        qdisc_graft+0x88c/0xbd0
        tc_get_qdisc+0x418/0x8a8
        rtnetlink_rcv_msg+0x3a8/0xa78
        netlink_rcv_skb+0x18c/0x328
        rtnetlink_rcv+0x28/0x38
        netlink_unicast+0x3c4/0x538
        netlink_sendmsg+0x538/0x9a0
        sock_sendmsg+0xac/0xf8
        ___sys_sendmsg+0x53c/0x658
        __sys_sendmsg+0xc8/0x140
        __arm64_sys_sendmsg+0x74/0xa8
        el0_svc_handler+0x164/0x468
        el0_svc+0x10/0x14
       Code: 910012a0 92400801 d343fc03 11000c21 (38fb6863)
      
      Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
      before dereferencing 'qdisc->cpu_qstats'.
      
      Changes since v1:
       - coding style improvements, thanks to Stefano Brivio
      
      Fixes: 8a53e616
      
       ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
      CC: Paolo Abeni <pabeni@redhat.com>
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdd2bc36
    • Davide Caratti's avatar
      net/sched: pfifo_fast: fix wrong dereference in pfifo_fast_enqueue · e9cc0513
      Davide Caratti authored
      [ Upstream commit 092e22e5 ]
      
      Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
      'TCQ_F_NOLOCK' bit in the parent qdisc, we can't assume anymore that
      per-cpu counters are there in the error path of skb_array_produce().
      Otherwise, the following splat can be seen:
      
       Unable to handle kernel paging request at virtual address 0000600dea430008
       Mem abort info:
         ESR = 0x96000005
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000005
         CM = 0, WnR = 0
       user pgtable: 64k pages, 48-bit VAs, pgdp = 000000007b97530e
       [0000600dea430008] pgd=0000000000000000, pud=0000000000000000
       Internal error: Oops: 96000005 [#1] SMP
      [...]
       pstate: 10000005 (nzcV daif -PAN -UAO)
       pc : pfifo_fast_enqueue+0x524/0x6e8
       lr : pfifo_fast_enqueue+0x46c/0x6e8
       sp : ffff800d39376fe0
       x29: ffff800d39376fe0 x28: 1ffff001a07d1e40
       x27: ffff800d03e8f188 x26: ffff800d03e8f200
       x25: 0000000000000062 x24: ffff800d393772f0
       x23: 0000000000000000 x22: 0000000000000403
       x21: ffff800cca569a00 x20: ffff800d03e8ee00
       x19: ffff800cca569a10 x18: 00000000000000bf
       x17: 0000000000000000 x16: 0000000000000000
       x15: 0000000000000000 x14: ffff1001a726edd0
       x13: 1fffe4000276a9a4 x12: 0000000000000000
       x11: dfff200000000000 x10: ffff800d03e8f1a0
       x9 : 0000000000000003 x8 : 0000000000000000
       x7 : 00000000f1f1f1f1 x6 : ffff1001a726edea
       x5 : ffff800cca56a53c x4 : 1ffff001bf9a8003
       x3 : 1ffff001bf9a8003 x2 : 1ffff001a07d1dcb
       x1 : 0000600dea430000 x0 : 0000600dea430008
       Process ping (pid: 6067, stack limit = 0x00000000dc0aa557)
       Call trace:
        pfifo_fast_enqueue+0x524/0x6e8
        htb_enqueue+0x660/0x10e0 [sch_htb]
        __dev_queue_xmit+0x123c/0x2de0
        dev_queue_xmit+0x24/0x30
        ip_finish_output2+0xc48/0x1720
        ip_finish_output+0x548/0x9d8
        ip_output+0x334/0x788
        ip_local_out+0x90/0x138
        ip_send_skb+0x44/0x1d0
        ip_push_pending_frames+0x5c/0x78
        raw_sendmsg+0xed8/0x28d0
        inet_sendmsg+0xc4/0x5c0
        sock_sendmsg+0xac/0x108
        __sys_sendto+0x1ac/0x2a0
        __arm64_sys_sendto+0xc4/0x138
        el0_svc_handler+0x13c/0x298
        el0_svc+0x8/0xc
       Code: f9402e80 d538d081 91002000 8b010000 (885f7c03)
      
      Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
      before dereferencing 'qdisc->cpu_qstats'.
      
      Fixes: 8a53e616
      
       ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
      CC: Paolo Abeni <pabeni@redhat.com>
      CC: Stefano Brivio <sbrivio@redhat.com>
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9cc0513
    • Vladimir Oltean's avatar
      net: dsa: tag_8021q: Future-proof the reserved fields in the custom VID · 0b003eda
      Vladimir Oltean authored
      [ Upstream commit bcccb0a5 ]
      
      After witnessing the discussion in https://lkml.org/lkml/2019/8/14/151
      w.r.t. ioctl extensibility, it became clear that such an issue might
      prevent that the 3 RSV bits inside the DSA 802.1Q tag might also suffer
      the same fate and be useless for further extension.
      
      So clearly specify that the reserved bits should currently be
      transmitted as zero and ignored on receive. The DSA tagger already does
      this (and has always did), and is the only known user so far (no
      Wireshark dissection plugin, etc). So there should be no incompatibility
      to speak of.
      
      Fixes: 0471dd42
      
       ("net: dsa: tag_8021q: Create a stable binary format")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b003eda
    • Marco Hartmann's avatar
      Add genphy_c45_config_aneg() function to phy-c45.c · c7f05c1d
      Marco Hartmann authored
      [ Upstream commit 2ebb991641d3f64b70fec0156e2b6933810177e9 ]
      
      Commit 34786005 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
      genphy_config_aneg") introduced a check that aborts phy_config_aneg()
      if the phy is a C45 phy.
      This causes phy_state_machine() to call phy_error() so that the phy
      ends up in PHY_HALTED state.
      
      Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
      (analogous to the C22 case) so that the state machine can run
      correctly.
      
      genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
      in drivers/net/phy/marvell10g.c, excluding vendor specific
      configurations for 1000BaseT.
      
      Fixes: 22b56e82
      
       ("net: phy: replace genphy_10g_driver with genphy_c45_driver")
      
      Signed-off-by: default avatarMarco Hartmann <marco.hartmann@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7f05c1d
    • Vladimir Oltean's avatar
      net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate · 98ded313
      Vladimir Oltean authored
      The discussion to be made is absolutely the same as in the case of
      previous patch ("taprio: Set default link speed to 10 Mbps in
      taprio_set_picos_per_byte"). Nothing is lost when setting a default.
      
      Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
      Fixes: e0a7683d
      
       ("net/sched: cbs: fix port_rate miscalculation")
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98ded313
    • Vladimir Oltean's avatar
      taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte · 622f16b3
      Vladimir Oltean authored
      The taprio budget needs to be adapted at runtime according to interface
      link speed. But that handling is problematic.
      
      For one thing, installing a qdisc on an interface that doesn't have
      carrier is not illegal. But taprio prints the following stack trace:
      
      [   31.851373] ------------[ cut here ]------------
      [   31.856024] WARNING: CPU: 1 PID: 207 at net/sched/sch_taprio.c:481 taprio_dequeue+0x1a8/0x2d4
      [   31.864566] taprio: dequeue() called with unknown picos per byte.
      [   31.864570] Modules linked in:
      [   31.873701] CPU: 1 PID: 207 Comm: tc Not tainted 5.3.0-rc5-01199-g8838fe023cd6 #1689
      [   31.881398] Hardware name: Freescale LS1021A
      [   31.885661] [<c03133a4>] (unwind_backtrace) from [<c030d8cc>] (show_stack+0x10/0x14)
      [   31.893368] [<c030d8cc>] (show_stack) from [<c10ac958>] (dump_stack+0xb4/0xc8)
      [   31.900555] [<c10ac958>] (dump_stack) from [<c0349d04>] (__warn+0xe0/0xf8)
      [   31.907395] [<c0349d04>] (__warn) from [<c0349d64>] (warn_slowpath_fmt+0x48/0x6c)
      [   31.914841] [<c0349d64>] (warn_slowpath_fmt) from [<c0f38db4>] (taprio_dequeue+0x1a8/0x2d4)
      [   31.923150] [<c0f38db4>] (taprio_dequeue) from [<c0f227b0>] (__qdisc_run+0x90/0x61c)
      [   31.930856] [<c0f227b0>] (__qdisc_run) from [<c0ec82ac>] (net_tx_action+0x12c/0x2bc)
      [   31.938560] [<c0ec82ac>] (net_tx_action) from [<c0302298>] (__do_softirq+0x130/0x3c8)
      [   31.946350] [<c0302298>] (__do_softirq) from [<c03502a0>] (irq_exit+0xbc/0xd8)
      [   31.953536] [<c03502a0>] (irq_exit) from [<c03a4808>] (__handle_domain_irq+0x60/0xb4)
      [   31.961328] [<c03a4808>] (__handle_domain_irq) from [<c0754478>] (gic_handle_irq+0x58/0x9c)
      [   31.969638] [<c0754478>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
      [   31.977076] Exception stack(0xe8167b20 to 0xe8167b68)
      [   31.982100] 7b20: e9d4bd80 00000cc0 000000cf 00000000 e9d4bd80 c1f38958 00000cc0 c1f38960
      [   31.990234] 7b40: 00000001 000000cf 00000004 e9dc0800 00000000 e8167b70 c0f478ec c0f46d94
      [   31.998363] 7b60: 60070013 ffffffff
      [   32.001833] [<c0301a8c>] (__irq_svc) from [<c0f46d94>] (netlink_trim+0x18/0xd8)
      [   32.009104] [<c0f46d94>] (netlink_trim) from [<c0f478ec>] (netlink_broadcast_filtered+0x34/0x414)
      [   32.017930] [<c0f478ec>] (netlink_broadcast_filtered) from [<c0f47cec>] (netlink_broadcast+0x20/0x28)
      [   32.027102] [<c0f47cec>] (netlink_broadcast) from [<c0eea378>] (rtnetlink_send+0x34/0x88)
      [   32.035238] [<c0eea378>] (rtnetlink_send) from [<c0f25890>] (notify_and_destroy+0x2c/0x44)
      [   32.043461] [<c0f25890>] (notify_and_destroy) from [<c0f25e08>] (qdisc_graft+0x398/0x470)
      [   32.051595] [<c0f25e08>] (qdisc_graft) from [<c0f27a00>] (tc_modify_qdisc+0x3a4/0x724)
      [   32.059470] [<c0f27a00>] (tc_modify_qdisc) from [<c0ee4c84>] (rtnetlink_rcv_msg+0x260/0x2ec)
      [   32.067864] [<c0ee4c84>] (rtnetlink_rcv_msg) from [<c0f4a988>] (netlink_rcv_skb+0xb8/0x110)
      [   32.076172] [<c0f4a988>] (netlink_rcv_skb) from [<c0f4a170>] (netlink_unicast+0x1b4/0x22c)
      [   32.084392] [<c0f4a170>] (netlink_unicast) from [<c0f4a5e4>] (netlink_sendmsg+0x33c/0x380)
      [   32.092614] [<c0f4a5e4>] (netlink_sendmsg) from [<c0ea9f40>] (sock_sendmsg+0x14/0x24)
      [   32.100403] [<c0ea9f40>] (sock_sendmsg) from [<c0eaa780>] (___sys_sendmsg+0x214/0x228)
      [   32.108279] [<c0eaa780>] (___sys_sendmsg) from [<c0eabad0>] (__sys_sendmsg+0x50/0x8c)
      [   32.116068] [<c0eabad0>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
      [   32.123938] Exception stack(0xe8167fa8 to 0xe8167ff0)
      [   32.128960] 7fa0:                   b6fa68c8 000000f8 00000003 bea142d0 00000000 00000000
      [   32.137093] 7fc0: b6fa68c8 000000f8 0052154c 00000128 5d6468a2 00000000 00000028 00558c9c
      [   32.145224] 7fe0: 00000070 bea14278 00530d64 b6e17e64
      [   32.150659] ---[ end trace 2139c9827c3e5177 ]---
      
      This happens because the qdisc ->dequeue callback gets called. Which
      again is not illegal, the qdisc will dequeue even when the interface is
      up but doesn't have carrier (and hence SPEED_UNKNOWN), and the frames
      will be dropped further down the stack in dev_direct_xmit().
      
      And, at the end of the day, for what? For calculating the initial budget
      of an interface which is non-operational at the moment and where frames
      will get dropped anyway.
      
      So if we can't figure out the link speed, default to SPEED_10 and move
      along. We can also remove the runtime check now.
      
      Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
      Fixes: 7b9eba7b
      
       ("net/sched: taprio: fix picos_per_byte miscalculation")
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      622f16b3
    • Vladimir Oltean's avatar
      taprio: Fix kernel panic in taprio_destroy · 4f15d0e5
      Vladimir Oltean authored
      taprio_init may fail earlier than this line:
      
      	list_add(&q->taprio_list, &taprio_list);
      
      i.e. due to the net device not being multi queue.
      
      Attempting to remove q from the global taprio_list when it is not part
      of it will result in a kernel panic.
      
      Fix it by matching list_add and list_del better to one another in the
      order of operations. This way we can keep the deletion unconditional
      and with lower complexity - O(1).
      
      Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
      Fixes: 7b9eba7b
      
       ("net/sched: taprio: fix picos_per_byte miscalculation")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f15d0e5
    • Hayes Wang's avatar
      r8152: remove calling netif_napi_del · 61f10b1b
      Hayes Wang authored
      [ Upstream commit 973dc6cf
      
       ]
      
      Remove unnecessary use of netif_napi_del. This also avoids to call
      napi_disable() after netif_napi_del().
      
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61f10b1b
    • Hayes Wang's avatar
      Revert "r8152: napi hangup fix after disconnect" · 5a9ceccb
      Hayes Wang authored
      [ Upstream commit 49d4b141 ]
      
      This reverts commit 0ee1f473.
      
      The commit 0ee1f473 ("r8152: napi hangup fix after
      disconnect") adds a check about RTL8152_UNPLUG to determine
      if calling napi_disable() is invalid in rtl8152_close(),
      when rtl8152_disconnect() is called. This avoids to use
      napi_disable() after calling netif_napi_del().
      
      Howver, commit ffa9fec3 ("r8152: set RTL8152_UNPLUG
      only for real disconnection") causes that RTL8152_UNPLUG
      is not always set when calling rtl8152_disconnect().
      Therefore, I have to revert commit 0ee1f473
      
       ("r8152:
      napi hangup fix after disconnect"), first. And submit
      another patch to fix it.
      
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a9ceccb
    • John Hurley's avatar
      nfp: flower: handle neighbour events on internal ports · 7e215364
      John Hurley authored
      [ Upstream commit e8024cb4 ]
      
      Recent code changes to NFP allowed the offload of neighbour entries to FW
      when the next hop device was an internal port. This allows for offload of
      tunnel encap when the end-point IP address is applied to such a port.
      
      Unfortunately, the neighbour event handler still rejects events that are
      not associated with a repr dev and so the firmware neighbour table may get
      out of sync for internal ports.
      
      Fix this by allowing internal port neighbour events to be correctly
      processed.
      
      Fixes: 45756dfe
      
       ("nfp: flower: allow tunnels to output to internal port")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e215364
    • John Hurley's avatar
      nfp: flower: prevent ingress block binds on internal ports · f7ec32a4
      John Hurley authored
      [ Upstream commit 739d7c57 ]
      
      Internal port TC offload is implemented through user-space applications
      (such as OvS) by adding filters at egress via TC clsact qdiscs. Indirect
      block offload support in the NFP driver accepts both ingress qdisc binds
      and egress binds if the device is an internal port. However, clsact sends
      bind notification for both ingress and egress block binds which can lead
      to the driver registering multiple callbacks and receiving multiple
      notifications of new filters.
      
      Fix this by rejecting ingress block bind callbacks when the port is
      internal and only adding filter callbacks for egress binds.
      
      Fixes: 4d12ba42
      
       ("nfp: flower: allow offloading of matches on 'internal' ports")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7ec32a4
    • Eric Dumazet's avatar
      tcp: remove empty skb from write queue in error cases · 64a2a93b
      Eric Dumazet authored
      [ Upstream commit fdfc5c85 ]
      
      Vladimir Rutsky reported stuck TCP sessions after memory pressure
      events. Edge Trigger epoll() user would never receive an EPOLLOUT
      notification allowing them to retry a sendmsg().
      
      Jason tested the case of sk_stream_alloc_skb() returning NULL,
      but there are other paths that could lead both sendmsg() and sendpage()
      to return -1 (EAGAIN), with an empty skb queued on the write queue.
      
      This patch makes sure we remove this empty skb so that
      Jason code can detect that the queue is empty, and
      call sk->sk_write_space(sk) accordingly.
      
      Fixes: ce5ec440
      
       ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Reported-by: default avatarVladimir Rutsky <rutsky@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64a2a93b
    • Willem de Bruijn's avatar
      tcp: inherit timestamp on mtu probe · 5cef2bfc
      Willem de Bruijn authored
      [ Upstream commit 888a5c53 ]
      
      TCP associates tx timestamp requests with a byte in the bytestream.
      If merging skbs in tcp_mtu_probe, migrate the tstamp request.
      
      Similar to MSG_EOR, do not allow moving a timestamp from any segment
      in the probe but the last. This to avoid merging multiple timestamps.
      
      Tested with the packetdrill script at
      https://github.com/wdebruij/packetdrill/commits/mtu_probe-1
      
      Link: http://patchwork.ozlabs.org/patch/1143278/#2232897
      Fixes: 4ed2d765
      
       ("net-timestamp: TCP timestamping")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cef2bfc
    • Chen-Yu Tsai's avatar
      net: stmmac: dwmac-rk: Don't fail if phy regulator is absent · 939cc35d
      Chen-Yu Tsai authored
      [ Upstream commit 3b25528e ]
      
      The devicetree binding lists the phy phy as optional. As such, the
      driver should not bail out if it can't find a regulator. Instead it
      should just skip the remaining regulator related code and continue
      on normally.
      
      Skip the remainder of phy_power_on() if a regulator supply isn't
      available. This also gets rid of the bogus return code.
      
      Fixes: 2e12f536
      
       ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator")
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      939cc35d
    • Cong Wang's avatar
      net_sched: fix a NULL pointer deref in ipt action · 505aac7f
      Cong Wang authored
      [ Upstream commit 981471bd ]
      
      The net pointer in struct xt_tgdtor_param is not explicitly
      initialized therefore is still NULL when dereferencing it.
      So we have to find a way to pass the correct net pointer to
      ipt_destroy_target().
      
      The best way I find is just saving the net pointer inside the per
      netns struct tcf_idrinfo, which could make this patch smaller.
      
      Fixes: 0c66dc1e
      
       ("netfilter: conntrack: register hooks in netns when needed by ruleset")
      Reported-and-tested-by: default avatar <itugrok@yahoo.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      505aac7f
    • Vlad Buslov's avatar
      net: sched: act_sample: fix psample group handling on overwrite · 3c6dfd2a
      Vlad Buslov authored
      [ Upstream commit dbf47a2a ]
      
      Action sample doesn't properly handle psample_group pointer in overwrite
      case. Following issues need to be fixed:
      
      - In tcf_sample_init() function RCU_INIT_POINTER() is used to set
        s->psample_group, even though we neither setting the pointer to NULL, nor
        preventing concurrent readers from accessing the pointer in some way.
        Use rcu_swap_protected() instead to safely reset the pointer.
      
      - Old value of s->psample_group is not released or deallocated in any way,
        which results resource leak. Use psample_group_put() on non-NULL value
        obtained with rcu_swap_protected().
      
      - The function psample_group_put() that released reference to struct
        psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu
        grace period when deallocating it. Extend struct psample_group with rcu
        head and use kfree_rcu when freeing it.
      
      Fixes: 5c5670fa
      
       ("net/sched: Introduce sample tc action")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c6dfd2a
    • Feng Sun's avatar
      net: fix skb use after free in netpoll · 5038bd02
      Feng Sun authored
      [ Upstream commit 2c1644cf ]
      
      After commit baeababb
      
      
      ("tun: return NET_XMIT_DROP for dropped packets"),
      when tun_net_xmit drop packets, it will free skb and return NET_XMIT_DROP,
      netpoll_send_skb_on_dev will run into following use after free cases:
      1. retry netpoll_start_xmit with freed skb;
      2. queue freed skb in npinfo->txq.
      queue_process will also run into use after free case.
      
      hit netpoll_send_skb_on_dev first case with following kernel log:
      
      [  117.864773] kernel BUG at mm/slub.c:306!
      [  117.864773] invalid opcode: 0000 [#1] SMP PTI
      [  117.864774] CPU: 3 PID: 2627 Comm: loop_printmsg Kdump: loaded Tainted: P           OE     5.3.0-050300rc5-generic #201908182231
      [  117.864775] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  117.864775] RIP: 0010:kmem_cache_free+0x28d/0x2b0
      [  117.864781] Call Trace:
      [  117.864781]  ? tun_net_xmit+0x21c/0x460
      [  117.864781]  kfree_skbmem+0x4e/0x60
      [  117.864782]  kfree_skb+0x3a/0xa0
      [  117.864782]  tun_net_xmit+0x21c/0x460
      [  117.864782]  netpoll_start_xmit+0x11d/0x1b0
      [  117.864788]  netpoll_send_skb_on_dev+0x1b8/0x200
      [  117.864789]  __br_forward+0x1b9/0x1e0 [bridge]
      [  117.864789]  ? skb_clone+0x53/0xd0
      [  117.864790]  ? __skb_clone+0x2e/0x120
      [  117.864790]  deliver_clone+0x37/0x50 [bridge]
      [  117.864790]  maybe_deliver+0x89/0xc0 [bridge]
      [  117.864791]  br_flood+0x6c/0x130 [bridge]
      [  117.864791]  br_dev_xmit+0x315/0x3c0 [bridge]
      [  117.864792]  netpoll_start_xmit+0x11d/0x1b0
      [  117.864792]  netpoll_send_skb_on_dev+0x1b8/0x200
      [  117.864792]  netpoll_send_udp+0x2c6/0x3e8
      [  117.864793]  write_msg+0xd9/0xf0 [netconsole]
      [  117.864793]  console_unlock+0x386/0x4e0
      [  117.864793]  vprintk_emit+0x17e/0x280
      [  117.864794]  vprintk_default+0x29/0x50
      [  117.864794]  vprintk_func+0x4c/0xbc
      [  117.864794]  printk+0x58/0x6f
      [  117.864795]  loop_fun+0x24/0x41 [printmsg_loop]
      [  117.864795]  kthread+0x104/0x140
      [  117.864795]  ? 0xffffffffc05b1000
      [  117.864796]  ? kthread_park+0x80/0x80
      [  117.864796]  ret_from_fork+0x35/0x40
      
      Signed-off-by: default avatarFeng Sun <loyou85@gmail.com>
      Signed-off-by: default avatarXiaojun Zhao <xiaojunzhao141@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5038bd02
    • Eric Dumazet's avatar
      mld: fix memory leak in mld_del_delrec() · baa51358
      Eric Dumazet authored
      [ Upstream commit a84d0164 ]
      
      Similar to the fix done for IPv4 in commit e5b1c6c6
      ("igmp: fix memory leak in igmpv3_del_delrec()"), we need to
      make sure mca_tomb and mca_sources are not blindly overwritten.
      
      Using swap() then a call to ip6_mc_clear_src() will take care
      of the missing free.
      
      BUG: memory leak
      unreferenced object 0xffff888117d9db00 (size 64):
        comm "syz-executor247", pid 6918, jiffies 4294943989 (age 25.350s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 fe 88 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000005b463030>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<000000005b463030>] slab_post_alloc_hook mm/slab.h:522 [inline]
          [<000000005b463030>] slab_alloc mm/slab.c:3319 [inline]
          [<000000005b463030>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
          [<00000000939cbf94>] kmalloc include/linux/slab.h:552 [inline]
          [<00000000939cbf94>] kzalloc include/linux/slab.h:748 [inline]
          [<00000000939cbf94>] ip6_mc_add1_src net/ipv6/mcast.c:2236 [inline]
          [<00000000939cbf94>] ip6_mc_add_src+0x31f/0x420 net/ipv6/mcast.c:2356
          [<00000000d8972221>] ip6_mc_source+0x4a8/0x600 net/ipv6/mcast.c:449
          [<000000002b203d0d>] do_ipv6_setsockopt.isra.0+0x1b92/0x1dd0 net/ipv6/ipv6_sockglue.c:748
          [<000000001f1e2d54>] ipv6_setsockopt+0x89/0xd0 net/ipv6/ipv6_sockglue.c:944
          [<00000000c8f7bdf9>] udpv6_setsockopt+0x4e/0x90 net/ipv6/udp.c:1558
          [<000000005a9a0c5e>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3139
          [<00000000910b37b2>] __sys_setsockopt+0x10f/0x220 net/socket.c:2084
          [<00000000e9108023>] __do_sys_setsockopt net/socket.c:2100 [inline]
          [<00000000e9108023>] __se_sys_setsockopt net/socket.c:2097 [inline]
          [<00000000e9108023>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2097
          [<00000000f4818160>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:296
          [<000000008d367e8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1666d49e ("mld: do not remove mld souce list info when set link down")
      Fixes: 9c8bb163
      
       ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      baa51358
  2. Sep 06, 2019
    • Greg Kroah-Hartman's avatar
      Linux 5.2.13 · 218ca2e5
      Greg Kroah-Hartman authored
      v5.2.13
      218ca2e5
    • Benjamin Tissoires's avatar
      Revert "Input: elantech - enable SMBus on new (2018+) systems" · 4c634717
      Benjamin Tissoires authored
      This reverts commit 60956b01 which is
      commit 883a2a80
      
       upstream.
      
      This patch depends on an other series:
      https://patchwork.kernel.org/project/linux-input/list/?series=122327&state=%2A&archive=both
      
      It was a mistake to backport it in the v5.2 branch, as there
      is a high chance we encounter a touchpad that needs the series
      above.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=204733
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=204771
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c634717
    • Greg Kroah-Hartman's avatar
      Linux 5.2.12 · 140839fe
      Greg Kroah-Hartman authored
      v5.2.12
      140839fe
    • Greg Kroah-Hartman's avatar
      Revert "ASoC: Fail card instantiation if DAI format setup fails" · 5566d1c6
      Greg Kroah-Hartman authored
      This reverts commit ab4f4d33 which is
      commit 40aa5383
      
       upstream.
      
      Mark Brown writes:
      	I nacked this patch when Sasha posted it - it only improves
      	diagnostics and might make systems that worked by accident break
      	since it turns things into a hard failure, it won't make
      	anything that didn't work previously work.
      
      Reported-by: default avatarMark Brown <broonie@kernel.org>
      Cc: Ricard Wanderlof <ricardw@axis.com>
      Cc: Sasha Levin <sashal@kernel.org>
      Link: https://lore.kernel.org/lkml/20190904181027.GG4348@sirena.co.uk
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5566d1c6
    • Cong Wang's avatar
      hsr: switch ->dellink() to ->ndo_uninit() · 4d896602
      Cong Wang authored
      [ Upstream commit 311633b6 ]
      
      Switching from ->priv_destructor to dellink() has an unexpected
      consequence: existing RCU readers, that is, hsr_port_get_hsr()
      callers, may still be able to read the port list.
      
      Instead of checking the return value of each hsr_port_get_hsr(),
      we can just move it to ->ndo_uninit() which is called after
      device unregister and synchronize_net(), and we still have RTNL
      lock there.
      
      Fixes: b9a1e627 ("hsr: implement dellink to clean up resources")
      Fixes: edf070a0
      
       ("hsr: fix a NULL pointer deref in hsr_dev_xmit()")
      Reported-by: default avatar <syzbot+097ef84cdc95843fbaa8@syzkaller.appspotmail.com>
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4d896602
    • Cong Wang's avatar
      hsr: fix a NULL pointer deref in hsr_dev_xmit() · 072c9337
      Cong Wang authored
      [ Upstream commit edf070a0
      
       ]
      
      hsr_port_get_hsr() could return NULL and kernel
      could crash:
      
       BUG: kernel NULL pointer dereference, address: 0000000000000010
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 8000000074b84067 P4D 8000000074b84067 PUD 7057d067 PMD 0
       Oops: 0000 [#1] SMP PTI
       CPU: 0 PID: 754 Comm: a.out Not tainted 5.2.0-rc6+ #718
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
       RIP: 0010:hsr_dev_xmit+0x20/0x31
       Code: 48 8b 1b eb e0 5b 5d 41 5c c3 66 66 66 66 90 55 48 89 fd 48 8d be 40 0b 00 00 be 04 00 00 00 e8 ee f2 ff ff 48 89 ef 48 89 c6 <48> 8b 40 10 48 89 45 10 e8 6c 1b 00 00 31 c0 5d c3 66 66 66 66 90
       RSP: 0018:ffffb5b400003c48 EFLAGS: 00010246
       RAX: 0000000000000000 RBX: ffff9821b4509a88 RCX: 0000000000000000
       RDX: ffff9821b4509a88 RSI: 0000000000000000 RDI: ffff9821bc3fc7c0
       RBP: ffff9821bc3fc7c0 R08: 0000000000000000 R09: 00000000000c2019
       R10: 0000000000000000 R11: 0000000000000002 R12: ffff9821bc3fc7c0
       R13: ffff9821b4509a88 R14: 0000000000000000 R15: 000000000000006e
       FS:  00007fee112a1800(0000) GS:ffff9821bd800000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000010 CR3: 000000006e9ce000 CR4: 00000000000406f0
       Call Trace:
        <IRQ>
        netdev_start_xmit+0x1b/0x38
        dev_hard_start_xmit+0x121/0x21e
        ? validate_xmit_skb.isra.0+0x19/0x1e3
        __dev_queue_xmit+0x74c/0x823
        ? lockdep_hardirqs_on+0x12b/0x17d
        ip6_finish_output2+0x3d3/0x42c
        ? ip6_mtu+0x55/0x5c
        ? mld_sendpack+0x191/0x229
        mld_sendpack+0x191/0x229
        mld_ifc_timer_expire+0x1f7/0x230
        ? mld_dad_timer_expire+0x58/0x58
        call_timer_fn+0x12e/0x273
        __run_timers.part.0+0x174/0x1b5
        ? mld_dad_timer_expire+0x58/0x58
        ? sched_clock_cpu+0x10/0xad
        ? mark_lock+0x26/0x1f2
        ? __lock_is_held+0x40/0x71
        run_timer_softirq+0x26/0x48
        __do_softirq+0x1af/0x392
        irq_exit+0x53/0xa2
        smp_apic_timer_interrupt+0x1c4/0x1d9
        apic_timer_interrupt+0xf/0x20
        </IRQ>
      
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      072c9337
    • Cong Wang's avatar
      hsr: implement dellink to clean up resources · 08523d5a
      Cong Wang authored
      commit b9a1e627
      
       upstream.
      
      hsr_link_ops implements ->newlink() but not ->dellink(),
      which leads that resources not released after removing the device,
      particularly the entries in self_node_db and node_db.
      
      So add ->dellink() implementation to replace the priv_destructor.
      This also makes the code slightly easier to understand.
      
      Reported-by: default avatar <syzbot+c6167ec3de7def23d1e8@syzkaller.appspotmail.com>
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08523d5a
    • Daniel Borkmann's avatar
      bpf: fix use after free in prog symbol exposure · a282179b
      Daniel Borkmann authored
      commit c751798a upstream.
      
      syzkaller managed to trigger the warning in bpf_jit_free() which checks via
      bpf_prog_kallsyms_verify_off() for potentially unlinked JITed BPF progs
      in kallsyms, and subsequently trips over GPF when walking kallsyms entries:
      
        [...]
        8021q: adding VLAN 0 to HW filter on device batadv0
        8021q: adding VLAN 0 to HW filter on device batadv0
        WARNING: CPU: 0 PID: 9869 at kernel/bpf/core.c:810 bpf_jit_free+0x1e8/0x2a0
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Workqueue: events bpf_prog_free_deferred
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x113/0x167 lib/dump_stack.c:113
         panic+0x212/0x40b kernel/panic.c:214
         __warn.cold.8+0x1b/0x38 kernel/panic.c:571
         report_bug+0x1a4/0x200 lib/bug.c:186
         fixup_bug arch/x86/kernel/traps.c:178 [inline]
         do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
         do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
         invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
        RIP: 0010:bpf_jit_free+0x1e8/0x2a0
        Code: 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 86 00 00 00 48 ba 00 02 00 00 00 00 ad de 0f b6 43 02 49 39 d6 0f 84 5f fe ff ff <0f> 0b e9 58 fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1
        RSP: 0018:ffff888092f67cd8 EFLAGS: 00010202
        RAX: 0000000000000007 RBX: ffffc90001947000 RCX: ffffffff816e9d88
        RDX: dead000000000200 RSI: 0000000000000008 RDI: ffff88808769f7f0
        RBP: ffff888092f67d00 R08: fffffbfff1394059 R09: fffffbfff1394058
        R10: fffffbfff1394058 R11: ffffffff89ca02c7 R12: ffffc90001947002
        R13: ffffc90001947020 R14: ffffffff881eca80 R15: ffff88808769f7e8
        BUG: unable to handle kernel paging request at fffffbfff400d000
        #PF error: [normal kernel read fault]
        PGD 21ffee067 P4D 21ffee067 PUD 21ffed067 PMD 9f942067 PTE 0
        Oops: 0000 [#1] PREEMPT SMP KASAN
        CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Workqueue: events bpf_prog_free_deferred
        RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:495 [inline]
        RIP: 0010:bpf_tree_comp kernel/bpf/core.c:558 [inline]
        RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
        RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
        RIP: 0010:bpf_prog_kallsyms_find+0x107/0x2e0 kernel/bpf/core.c:632
        Code: 00 f0 ff ff 44 38 c8 7f 08 84 c0 0f 85 fa 00 00 00 41 f6 45 02 01 75 02 0f 0b 48 39 da 0f 82 92 00 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 03 0f 8e 45 01 00 00 8b 03 48 c1 e0
        [...]
      
      Upon further debugging, it turns out that whenever we trigger this
      issue, the kallsyms removal in bpf_prog_ksym_node_del() was /skipped/
      but yet bpf_jit_free() reported that the entry is /in use/.
      
      Problem is that symbol exposure via bpf_prog_kallsyms_add() but also
      perf_event_bpf_event() were done /after/ bpf_prog_new_fd(). Once the
      fd is exposed to the public, a parallel close request came in right
      before we attempted to do the bpf_prog_kallsyms_add().
      
      Given at this time the prog reference count is one, we start to rip
      everything underneath us via bpf_prog_release() -> bpf_prog_put().
      The memory is eventually released via deferred free, so we're seeing
      that bpf_jit_free() has a kallsym entry because we added it from
      bpf_prog_load() but /after/ bpf_prog_put() from the remote CPU.
      
      Therefore, move both notifications /before/ we install the fd. The
      issue was never seen between bpf_prog_alloc_id() and bpf_prog_new_fd()
      because upon bpf_prog_get_fd_by_id() we'll take another reference to
      the BPF prog, so we're still holding the original reference from the
      bpf_prog_load().
      
      Fixes: 6ee52e2a ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
      Fixes: 74451e66
      
       ("bpf: make jited programs visible in traces")
      Reported-by: default avatar <syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Song Liu <songliubraving@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a282179b
    • Greg Kroah-Hartman's avatar
      x86/ptrace: fix up botched merge of spectrev1 fix · 0d5014b8
      Greg Kroah-Hartman authored
      I incorrectly merged commit 31a2fbb3
      
       ("x86/ptrace: Fix possible
      spectre-v1 in ptrace_get_debugreg()") when backporting it, as was
      graciously pointed out at
      https://grsecurity.net/teardown_of_a_failed_linux_lts_spectre_fix.php
      
      Resolve the upstream difference with the stable kernel merge to properly
      protect things.
      
      Reported-by: default avatarBrad Spengler <spender@grsecurity.net>
      Cc: Dianzhang Chen <dianzhangchen0@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <bp@alien8.de>
      Cc: <hpa@zytor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d5014b8