Skip to content
  1. Oct 02, 2019
    • Tuong Lien's avatar
      tipc: fix unlimited bundling of small messages · e95584a8
      Tuong Lien authored
      
      
      We have identified a problem with the "oversubscription" policy in the
      link transmission code.
      
      When small messages are transmitted, and the sending link has reached
      the transmit window limit, those messages will be bundled and put into
      the link backlog queue. However, bundles of data messages are counted
      at the 'CRITICAL' level, so that the counter for that level, instead of
      the counter for the real, bundled message's level is the one being
      increased.
      Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
      to be tested against the unchanged counter for their own level, while
      contributing to an unrestrained increase at the CRITICAL backlog level.
      
      This leaves a gap in congestion control algorithm for small messages
      that can result in starvation for other users or a "real" CRITICAL
      user. Even that eventually can lead to buffer exhaustion & link reset.
      
      We fix this by keeping a 'target_bskb' buffer pointer at each levels,
      then when bundling, we only bundle messages at the same importance
      level only. This way, we know exactly how many slots a certain level
      have occupied in the queue, so can manage level congestion accurately.
      
      By bundling messages at the same level, we even have more benefits. Let
      consider this:
      - One socket sends 64-byte messages at the 'CRITICAL' level;
      - Another sends 4096-byte messages at the 'LOW' level;
      
      When a 64-byte message comes and is bundled the first time, we put the
      overhead of message bundle to it (+ 40-byte header, data copy, etc.)
      for later use, but the next message can be a 4096-byte one that cannot
      be bundled to the previous one. This means the last bundle carries only
      one payload message which is totally inefficient, as for the receiver
      also! Later on, another 64-byte message comes, now we make a new bundle
      and the same story repeats...
      
      With the new bundling algorithm, this will not happen, the 64-byte
      messages will be bundled together even when the 4096-byte message(s)
      comes in between. However, if the 4096-byte messages are sent at the
      same level i.e. 'CRITICAL', the bundling algorithm will again cause the
      same overhead.
      
      Also, the same will happen even with only one socket sending small
      messages at a rate close to the link transmit's one, so that, when one
      message is bundled, it's transmitted shortly. Then, another message
      comes, a new bundle is created and so on...
      
      We will solve this issue radically by another patch.
      
      Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
      Reported-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e95584a8
    • Dongli Zhang's avatar
      xen-netfront: do not use ~0U as error return value for xennet_fill_frags() · a761129e
      Dongli Zhang authored
      
      
      xennet_fill_frags() uses ~0U as return value when the sk_buff is not able
      to cache extra fragments. This is incorrect because the return type of
      xennet_fill_frags() is RING_IDX and 0xffffffff is an expected value for
      ring buffer index.
      
      In the situation when the rsp_cons is approaching 0xffffffff, the return
      value of xennet_fill_frags() may become 0xffffffff which xennet_poll() (the
      caller) would regard as error. As a result, queue->rx.rsp_cons is set
      incorrectly because it is updated only when there is error. If there is no
      error, xennet_poll() would be responsible to update queue->rx.rsp_cons.
      Finally, queue->rx.rsp_cons would point to the rx ring buffer entries whose
      queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL.
      This leads to NULL pointer access in the next iteration to process rx ring
      buffer entries.
      
      The symptom is similar to the one fixed in
      commit 00b36850 ("xen-netfront: do not assume sk_buff_head list is
      empty in error handling").
      
      This patch changes the return type of xennet_fill_frags() to indicate
      whether it is successful or failed. The queue->rx.rsp_cons will be
      always updated inside this function.
      
      Fixes: ad4f15dc ("xen/netfront: don't bug in case of too many frags")
      Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a761129e
    • David Ahern's avatar
      ipv6: Handle race in addrconf_dad_work · a3ce2a21
      David Ahern authored
      
      
      Rajendra reported a kernel panic when a link was taken down:
      
      [ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
      [ 6870.271856] IP: [<ffffffff8efc5764>] __ipv6_ifa_notify+0x154/0x290
      
      <snip>
      
      [ 6870.570501] Call Trace:
      [ 6870.573238] [<ffffffff8efc58c6>] ? ipv6_ifa_notify+0x26/0x40
      [ 6870.579665] [<ffffffff8efc98ec>] ? addrconf_dad_completed+0x4c/0x2c0
      [ 6870.586869] [<ffffffff8efe70c6>] ? ipv6_dev_mc_inc+0x196/0x260
      [ 6870.593491] [<ffffffff8efc9c6a>] ? addrconf_dad_work+0x10a/0x430
      [ 6870.600305] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
      [ 6870.606732] [<ffffffff8ea93a7a>] ? process_one_work+0x18a/0x430
      [ 6870.613449] [<ffffffff8ea93d6d>] ? worker_thread+0x4d/0x490
      [ 6870.619778] [<ffffffff8ea93d20>] ? process_one_work+0x430/0x430
      [ 6870.626495] [<ffffffff8ea99dd9>] ? kthread+0xd9/0xf0
      [ 6870.632145] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
      [ 6870.638573] [<ffffffff8ea99d00>] ? kthread_park+0x60/0x60
      [ 6870.644707] [<ffffffff8f01ae77>] ? ret_from_fork+0x57/0x70
      [ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0
      
      addrconf_dad_work is kicked to be scheduled when a device is brought
      up. There is a race between addrcond_dad_work getting scheduled and
      taking the rtnl lock and a process taking the link down (under rtnl).
      The latter removes the host route from the inet6_addr as part of
      addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
      to use the host route in ipv6_ifa_notify. If the down event removes
      the host route due to the race to the rtnl, then the BUG listed above
      occurs.
      
      This scenario does not occur when the ipv6 address is not kept
      (net.ipv6.conf.all.keep_addr_on_down = 0) as addrconf_ifdown sets the
      state of the ifp to DEAD. Handle when the addresses are kept by checking
      IF_READY which is reset by addrconf_ifdown.
      
      The 'dead' flag for an inet6_addr is set only under rtnl, in
      addrconf_ifdown and it means the device is getting removed (or IPv6 is
      disabled). The interesting cases for changing the idev flag are
      addrconf_notify (NETDEV_UP and NETDEV_CHANGE) and addrconf_ifdown
      (reset the flag). The former does not have the idev lock - only rtnl;
      the latter has both. Based on that the existing dead + IF_READY check
      can be moved to right after the rtnl_lock in addrconf_dad_work.
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Reported-by: default avatarRajendra Dendukuri <rajendra.dendukuri@broadcom.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3ce2a21
    • Eric Dumazet's avatar
      tcp: adjust rto_base in retransmits_timed_out() · 3256a2d6
      Eric Dumazet authored
      
      
      The cited commit exposed an old retransmits_timed_out() bug
      which assumed it could call tcp_model_timeout() with
      TCP_RTO_MIN as rto_base for all states.
      
      But flows in SYN_SENT or SYN_RECV state uses a different
      RTO base (1 sec instead of 200 ms, unless BPF choses
      another value)
      
      This caused a reduction of SYN retransmits from 6 to 4 with
      the default /proc/sys/net/ipv4/tcp_syn_retries value.
      
      Fixes: a41e8a88 ("tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Marek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3256a2d6
    • Dexuan Cui's avatar
      vsock: Fix a lockdep warning in __vsock_release() · 0d9138ff
      Dexuan Cui authored
      
      
      Lockdep is unhappy if two locks from the same class are held.
      
      Fix the below warning for hyperv and virtio sockets (vmci socket code
      doesn't have the issue) by using lock_sock_nested() when __vsock_release()
      is called recursively:
      
      ============================================
      WARNING: possible recursive locking detected
      5.3.0+ #1 Not tainted
      --------------------------------------------
      server/1795 is trying to acquire lock:
      ffff8880c5158990 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 [hv_sock]
      
      but task is already holding lock:
      ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(sk_lock-AF_VSOCK);
        lock(sk_lock-AF_VSOCK);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by server/1795:
       #0: ffff8880c5d05ff8 (&sb->s_type->i_mutex_key#10){+.+.}, at: __sock_release+0x2d/0xa0
       #1: ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]
      
      stack backtrace:
      CPU: 5 PID: 1795 Comm: server Not tainted 5.3.0+ #1
      Call Trace:
       dump_stack+0x67/0x90
       __lock_acquire.cold.67+0xd2/0x20b
       lock_acquire+0xb5/0x1c0
       lock_sock_nested+0x6d/0x90
       hvs_release+0x10/0x120 [hv_sock]
       __vsock_release+0x24/0xf0 [vsock]
       __vsock_release+0xa0/0xf0 [vsock]
       vsock_release+0x12/0x30 [vsock]
       __sock_release+0x37/0xa0
       sock_close+0x14/0x20
       __fput+0xc1/0x250
       task_work_run+0x98/0xc0
       do_exit+0x344/0xc60
       do_group_exit+0x47/0xb0
       get_signal+0x15c/0xc50
       do_signal+0x30/0x720
       exit_to_usermode_loop+0x50/0xa0
       do_syscall_64+0x24e/0x270
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7f4184e85f31
      
      Tested-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d9138ff
    • Johan Hovold's avatar
      hso: fix NULL-deref on tty open · 8353da9f
      Johan Hovold authored
      
      
      Fix NULL-pointer dereference on tty open due to a failure to handle a
      missing interrupt-in endpoint when probing modem ports:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000006
      	...
      	RIP: 0010:tiocmget_submit_urb+0x1c/0xe0 [hso]
      	...
      	Call Trace:
      	hso_start_serial_device+0xdc/0x140 [hso]
      	hso_serial_open+0x118/0x1b0 [hso]
      	tty_open+0xf1/0x490
      
      Fixes: 542f5482 ("tty: Modem functions for the HSO driver")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8353da9f
    • Oleksij Rempel's avatar
      net: ag71xx: fix mdio subnode support · 569aad4f
      Oleksij Rempel authored
      
      
      This patch is syncing driver with actual devicetree documentation:
      Documentation/devicetree/bindings/net/qca,ar71xx.txt
      |Optional subnodes:
      |- mdio : specifies the mdio bus, used as a container for phy nodes
      |  according to phy.txt in the same directory
      
      The driver was working with fixed phy without any noticeable issues. This bug
      was uncovered by introducing dsa ar9331-switch driver.
      Since no one reported this bug until now, I assume no body is using it
      and this patch should not brake existing system.
      
      Fixes: d51b6ce4 ("net: ethernet: add ag71xx driver")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      569aad4f
    • David S. Miller's avatar
      Merge branch 'stmmac-fixes' · b33210e3
      David S. Miller authored
      
      
      Jose Abreu says:
      
      ====================
      net: stmmac: Fixes for -net
      
      Misc fixes for -net tree. More info in commit logs.
      
      v2 is just a rebase of v1 against -net and we added a new patch (09/09) to
      fix RSS feature.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b33210e3
    • Jose Abreu's avatar
      net: stmmac: xgmac: Fix RSS writing wrong keys · 56627336
      Jose Abreu authored
      
      
      Commit b6b6cc9a, changed the call to dwxgmac2_rss_write_reg()
      passing it the variable cfg->key[i].
      
      As key is an u8 but we write 32 bits at a time we need to cast it into
      an u32 so that the correct key values are written. Notice that the for
      loop already takes this into account so we don't try to write past the
      keys size.
      
      Fixes: b6b6cc9a ("net: stmmac: selftest: avoid large stack usage")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56627336
    • Jose Abreu's avatar
      net: stmmac: xgmac: Fix RSS not writing all Keys to HW · 3c72d4d3
      Jose Abreu authored
      
      
      The sizeof(cfg->key) is != ARRAY_SIZE(cfg->key). Fix it. This warning is
      triggered when running with cc flag -Wsizeof-array-div.
      
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Reported-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Fixes: 76067459 ("net: stmmac: Implement RSS and enable it in XGMAC core")
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c72d4d3
    • Jose Abreu's avatar
      net: stmmac: xgmac: Disable the Timestamp interrupt by default · 30300d9f
      Jose Abreu authored
      
      
      We don't use it anyway as XGMAC only supports polling for timestamp (in
      current SW implementation). This greatly reduces the system load by
      reducing the number of interrupts.
      
      Fixes: 2142754f ("net: stmmac: Add MAC related callbacks for XGMAC2")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30300d9f
    • Jose Abreu's avatar
      net: stmmac: Do not stop PHY if WoL is enabled · 3e2bf04f
      Jose Abreu authored
      
      
      If WoL is enabled we can't really stop the PHY, otherwise we will not
      receive the WoL packet. Fix this by telling phylink that only the MAC is
      down and only stop the PHY if WoL is not enabled.
      
      Fixes: 74371272 ("net: stmmac: Convert to phylink and remove phylib logic")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e2bf04f
    • Jose Abreu's avatar
      net: stmmac: Correctly take timestamp for PTPv2 · 14f34733
      Jose Abreu authored
      
      
      The case for PTPV2_EVENT requires event packets to be captured so add
      this setting to the list of enabled captures.
      
      Fixes: 891434b1 ("stmmac: add IEEE PTPv1 and PTPv2 support.")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14f34733
    • Jose Abreu's avatar
      net: stmmac: dwmac4: Always update the MAC Hash Filter · f79bfda3
      Jose Abreu authored
      
      
      We need to always update the MAC Hash Filter so that previous entries
      are invalidated.
      
      Found out while running stmmac selftests.
      
      Fixes: b8ef7020 ("net: stmmac: add support for hash table size 128/256 in dwmac4")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f79bfda3
    • Jose Abreu's avatar
      net: stmmac: selftests: Always use max DMA size in Jumbo Test · 432439fe
      Jose Abreu authored
      
      
      Although some XGMAC setups support frames larger than DMA size, some of
      them may not. As we can't know before-hand which ones support let's use
      the maximum DMA buffer size in the Jumbo Tests.
      
      User can always reconfigure the MTU to achieve larger frames.
      
      Fixes: 427849e8 ("net: stmmac: selftests: Add Jumbo Frame tests")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      432439fe
    • Jose Abreu's avatar
      net: stmmac: xgmac: Detect Hash Table size dinamically · c11986b9
      Jose Abreu authored
      
      
      Since commit b8ef7020 ("net: stmmac: add support for hash table size
      128/256 in dwmac4"), we can detect the Hash Table dinamically.
      
      Let's implement this feature in XGMAC cores and fix possible setups that
      don't support the maximum size for Hash Table.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c11986b9
    • Jose Abreu's avatar
      net: stmmac: xgmac: Not all Unicast addresses may be available · 9a2ae7b3
      Jose Abreu authored
      
      
      Some setups may not have all Unicast addresses filters available. Let's
      check this before trying to setup filters.
      
      Fixes: 0efedbf1 ("net: stmmac: xgmac: Fix XGMAC selftests")
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a2ae7b3
    • Vasundhara Volam's avatar
      devlink: Fix error handling in param and info_get dumpit cb · 93c2fcb0
      Vasundhara Volam authored
      
      
      If any of the param or info_get op returns error, dumpit cb is
      skipping to dump remaining params or info_get ops for all the
      drivers.
      
      Fix to not return if any of the param/info_get op returns error
      as not supported and continue to dump remaining information.
      
      v2: Modify the patch to return error, except for params/info_get
      op that return -EOPNOTSUPP as suggested by Andrew Lunn. Also, modify
      commit message to reflect the same.
      
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93c2fcb0
    • Wen Yang's avatar
      net: dsa: rtl8366rb: add missing of_node_put after calling of_get_child_by_name · f32eb9d8
      Wen Yang authored
      
      
      of_node_put needs to be called when the device node which is got
      from of_get_child_by_name finished using.
      irq_domain_add_linear() also calls of_node_get() to increase refcount,
      so irq_domain will not be affected when it is released.
      
      Fixes: d8652956 ("net: dsa: realtek-smi: Add Realtek SMI driver")
      Signed-off-by: default avatarWen Yang <wenyang@linux.alibaba.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f32eb9d8
    • Wen Yang's avatar
      net: mscc: ocelot: add missing of_node_put after calling of_get_child_by_name · d2c50b1c
      Wen Yang authored
      
      
      of_node_put needs to be called when the device node which is got
      from of_get_child_by_name finished using.
      In both cases of success and failure, we need to release 'ports',
      so clean up the code using goto.
      
      fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: default avatarWen Yang <wenyang@linux.alibaba.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2c50b1c
    • Vladimir Oltean's avatar
      net: sched: cbs: Avoid division by zero when calculating the port rate · 83c8c3cf
      Vladimir Oltean authored
      
      
      As explained in the "net: sched: taprio: Avoid division by zero on
      invalid link speed" commit, it is legal for the ethtool API to return
      zero as a link speed. So guard against it to ensure we don't perform a
      division by zero in kernel.
      
      Fixes: e0a7683d ("net/sched: cbs: fix port_rate miscalculation")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83c8c3cf
    • Vladimir Oltean's avatar
      net: sched: taprio: Avoid division by zero on invalid link speed · 9a9251a3
      Vladimir Oltean authored
      
      
      The check in taprio_set_picos_per_byte is currently not robust enough
      and will trigger this division by zero, due to e.g. PHYLINK not setting
      kset->base.speed when there is no PHY connected:
      
      [   27.109992] Division by zero in kernel.
      [   27.113842] CPU: 1 PID: 198 Comm: tc Not tainted 5.3.0-rc5-01246-gc4006b8c2637-dirty #212
      [   27.121974] Hardware name: Freescale LS1021A
      [   27.126234] [<c03132e0>] (unwind_backtrace) from [<c030d8b8>] (show_stack+0x10/0x14)
      [   27.133938] [<c030d8b8>] (show_stack) from [<c10b21b0>] (dump_stack+0xb0/0xc4)
      [   27.141124] [<c10b21b0>] (dump_stack) from [<c10af97c>] (Ldiv0_64+0x8/0x18)
      [   27.148052] [<c10af97c>] (Ldiv0_64) from [<c0700260>] (div64_u64+0xcc/0xf0)
      [   27.154978] [<c0700260>] (div64_u64) from [<c07002d0>] (div64_s64+0x4c/0x68)
      [   27.161993] [<c07002d0>] (div64_s64) from [<c0f3d890>] (taprio_set_picos_per_byte+0xe8/0xf4)
      [   27.170388] [<c0f3d890>] (taprio_set_picos_per_byte) from [<c0f3f614>] (taprio_change+0x668/0xcec)
      [   27.179302] [<c0f3f614>] (taprio_change) from [<c0f2bc24>] (qdisc_create+0x1fc/0x4f4)
      [   27.187091] [<c0f2bc24>] (qdisc_create) from [<c0f2c0c8>] (tc_modify_qdisc+0x1ac/0x6f8)
      [   27.195055] [<c0f2c0c8>] (tc_modify_qdisc) from [<c0ee9604>] (rtnetlink_rcv_msg+0x268/0x2dc)
      [   27.203449] [<c0ee9604>] (rtnetlink_rcv_msg) from [<c0f4fef0>] (netlink_rcv_skb+0xe0/0x114)
      [   27.211756] [<c0f4fef0>] (netlink_rcv_skb) from [<c0f4f6cc>] (netlink_unicast+0x1b4/0x22c)
      [   27.219977] [<c0f4f6cc>] (netlink_unicast) from [<c0f4fa84>] (netlink_sendmsg+0x284/0x340)
      [   27.228198] [<c0f4fa84>] (netlink_sendmsg) from [<c0eae5fc>] (sock_sendmsg+0x14/0x24)
      [   27.235988] [<c0eae5fc>] (sock_sendmsg) from [<c0eaedf8>] (___sys_sendmsg+0x214/0x228)
      [   27.243863] [<c0eaedf8>] (___sys_sendmsg) from [<c0eb015c>] (__sys_sendmsg+0x50/0x8c)
      [   27.251652] [<c0eb015c>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
      [   27.259524] Exception stack(0xe8045fa8 to 0xe8045ff0)
      [   27.264546] 5fa0:                   b6f608c8 000000f8 00000003 bed7e2f0 00000000 00000000
      [   27.272681] 5fc0: b6f608c8 000000f8 004ce54c 00000128 5d3ce8c7 00000000 00000026 00505c9c
      [   27.280812] 5fe0: 00000070 bed7e298 004ddd64 b6dd1e64
      
      Russell King points out that the ethtool API says zero is a valid return
      value of __ethtool_get_link_ksettings:
      
         * If it is enabled then they are read-only; if the link
         * is up they represent the negotiated link mode; if the link is down,
         * the speed is 0, %SPEED_UNKNOWN or the highest enabled speed and
         * @duplex is %DUPLEX_UNKNOWN or the best enabled duplex mode.
      
        So, it seems that taprio is not following the API... I'd suggest either
        fixing taprio, or getting agreement to change the ethtool API.
      
      The chosen path was to fix taprio.
      
      Fixes: 7b9eba7b ("net/sched: taprio: fix picos_per_byte miscalculation")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a9251a3
    • David S. Miller's avatar
      Merge tag 'mac80211-for-davem-2019-10-01' of... · 9cfc3702
      David S. Miller authored
      Merge tag 'mac80211-for-davem-2019-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      
      
      Johannes Berg says:
      
      ====================
      A small list of fixes this time:
       * two null pointer dereference fixes
       * a fix for preempt-enabled/BHs-enabled (lockdep) splats
         (that correctly pointed out a bug)
       * a fix for multi-BSSID ordering assumptions
       * a fix for the EDMG support, on-stack chandefs need to
         be initialized properly (now that they're bigger)
       * beacon (head) data from userspace should be validated
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cfc3702
    • Arnd Bergmann's avatar
      ionic: select CONFIG_NET_DEVLINK · 6de6d185
      Arnd Bergmann authored
      
      
      When no other driver selects the devlink library code, ionic
      produces a link failure:
      
      drivers/net/ethernet/pensando/ionic/ionic_devlink.o: In function `ionic_devlink_alloc':
      ionic_devlink.c:(.text+0xd): undefined reference to `devlink_alloc'
      drivers/net/ethernet/pensando/ionic/ionic_devlink.o: In function `ionic_devlink_register':
      ionic_devlink.c:(.text+0x71): undefined reference to `devlink_register'
      
      Add the same 'select' statement that the other drivers use here.
      
      Fixes: fbfb8031 ("ionic: Add hardware init and device commands")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6de6d185
    • Adam Zerella's avatar
      docs: networking: Add title caret and missing doc · c5f75a14
      Adam Zerella authored
      
      
      Resolving a couple of Sphinx documentation warnings
      that are generated in the networking section.
      
      - WARNING: document isn't included in any toctree
      - WARNING: Title underline too short.
      
      Signed-off-by: default avatarAdam Zerella <adam.zerella@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5f75a14
    • Lorenzo Bianconi's avatar
      net: socionext: netsec: always grab descriptor lock · 55131dec
      Lorenzo Bianconi authored
      
      
      Always acquire tx descriptor spinlock even if a xdp program is not loaded
      on the netsec device since ndo_xdp_xmit can run concurrently with
      netsec_netdev_start_xmit and netsec_clean_tx_dring. This can happen
      loading a xdp program on a different device (e.g virtio-net) and
      xdp_do_redirect_map/xdp_do_redirect_slow can redirect to netsec even if
      we do not have a xdp program on it.
      
      Fixes: ba2b2321 ("net: netsec: add XDP support")
      Tested-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55131dec
  2. Oct 01, 2019
    • Johannes Berg's avatar
      mac80211: keep BHs disabled while calling drv_tx_wake_queue() · d8dec42b
      Johannes Berg authored
      Drivers typically expect this, as it's the case for almost all cases
      where this is called (i.e. from the TX path). Also, the code in mac80211
      itself (if the driver calls ieee80211_tx_dequeue()) expects this as it
      uses this_cpu_ptr() without additional protection.
      
      This should fix various reports of the problem:
      https://bugzilla.kernel.org/show_bug.cgi?id=204127
      https://lore.kernel.org/linux-wireless/CAN5HydrWb3o_FE6A1XDnP1E+xS66d5kiEuhHfiGKkLNQokx13Q@mail.gmail.com/
      https://lore.kernel.org/lkml/nycvar.YFH.7.76.1909111238470.473@cbobk.fhfr.pm/
      
      
      
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: default avatarJiri Kosina <jkosina@suse.cz>
      Reported-by: default avatarAaron Hill <aa1ronham@gmail.com>
      Reported-by: default avatarLukas Redlinger <rel+kernel@agilox.net>
      Reported-by: default avatarOleksii Shevchuk <alxchk@gmail.com>
      Fixes: 21a5d4c3 ("mac80211: add stop/start logic for software TXQs")
      Link: https://lore.kernel.org/r/1569928763-I3e8838c5ecad878e59d4a94eb069a90f6641461a@changeid
      
      
      Reviewed-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      d8dec42b
    • Miaoqing Pan's avatar
      mac80211: fix txq null pointer dereference · 8ed31a26
      Miaoqing Pan authored
      
      
      If the interface type is P2P_DEVICE or NAN, read the file of
      '/sys/kernel/debug/ieee80211/phyx/netdev:wlanx/aqm' will get a
      NULL pointer dereference. As for those interface type, the
      pointer sdata->vif.txq is NULL.
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000011
      CPU: 1 PID: 30936 Comm: cat Not tainted 4.14.104 #1
      task: ffffffc0337e4880 task.stack: ffffff800cd20000
      PC is at ieee80211_if_fmt_aqm+0x34/0xa0 [mac80211]
      LR is at ieee80211_if_fmt_aqm+0x34/0xa0 [mac80211]
      [...]
      Process cat (pid: 30936, stack limit = 0xffffff800cd20000)
      [...]
      [<ffffff8000b7cd00>] ieee80211_if_fmt_aqm+0x34/0xa0 [mac80211]
      [<ffffff8000b7c414>] ieee80211_if_read+0x60/0xbc [mac80211]
      [<ffffff8000b7ccc4>] ieee80211_if_read_aqm+0x28/0x30 [mac80211]
      [<ffffff80082eff94>] full_proxy_read+0x2c/0x48
      [<ffffff80081eef00>] __vfs_read+0x2c/0xd4
      [<ffffff80081ef084>] vfs_read+0x8c/0x108
      [<ffffff80081ef494>] SyS_read+0x40/0x7c
      
      Signed-off-by: default avatarMiaoqing Pan <miaoqing@codeaurora.org>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/1569549796-8223-1-git-send-email-miaoqing@codeaurora.org
      
      
      [trim useless data from commit message]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      8ed31a26
    • Miaoqing Pan's avatar
      nl80211: fix null pointer dereference · b501426c
      Miaoqing Pan authored
      
      
      If the interface is not in MESH mode, the command 'iw wlanx mpath del'
      will cause kernel panic.
      
      The root cause is null pointer access in mpp_flush_by_proxy(), as the
      pointer 'sdata->u.mesh.mpp_paths' is NULL for non MESH interface.
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000068
      [...]
      PC is at _raw_spin_lock_bh+0x20/0x5c
      LR is at mesh_path_del+0x1c/0x17c [mac80211]
      [...]
      Process iw (pid: 4537, stack limit = 0xd83e0238)
      [...]
      [<c021211c>] (_raw_spin_lock_bh) from [<bf8c7648>] (mesh_path_del+0x1c/0x17c [mac80211])
      [<bf8c7648>] (mesh_path_del [mac80211]) from [<bf6cdb7c>] (extack_doit+0x20/0x68 [compat])
      [<bf6cdb7c>] (extack_doit [compat]) from [<c05c309c>] (genl_rcv_msg+0x274/0x30c)
      [<c05c309c>] (genl_rcv_msg) from [<c05c25d8>] (netlink_rcv_skb+0x58/0xac)
      [<c05c25d8>] (netlink_rcv_skb) from [<c05c2e14>] (genl_rcv+0x20/0x34)
      [<c05c2e14>] (genl_rcv) from [<c05c1f90>] (netlink_unicast+0x11c/0x204)
      [<c05c1f90>] (netlink_unicast) from [<c05c2420>] (netlink_sendmsg+0x30c/0x370)
      [<c05c2420>] (netlink_sendmsg) from [<c05886d0>] (sock_sendmsg+0x70/0x84)
      [<c05886d0>] (sock_sendmsg) from [<c0589f4c>] (___sys_sendmsg.part.3+0x188/0x228)
      [<c0589f4c>] (___sys_sendmsg.part.3) from [<c058add4>] (__sys_sendmsg+0x4c/0x70)
      [<c058add4>] (__sys_sendmsg) from [<c0208c80>] (ret_fast_syscall+0x0/0x44)
      Code: e2822c02 e2822001 e5832004 f590f000 (e1902f9f)
      ---[ end trace bbd717600f8f884d ]---
      
      Signed-off-by: default avatarMiaoqing Pan <miaoqing@codeaurora.org>
      Link: https://lore.kernel.org/r/1569485810-761-1-git-send-email-miaoqing@codeaurora.org
      
      
      [trim useless data from commit message]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      b501426c
    • Johannes Berg's avatar
      cfg80211: initialize on-stack chandefs · f43e5210
      Johannes Berg authored
      
      
      In a few places we don't properly initialize on-stack chandefs,
      resulting in EDMG data to be non-zero, which broke things.
      
      Additionally, in a few places we rely on the driver to init the
      data completely, but perhaps we shouldn't as non-EDMG drivers
      may not initialize the EDMG data, also initialize it there.
      
      Cc: stable@vger.kernel.org
      Fixes: 2a38075c ("nl80211: Add support for EDMG channels")
      Reported-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Tested-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Link: https://lore.kernel.org/r/1569239475-I2dcce394ecf873376c386a78f31c2ec8b538fa25@changeid
      
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      f43e5210
    • Johannes Berg's avatar
      cfg80211: validate SSID/MBSSID element ordering assumption · 242b0931
      Johannes Berg authored
      The code copying the data assumes that the SSID element is
      before the MBSSID element, but since the data is untrusted
      from the AP, this cannot be guaranteed.
      
      Validate that this is indeed the case and ignore the MBSSID
      otherwise, to avoid having to deal with both cases for the
      copy of data that should be between them.
      
      Cc: stable@vger.kernel.org
      Fixes: 0b8fb823 ("cfg80211: Parsing of Multiple BSSID information in scanning")
      Link: https://lore.kernel.org/r/1569009255-I1673911f5eae02964e21bdc11b2bf58e5e207e59@changeid
      
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      242b0931
    • Johannes Berg's avatar
      nl80211: validate beacon head · f88eb7c0
      Johannes Berg authored
      We currently don't validate the beacon head, i.e. the header,
      fixed part and elements that are to go in front of the TIM
      element. This means that the variable elements there can be
      malformed, e.g. have a length exceeding the buffer size, but
      most downstream code from this assumes that this has already
      been checked.
      
      Add the necessary checks to the netlink policy.
      
      Cc: stable@vger.kernel.org
      Fixes: ed1b6cc7 ("cfg80211/nl80211: add beacon settings")
      Link: https://lore.kernel.org/r/1569009255-I7ac7fbe9436e9d8733439eab8acbbd35e55c74ef@changeid
      
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      f88eb7c0
    • Vladimir Oltean's avatar
      net: sched: taprio: Fix potential integer overflow in taprio_set_picos_per_byte · 68ce6688
      Vladimir Oltean authored
      
      
      The speed divisor is used in a context expecting an s64, but it is
      evaluated using 32-bit arithmetic.
      
      To avoid that happening, instead of multiplying by 1,000,000 in the
      first place, simplify the fraction and do a standard 32 bit division
      instead.
      
      Fixes: f04b514c ("taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte")
      Reported-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68ce6688
    • Navid Emamdoost's avatar
      net: dsa: sja1105: Prevent leaking memory · 68501df9
      Navid Emamdoost authored
      
      
      In sja1105_static_config_upload, in two cases memory is leaked: when
      static_config_buf_prepare_for_upload fails and when sja1105_inhibit_tx
      fails. In both cases config_buf should be released.
      
      Fixes: 8aa9ebcc ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
      Fixes: 1a4c6940 ("net: dsa: sja1105: Prevent PHY jabbering during switch reset")
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68501df9
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Ensure PTP time for rxtstamp reconstruction is not in the past · b6f2494d
      Vladimir Oltean authored
      
      
      Sometimes the PTP synchronization on the switch 'jumps':
      
        ptp4l[11241.155]: rms    8 max   16 freq -21732 +/-  11 delay   742 +/-   0
        ptp4l[11243.157]: rms    7 max   17 freq -21731 +/-  10 delay   744 +/-   0
        ptp4l[11245.160]: rms 33592410 max 134217731 freq +192422 +/- 8530253 delay   743 +/-   0
        ptp4l[11247.163]: rms 811631 max 964131 freq +10326 +/- 557785 delay   743 +/-   0
        ptp4l[11249.166]: rms 261936 max 533876 freq -304323 +/- 126371 delay   744 +/-   0
        ptp4l[11251.169]: rms 48700 max 57740 freq -20218 +/- 30532 delay   744 +/-   0
        ptp4l[11253.171]: rms 14570 max 30163 freq  -5568 +/- 7563 delay   742 +/-   0
        ptp4l[11255.174]: rms 2914 max 3440 freq -22001 +/- 1667 delay   744 +/-   1
        ptp4l[11257.177]: rms  811 max 1710 freq -22653 +/- 451 delay   744 +/-   1
        ptp4l[11259.180]: rms  177 max  218 freq -21695 +/-  89 delay   741 +/-   0
        ptp4l[11261.182]: rms   45 max   92 freq -21677 +/-  32 delay   742 +/-   0
        ptp4l[11263.186]: rms   14 max   32 freq -21733 +/-  11 delay   742 +/-   0
        ptp4l[11265.188]: rms    9 max   14 freq -21725 +/-  12 delay   742 +/-   0
        ptp4l[11267.191]: rms    9 max   16 freq -21727 +/-  13 delay   742 +/-   0
        ptp4l[11269.194]: rms    6 max   15 freq -21726 +/-   9 delay   743 +/-   0
        ptp4l[11271.197]: rms    8 max   15 freq -21728 +/-  11 delay   743 +/-   0
        ptp4l[11273.200]: rms    6 max   12 freq -21727 +/-   8 delay   743 +/-   0
        ptp4l[11275.202]: rms    9 max   17 freq -21720 +/-  11 delay   742 +/-   0
        ptp4l[11277.205]: rms    9 max   18 freq -21725 +/-  12 delay   742 +/-   0
      
      Background: the switch only offers partial RX timestamps (24 bits) and
      it is up to the driver to read the PTP clock to fill those timestamps up
      to 64 bits. But the PTP clock readout needs to happen quickly enough (in
      0.135 seconds, in fact), otherwise the PTP clock will wrap around 24
      bits, condition which cannot be detected.
      
      Looking at the 'max 134217731' value on output line 3, one can see that
      in hex it is 0x8000003. Because the PTP clock resolution is 8 ns,
      that means 0x1000000 in ticks, which is exactly 2^24. So indeed this is
      a PTP clock wraparound, but the reason might be surprising.
      
      What is going on is that sja1105_tstamp_reconstruct(priv, now, ts)
      expects a "now" time that is later than the "ts" was snapshotted at.
      This, of course, is obvious: we read the PTP time _after_ the partial RX
      timestamp was received. However, the workqueue is processing frames from
      a skb queue and reuses the same PTP time, read once at the beginning.
      Normally the skb queue only contains one frame and all goes well. But
      when the skb queue contains two frames, the second frame that gets
      dequeued might have been partially timestamped by the RX MAC _after_ we
      had read our PTP time initially.
      
      The code was originally like that due to concerns that SPI access for
      PTP time readout is a slow process, and we are time-constrained anyway
      (aka: premature optimization). But some timing analysis reveals that the
      time spent until the RX timestamp is completely reconstructed is 1 order
      of magnitude lower than the 0.135 s deadline even under worst-case
      conditions. So we can afford to read the PTP time for each frame in the
      RX timestamping queue, which of course ensures that the full PTP time is
      in the partial timestamp's future.
      
      Fixes: f3097be2 ("net: dsa: sja1105: Add a state machine for RX timestamping")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6f2494d
    • David S. Miller's avatar
      Merge tag 'ieee802154-for-davem-2019-09-28' of... · 3755ee22
      David S. Miller authored
      Merge tag 'ieee802154-for-davem-2019-09-28' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
      
      
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154 for net 2019-09-28
      
      An update from ieee802154 for your *net* tree.
      
      Three driver fixes. Navid Emamdoost fixed a memory leak on an error
      path in the ca8210 driver, Johan Hovold fixed a use-after-free found
      by syzbot in the atusb driver and Christophe JAILLET makes sure
      __skb_put_data is used instead of memcpy in the mcr20a driver
      
      I switched from branches to tags here to be pulled from. So far not
      annotated and not signed. Once I fixed my scripts it should contain
      this messages as annotations. If you want it signed as well just tell
      me. If there are any problems let me know.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3755ee22
    • Martin KaFai Lau's avatar
      net: Unpublish sk from sk_reuseport_cb before call_rcu · 8c7138b3
      Martin KaFai Lau authored
      
      
      The "reuse->sock[]" array is shared by multiple sockets.  The going away
      sk must unpublish itself from "reuse->sock[]" before making call_rcu()
      call.  However, this unpublish-action is currently done after a grace
      period and it may cause use-after-free.
      
      The fix is to move reuseport_detach_sock() to sk_destruct().
      Due to the above reason, any socket with sk_reuseport_cb has
      to go through the rcu grace period before freeing it.
      
      It is a rather old bug (~3 yrs).  The Fixes tag is not necessary
      the right commit but it is the one that introduced the SOCK_RCU_FREE
      logic and this fix is depending on it.
      
      Fixes: a4298e45 ("net: add SOCK_RCU_FREE socket flag")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c7138b3
    • Haishuang Yan's avatar
      erspan: remove the incorrect mtu limit for erspan · 0e141f75
      Haishuang Yan authored
      
      
      erspan driver calls ether_setup(), after commit 61e84623
      ("net: centralize net_device min/max MTU checking"), the range
      of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.
      
      It causes the dev mtu of the erspan device to not be greater
      than 1500, this limit value is not correct for ipgre tap device.
      
      Tested:
      Before patch:
      # ip link set erspan0 mtu 1600
      Error: mtu greater than device maximum.
      After patch:
      # ip link set erspan0 mtu 1600
      # ip -d link show erspan0
      21: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1600 qdisc noop state DOWN
      mode DEFAULT group default qlen 1000
          link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 0
      
      Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
      Signed-off-by: default avatarHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e141f75
    • Eric Dumazet's avatar
      sch_cbq: validate TCA_CBQ_WRROPT to avoid crash · e9789c7c
      Eric Dumazet authored
      
      
      syzbot reported a crash in cbq_normalize_quanta() caused
      by an out of range cl->priority.
      
      iproute2 enforces this check, but malicious users do not.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN PTI
      Modules linked in:
      CPU: 1 PID: 26447 Comm: syz-executor.1 Not tainted 5.3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:cbq_normalize_quanta.part.0+0x1fd/0x430 net/sched/sch_cbq.c:902
      RSP: 0018:ffff8801a5c333b0 EFLAGS: 00010206
      RAX: 0000000020000003 RBX: 00000000fffffff8 RCX: ffffc9000712f000
      RDX: 00000000000043bf RSI: ffffffff83be8962 RDI: 0000000100000018
      RBP: ffff8801a5c33420 R08: 000000000000003a R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000002ef
      R13: ffff88018da95188 R14: dffffc0000000000 R15: 0000000000000015
      FS:  00007f37d26b1700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004c7cec CR3: 00000001bcd0a006 CR4: 00000000001626f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       [<ffffffff83be9d57>] cbq_normalize_quanta include/net/pkt_sched.h:27 [inline]
       [<ffffffff83be9d57>] cbq_addprio net/sched/sch_cbq.c:1097 [inline]
       [<ffffffff83be9d57>] cbq_set_wrr+0x2d7/0x450 net/sched/sch_cbq.c:1115
       [<ffffffff83bee8a7>] cbq_change_class+0x987/0x225b net/sched/sch_cbq.c:1537
       [<ffffffff83b96985>] tc_ctl_tclass+0x555/0xcd0 net/sched/sch_api.c:2329
       [<ffffffff83a84655>] rtnetlink_rcv_msg+0x485/0xc10 net/core/rtnetlink.c:5248
       [<ffffffff83cadf0a>] netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2510
       [<ffffffff83a7db6d>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5266
       [<ffffffff83cac2c6>] netlink_unicast_kernel net/netlink/af_netlink.c:1324 [inline]
       [<ffffffff83cac2c6>] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1350
       [<ffffffff83cacd4a>] netlink_sendmsg+0x89a/0xd50 net/netlink/af_netlink.c:1939
       [<ffffffff8399d46e>] sock_sendmsg_nosec net/socket.c:673 [inline]
       [<ffffffff8399d46e>] sock_sendmsg+0x12e/0x170 net/socket.c:684
       [<ffffffff8399f1fd>] ___sys_sendmsg+0x81d/0x960 net/socket.c:2359
       [<ffffffff839a2d05>] __sys_sendmsg+0x105/0x1d0 net/socket.c:2397
       [<ffffffff839a2df9>] SYSC_sendmsg net/socket.c:2406 [inline]
       [<ffffffff839a2df9>] SyS_sendmsg+0x29/0x30 net/socket.c:2404
       [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305
       [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9789c7c
    • Michal Vokáč's avatar
      net: dsa: qca8k: Use up to 7 ports for all operations · 7ae6d93c
      Michal Vokáč authored
      
      
      The QCA8K family supports up to 7 ports. So use the existing
      QCA8K_NUM_PORTS define to allocate the switch structure and limit all
      operations with the switch ports.
      
      This was not an issue until commit 0394a63a ("net: dsa: enable and
      disable all ports") disabled all unused ports. Since the unused ports 7-11
      are outside of the correct register range on this switch some registers
      were rewritten with invalid content.
      
      Fixes: 6b93fb46 ("net-next: dsa: add new driver for qca8xxx family")
      Fixes: a0c02161 ("net: dsa: variable number of ports")
      Fixes: 0394a63a ("net: dsa: enable and disable all ports")
      Signed-off-by: default avatarMichal Vokáč <michal.vokac@ysoft.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ae6d93c