Commits · 7eab14de73a8028f770e703962c5437a2b0dda82 · Mirrors / github.com / sifive_riscv-linux

Jan 20, 2021

mdio, phy: fix -Wshadow warnings triggered by nested container_of() · 7eab14de

Alexander Lobakin authored Jan 16, 2021



container_of() macro hides a local variable '__mptr' inside. This
becomes a problem when several container_of() are nested in each
other within single line or plain macros.
As C preprocessor doesn't support generating random variable names,
the sole solution is to avoid defining macros that consist only of
container_of() calls, or they will self-shadow '__mptr' each time:

In file included from ./include/linux/bitmap.h:10,
                 from drivers/net/phy/phy_device.c:12:
drivers/net/phy/phy_device.c: In function ‘phy_device_release’:
./include/linux/kernel.h:693:8: warning: declaration of ‘__mptr’ shadows a previous local [-Wshadow]
  693 |  void *__mptr = (void *)(ptr);     \
      |        ^~~~~~
./include/linux/phy.h:647:26: note: in expansion of macro ‘container_of’
  647 | #define to_phy_device(d) container_of(to_mdio_device(d), \
      |                          ^~~~~~~~~~~~
./include/linux/mdio.h:52:27: note: in expansion of macro ‘container_of’
   52 | #define to_mdio_device(d) container_of(d, struct mdio_device, dev)
      |                           ^~~~~~~~~~~~
./include/linux/phy.h:647:39: note: in expansion of macro ‘to_mdio_device’
  647 | #define to_phy_device(d) container_of(to_mdio_device(d), \
      |                                       ^~~~~~~~~~~~~~
drivers/net/phy/phy_device.c:217:8: note: in expansion of macro ‘to_phy_device’
  217 |  kfree(to_phy_device(dev));
      |        ^~~~~~~~~~~~~
./include/linux/kernel.h:693:8: note: shadowed declaration is here
  693 |  void *__mptr = (void *)(ptr);     \
      |        ^~~~~~
./include/linux/phy.h:647:26: note: in expansion of macro ‘container_of’
  647 | #define to_phy_device(d) container_of(to_mdio_device(d), \
      |                          ^~~~~~~~~~~~
drivers/net/phy/phy_device.c:217:8: note: in expansion of macro ‘to_phy_device’
  217 |  kfree(to_phy_device(dev));
      |        ^~~~~~~~~~~~~

As they are declared in header files, these warnings are highly
repetitive and very annoying (along with the one from linux/pci.h).

Convert the related macros from linux/{mdio,phy}.h to static inlines
to avoid self-shadowing and potentially improve bug-catching.
No functional changes implied.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210116161246.67075-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

7eab14de

vhost_net: avoid tx queue stuck when sendmsg fails · dc9c9e72

Yunjian Wang authored Jan 15, 2021



Currently the driver doesn't drop a packet which can't be sent by tun
(e.g bad packet). In this case, the driver will always process the
same packet lead to the tx queue stuck.

To fix this issue:
1. in the case of persistent failure (e.g bad packet), the driver
   can skip this descriptor by ignoring the error.
2. in the case of transient failure (e.g -ENOBUFS, -EAGAIN and -ENOMEM),
   the driver schedules the worker to try again.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1610685980-38608-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dc9c9e72

Jan 19, 2021

net: hns: fix variable used when DEBUG is defined · 99d51897

Tom Rix authored Jan 17, 2021



When DEBUG is defined this error occurs

drivers/net/ethernet/hisilicon/hns/hns_enet.c:1505:36: error:
  ‘struct net_device’ has no member named ‘ae_handle’;
  did you mean ‘rx_handler’?
  assert(skb->queue_mapping < ndev->ae_handle->q_num);
                                    ^~~~~~~~~

ae_handle is an element of struct hns_nic_priv, so change
ndev to priv.

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20210117191044.533725-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

99d51897

arcnet: fix macro name when DEBUG is defined · 7cfabe4f

Tom Rix authored Jan 17, 2021



When DEBUG is defined this error occurs

drivers/net/arcnet/com20020_cs.c:70:15: error: ‘com20020_REG_W_ADDR_HI’
  undeclared (first use in this function);
  did you mean ‘COM20020_REG_W_ADDR_HI’?
       ioaddr, com20020_REG_W_ADDR_HI);
               ^~~~~~~~~~~~~~~~~~~~~~

From reviewing the context, the suggestion is what is meant.

Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/20210117181519.527625-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

7cfabe4f

Merge branch 'tls-device-offload-for-bond' · be7f4578

Jakub Kicinski authored Jan 18, 2021



Tariq Toukan says:

====================
TLS device offload for Bond

This series opens TX and RX TLS device offload for bond interfaces.
This allows bond interfaces to benefit from capable lower devices.

We add a new ndo_sk_get_lower_dev() to be used to get the lower dev that
corresponds to a given socket.
The TLS module uses it to interact directly with the lowest device in
chain, and invoke the control operations in tlsdev_ops. This means that the
bond interface doesn't have his own struct tlsdev_ops instance and
derived logic/callbacks.

To keep simple track of the HW and SW TLS contexts, we bind each socket to
a specific lower device for the socket's whole lifetime. This is logically
valid (and similar to the SW kTLS behavior) in the following bond configuration,
so we restrict the offload support to it:

((mode == balance-xor) or (mode == 802.3ad))
and xmit_hash_policy == layer3+4.

In this design, TLS TX/RX offload feature flags of the bond device are
independent from the lower devices. They reflect the current features state,
but are not directly controllable.
This is because the bond driver is bypassed by the call to
ndo_sk_get_lower_dev(), without him knowing who the caller is.
The bond TLS feature flags are set/cleared only according to the configuration
of the mode and xmit_hash_policy.

Bypass is true only for the control flow. Packets in fast path still go through
the bond logic.

The design here differs from the xfrm/ipsec offload, where the bond driver
has his own copy of struct xfrmdev_ops and callbacks.
====================

Link: https://lore.kernel.org/r/20210117145949.8632-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

be7f4578

net/tls: Except bond interface from some TLS checks · 4e5a7332

Tariq Toukan authored Jan 17, 2021



In the tls_dev_event handler, ignore tlsdev_ops requirement for bond
interfaces, they do not exist as the interaction is done directly with
the lower device.

Also, make the validate function pass when it's called with the upper
bond interface.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

4e5a7332

net/tls: Device offload to use lowest netdevice in chain · 153cbd13

Tariq Toukan authored Jan 17, 2021



Do not call the tls_dev_ops of upper devices. Instead, ask them
for the proper lowest device and communicate with it directly.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

153cbd13

net/bonding: Declare TLS RX device offload support · dc5809f9

Tariq Toukan authored Jan 17, 2021



Following the description in previous patch (for TX):
As the bond interface is being bypassed by the TLS module, interacting
directly against the lower devs, there is no way for the bond interface
to disable its device offload capabilities, as long as the mode/policy
config allows it.
Hence, the feature flag is not directly controllable, but just reflects
the offload status based on the logic under bond_sk_check().

Here we just declare RX device offload support, and expose it via the
NETIF_F_HW_TLS_RX flag.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dc5809f9

net/bonding: Implement TLS TX device offload · 89df6a81

Tariq Toukan authored Jan 17, 2021



Implement TLS TX device offload for bonding interfaces.
This allows kTLS sockets running on a bond to benefit from the
device offload on capable lower devices.

To allow a simple and fast maintenance of the TLS context in SW and
lower devices, we bind the TLS socket to a specific lower dev.
To achieve a behavior similar to SW kTLS, we support only balance-xor
and 802.3ad modes, with xmit_hash_policy=layer3+4. This is enforced
in bond_sk_check(), done in a previous patch.

For the above configuration, the SW implementation keeps picking the
same exact lower dev for all the socket's SKBs. The device offload
behaves similarly, making the decision once at the connection creation.

Per socket, the TLS module should work directly with the lowest netdev
in chain, to call the tls_dev_ops operations.

As the bond interface is being bypassed by the TLS module, interacting
directly against the lower devs, there is no way for the bond interface
to disable its device offload capabilities, as long as the mode/policy
config allows it.
Hence, the feature flag is not directly controllable, but just reflects
the current offload status based on the logic under bond_sk_check().

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

89df6a81

net/bonding: Take update_features call out of XFRM funciton · f45583de

Tariq Toukan authored Jan 17, 2021



In preparation for more cases that call netdev_update_features().

While here, move the features logic to the stage where struct bond
is already updated, and pass it as the only parameter to function
bond_set_xfrm_features().

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

f45583de

net/bonding: Implement ndo_sk_get_lower_dev · 007feb87

Tariq Toukan authored Jan 17, 2021



Add ndo_sk_get_lower_dev() implementation for bond interfaces.

Support only for the cases where the socket's and SKBs' hash
yields identical value for the whole connection lifetime.

Here we restrict it to L3+4 sockets only, with
xmit_hash_policy==LAYER34 and bond modes xor/802.3ad.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

007feb87

net/bonding: Take IP hash logic into a helper · 5b998545

Tariq Toukan authored Jan 17, 2021



Hash logic on L3 will be used in a downstream patch for one more use
case.
Take it to a function for a better code reuse.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5b998545

net: netdevice: Add operation ndo_sk_get_lower_dev · 719a402c

Tariq Toukan authored Jan 17, 2021



ndo_sk_get_lower_dev returns the lower netdev that corresponds to
a given socket.
Additionally, we implement a helper netdev_sk_get_lowest_dev() to get
the lowest one in chain.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

719a402c

net/qla3xxx: switch from 'pci_' to 'dma_' API · 41fb4c1b

Christophe JAILLET authored Jan 17, 2021



The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'ql_alloc_net_req_rsp_queues()' GFP_KERNEL can
be used because it is only called from 'ql_alloc_mem_resources()' which
already calls 'ql_alloc_buffer_queues()' which uses GFP_KERNEL. (see below)

When memory is allocated in 'ql_alloc_buffer_queues()' GFP_KERNEL can be
used because this flag is already used just a few line above.

When memory is allocated in 'ql_alloc_small_buffers()' GFP_KERNEL can
be used because it is only called from 'ql_alloc_mem_resources()' which
already calls 'ql_alloc_buffer_queues()' which uses GFP_KERNEL. (see above)

When memory is allocated in 'ql_alloc_mem_resources()' GFP_KERNEL can be
used because this function already calls 'ql_alloc_buffer_queues()' which
uses GFP_KERNEL. (see above)

While at it, use 'dma_set_mask_and_coherent()' instead of 'dma_set_mask()/
dma_set_coherent_mask()' in order to slightly simplify code.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20210117081542.560021-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

41fb4c1b

net_sched: fix RTNL deadlock again caused by request_module() · d349f997

Cong Wang authored Jan 16, 2021

tcf_action_init_1() loads tc action modules automatically with
request_module() after parsing the tc action names, and it drops RTNL
lock and re-holds it before and after request_module(). This causes a
lot of troubles, as discovered by syzbot, because we can be in the
middle of batch initializations when we create an array of tc actions.

One of the problem is deadlock:

CPU 0					CPU 1
rtnl_lock();
for (...) {
  tcf_action_init_1();
    -> rtnl_unlock();
    -> request_module();
				rtnl_lock();
				for (...) {
				  tcf_action_init_1();
				    -> tcf_idr_check_alloc();
				   // Insert one action into idr,
				   // but it is not committed until
				   // tcf_idr_insert_many(), then drop
				   // the RTNL lock in the _next_
				   // iteration
				   -> rtnl_unlock();
    -> rtnl_lock();
    -> a_o->init();
      -> tcf_idr_check_alloc();
      // Now waiting for the same index
      // to be committed
				    -> request_module();
				    -> rtnl_lock()
				    // Now waiting for RTNL lock
				}
				rtnl_unlock();
}
rtnl_unlock();

This is not easy to solve, we can move the request_module() before
this loop and pre-load all the modules we need for this netlink
message and then do the rest initializations. So the loop breaks down
to two now:

        for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
                struct tc_action_ops *a_o;

                a_o = tc_action_load_ops(name, tb[i]...);
                ops[i - 1] = a_o;
        }

        for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
                act = tcf_action_init_1(ops[i - 1]...);
        }

Although this looks serious, it only has been reported by syzbot, so it
seems hard to trigger this by humans. And given the size of this patch,
I'd suggest to make it to net-next and not to backport to stable.

This patch has been tested by syzbot and tested with tdc.py by me.

Fixes: 0fedc63f

 ("net_sched: commit action insertions together")
Reported-and-tested-by:  <syzbot+82752bc5331601cf4899@syzkaller.appspotmail.com>
Reported-and-tested-by:  <syzbot+b3b63b6bff456bd95294@syzkaller.appspotmail.com>
Reported-by:  <syzbot+ba67b12b1ca729912834@syzkaller.appspotmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20210117005657.14810-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d349f997

net: phy: national: remove definition of DEBUG · 6ea9309a

Tom Rix authored Jan 15, 2021



Defining DEBUG should only be done in development.
So remove DEBUG.

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20210115235346.289611-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6ea9309a

Merge branch 'net-make-udp-tunnel-devices-support-fraglist' · c080559a

Jakub Kicinski authored Jan 18, 2021



Xin Long says:

====================
net: make udp tunnel devices support fraglist

Like GRE device, UDP tunnel devices should also support fraglist, so
that some protocol (like SCTP) HW GSO that requires NETIF_F_FRAGLIST
in the dev can work. Especially when the lower device support both
NETIF_F_GSO_UDP_TUNNEL and NETIF_F_GSO_SCTP.
====================

Link: https://lore.kernel.org/r/cover.1610704037.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c080559a

bareudp: add NETIF_F_FRAGLIST flag for dev features · 3224dcfd

Xin Long authored Jan 15, 2021



Like vxlan and geneve, bareudp also needs this dev feature
to support some protocol's HW GSO.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

3224dcfd

geneve: add NETIF_F_FRAGLIST flag for dev features · 18423e1a

Xin Long authored Jan 15, 2021



Some protocol HW GSO requires fraglist supported by the device, like
SCTP. Without NETIF_F_FRAGLIST set in the dev features of geneve, it
would have to do SW GSO before the packets enter the driver, even
when the geneve dev and lower dev (like veth) both have the feature
of NETIF_F_GSO_SCTP.

So this patch is to add it for geneve.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

18423e1a

vxlan: add NETIF_F_FRAGLIST flag for dev features · cb2c5711

Xin Long authored Jan 15, 2021



Some protocol HW GSO requires fraglist supported by the device, like
SCTP. Without NETIF_F_FRAGLIST set in the dev features of vxlan, it
would have to do SW GSO before the packets enter the driver, even
when the vxlan dev and lower dev (like veth) both have the feature
of NETIF_F_GSO_SCTP.

So this patch is to add it for vxlan.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

cb2c5711

hv_netvsc: Add (more) validation for untrusted Hyper-V values · 505e3f00

Andrea Parri (Microsoft) authored Jan 14, 2021

For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest. Ensure that invalid values cannot cause indexing
off the end of an array, or subvert an existing validation via integer
overflow. Ensure that outgoing packets do not have any leftover guest
memory that has not been zeroed out.

Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20210114202628.119541-1-parri.andrea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

505e3f00

net: bridge: check vlan with eth_type_vlan() method · a98c0c47

Menglong Dong authored Jan 17, 2021



Replace some checks for ETH_P_8021Q and ETH_P_8021AD with
eth_type_vlan().

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210117080950.122761-1-dong.menglong@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a98c0c47

Merge branch 'net-ipa-interconnect-improvements' · 220723dc

Jakub Kicinski authored Jan 18, 2021



Alex Elder says:

====================
net: ipa: interconnect improvements

The main outcome of this series is to allow the number of
interconnects used by the IPA to differ from the three that
are implemented now.  With this series in place, any number
of interconnects can now be used, all specified in the
configuration data for a specific platform.

A few minor interconnect-related cleanups are implemented as well.
====================

Link: https://lore.kernel.org/r/20210115125050.20555-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

220723dc

net: ipa: allow arbitrary number of interconnects · ea151e19