Skip to content
  1. May 04, 2021
  2. May 03, 2021
    • Daniel Borkmann's avatar
      bpf: Fix leakage of uninitialized bpf stack under speculation · 801c6058
      Daniel Borkmann authored
      The current implemented mechanisms to mitigate data disclosure under
      speculation mainly address stack and map value oob access from the
      speculative domain. However, Piotr discovered that uninitialized BPF
      stack is not protected yet, and thus old data from the kernel stack,
      potentially including addresses of kernel structures, could still be
      extracted from that 512 bytes large window. The BPF stack is special
      compared to map values since it's not zero initialized for every
      program invocation, whereas map values /are/ zero initialized upon
      their initial allocation and thus cannot leak any prior data in either
      domain. In the non-speculative domain, the verifier ensures that every
      stack slot read must have a prior stack slot write by the BPF program
      to avoid such data leaking issue.
      
      However, this is not enough: for example, when the pointer arithmetic
      operation moves the stack pointer from the last valid stack offset to
      the first valid offset, the sanitation logic allows for any intermediate
      offsets during speculative execution, which could then be used to
      extract any restricted stack content via side-channel.
      
      Given for unprivileged stack pointer arithmetic the use of unknown
      but bounded scalars is generally forbidden, we can simply turn the
      register-based arithmetic operation into an immediate-based arithmetic
      operation without the need for masking. This also gives the benefit
      of reducing the needed instructions for the operation. Given after
      the work in 7fedb63a
      
       ("bpf: Tighten speculative pointer arithmetic
      mask"), the aux->alu_limit already holds the final immediate value for
      the offset register with the known scalar. Thus, a simple mov of the
      immediate to AX register with using AX as the source for the original
      instruction is sufficient and possible now in this case.
      
      Reported-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      801c6058
    • Daniel Borkmann's avatar
      bpf: Fix masking negation logic upon negative dst register · b9b34ddb
      Daniel Borkmann authored
      The negation logic for the case where the off_reg is sitting in the
      dst register is not correct given then we cannot just invert the add
      to a sub or vice versa. As a fix, perform the final bitwise and-op
      unconditionally into AX from the off_reg, then move the pointer from
      the src to dst and finally use AX as the source for the original
      pointer arithmetic operation such that the inversion yields a correct
      result. The single non-AX mov in between is possible given constant
      blinding is retaining it as it's not an immediate based operation.
      
      Fixes: 979d63d5
      
       ("bpf: prevent out of bounds speculation on pointer arithmetic")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b9b34ddb
  3. May 01, 2021
  4. Apr 30, 2021
    • Oleksij Rempel's avatar
      net: dsa: ksz: ksz8863_smi_probe: set proper return value for ksz_switch_alloc() · d4eecfb2
      Oleksij Rempel authored
      ksz_switch_alloc() will return NULL only if allocation is failed. So,
      the proper return value is -ENOMEM.
      
      Fixes: 60a36476
      
       ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4eecfb2
    • Oleksij Rempel's avatar
      net: dsa: ksz: ksz8795_spi_probe: fix possible NULL pointer dereference · ba46b576
      Oleksij Rempel authored
      Fix possible NULL pointer dereference in case devm_kzalloc() failed to
      allocate memory
      
      Fixes: cc13e52c
      
       ("net: dsa: microchip: Add Microchip KSZ8863 SPI based driver support")
      Reported-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba46b576
    • Oleksij Rempel's avatar
      net: dsa: ksz: ksz8863_smi_probe: fix possible NULL pointer dereference · d27f0201
      Oleksij Rempel authored
      Fix possible NULL pointer dereference in case devm_kzalloc() failed to
      allocate memory.
      
      Fixes: 60a36476
      
       ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support")
      Reported-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d27f0201
    • Yang Li's avatar
      bnx2x: Remove redundant assignment to err · 8343b1f8
      Yang Li authored
      
      
      Variable 'err' is set to -EIO but this value is never read as it is
      overwritten with a new value later on, hence it is a redundant
      assignment and can be removed.
      
      Clean up the following clang-analyzer warning:
      drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1195:2: warning: Value
      stored to 'err' is never read [clang-analyzer-deadcode.DeadStores]
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8343b1f8
    • Jiapeng Chong's avatar
      net: macb: Remove redundant assignment to queue · bbf6acea
      Jiapeng Chong authored
      
      
      Variable queue is set to bp->queues but these values is not used as it
      is overwritten later on, hence redundant assignment  can be removed.
      
      Cleans up the following clang-analyzer warning:
      
      drivers/net/ethernet/cadence/macb_main.c:4919:21: warning: Value stored
      to 'queue' during its initialization is never read
      [clang-analyzer-deadcode.DeadStores].
      
      drivers/net/ethernet/cadence/macb_main.c:4832:21: warning: Value stored
      to 'queue' during its initialization is never read
      [clang-analyzer-deadcode.DeadStores].
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbf6acea
    • Michael Walle's avatar
      MAINTAINERS: move Murali Karicheri to credits · 57e1d820
      Michael Walle authored
      
      
      His email bounces with permanent error "550 Invalid recipient". His last
      email was from 2020-09-09 on the LKML and he seems to have left TI.
      
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57e1d820
    • Michael Walle's avatar
      MAINTAINERS: remove Wingman Kwok · 1c7600b7
      Michael Walle authored
      
      
      His email bounces with permanent error "550 Invalid recipient". His last
      email on the LKML was from 2015-10-22 on the LKML.
      
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c7600b7
    • David S. Miller's avatar
      Merge branch 'hns3-fixes' · 2ce960f8
      David S. Miller authored
      
      
      Huazhong Tan says:
      
      ====================
      net: hns3: add some fixes for -net
      
      This series adds some fixes for the HNS3 ethernet driver.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ce960f8
    • Jian Shen's avatar
      net: hns3: add check for HNS3_NIC_STATE_INITED in hns3_reset_notify_up_enet() · b4047aac
      Jian Shen authored
      In some cases, the device is not initialized because reset failed.
      If another task calls hns3_reset_notify_up_enet() before reset
      retry, it will cause an error since uninitialized pointer access.
      So add check for HNS3_NIC_STATE_INITED before calling
      hns3_nic_net_open() in hns3_reset_notify_up_enet().
      
      Fixes: bb6b94a8
      
       ("net: hns3: Add reset interface implementation in client")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4047aac
    • Yufeng Mo's avatar
      net: hns3: initialize the message content in hclge_get_link_mode() · 568a54bd
      Yufeng Mo authored
      The message sent to VF should be initialized, otherwise random
      value of some contents may cause improper processing by the target.
      So add a initialization to message in hclge_get_link_mode().
      
      Fixes: 9194d18b
      
       ("net: hns3: fix the problem that the supported port is empty")
      Signed-off-by: default avatarYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      568a54bd
    • Yufeng Mo's avatar
      net: hns3: fix incorrect configuration for igu_egu_hw_err · 2867298d
      Yufeng Mo authored
      According to the UM, the type and enable status of igu_egu_hw_err
      should be configured separately. Currently, the type field is
      incorrect when disable this error. So fix it by configuring these
      two fields separately.
      
      Fixes: bf1faf94
      
       ("net: hns3: Add enable and process hw errors from IGU, EGU and NCSI")
      Signed-off-by: default avatarYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2867298d
    • Yang Li's avatar
      net: Remove redundant assignment to err · 1a70f659
      Yang Li authored
      
      
      Variable 'err' is set to -ENOMEM but this value is never read as it is
      overwritten with a new value later on, hence the 'If statements' and
      assignments are redundantand and can be removed.
      
      Cleans up the following clang-analyzer warning:
      
      net/ipv6/seg6.c:126:4: warning: Value stored to 'err' is never read
      [clang-analyzer-deadcode.DeadStores]
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a70f659
    • Zhang Zhengming's avatar
      bridge: Fix possible races between assigning rx_handler_data and setting IFF_BRIDGE_PORT bit · 59259ff7
      Zhang Zhengming authored
      
      
      There is a crash in the function br_get_link_af_size_filtered,
      as the port_exists(dev) is true and the rx_handler_data of dev is NULL.
      But the rx_handler_data of dev is correct saved in vmcore.
      
      The oops looks something like:
       ...
       pc : br_get_link_af_size_filtered+0x28/0x1c8 [bridge]
       ...
       Call trace:
        br_get_link_af_size_filtered+0x28/0x1c8 [bridge]
        if_nlmsg_size+0x180/0x1b0
        rtnl_calcit.isra.12+0xf8/0x148
        rtnetlink_rcv_msg+0x334/0x370
        netlink_rcv_skb+0x64/0x130
        rtnetlink_rcv+0x28/0x38
        netlink_unicast+0x1f0/0x250
        netlink_sendmsg+0x310/0x378
        sock_sendmsg+0x4c/0x70
        __sys_sendto+0x120/0x150
        __arm64_sys_sendto+0x30/0x40
        el0_svc_common+0x78/0x130
        el0_svc_handler+0x38/0x78
        el0_svc+0x8/0xc
      
      In br_add_if(), we found there is no guarantee that
      assigning rx_handler_data to dev->rx_handler_data
      will before setting the IFF_BRIDGE_PORT bit of priv_flags.
      So there is a possible data competition:
      
      CPU 0:                                                        CPU 1:
      (RCU read lock)                                               (RTNL lock)
      rtnl_calcit()                                                 br_add_slave()
        if_nlmsg_size()                                               br_add_if()
          br_get_link_af_size_filtered()                              -> netdev_rx_handler_register
                                                                          ...
                                                                          // The order is not guaranteed
            ...                                                           -> dev->priv_flags |= IFF_BRIDGE_PORT;
            // The IFF_BRIDGE_PORT bit of priv_flags has been set
            -> if (br_port_exists(dev)) {
              // The dev->rx_handler_data has NOT been assigned
              -> p = br_port_get_rcu(dev);
              ....
                                                                          -> rcu_assign_pointer(dev->rx_handler_data, rx_handler_data);
                                                                           ...
      
      Fix it in br_get_link_af_size_filtered, using br_port_get_check_rcu() and checking the return value.
      
      Signed-off-by: default avatarZhang Zhengming <zhangzhengming@huawei.com>
      Reviewed-by: default avatarZhao Lei <zhaolei69@huawei.com>
      Reviewed-by: default avatarWang Xiaogang <wangxiaogang3@huawei.com>
      Suggested-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59259ff7
    • David S. Miller's avatar
      Merge branch 'fragment-stack-oob-read' · 0ab1fa1c
      David S. Miller authored
      Davide Caratti says:
      
      ====================
      fix stack OOB read while fragmenting IPv4 packets
      
      - patch 1/2 fixes openvswitch IPv4 fragmentation, that does a stack OOB
      read after commit d52e5a7e
      
       ("ipv4: lock mtu in fnhe when received
      PMTU < net.ipv4.route.min_pmt")
      - patch 2/2 fixes the same issue in TC 'sch_frag' code
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ab1fa1c
    • Davide Caratti's avatar
      net/sched: sch_frag: fix stack OOB read while fragmenting IPv4 packets · 31fe34a0
      Davide Caratti authored
      when 'act_mirred' tries to fragment IPv4 packets that had been previously
      re-assembled using 'act_ct', splats like the following can be observed on
      kernels built with KASAN:
      
       BUG: KASAN: stack-out-of-bounds in ip_do_fragment+0x1b03/0x1f60
       Read of size 1 at addr ffff888147009574 by task ping/947
      
       CPU: 0 PID: 947 Comm: ping Not tainted 5.12.0-rc6+ #418
       Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
       Call Trace:
        <IRQ>
        dump_stack+0x92/0xc1
        print_address_description.constprop.7+0x1a/0x150
        kasan_report.cold.13+0x7f/0x111
        ip_do_fragment+0x1b03/0x1f60
        sch_fragment+0x4bf/0xe40
        tcf_mirred_act+0xc3d/0x11a0 [act_mirred]
        tcf_action_exec+0x104/0x3e0
        fl_classify+0x49a/0x5e0 [cls_flower]
        tcf_classify_ingress+0x18a/0x820
        __netif_receive_skb_core+0xae7/0x3340
        __netif_receive_skb_one_core+0xb6/0x1b0
        process_backlog+0x1ef/0x6c0
        __napi_poll+0xaa/0x500
        net_rx_action+0x702/0xac0
        __do_softirq+0x1e4/0x97f
        do_softirq+0x71/0x90
        </IRQ>
        __local_bh_enable_ip+0xdb/0xf0
        ip_finish_output2+0x760/0x2120
        ip_do_fragment+0x15a5/0x1f60
        __ip_finish_output+0x4c2/0xea0
        ip_output+0x1ca/0x4d0
        ip_send_skb+0x37/0xa0
        raw_sendmsg+0x1c4b/0x2d00
        sock_sendmsg+0xdb/0x110
        __sys_sendto+0x1d7/0x2b0
        __x64_sys_sendto+0xdd/0x1b0
        do_syscall_64+0x33/0x40
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7f82e13853eb
       Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 75 42 2c 00 41 89 ca 8b 00 85 c0 75 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 41 57 4d 89 c7 41 56 41 89
       RSP: 002b:00007ffe01fad888 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
       RAX: ffffffffffffffda RBX: 00005571aac13700 RCX: 00007f82e13853eb
       RDX: 0000000000002330 RSI: 00005571aac13700 RDI: 0000000000000003
       RBP: 0000000000002330 R08: 00005571aac10500 R09: 0000000000000010
       R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe01faefb0
       R13: 00007ffe01fad890 R14: 00007ffe01fad980 R15: 00005571aac0f0a0
      
       The buggy address belongs to the page:
       page:000000001dff2e03 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147009
       flags: 0x17ffffc0001000(reserved)
       raw: 0017ffffc0001000 ffffea00051c0248 ffffea00051c0248 0000000000000000
       raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        ffff888147009400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff888147009480: f1 f1 f1 f1 04 f2 f2 f2 f2 f2 f2 f2 00 00 00 00
       >ffff888147009500: 00 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2 f2
                                                                    ^
        ffff888147009580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff888147009600: 00 00 00 00 00 00 00 00 00 00 00 00 00 f2 f2 f2
      
      for IPv4 packets, sch_fragment() uses a temporary struct dst_entry. Then,
      in the following call graph:
      
        ip_do_fragment()
          ip_skb_dst_mtu()
            ip_dst_mtu_maybe_forward()
              ip_mtu_locked()
      
      the pointer to struct dst_entry is used as pointer to struct rtable: this
      turns the access to struct members like rt_mtu_locked into an OOB read in
      the stack. Fix this changing the temporary variable used for IPv4 packets
      in sch_fragment(), similarly to what is done for IPv6 few lines below.
      
      Fixes: c129412f
      
       ("net/sched: sch_frag: add generic packet fragment support.")
      Cc: <stable@vger.kernel.org> # 5.11
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31fe34a0
    • Davide Caratti's avatar
      openvswitch: fix stack OOB read while fragmenting IPv4 packets · 7c0ea593
      Davide Caratti authored
      running openvswitch on kernels built with KASAN, it's possible to see the
      following splat while testing fragmentation of IPv4 packets:
      
       BUG: KASAN: stack-out-of-bounds in ip_do_fragment+0x1b03/0x1f60
       Read of size 1 at addr ffff888112fc713c by task handler2/1367
      
       CPU: 0 PID: 1367 Comm: handler2 Not tainted 5.12.0-rc6+ #418
       Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
       Call Trace:
        dump_stack+0x92/0xc1
        print_address_description.constprop.7+0x1a/0x150
        kasan_report.cold.13+0x7f/0x111
        ip_do_fragment+0x1b03/0x1f60
        ovs_fragment+0x5bf/0x840 [openvswitch]
        do_execute_actions+0x1bd5/0x2400 [openvswitch]
        ovs_execute_actions+0xc8/0x3d0 [openvswitch]
        ovs_packet_cmd_execute+0xa39/0x1150 [openvswitch]
        genl_family_rcv_msg_doit.isra.15+0x227/0x2d0
        genl_rcv_msg+0x287/0x490
        netlink_rcv_skb+0x120/0x380
        genl_rcv+0x24/0x40
        netlink_unicast+0x439/0x630
        netlink_sendmsg+0x719/0xbf0
        sock_sendmsg+0xe2/0x110
        ____sys_sendmsg+0x5ba/0x890
        ___sys_sendmsg+0xe9/0x160
        __sys_sendmsg+0xd3/0x170
        do_syscall_64+0x33/0x40
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7f957079db07
       Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 eb ec ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 24 ed ff ff 48
       RSP: 002b:00007f956ce35a50 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
       RAX: ffffffffffffffda RBX: 0000000000000019 RCX: 00007f957079db07
       RDX: 0000000000000000 RSI: 00007f956ce35ae0 RDI: 0000000000000019
       RBP: 00007f956ce35ae0 R08: 0000000000000000 R09: 00007f9558006730
       R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
       R13: 00007f956ce37308 R14: 00007f956ce35f80 R15: 00007f956ce35ae0
      
       The buggy address belongs to the page:
       page:00000000af2a1d93 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x112fc7
       flags: 0x17ffffc0000000()
       raw: 0017ffffc0000000 0000000000000000 dead000000000122 0000000000000000
       raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
       page dumped because: kasan: bad access detected
      
       addr ffff888112fc713c is located in stack of task handler2/1367 at offset 180 in frame:
        ovs_fragment+0x0/0x840 [openvswitch]
      
       this frame has 2 objects:
        [32, 144) 'ovs_dst'
        [192, 424) 'ovs_rt'
      
       Memory state around the buggy address:
        ffff888112fc7000: f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff888112fc7080: 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00
       >ffff888112fc7100: 00 00 00 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00
                                               ^
        ffff888112fc7180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff888112fc7200: 00 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00
      
      for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. Then,
      in the following call graph:
      
        ip_do_fragment()
          ip_skb_dst_mtu()
            ip_dst_mtu_maybe_forward()
              ip_mtu_locked()
      
      the pointer to struct dst_entry is used as pointer to struct rtable: this
      turns the access to struct members like rt_mtu_locked into an OOB read in
      the stack. Fix this changing the temporary variable used for IPv4 packets
      in ovs_fragment(), similarly to what is done for IPv6 few lines below.
      
      Fixes: d52e5a7e
      
       ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmt")
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c0ea593
    • Andrea Mayer's avatar
      seg6: add counters support for SRv6 Behaviors · 94604548
      Andrea Mayer authored
      This patch provides counters for SRv6 Behaviors as defined in [1],
      section 6. For each SRv6 Behavior instance, counters defined in [1] are:
      
       - the total number of packets that have been correctly processed;
       - the total amount of traffic in bytes of all packets that have been
         correctly processed;
      
      In addition, this patch introduces a new counter that counts the number of
      packets that have NOT been properly processed (i.e. errors) by an SRv6
      Behavior instance.
      
      Counters are not only interesting for network monitoring purposes (i.e.
      counting the number of packets processed by a given behavior) but they also
      provide a simple tool for checking whether a behavior instance is working
      as we expect or not.
      Counters can be useful for troubleshooting misconfigured SRv6 networks.
      Indeed, an SRv6 Behavior can silently drop packets for very different
      reasons (i.e. wrong SID configuration, interfaces set with SID addresses,
      etc) without any notification/message to the user.
      
      Due to the nature of SRv6 networks, diagnostic tools such as ping and
      traceroute may be ineffective: paths used for reaching a given router can
      be totally different from the ones followed by probe packets. In addition,
      paths are often asymmetrical and this makes it even more difficult to keep
      up with the journey of the packets and to understand which behaviors are
      actually processing our traffic.
      
      When counters are enabled on an SRv6 Behavior instance, it is possible to
      verify if packets are actually processed by such behavior and what is the
      outcome of the processing. Therefore, the counters for SRv6 Behaviors offer
      an non-invasive observability point which can be leveraged for both traffic
      monitoring and troubleshooting purposes.
      
      [1] https://www.rfc-editor.org/rfc/rfc8986.html#name-counters
      
      Troubleshooting using SRv6 Behavior counters
      --------------------------------------------
      
      Let's make a brief example to see how helpful counters can be for SRv6
      networks. Let's consider a node where an SRv6 End Behavior receives an SRv6
      packet whose Segment Left (SL) is equal to 0. In this case, the End
      Behavior (which accepts only packets with SL >= 1) discards the packet and
      increases the error counter.
      This information can be leveraged by the network operator for
      troubleshooting. Indeed, the error counter is telling the user that the
      packet:
      
        (i) arrived at the node;
       (ii) the packet has been taken into account by the SRv6 End behavior;
      (iii) but an error has occurred during the processing.
      
      The error (iii) could be caused by different reasons, such as wrong route
      settings on the node or due to an invalid SID List carried by the SRv6
      packet. Anyway, the error counter is used to exclude that the packet did
      not arrive at the node or it has not been processed by the behavior at
      all.
      
      Turning on/off counters for SRv6 Behaviors
      ------------------------------------------
      
      Each SRv6 Behavior instance can be configured, at the time of its creation,
      to make use of counters.
      This is done through iproute2 which allows the user to create an SRv6
      Behavior instance specifying the optional "count" attribute as shown in the
      following example:
      
       $ ip -6 route add 2001:db8::1 encap seg6local action End count dev eth0
      
      per-behavior counters can be shown by adding "-s" to the iproute2 command
      line, i.e.:
      
       $ ip -s -6 route show 2001:db8::1
       2001:db8::1 encap seg6local action End packets 0 bytes 0 errors 0 dev eth0
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Impact of counters for SRv6 Behaviors on performance
      ====================================================
      
      To determine the performance impact due to the introduction of counters in
      the SRv6 Behavior subsystem, we have carried out extensive tests.
      
      We chose to test the throughput achieved by the SRv6 End.DX2 Behavior
      because, among all the other behaviors implemented so far, it reaches the
      highest throughput which is around 1.5 Mpps (per core at 2.4 GHz on a
      Xeon(R) CPU E5-2630 v3) on kernel 5.12-rc2 using packets of size ~ 100
      bytes.
      
      Three different tests were conducted in order to evaluate the overall
      throughput of the SRv6 End.DX2 Behavior in the following scenarios:
      
       1) vanilla kernel (without the SRv6 Behavior counters patch) and a single
          instance of an SRv6 End.DX2 Behavior;
       2) patched kernel with SRv6 Behavior counters and a single instance of
          an SRv6 End.DX2 Behavior with counters turned off;
       3) patched kernel with SRv6 Behavior counters and a single instance of
          SRv6 End.DX2 Behavior with counters turned on.
      
      All tests were performed on a testbed deployed on the CloudLab facilities
      [2], a flexible infrastructure dedicated to scientific research on the
      future of Cloud Computing.
      
      Results of tests are shown in the following table:
      
      Scenario (1): average 1504764,81 pps (~1504,76 kpps); std. dev 3956,82 pps
      Scenario (2): average 1501469,78 pps (~1501,47 kpps); std. dev 2979,85 pps
      Scenario (3): average 1501315,13 pps (~1501,32 kpps); std. dev 2956,00 pps
      
      As can be observed, throughputs achieved in scenarios (2),(3) did not
      suffer any observable degradation compared to scenario (1).
      
      Thanks to Jakub Kicinski and David Ahern for their valuable suggestions
      and comments provided during the discussion of the proposed RFCs.
      
      [2] https://www.cloudlab.us
      
      
      
      Signed-off-by: default avatarAndrea Mayer <andrea.mayer@uniroma2.it>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94604548
    • Linus Torvalds's avatar
      Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 9d31d233
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "Core:
      
         - bpf:
              - allow bpf programs calling kernel functions (initially to
                reuse TCP congestion control implementations)
              - enable task local storage for tracing programs - remove the
                need to store per-task state in hash maps, and allow tracing
                programs access to task local storage previously added for
                BPF_LSM
              - add bpf_for_each_map_elem() helper, allowing programs to walk
                all map elements in a more robust and easier to verify fashion
              - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
                redirection
              - lpm: add support for batched ops in LPM trie
              - add BTF_KIND_FLOAT support - mostly to allow use of BTF on
                s390 which has floats in its headers files
              - improve BPF syscall documentation and extend the use of kdoc
                parsing scripts we already employ for bpf-helpers
              - libbpf, bpftool: support static linking of BPF ELF files
              - improve support for encapsulation of L2 packets
      
         - xdp: restructure redirect actions to avoid a runtime lookup,
           improving performance by 4-8% in microbenchmarks
      
         - xsk: build skb by page (aka generic zerocopy xmit) - improve
           performance of software AF_XDP path by 33% for devices which don't
           need headers in the linear skb part (e.g. virtio)
      
         - nexthop: resilient next-hop groups - improve path stability on
           next-hops group changes (incl. offload for mlxsw)
      
         - ipv6: segment routing: add support for IPv4 decapsulation
      
         - icmp: add support for RFC 8335 extended PROBE messages
      
         - inet: use bigger hash table for IP ID generation
      
         - tcp: deal better with delayed TX completions - make sure we don't
           give up on fast TCP retransmissions only because driver is slow in
           reporting that it completed transmitting the original
      
         - tcp: reorder tcp_congestion_ops for better cache locality
      
         - mptcp:
              - add sockopt support for common TCP options
              - add support for common TCP msg flags
              - include multiple address ids in RM_ADDR
              - add reset option support for resetting one subflow
      
         - udp: GRO L4 improvements - improve 'forward' / 'frag_list'
           co-existence with UDP tunnel GRO, allowing the first to take place
           correctly even for encapsulated UDP traffic
      
         - micro-optimize dev_gro_receive() and flow dissection, avoid
           retpoline overhead on VLAN and TEB GRO
      
         - use less memory for sysctls, add a new sysctl type, to allow using
           u8 instead of "int" and "long" and shrink networking sysctls
      
         - veth: allow GRO without XDP - this allows aggregating UDP packets
           before handing them off to routing, bridge, OvS, etc.
      
         - allow specifing ifindex when device is moved to another namespace
      
         - netfilter:
              - nft_socket: add support for cgroupsv2
              - nftables: add catch-all set element - special element used to
                define a default action in case normal lookup missed
              - use net_generic infra in many modules to avoid allocating
                per-ns memory unnecessarily
      
         - xps: improve the xps handling to avoid potential out-of-bound
           accesses and use-after-free when XPS change race with other
           re-configuration under traffic
      
         - add a config knob to turn off per-cpu netdev refcnt to catch
           underflows in testing
      
        Device APIs:
      
         - add WWAN subsystem to organize the WWAN interfaces better and
           hopefully start driving towards more unified and vendor-
           independent APIs
      
         - ethtool:
              - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
                support)
              - allow network drivers to dump arbitrary SFP EEPROM data,
                current offset+length API was a poor fit for modern SFP which
                define EEPROM in terms of pages (incl. mlx5 support)
      
         - act_police, flow_offload: add support for packet-per-second
           policing (incl. offload for nfp)
      
         - psample: add additional metadata attributes like transit delay for
           packets sampled from switch HW (and corresponding egress and
           policy-based sampling in the mlxsw driver)
      
         - dsa: improve support for sandwiched LAGs with bridge and DSA
      
         - netfilter:
              - flowtable: use direct xmit in topologies with IP forwarding,
                bridging, vlans etc.
              - nftables: counter hardware offload support
      
         - Bluetooth:
              - improvements for firmware download w/ Intel devices
              - add support for reading AOSP vendor capabilities
              - add support for virtio transport driver
      
         - mac80211:
              - allow concurrent monitor iface and ethernet rx decap
              - set priority and queue mapping for injected frames
      
         - phy: add support for Clause-45 PHY Loopback
      
         - pci/iov: add sysfs MSI-X vector assignment interface to distribute
           MSI-X resources to VFs (incl. mlx5 support)
      
        New hardware/drivers:
      
         - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
           Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
           interfaces.
      
         - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
           BCM63xx switches
      
         - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
      
         - ath11k: support for QCN9074 a 802.11ax device
      
         - Bluetooth: Broadcom BCM4330 and BMC4334
      
         - phy: Marvell 88X2222 transceiver support
      
         - mdio: add BCM6368 MDIO mux bus controller
      
         - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
      
         - mana: driver for Microsoft Azure Network Adapter (MANA)
      
         - Actions Semi Owl Ethernet MAC
      
         - can: driver for ETAS ES58X CAN/USB interfaces
      
        Pure driver changes:
      
         - add XDP support to: enetc, igc, stmmac
      
         - add AF_XDP support to: stmmac
      
         - virtio:
              - page_to_skb() use build_skb when there's sufficient tailroom
                (21% improvement for 1000B UDP frames)
              - support XDP even without dedicated Tx queues - share the Tx
                queues with the stack when necessary
      
         - mlx5:
              - flow rules: add support for mirroring with conntrack, matching
                on ICMP, GTP, flex filters and more
              - support packet sampling with flow offloads
              - persist uplink representor netdev across eswitch mode changes
              - allow coexistence of CQE compression and HW time-stamping
              - add ethtool extended link error state reporting
      
         - ice, iavf: support flow filters, UDP Segmentation Offload
      
         - dpaa2-switch:
              - move the driver out of staging
              - add spanning tree (STP) support
              - add rx copybreak support
              - add tc flower hardware offload on ingress traffic
      
         - ionic:
              - implement Rx page reuse
              - support HW PTP time-stamping
      
         - octeon: support TC hardware offloads - flower matching on ingress
           and egress ratelimitting.
      
         - stmmac:
              - add RX frame steering based on VLAN priority in tc flower
              - support frame preemption (FPE)
              - intel: add cross time-stamping freq difference adjustment
      
         - ocelot:
              - support forwarding of MRP frames in HW
              - support multiple bridges
              - support PTP Sync one-step timestamping
      
         - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
           learning, flooding etc.
      
         - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
           SC7280 SoCs)
      
         - mt7601u: enable TDLS support
      
         - mt76:
              - add support for 802.3 rx frames (mt7915/mt7615)
              - mt7915 flash pre-calibration support
              - mt7921/mt7663 runtime power management fixes"
      
      * tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
        net: selftest: fix build issue if INET is disabled
        net: netrom: nr_in: Remove redundant assignment to ns
        net: tun: Remove redundant assignment to ret
        net: phy: marvell: add downshift support for M88E1240
        net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
        net/sched: act_ct: Remove redundant ct get and check
        icmp: standardize naming of RFC 8335 PROBE constants
        bpf, selftests: Update array map tests for per-cpu batched ops
        bpf: Add batched ops support for percpu array
        bpf: Implement formatted output helpers with bstr_printf
        seq_file: Add a seq_bprintf function
        sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
        net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
        net: fix a concurrency bug in l2tp_tunnel_register()
        net/smc: Remove redundant assignment to rc
        mpls: Remove redundant assignment to err
        llc2: Remove redundant assignment to rc
        net/tls: Remove redundant initialization of record
        rds: Remove redundant assignment to nr_sig
        dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
        ...
      9d31d233
    • Linus Torvalds's avatar
      Merge tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 635de956
      Linus Torvalds authored
      Pull x86 tlb updates from Ingo Molnar:
       "The x86 MM changes in this cycle were:
      
         - Implement concurrent TLB flushes, which overlaps the local TLB
           flush with the remote TLB flush.
      
           In testing this improved sysbench performance measurably by a
           couple of percentage points, especially if TLB-heavy security
           mitigations are active.
      
         - Further micro-optimizations to improve the performance of TLB
           flushes"
      
      * tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        smp: Micro-optimize smp_call_function_many_cond()
        smp: Inline on_each_cpu_cond() and on_each_cpu()
        x86/mm/tlb: Remove unnecessary uses of the inline keyword
        cpumask: Mark functions as pure
        x86/mm/tlb: Do not make is_lazy dirty for no reason
        x86/mm/tlb: Privatize cpu_tlbstate
        x86/mm/tlb: Flush remote and local TLBs concurrently
        x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()
        x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()
        smp: Run functions concurrently in smp_call_function_many_cond()
      635de956
    • Linus Torvalds's avatar
      Merge tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze · d0cc7eca
      Linus Torvalds authored
      Pull Microblaze updates from Michal Simek:
       "No new features, just about cleaning up some code and moving to
        generic syscall solution used by other architectures:
      
         - Switch to generic syscall scripts
      
         - Some small fixes"
      
      * tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: add 'fallthrough' to memcpy/memset/memmove
        microblaze: Fix a typo
        microblaze: tag highmem_setup() with __meminit
        microblaze: syscalls: switch to generic syscallhdr.sh
        microblaze: syscalls: switch to generic syscalltbl.sh
      d0cc7eca
    • Linus Torvalds's avatar
      Merge tag 'mips_5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 77d51337
      Linus Torvalds authored
      Pull MIPS updates from Thomas Bogendoerfer:
      
       - removed get_fs/set_fs
      
       - removed broken/unmaintained MIPS KVM trap and emulate support
      
       - added support for Loongson-2K1000
      
       - fixes and cleanups
      
      * tag 'mips_5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits)
        MIPS: BCM63XX: Use BUG_ON instead of condition followed by BUG.
        MIPS: select ARCH_KEEP_MEMBLOCK unconditionally
        mips: Do not include hi and lo in clobber list for R6
        MIPS:DTS:Correct the license for Loongson-2K
        MIPS:DTS:Fix label name and interrupt number of ohci for Loongson-2K
        MIPS: Avoid handcoded DIVU in `__div64_32' altogether
        lib/math/test_div64: Correct the spelling of "dividend"
        lib/math/test_div64: Fix error message formatting
        mips/bootinfo:correct some comments of fw_arg
        MIPS: Avoid DIVU in `__div64_32' is result would be zero
        MIPS: Reinstate platform `__div64_32' handler
        div64: Correct inline documentation for `do_div'
        lib/math: Add a `do_div' test module
        MIPS: Makefile: Replace -pg with CC_FLAGS_FTRACE
        MIPS: pci-legacy: revert "use generic pci_enable_resources"
        MIPS: Loongson64: Add kexec/kdump support
        MIPS: pci-legacy: use generic pci_enable_resources
        MIPS: pci-legacy: remove busn_resource field
        MIPS: pci-legacy: remove redundant info messages
        MIPS: pci-legacy: stop using of_pci_range_to_resource
        ...
      77d51337
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 3644286f
      Linus Torvalds authored
      Pull fsnotify updates from Jan Kara:
      
       - support for limited fanotify functionality for unpriviledged users
      
       - faster merging of fanotify events
      
       - a few smaller fsnotify improvements
      
      * tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        shmem: allow reporting fanotify events with file handles on tmpfs
        fs: introduce a wrapper uuid_to_fsid()
        fanotify_user: use upper_32_bits() to verify mask
        fanotify: support limited functionality for unprivileged users
        fanotify: configurable limits via sysfs
        fanotify: limit number of event merge attempts
        fsnotify: use hash table for faster events merge
        fanotify: mix event info and pid into merge key hash
        fanotify: reduce event objectid to 29-bit hash
        fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
      3644286f
    • Linus Torvalds's avatar
      Merge tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 767fcbc8
      Linus Torvalds authored
      Pull quota, ext2, reiserfs updates from Jan Kara:
      
       - support for path (instead of device) based quotactl syscall
         (quotactl_path(2))
      
       - ext2 conversion to kmap_local()
      
       - other minor cleanups & fixes
      
      * tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fs/reiserfs/journal.c: delete useless variables
        fs/ext2: Replace kmap() with kmap_local_page()
        ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry()
        fs/ext2/: fix misspellings using codespell tool
        quota: report warning limits for realtime space quotas
        quota: wire up quotactl_path
        quota: Add mountpath based quota support
      767fcbc8
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · d2b6f8a1
      Linus Torvalds authored
      Pull xfs updates from Darrick Wong:
       "The notable user-visible addition this cycle is ability to remove
        space from the last AG in a filesystem. This is the first of many
        changes needed for full-fledged support for shrinking a filesystem.
        Still needed are (a) the ability to reorganize files and metadata away
        from the end of the fs; (b) the ability to remove entire allocation
        groups; (c) shrink support for realtime volumes; and (d) thorough
        testing of (a-c).
      
        There are a number of performance improvements in this code drop: Dave
        streamlined various parts of the buffer logging code and reduced the
        cost of various debugging checks, and added the ability to pre-create
        the xattr structures while creating files. Brian eliminated
        transaction reservations that were being held across writeback (thus
        reducing livelock potential.
      
        Other random pieces: Pavel fixed the repetitve warnings about
        deprecated mount options, I fixed online fsck to behave itself when a
        readonly remount comes in during scrub, and refactored various other
        parts of that code, Christoph contributed a lot of refactoring this
        cycle. The xfs_icdinode structure has been absorbed into the (incore)
        xfs_inode structure, and the format and flags handling around
        xfs_inode_fork structures has been simplified. Chandan provided a
        number of fixes for extent count overflow related problems that have
        been shaken out by debugging knobs added during 5.12.
      
        Summary:
      
         - Various minor fixes in online scrub.
      
         - Prevent metadata files from being automatically inactivated.
      
         - Validate btree heights by the computed per-btree limits.
      
         - Don't warn about remounting with deprecated mount options.
      
         - Initialize attr forks at create time if we suspect we're going to
           need to store them.
      
         - Reduce memory reallocation workouts in the logging code.
      
         - Fix some theoretical math calculation errors in logged buffers that
           span multiple discontig memory ranges but contiguous ondisk
           regions.
      
         - Speedups in dirty buffer bitmap handling.
      
         - Make type verifier functions more inline-happy to reduce overhead.
      
         - Reduce debug overhead in directory checking code.
      
         - Many many typo fixes.
      
         - Begin to handle the permanent loss of the very end of a filesystem.
      
         - Fold struct xfs_icdinode into xfs_inode.
      
         - Deprecate the long defunct BMV_IF_NO_DMAPI_READ from the bmapx
           ioctl.
      
         - Remove a broken directory block format check from online scrub.
      
         - Fix a bug where we could produce an unnecessarily tall data fork
           btree when creating an attr fork.
      
         - Fix scrub and readonly remounts racing.
      
         - Fix a writeback ioend log deadlock problem by dropping the behavior
           where we could preallocate a setfilesize transaction.
      
         - Fix some bugs in the new extent count checking code.
      
         - Fix some bugs in the attr fork preallocation code.
      
         - Refactor if_flags out of the incore inode fork data structure"
      
      * tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (77 commits)
        xfs: remove xfs_quiesce_attr declaration
        xfs: remove XFS_IFEXTENTS
        xfs: remove XFS_IFINLINE
        xfs: remove XFS_IFBROOT
        xfs: only look at the fork format in xfs_idestroy_fork
        xfs: simplify xfs_attr_remove_args
        xfs: rename and simplify xfs_bmap_one_block
        xfs: move the XFS_IFEXTENTS check into xfs_iread_extents
        xfs: drop unnecessary setfilesize helper
        xfs: drop unused ioend private merge and setfilesize code
        xfs: open code ioend needs workqueue helper
        xfs: drop submit side trans alloc for append ioends
        xfs: fix return of uninitialized value in variable error
        xfs: get rid of the ip parameter to xchk_setup_*
        xfs: fix scrub and remount-ro protection when running scrub
        xfs: move the check for post-EOF mappings into xfs_can_free_eofblocks
        xfs: move the xfs_can_free_eofblocks call under the IOLOCK
        xfs: precalculate default inode attribute offset
        xfs: default attr fork size does not handle device inodes
        xfs: inode fork allocation depends on XFS_IFEXTENT flag
        ...
      d2b6f8a1
    • Linus Torvalds's avatar
      Merge tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · f2c80837
      Linus Torvalds authored
      Pull gfs2 updates from Andreas Gruenbacher:
      
       - Fix some compiler and kernel-doc warnings
      
       - Various minor cleanups and optimizations
      
       - Add a new sysfs gfs2 status file with some filesystem wide
         information
      
      * tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Fix fall-through warnings for Clang
        gfs2: Fix a number of kernel-doc warnings
        gfs2: Make gfs2_setattr_simple static
        gfs2: Add new sysfs file for gfs2 status
        gfs2: Silence possible null pointer dereference warning
        gfs2: Turn gfs2_meta_indirect_buffer into gfs2_meta_buffer
        gfs2: Replace gfs2_lblk_to_dblk with gfs2_get_extent
        gfs2: Turn gfs2_extent_map into gfs2_{get,alloc}_extent
        gfs2: Add new gfs2_iomap_get helper
        gfs2: Remove unused variable sb_format
        gfs2: Fix dir.c function parameter descriptions
        gfs2: Eliminate gh parameter from go_xmote_bh func
        gfs2: don't create empty buffers for NO_CREATE
      f2c80837
    • Linus Torvalds's avatar
      Merge tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat · 8ae8932c
      Linus Torvalds authored
      Pull exfat updates from Namjae Jeon:
      
       - Improve write performance with dirsync mount option
      
       - Improve lookup performance
      
       - Add support for FITRIM ioctl
      
       - Fix a bug with discard option
      
      * tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
        exfat: speed up iterate/lookup by fixing start point of traversing cluster chain
        exfat: improve write performance when dirsync enabled
        exfat: add support ioctl and FITRIM function
        exfat: introduce bitmap_lock for cluster bitmap access
        exfat: fix erroneous discard when clear cluster bit
      8ae8932c
  5. Apr 29, 2021
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · d72cd4ad
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       "This consists of the usual driver updates (ufs, target, tcmu,
        smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).
      
        The major core change is using a sbitmap instead of an atomic for
        queue tracking"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits)
        scsi: target: tcm_fc: Fix a kernel-doc header
        scsi: target: Shorten ALUA error messages
        scsi: target: Fix two format specifiers
        scsi: target: Compare explicitly with SAM_STAT_GOOD
        scsi: sd: Introduce a new local variable in sd_check_events()
        scsi: dc395x: Open-code status_byte(u8) calls
        scsi: 53c700: Open-code status_byte(u8) calls
        scsi: smartpqi: Remove unused functions
        scsi: qla4xxx: Remove an unused function
        scsi: myrs: Remove unused functions
        scsi: myrb: Remove unused functions
        scsi: mpt3sas: Fix two kernel-doc headers
        scsi: fcoe: Suppress a compiler warning
        scsi: libfc: Fix a format specifier
        scsi: aacraid: Remove an unused function
        scsi: core: Introduce enum scsi_disposition
        scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
        scsi: core: Rename scsi_softirq_done() into scsi_complete()
        scsi: core: Remove an incorrect comment
        scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
        ...
      d72cd4ad
    • Linus Torvalds's avatar
      Merge tag 'vfio-v5.13-rc1' of git://github.com/awilliam/linux-vfio · 238da4d0
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Embed struct vfio_device into vfio driver structures (Jason
         Gunthorpe)
      
       - Make vfio_mdev type safe (Jason Gunthorpe)
      
       - Remove vfio-pci NVLink2 extensions for POWER9 (Christoph Hellwig)
      
       - Update vfio-pci IGD extensions for OpRegion 2.1+ (Fred Gao)
      
       - Various spelling/blank line fixes (Zhen Lei, Zhou Wang, Bhaskar
         Chowdhury)
      
       - Simplify unpin_pages error handling (Shenming Lu)
      
       - Fix i915 mdev Kconfig dependency (Arnd Bergmann)
      
       - Remove unused structure member (Keqian Zhu)
      
      * tag 'vfio-v5.13-rc1' of git://github.com/awilliam/linux-vfio: (43 commits)
        vfio/gvt: fix DRM_I915_GVT dependency on VFIO_MDEV
        vfio/iommu_type1: Remove unused pinned_page_dirty_scope in vfio_iommu
        vfio/mdev: Correct the function signatures for the mdev_type_attributes
        vfio/mdev: Remove kobj from mdev_parent_ops->create()
        vfio/gvt: Use mdev_get_type_group_id()
        vfio/gvt: Make DRM_I915_GVT depend on VFIO_MDEV
        vfio/mbochs: Use mdev_get_type_group_id()
        vfio/mdpy: Use mdev_get_type_group_id()
        vfio/mtty: Use mdev_get_type_group_id()
        vfio/mdev: Add mdev/mtype_get_type_group_id()
        vfio/mdev: Remove duplicate storage of parent in mdev_device
        vfio/mdev: Add missing error handling to dev_set_name()
        vfio/mdev: Reorganize mdev_device_create()
        vfio/mdev: Add missing reference counting to mdev_type
        vfio/mdev: Expose mdev_get/put_parent to mdev_private.h
        vfio/mdev: Use struct mdev_type in struct mdev_device
        vfio/mdev: Simplify driver registration
        vfio/mdev: Add missing typesafety around mdev_device
        vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
        vfio/mdev: Fix missing static's on MDEV_TYPE_ATTR's
        ...
      238da4d0
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 35655ceb
      Linus Torvalds authored
      Pull clk updates from Stephen Boyd:
       "Here's a collection of largely clk driver updates. The usual suspects
        are here: i.MX, Qualcomm, Renesas, Allwinner, Samsung, and Rockchip,
        but it feels pretty light on commits.
      
        There's only one real commit to the framework core and that's to
        consolidate code. Otherwise the diffstat is dominated by many Qualcomm
        clk driver patches that modernize the driver for the proper way of
        speciying clk parents. That's shifting data around, which could subtly
        break things so I'll be on the lookout for fixes.
      
        New Drivers:
         - Proper clk driver for Mediatek MT7621 SoCs
         - Support for the clock controller on the new Rockchip rk3568
      
        Updates:
         - Simplify Zynq Kconfig dependencies
         - Use clk_hw pointers in socfpga driver
         - Cleanup parent data in qcom clk drivers
         - Some cleanups for rk3399 modularization
         - Fix reparenting of i.MX UART clocks by initializing only the ones
           associated to stdout
         - Correct the PCIE clocks for i.MX8MP and i.MX8MQ
         - Make i.MX LPCG and SCU clocks return on registering failure
         - Kernel doc fixes
         - Add DAB hardware accelerator clocks on Renesas R-Car E3 and M3-N
         - Add timer (TMU) clocks on Renesas R-Car H3 ES1.0
         - Add Timer (TMU & CMT) and thermal sensor (TSC) clocks on
           Renesas R-Car V3U
         - Sigma-delta modulation on Allwinner V3s audio PLL"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (82 commits)
        MAINTAINERS: add MT7621 CLOCK maintainer
        staging: mt7621-dts: use valid vendor 'mediatek' instead of invalid 'mtk'
        staging: mt7621-dts: make use of new 'mt7621-clk'
        clk: ralink: add clock driver for mt7621 SoC
        clk: uniphier: Fix potential infinite loop
        clk: qcom: rpmh: add support for SDX55 rpmh IPA clock
        clk: qcom: gcc-sdm845: get rid of the test clock
        clk: qcom: convert SDM845 Global Clock Controller to parent_data
        dt-bindings: clock: separate SDM845 GCC clock bindings
        clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE
        clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
        clk: qcom: a7-pll: Add missing MODULE_DEVICE_TABLE
        dt: bindings: add mt7621-sysc device tree binding documentation
        dt-bindings: clock: add dt binding header for mt7621 clocks
        clk: samsung: Remove redundant dev_err calls
        clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable
        clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback
        clk: zynqmp: Drop dependency on ARCH_ZYNQMP
        clk: zynqmp: Enable the driver if ZYNQMP_FIRMWARE is selected
        clk: qcom: gcc-sm8350: use ARRAY_SIZE instead of specifying num_parents
        ...
      35655ceb
    • Linus Torvalds's avatar
      Merge tag 'mailbox-v5.13' of git://git.linaro.org/landing-teams/working/fujitsu/integration · d8201efe
      Linus Torvalds authored
      Pull mailbox updates from Jassi Brar:
       "qcom:
         - enable support for SM8350 and SC7280
      
        sprd:
         - refcount channel usage
         - specify interrupt names in dt
         - support sc9863a
      
        arm:
         - drop redundant print
      
        ti:
         - convert dt-bindings to json schema
      
        and misc spelling fixes"
      
      * tag 'mailbox-v5.13' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        dt-bindings: mailbox: qcom-ipcc: Add compatible for SC7280
        dt-bindings: mailbox: ti,secure-proxy: Convert to json schema
        mailbox: arm_mhu_db: Remove redundant dev_err call in mhu_db_probe()
        mailbox: sprd: Add supplementary inbox support
        dt-bindings: mailbox: Add interrupt-names to SPRD mailbox
        mailbox: sprd: Introduce refcnt when clients requests/free channels
        MAINTAINERS: Add DT bindings directory to mailbox
        mailbox: fix various typos in comments
        mailbox: pcc: fix platform_no_drv_owner.cocci warnings
        dt-bindings: mailbox: Add compatible for SM8350 IPCC
      d8201efe
    • Linus Torvalds's avatar
      Merge tag 'backlight-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · c969f245
      Linus Torvalds authored
      Pull backlight updates from Lee Jones:
       "New Device Support:
         - Add support for PMI8994 to Qualcom WLED
         - Add support for KTD259 to Kinetic KTD253
      
        Fix-ups:
         - Device Tree related fix-ups; kinetic,ktd253
         - Use proper sequence during sync_toggle; qcom-wled
         - Fix Wmisleading-indentation warnings; jornada720_bl
      
        Bug Fixes:
         - Fix sync toggle on WLED4; qcom-wled
         - Fix FSC update on WLED5; qcom-wled"
      
      * tag 'backlight-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
        backlight: journada720: Fix Wmisleading-indentation warning
        backlight: qcom-wled: Correct the sync_toggle sequence
        backlight: qcom-wled: Fix FSC update issue for WLED5
        dt-bindings: backlight: Add Kinetic KTD259 bindings
        backlight: ktd253: Support KTD259
        backlight: qcom-wled: Use sink_addr for sync toggle
        dt-bindings: backlight: qcom-wled: Add PMI8994 compatible
      c969f245
    • Linus Torvalds's avatar
      Merge tag 'mfd-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 71a5cc28
      Linus Torvalds authored
      Pull MFD updates from Lee Jones:
       "Core Framework:
         - Add support for Software Nodes to MFD Core
         - Remove support for Device Properties from MFD Core
         - Use standard APIs in MFD Core
      
        New Drivers:
         - Add support for ROHM BD9576MUF and BD9573MUF PMICs
         - Add support for Netronix Embedded Controller, PWM and RTC
         - Add support for Actions Semi ATC260x PMICs and OnKey
      
        New Device Support:
         - Add support for DG1 PCIe Graphics Card to Intel PMT
         - Add support for ROHM BD71815 PMIC to ROHM BD71828
         - Add support for Tolino Shine 2 HD to Netronix Embedded Controller
         - Add support for AX10 BMC Secure Updates to Intel M10 BMC
      
        Removed Device Support:
         - Remove Arizona Extcon support from MFD
         - Remove ST-E AB8500 Power Supply code from MFD
         - Remove AB3100 altogether
      
        New Functionality:
         - Add support for SMBus and I2C modes to Dialog DA9063
         - Switch to using Software Nodes in Intel (various)
      
        New/converted Device Tree bindings:
         - rohm bd71815-pmic, rohm bd9576-pmic, netronix ntxec, actions
           atc260x, ricoh rn5t618, qcom pm8xxx
      
      - Fix-ups:
         - Fix error handling/path; intel_pmt
         - Simplify code; rohm-bd718x7, ab8500-core, intel-m10-bmc
         - Trivial clean-ups (reordering, spelling); rohm-generic, rn5t618,
           max8997
         - Use correct data-type; db8500-prcmu
         - Remove superfluous code; lp87565, intel_quark_i2c_gpi, lpc_sch, twl
         - Use generic APIs/defines; lm3533-core, intel_quark_i2c_gpio
         - Regmap related fix-ups; intel-m10-bmc, sec-core
         - Reorder resource freeing during remove; intel_quark_i2c_gpio
         - Make table indexing more robust; intel_quark_i2c_gpio
         - Fix reference imbalances; arizona-irq
         - Staticify and (un)constify things; arizona-spi, stmpe, ene-kb3930,
           intel-lpss-acpi, intel-lpss-pci, atc260x-i2c, intel_quark_i2c_gpio
      
        Bug Fixes:
         - Fix incorrect (register) values; intel-m10-bmc
         - Kconfig related fixes; ABX500_CORE
         - Do not clear the Auto Reload Register; stm32-timers"
      
      * tag 'mfd-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (84 commits)
        mfd: intel-m10-bmc: Add support for MAX10 BMC Secure Updates
        Revert "mfd: max8997: Add of_compatible to Extcon and Charger mfd_cell"
        mfd: twl: Remove unused inline function twl4030charger_usb_en()
        dt-bindings: mfd: Convert pm8xxx bindings to yaml
        dt-bindings: mfd: Add compatible for pmk8350 rtc
        i2c: designware: Get rid of legacy platform data
        mfd: intel_quark_i2c_gpio: Convert I²C to use software nodes
        mfd: lpc_sch: Partially revert "Add support for Intel Quark X1000"
        mfd: arizona: Fix rumtime PM imbalance on error
        mfd: max8997: Replace 8998 with 8997
        mfd: core: Use acpi_find_child_device() for child devices lookup
        mfd: intel_quark_i2c_gpio: Don't play dirty trick with const
        mfd: intel_quark_i2c_gpio: Enable MSI interrupt
        mfd: intel_quark_i2c_gpio: Reuse BAR definitions for MFD cell indexing
        mfd: ntxec: Support for EC in Tolino Shine 2 HD
        mfd: stm32-timers: Avoid clearing auto reload register
        mfd: intel_quark_i2c_gpio: Replace I²C speeds with descriptive definitions
        mfd: intel_quark_i2c_gpio: Remove unused struct device member
        mfd: intel_quark_i2c_gpio: Unregister resources in reversed order
        mfd: Kconfig: ABX500_CORE should depend on ARCH_U8500
        ...
      71a5cc28
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · be18cd1f
      Linus Torvalds authored
      Pull MMC and MEMSTICK updates from Ulf Hansson:
       "MMC core:
         - Fix hanging on I/O during system suspend for removable cards
         - Set read only for SD cards with permanent write protect bit
         - Power cycle the SD/SDIO card if CMD11 fails for UHS voltage
         - Issue a cache flush for eMMC only when it's enabled
         - Adopt to updated cache ctrl settings for eMMC from MMC ioctls
         - Use use device property API when parsing voltages
         - Don't retry eMMC sanitize cmds
         - Use the timeout from the MMC ioctl for eMMC santize cmds
      
        MMC host:
         - mmc_spi: Make of_mmc_spi.c resource provider agnostic
         - mmc_spi: Use polling for card detect even without voltage-ranges
         - sdhci: Check for reset prior to DMA address unmap
         - sdhci-acpi: Add support for the AMDI0041 eMMC controller variant
         - sdhci-esdhc-imx: Depending on OF Kconfig and cleanup code
         - sdhci-pci: Add PCI IDs for Intel LKF
         - sdhci-pci: Fix initialization of some SD cards for Intel BYT
         - sdhci-pci-gli: Various improvements for GL97xx variants
         - sdhci-of-dwcmshc: Enable support for MMC_CAP_WAIT_WHILE_BUSY
         - sdhci-of-dwcmshc: Add ACPI support for BlueField-3 SoC
         - sdhci-of-dwcmshc: Add Rockchip platform support
         - tmio/renesas_sdhi: Extend support for reset and use a reset controller
         - tmio/renesas_sdhi: Enable support for MMC_CAP_WAIT_WHILE_BUSY
         - tmio/renesas_sdhi: Various improvements
      
        MEMSTICK:
         - Minor improvements/cleanups"
      
      * tag 'mmc-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (79 commits)
        mmc: block: Issue a cache flush only when it's enabled
        memstick: r592: ignore kfifo_out() return code again
        mmc: block: Update ext_csd.cache_ctrl if it was written
        mmc: mmc_spi: Make of_mmc_spi.c resource provider agnostic
        mmc: mmc_spi: Use already parsed IRQ
        mmc: mmc_spi: Drop unused NO_IRQ definition
        mmc: mmc_spi: Set up polling even if voltage-ranges is not present
        mmc: core: Convert mmc_of_parse_voltage() to use device property API
        mmc: core: Correct descriptions in mmc_of_parse()
        mmc: dw_mmc-rockchip: Just set default sample value for legacy mode
        mmc: sdhci-s3c: constify uses of driver/match data
        mmc: sdhci-s3c: correct kerneldoc of sdhci_s3c_drv_data
        mmc: sdhci-s3c: simplify getting of_device_id match data
        mmc: tmio: always restore irq register
        mmc: sdhci-pci-gli: Enlarge ASPM L1 entry delay of GL975x
        mmc: core: Let eMMC sanitize not retry in case of timeout/failure
        mmc: core: Add a retries parameter to __mmc_switch function
        memstick: r592: remove unused variable
        mmc: sdhci-st: Remove unnecessary error log
        mmc: sdhci-msm: Remove unnecessary error log
        ...
      be18cd1f
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.13-1' of git://github.com/cminyard/linux-ipmi · 6fa09d31
      Linus Torvalds authored
      Pull IPMI updates from Corey Minyard:
       "A bunch of little cleanups
      
        Nothing major, no functional changes"
      
      * tag 'for-linus-5.13-1' of git://github.com/cminyard/linux-ipmi:
        ipmi_si: Join string literals back
        ipmi_si: Drop redundant check before calling put_device()
        ipmi_si: Use strstrip() to remove surrounding spaces
        ipmi_si: Get rid of ->addr_source_cleanup()
        ipmi_si: Reuse si_to_str[] array in ipmi_hardcode_init_one()
        ipmi_si: Introduce ipmi_panic_event_str[] array
        ipmi_si: Use proper ACPI macros to check error code for failures
        ipmi_si: Utilize temporary variable to hold device pointer
        ipmi_si: Remove bogus err_free label
        ipmi_si: Switch to use platform_get_mem_or_io()
        ipmi: Handle device properties with software node API
        ipmi:ssif: make ssif_i2c_send() void
        ipmi: Refine retry conditions for getting device id
      6fa09d31
    • Linus Torvalds's avatar
      Merge tag 'devicetree-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 0080665f
      Linus Torvalds authored
      Pull devicetree updates from Rob Herring:
      
       - Refactor powerpc and arm64 kexec DT handling to common code. This
         enables IMA on arm64.
      
       - Add kbuild support for applying DT overlays at build time. The first
         user are the DT unittests.
      
       - Fix kerneldoc formatting and W=1 warnings in drivers/of/
      
       - Fix handling 64-bit flag on PCI resources
      
       - Bump dtschema version required to v2021.2.1
      
       - Enable undocumented compatible checks for dtbs_check. This allows
         tracking of missing binding schemas.
      
       - DT docs improvements. Regroup the DT docs and add the example schema
         and DT kernel ABI docs to the doc build.
      
       - Convert Broadcom Bluetooth and video-mux bindings to schema
      
       - Add QCom sm8250 Venus video codec binding schema
      
       - Add vendor prefixes for AESOP, YIC System Co., Ltd, and Siliconfile
         Technologies Inc.
      
       - Cleanup of DT schema type references on common properties and
         standard unit properties
      
      * tag 'devicetree-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (64 commits)
        powerpc: If kexec_build_elf_info() fails return immediately from elf64_load()
        powerpc: Free fdt on error in elf64_load()
        of: overlay: Fix kerneldoc warning in of_overlay_remove()
        of: linux/of.h: fix kernel-doc warnings
        of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses
        dt-bindings: bcm4329-fmac: add optional brcm,ccode-map
        docs: dt: update writing-schema.rst references
        dt-bindings: media: venus: Add sm8250 dt schema
        of: base: Fix spelling issue with function param 'prop'
        docs: dt: Add DT API documentation
        of: Add missing 'Return' section in kerneldoc comments
        of: Fix kerneldoc output formatting
        docs: dt: Group DT docs into relevant sub-sections
        docs: dt: Make 'Devicetree' wording more consistent
        docs: dt: writing-schema: Include the example schema in the doc build
        docs: dt: writing-schema: Remove spurious indentation
        dt-bindings: Fix reference in submitting-patches.rst to the DT ABI doc
        dt-bindings: ddr: Add optional manufacturer and revision ID to LPDDR3
        dt-bindings: media: video-interfaces: Drop the example
        devicetree: bindings: clock: Minor typo fix in the file armada3700-tbg-clock.txt
        ...
      0080665f