Skip to content
  1. Oct 30, 2020
  2. Oct 28, 2020
  3. Oct 27, 2020
    • Zenghui Yu's avatar
      net: hns3: Clear the CMDQ registers before unmapping BAR region · e3364c5f
      Zenghui Yu authored
      When unbinding the hns3 driver with the HNS3 VF, I got the following
      kernel panic:
      
      [  265.709989] Unable to handle kernel paging request at virtual address ffff800054627000
      [  265.717928] Mem abort info:
      [  265.720740]   ESR = 0x96000047
      [  265.723810]   EC = 0x25: DABT (current EL), IL = 32 bits
      [  265.729126]   SET = 0, FnV = 0
      [  265.732195]   EA = 0, S1PTW = 0
      [  265.735351] Data abort info:
      [  265.738227]   ISV = 0, ISS = 0x00000047
      [  265.742071]   CM = 0, WnR = 1
      [  265.745055] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000009b54000
      [  265.751753] [ffff800054627000] pgd=0000202ffffff003, p4d=0000202ffffff003, pud=00002020020eb003, pmd=00000020a0dfc003, pte=0000000000000000
      [  265.764314] Internal error: Oops: 96000047 [#1] SMP
      [  265.830357] CPU: 61 PID: 20319 Comm: bash Not tainted 5.9.0+ #206
      [  265.836423] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.05 09/18/2019
      [  265.843873] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
      [  265.843890] pc : hclgevf_cmd_uninit+0xbc/0x300
      [  265.861988] lr : hclgevf_cmd_uninit+0xb0/0x300
      [  265.861992] sp : ffff80004c983b50
      [  265.881411] pmr_save: 000000e0
      [  265.884453] x29: ffff80004c983b50 x28: ffff20280bbce500
      [  265.889744] x27: 0000000000000000 x26: 0000000000000000
      [  265.895034] x25: ffff800011a1f000 x24: ffff800011a1fe90
      [  265.900325] x23: ffff0020ce9b00d8 x22: ffff0020ce9b0150
      [  265.905616] x21: ffff800010d70e90 x20: ffff800010d70e90
      [  265.910906] x19: ffff0020ce9b0080 x18: 0000000000000004
      [  265.916198] x17: 0000000000000000 x16: ffff800011ae32e8
      [  265.916201] x15: 0000000000000028 x14: 0000000000000002
      [  265.916204] x13: ffff800011ae32e8 x12: 0000000000012ad8
      [  265.946619] x11: ffff80004c983b50 x10: 0000000000000000
      [  265.951911] x9 : ffff8000115d0888 x8 : 0000000000000000
      [  265.951914] x7 : ffff800011890b20 x6 : c0000000ffff7fff
      [  265.951917] x5 : ffff80004c983930 x4 : 0000000000000001
      [  265.951919] x3 : ffffa027eec1b000 x2 : 2b78ccbbff369100
      [  265.964487] x1 : 0000000000000000 x0 : ffff800054627000
      [  265.964491] Call trace:
      [  265.964494]  hclgevf_cmd_uninit+0xbc/0x300
      [  265.964496]  hclgevf_uninit_ae_dev+0x9c/0xe8
      [  265.964501]  hnae3_unregister_ae_dev+0xb0/0x130
      [  265.964516]  hns3_remove+0x34/0x88 [hns3]
      [  266.009683]  pci_device_remove+0x48/0xf0
      [  266.009692]  device_release_driver_internal+0x114/0x1e8
      [  266.030058]  device_driver_detach+0x28/0x38
      [  266.034224]  unbind_store+0xd4/0x108
      [  266.037784]  drv_attr_store+0x40/0x58
      [  266.041435]  sysfs_kf_write+0x54/0x80
      [  266.045081]  kernfs_fop_write+0x12c/0x250
      [  266.049076]  vfs_write+0xc4/0x248
      [  266.052378]  ksys_write+0x74/0xf8
      [  266.055677]  __arm64_sys_write+0x24/0x30
      [  266.059584]  el0_svc_common.constprop.3+0x84/0x270
      [  266.064354]  do_el0_svc+0x34/0xa0
      [  266.067658]  el0_svc+0x38/0x40
      [  266.070700]  el0_sync_handler+0x8c/0xb0
      [  266.074519]  el0_sync+0x140/0x180
      
      It looks like the BAR memory region had already been unmapped before we
      start clearing CMDQ registers in it, which is pretty bad and the kernel
      happily kills itself because of a Current EL Data Abort (on arm64).
      
      Moving the CMDQ uninitialization a bit early fixes the issue for me.
      
      Fixes: 862d969a
      
       ("net: hns3: do VF's pci re-initialization while PF doing FLR")
      Signed-off-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Link: https://lore.kernel.org/r/20201023051550.793-1-yuzenghui@huawei.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e3364c5f
    • Jakub Kicinski's avatar
      Merge branch 'bnxt_en-bug-fixes' · 10067b50
      Jakub Kicinski authored
      Michael Chan says:
      
      ====================
      bnxt_en: Bug fixes.
      
      These 5 bug fixes are all related to the firmware reset or AER recovery.
      2 patches fix the cleanup logic for the workqueue used to handle firmware
      reset and recovery. 1 patch ensures that the chip will have the proper
      BAR addresses latched after fatal AER recovery.  1 patch fixes the
      open path to check for firmware reset abort error.  The last one
      sends the fw reset command unconditionally to fix the AER reset logic.
      ====================
      
      Link: https://lore.kernel.org/r/1603685901-17917-1-git-send-email-michael.chan@broadcom.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      10067b50
    • Vasundhara Volam's avatar
      bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally. · 825741b0
      Vasundhara Volam authored
      In the AER or firmware reset flow, if we are in fatal error state or
      if pci_channel_offline() is true, we don't send any commands to the
      firmware because the commands will likely not reach the firmware and
      most commands don't matter much because the firmware is likely to be
      reset imminently.
      
      However, the HWRM_FUNC_RESET command is different and we should always
      attempt to send it.  In the AER flow for example, the .slot_reset()
      call will trigger this fw command and we need to try to send it to
      effect the proper reset.
      
      Fixes: b340dc68
      
       ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
      Reviewed-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      825741b0
    • Michael Chan's avatar
      bnxt_en: Check abort error state in bnxt_open_nic(). · a1301f08
      Michael Chan authored
      bnxt_open_nic() is called during configuration changes that require
      the NIC to be closed and then opened.  This call is protected by
      rtnl_lock.  Firmware reset can be happening at the same time.  Only
      critical portions of the entire firmware reset sequence are protected
      by the rtnl_lock.  It is possible that bnxt_open_nic() can be called
      when the firmware reset sequence is aborting.  In that case,
      bnxt_open_nic() needs to check if the ABORT_ERR flag is set and
      abort if it is.  The configuration change that resulted in the
      bnxt_open_nic() call will fail but the NIC will be brought to a
      consistent IF_DOWN state.
      
      Without this patch, if bnxt_open_nic() were to continue in this error
      state, it may crash like this:
      
      [ 1648.659736] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 1648.659768] IP: [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
      [ 1648.659796] PGD 101e1b3067 PUD 101e1b2067 PMD 0
      [ 1648.659813] Oops: 0000 [#1] SMP
      [ 1648.659825] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dell_smbios dell_wmi_descriptor dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper vfat cryptd fat pcspkr ipmi_ssif sg k10temp i2c_piix4 wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_power_meter sch_fq_codel ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci megaraid_sas crct10dif_pclmul crct10dif_common
      [ 1648.660063]  tg3 libata crc32c_intel bnxt_en(OE) drm_panel_orientation_quirks devlink ptp pps_core dm_mirror dm_region_hash dm_log dm_mod fuse
      [ 1648.660105] CPU: 13 PID: 3867 Comm: ethtool Kdump: loaded Tainted: G           OE  ------------   3.10.0-1152.el7.x86_64 #1
      [ 1648.660911] Hardware name: Dell Inc. PowerEdge R7515/0R4CNN, BIOS 1.2.14 01/28/2020
      [ 1648.661662] task: ffff94e64cbc9080 ti: ffff94f55df1c000 task.ti: ffff94f55df1c000
      [ 1648.662409] RIP: 0010:[<ffffffffc01e9b3a>]  [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
      [ 1648.663171] RSP: 0018:ffff94f55df1fba8  EFLAGS: 00010202
      [ 1648.663927] RAX: 0000000000000000 RBX: ffff94e6827e0000 RCX: 0000000000000000
      [ 1648.664684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94e6827e08c0
      [ 1648.665433] RBP: ffff94f55df1fc20 R08: 00000000000001ff R09: 0000000000000008
      [ 1648.666184] R10: 0000000000000d53 R11: ffff94f55df1f7ce R12: ffff94e6827e08c0
      [ 1648.666940] R13: ffff94e6827e08c0 R14: ffff94e6827e08c0 R15: ffffffffb9115e40
      [ 1648.667695] FS:  00007f8aadba5740(0000) GS:ffff94f57eb40000(0000) knlGS:0000000000000000
      [ 1648.668447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1648.669202] CR2: 0000000000000000 CR3: 0000001022772000 CR4: 0000000000340fe0
      [ 1648.669966] Call Trace:
      [ 1648.670730]  [<ffffffffc01f1d5d>] ? bnxt_need_reserve_rings+0x9d/0x170 [bnxt_en]
      [ 1648.671496]  [<ffffffffc01fa7ea>] __bnxt_open_nic+0x8a/0x9a0 [bnxt_en]
      [ 1648.672263]  [<ffffffffc01f7479>] ? bnxt_close_nic+0x59/0x1b0 [bnxt_en]
      [ 1648.673031]  [<ffffffffc01fb11b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
      [ 1648.673793]  [<ffffffffc020037c>] bnxt_set_ringparam+0x6c/0xa0 [bnxt_en]
      [ 1648.674550]  [<ffffffffb8a5f564>] dev_ethtool+0x1334/0x21a0
      [ 1648.675306]  [<ffffffffb8a719ff>] dev_ioctl+0x1ef/0x5f0
      [ 1648.676061]  [<ffffffffb8a324bd>] sock_do_ioctl+0x4d/0x60
      [ 1648.676810]  [<ffffffffb8a326bb>] sock_ioctl+0x1eb/0x2d0
      [ 1648.677548]  [<ffffffffb8663230>] do_vfs_ioctl+0x3a0/0x5b0
      [ 1648.678282]  [<ffffffffb8b8e678>] ? __do_page_fault+0x238/0x500
      [ 1648.679016]  [<ffffffffb86634e1>] SyS_ioctl+0xa1/0xc0
      [ 1648.679745]  [<ffffffffb8b93f92>] system_call_fastpath+0x25/0x2a
      [ 1648.680461] Code: 9e 60 01 00 00 0f 1f 40 00 45 8b 8e 48 01 00 00 31 c9 45 85 c9 0f 8e 73 01 00 00 66 0f 1f 44 00 00 49 8b 86 a8 00 00 00 48 63 d1 <48> 8b 14 d0 48 85 d2 0f 84 46 01 00 00 41 8b 86 44 01 00 00 c7
      [ 1648.681986] RIP  [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
      [ 1648.682724]  RSP <ffff94f55df1fba8>
      [ 1648.683451] CR2: 0000000000000000
      
      Fixes: ec5d31e3
      
       ("bnxt_en: Handle firmware reset status during IF_UP.")
      Reviewed-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a1301f08
    • Vasundhara Volam's avatar
      bnxt_en: Re-write PCI BARs after PCI fatal error. · f75d9a0a
      Vasundhara Volam authored
      When a PCIe fatal error occurs, the internal latched BAR addresses
      in the chip get reset even though the BAR register values in config
      space are retained.
      
      pci_restore_state() will not rewrite the BAR addresses if the
      BAR address values are valid, causing the chip's internal BAR addresses
      to stay invalid.  So we need to zero the BAR registers during PCIe fatal
      error to force pci_restore_state() to restore the BAR addresses.  These
      write cycles to the BAR registers will cause the proper BAR addresses to
      latch internally.
      
      Fixes: 6316ea6d
      
       ("bnxt_en: Enable AER support.")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f75d9a0a
    • Vasundhara Volam's avatar
      bnxt_en: Invoke cancel_delayed_work_sync() for PFs also. · 631ce27a
      Vasundhara Volam authored
      As part of the commit b148bb23
      ("bnxt_en: Fix possible crash in bnxt_fw_reset_task()."),
      cancel_delayed_work_sync() is called only for VFs to fix a possible
      crash by cancelling any pending delayed work items. It was assumed
      by mistake that the flush_workqueue() call on the PF would flush
      delayed work items as well.
      
      As flush_workqueue() does not cancel the delayed workqueue, extend
      the fix for PFs. This fix will avoid the system crash, if there are
      any pending delayed work items in fw_reset_task() during driver's
      .remove() call.
      
      Unify the workqueue cleanup logic for both PF and VF by calling
      cancel_work_sync() and cancel_delayed_work_sync() directly in
      bnxt_remove_one().
      
      Fixes: b148bb23
      
       ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().")
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Reviewed-by: default avatarAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      631ce27a
    • Vasundhara Volam's avatar
      bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one(). · 21d6a11e
      Vasundhara Volam authored
      A recent patch has moved the workqueue cleanup logic before
      calling unregister_netdev() in bnxt_remove_one().  This caused a
      regression because the workqueue can be restarted if the device is
      still open.  Workqueue cleanup must be done after unregister_netdev().
      The workqueue will not restart itself after the device is closed.
      
      Call bnxt_cancel_sp_work() after unregister_netdev() and
      call bnxt_dl_fw_reporters_destroy() after that.  This fixes the
      regession and the original NULL ptr dereference issue.
      
      Fixes: b16939b5
      
       ("bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      21d6a11e
    • Jakub Kicinski's avatar
      Merge branch 'mlxsw-various-fixes' · 19c176eb
      Jakub Kicinski authored
      Ido Schimmel says:
      
      ====================
      mlxsw: Various fixes
      
      This patch set contains various fixes for mlxsw.
      
      Patch #1 ensures that only link modes that are supported by both the
      device and the driver are advertised. When a link mode that is not
      supported by the driver is negotiated by the device, it will be
      presented as an unknown speed by ethtool, causing the bond driver to
      wrongly assume that the link is down.
      
      Patch #2 fixes a trivial memory leak upon module removal.
      
      Patch #3 fixes a use-after-free that syzkaller was able to trigger once
      on a slow emulator after a few months of fuzzing.
      ====================
      
      Link: https://lore.kernel.org/r/20201024133733.2107509-1-idosch@idosch.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      19c176eb
    • Amit Cohen's avatar
      mlxsw: core: Fix use-after-free in mlxsw_emad_trans_finish() · 0daf2bf5
      Amit Cohen authored
      Each EMAD transaction stores the skb used to issue the EMAD request
      ('trans->tx_skb') so that the request could be retried in case of a
      timeout. The skb can be freed when a corresponding response is received
      or as part of the retry logic (e.g., failed retransmit, exceeded maximum
      number of retries).
      
      The two tasks (i.e., response processing and retransmits) are
      synchronized by the atomic 'trans->active' field which ensures that
      responses to inactive transactions are ignored.
      
      In case of a failed retransmit the transaction is finished and all of
      its resources are freed. However, the current code does not mark it as
      inactive. Syzkaller was able to hit a race condition in which a
      concurrent response is processed while the transaction's resources are
      being freed, resulting in a use-after-free [1].
      
      Fix the issue by making sure to mark the transaction as inactive after a
      failed retransmit and free its resources only if a concurrent task did
      not already do that.
      
      [1]
      BUG: KASAN: use-after-free in consume_skb+0x30/0x370
      net/core/skbuff.c:833
      Read of size 4 at addr ffff88804f570494 by task syz-executor.0/1004
      
      CPU: 0 PID: 1004 Comm: syz-executor.0 Not tainted 5.8.0-rc7+ #68
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xf6/0x16e lib/dump_stack.c:118
       print_address_description.constprop.0+0x1c/0x250
      mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
       check_memory_region_inline mm/kasan/generic.c:186 [inline]
       check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
       instrument_atomic_read include/linux/instrumented.h:56 [inline]
       atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
       refcount_read include/linux/refcount.h:147 [inline]
       skb_unref include/linux/skbuff.h:1044 [inline]
       consume_skb+0x30/0x370 net/core/skbuff.c:833
       mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
       mlxsw_emad_process_response drivers/net/ethernet/mellanox/mlxsw/core.c:651 [inline]
       mlxsw_emad_rx_listener_func+0x5c9/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:672
       mlxsw_core_skb_receive+0x4df/0x770 drivers/net/ethernet/mellanox/mlxsw/core.c:2063
       mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
       mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
       tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
       __do_softirq+0x223/0x964 kernel/softirq.c:292
       asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
      
      Allocated by task 1006:
       save_stack+0x1b/0x40 mm/kasan/common.c:48
       set_track mm/kasan/common.c:56 [inline]
       __kasan_kmalloc mm/kasan/common.c:494 [inline]
       __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
       slab_post_alloc_hook mm/slab.h:586 [inline]
       slab_alloc_node mm/slub.c:2824 [inline]
       slab_alloc mm/slub.c:2832 [inline]
       kmem_cache_alloc+0xcd/0x2e0 mm/slub.c:2837
       __build_skb+0x21/0x60 net/core/skbuff.c:311
       __netdev_alloc_skb+0x1e2/0x360 net/core/skbuff.c:464
       netdev_alloc_skb include/linux/skbuff.h:2810 [inline]
       mlxsw_emad_alloc drivers/net/ethernet/mellanox/mlxsw/core.c:756 [inline]
       mlxsw_emad_reg_access drivers/net/ethernet/mellanox/mlxsw/core.c:787 [inline]
       mlxsw_core_reg_access_emad+0x1ab/0x1420 drivers/net/ethernet/mellanox/mlxsw/core.c:1817
       mlxsw_reg_trans_query+0x39/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:1831
       mlxsw_sp_sb_pm_occ_clear drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:260 [inline]
       mlxsw_sp_sb_occ_max_clear+0xbff/0x10a0 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:1365
       mlxsw_devlink_sb_occ_max_clear+0x76/0xb0 drivers/net/ethernet/mellanox/mlxsw/core.c:1037
       devlink_nl_cmd_sb_occ_max_clear_doit+0x1ec/0x280 net/core/devlink.c:1765
       genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:714 [inline]
       genl_rcv_msg+0x617/0x980 net/netlink/genetlink.c:731
       netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2470
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:742
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg+0x150/0x190 net/socket.c:671
       ____sys_sendmsg+0x6d8/0x840 net/socket.c:2359
       ___sys_sendmsg+0xff/0x170 net/socket.c:2413
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2446
       do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 73:
       save_stack+0x1b/0x40 mm/kasan/common.c:48
       set_track mm/kasan/common.c:56 [inline]
       kasan_set_free_info mm/kasan/common.c:316 [inline]
       __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
       slab_free_hook mm/slub.c:1474 [inline]
       slab_free_freelist_hook mm/slub.c:1507 [inline]
       slab_free mm/slub.c:3072 [inline]
       kmem_cache_free+0xbe/0x380 mm/slub.c:3088
       kfree_skbmem net/core/skbuff.c:622 [inline]
       kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:616
       __kfree_skb net/core/skbuff.c:679 [inline]
       consume_skb net/core/skbuff.c:837 [inline]
       consume_skb+0xe1/0x370 net/core/skbuff.c:831
       mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
       mlxsw_emad_transmit_retry.isra.0+0x9d/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:613
       mlxsw_emad_trans_timeout_work+0x43/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:625
       process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
       worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
       kthread+0x355/0x470 kernel/kthread.c:291
       ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293
      
      The buggy address belongs to the object at ffff88804f5703c0
       which belongs to the cache skbuff_head_cache of size 224
      The buggy address is located 212 bytes inside of
       224-byte region [ffff88804f5703c0, ffff88804f5704a0)
      The buggy address belongs to the page:
      page:ffffea00013d5c00 refcount:1 mapcount:0 mapping:0000000000000000
      index:0x0
      flags: 0x100000000000200(slab)
      raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806c625400
      raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88804f570380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
       ffff88804f570400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff88804f570480: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
                               ^
       ffff88804f570500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88804f570580: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
      
      Fixes: caf7297e
      
       ("mlxsw: core: Introduce support for asynchronous EMAD register access")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0daf2bf5
    • Ido Schimmel's avatar
      mlxsw: core: Fix memory leak on module removal · adc80b6c
      Ido Schimmel authored
      Free the devlink instance during the teardown sequence in the non-reload
      case to avoid the following memory leak.
      
      unreferenced object 0xffff888232895000 (size 2048):
        comm "modprobe", pid 1073, jiffies 4295568857 (age 164.871s)
        hex dump (first 32 bytes):
          00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de  ........".......
          10 50 89 32 82 88 ff ff 10 50 89 32 82 88 ff ff  .P.2.....P.2....
        backtrace:
          [<00000000c704e9a6>] __kmalloc+0x13a/0x2a0
          [<00000000ee30129d>] devlink_alloc+0xff/0x760
          [<0000000092ab3e5d>] 0xffffffffa042e5b0
          [<000000004f3f8a31>] 0xffffffffa042f6ad
          [<0000000092800b4b>] 0xffffffffa0491df3
          [<00000000c4843903>] local_pci_probe+0xcb/0x170
          [<000000006993ded7>] pci_device_probe+0x2c2/0x4e0
          [<00000000a8e0de75>] really_probe+0x2c5/0xf90
          [<00000000d42ba75d>] driver_probe_device+0x1eb/0x340
          [<00000000bcc95e05>] device_driver_attach+0x294/0x300
          [<000000000e2bc177>] __driver_attach+0x167/0x2f0
          [<000000007d44cd6e>] bus_for_each_dev+0x148/0x1f0
          [<000000003cd5a91e>] driver_attach+0x45/0x60
          [<000000000041ce51>] bus_add_driver+0x3b8/0x720
          [<00000000f5215476>] driver_register+0x230/0x4e0
          [<00000000d79356f5>] __pci_register_driver+0x190/0x200
      
      Fixes: a22712a9
      
       ("mlxsw: core: Fix devlink unregister flow")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reported-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Tested-by: default avatarOleksandr Shamray <oleksandrs@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      adc80b6c
    • Amit Cohen's avatar
      mlxsw: Only advertise link modes supported by both driver and device · 1601559b
      Amit Cohen authored
      During port creation the driver instructs the device to advertise all
      the supported link modes queried from the device.
      
      Since cited commit not all the link modes supported by the device are
      supported by the driver. This can result in the device negotiating a
      link mode that is not recognized by the driver causing ethtool to show
      an unsupported speed:
      
      $ ethtool swp1
      ...
      Speed: Unknown!
      
      This is especially problematic when the netdev is enslaved to a bond, as
      the bond driver uses unknown speed as an indication that the link is
      down:
      
      [13048.900895] net_ratelimit: 86 callbacks suppressed
      [13048.900902] t_bond0: (slave swp52): failed to get link speed/duplex
      [13048.912160] t_bond0: (slave swp49): failed to get link speed/duplex
      
      Fix this by making sure that only link modes that are supported by both
      the device and the driver are advertised.
      
      Fixes: b97cd891
      
       ("mlxsw: Remove 56G speed support")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1601559b
    • Jakub Kicinski's avatar
      Merge branch 'net-smc-fixes-2020-10-23' · 522ee51e
      Jakub Kicinski authored
      Karsten Graul says:
      
      ====================
      net/smc: fixes 2020-10-23
      
      Patch 1 fixes a potential null pointer dereference. Patch 2 takes care
      of a suppressed return code and patch 3 corrects the system EID in the
      ISM driver.
      ====================
      
      Link: https://lore.kernel.org/r/20201023184830.59548-1-kgraul@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      522ee51e
    • Karsten Graul's avatar
      s390/ism: fix incorrect system EID · 1dc0d1cf
      Karsten Graul authored
      The system EID that is defined by the ISM driver is not correct. Using
      an incorrect system EID allows to communicate with remote Linux systems
      that use the same incorrect system EID, but when it comes to
      interoperability with other operating systems then the system EIDs do
      never match which prevents SMC-Dv2 communication.
      Using the correct system EID fixes this problem.
      
      Fixes: 201091eb
      
       ("net/smc: introduce System Enterprise ID (SEID)")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1dc0d1cf
    • Karsten Graul's avatar
      net/smc: fix suppressed return code · 96d6fded
      Karsten Graul authored
      The patch that repaired the invalid return code in smcd_new_buf_create()
      missed to take care of errno ENOSPC which has a special meaning that no
      more DMBEs can be registered on the device. Fix that by keeping this
      errno value during the translation of the return code.
      
      Fixes: 6b1bbf94
      
       ("net/smc: fix invalid return code in smcd_new_buf_create()")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      96d6fded
    • Karsten Graul's avatar
      net/smc: fix null pointer dereference in smc_listen_decline() · 4a9baf45
      Karsten Graul authored
      smc_listen_work() calls smc_listen_decline() on label out_decl,
      providing the ini pointer variable. But this pointer can still be null
      when the label out_decl is reached.
      Fix this by checking the ini variable in smc_listen_work() and call
      smc_listen_decline() with the result directly.
      
      Fixes: a7c9c5f4
      
       ("net/smc: CLC accept / confirm V2")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4a9baf45
    • Jeff Vander Stoep's avatar
      vsock: use ns_capable_noaudit() on socket create · af545bb5
      Jeff Vander Stoep authored
      
      
      During __vsock_create() CAP_NET_ADMIN is used to determine if the
      vsock_sock->trusted should be set to true. This value is used later
      for determing if a remote connection should be allowed to connect
      to a restricted VM. Unfortunately, if the caller doesn't have
      CAP_NET_ADMIN, an audit message such as an selinux denial is
      generated even if the caller does not want a trusted socket.
      
      Logging errors on success is confusing. To avoid this, switch the
      capable(CAP_NET_ADMIN) check to the noaudit version.
      
      Reported-by: default avatarRoman Kiryanov <rkir@google.com>
      https://android-review.googlesource.com/c/device/generic/goldfish/+/1468545/
      
      
      Signed-off-by: default avatarJeff Vander Stoep <jeffv@google.com>
      Reviewed-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Link: https://lore.kernel.org/r/20201023143757.377574-1-jeffv@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      af545bb5
    • Raju Rangoju's avatar
      cxgb4: set up filter action after rewrites · 937d8420
      Raju Rangoju authored
      The current code sets up the filter action field before
      rewrites are set up. When the action 'switch' is used
      with rewrites, this may result in initial few packets
      that get switched out don't have rewrites applied
      on them.
      
      So, make sure filter action is set up along with rewrites
      or only after everything else is set up for rewrites.
      
      Fixes: 12b276fb
      
       ("cxgb4: add support to create hash filters")
      Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
      Link: https://lore.kernel.org/r/20201023115852.18262-1-rajur@chelsio.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      937d8420
    • Dan Carpenter's avatar
      net: hns3: clean up a return in hclge_tm_bp_setup() · ee7a3764
      Dan Carpenter authored
      
      
      Smatch complains that "ret" might be uninitialized if we don't enter
      the loop.  We do always enter the loop so it's a false positive, but
      it's cleaner to just return a literal zero and that silences the
      warning as well.
      
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/20201023112212.GA282278@mwanda
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ee7a3764
  4. Oct 24, 2020
    • Arjun Roy's avatar
      tcp: Prevent low rmem stalls with SO_RCVLOWAT. · 435ccfa8
      Arjun Roy authored
      With SO_RCVLOWAT, under memory pressure,
      it is possible to enter a state where:
      
      1. We have not received enough bytes to satisfy SO_RCVLOWAT.
      2. We have not entered buffer pressure (see tcp_rmem_pressure()).
      3. But, we do not have enough buffer space to accept more packets.
      
      In this case, we advertise 0 rwnd (due to #3) but the application does
      not drain the receive queue (no wakeup because of #1 and #2) so the
      flow stalls.
      
      Modify the heuristic for SO_RCVLOWAT so that, if we are advertising
      rwnd<=rcv_mss, force a wakeup to prevent a stall.
      
      Without this patch, setting tcp_rmem to 6143 and disabling TCP
      autotune causes a stalled flow. With this patch, no stall occurs. This
      is with RPC-style traffic with large messages.
      
      Fixes: 03f45c88
      
       ("tcp: avoid extra wakeups for SO_RCVLOWAT users")
      Signed-off-by: default avatarArjun Roy <arjunroy@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20201023184709.217614-1-arjunroy.kdev@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      435ccfa8
    • Michael Ellerman's avatar
      net: ucc_geth: Drop extraneous parentheses in comparison · dab23422
      Michael Ellerman authored
      
      
      Clang warns about the extra parentheses in this comparison:
      
        drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
        warning: equality comparison with extraneous parentheses
          if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
               ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      It seems clear the intent here is to do a comparison not an
      assignment, so drop the extra parentheses to avoid any confusion.
      
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20201023033236.3296988-1-mpe@ellerman.id.au
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dab23422
    • Jakub Kicinski's avatar
      Merge branch 'ionic-memory-usage-fixes' · 0c3b7f4b
      Jakub Kicinski authored
      Shannon Nelson says:
      
      ====================
      ionic: memory usage fixes
      
      This patchset addresses some memory leaks and incorrect
      io reads.
      ====================
      
      Link: https://lore.kernel.org/r/20201022235531.65956-1-snelson@pensando.io
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0c3b7f4b
    • Shannon Nelson's avatar
      ionic: fix mem leak in rx_empty · 0c32a28e
      Shannon Nelson authored
      The sentinel descriptor entry was getting missed in the
      traverse of the ring from head to tail, so change to a
      loop of 0 to the end.
      
      Fixes: f1d2e894
      
       ("ionic: use index not pointer for queue tracking")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0c32a28e
    • Shannon Nelson's avatar
      ionic: no rx flush in deinit · 43ecf7b4
      Shannon Nelson authored
      Kmemleak pointed out to us that ionic_rx_flush() is sending
      skbs into napi_gro_XXX with a disabled napi context, and these
      end up getting lost and leaked.  We can safely remove the flush.
      
      Fixes: 0f3154e6
      
       ("ionic: Add Tx and Rx handling")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      43ecf7b4
    • Shannon Nelson's avatar
      ionic: clean up sparse complaints · d701ec32
      Shannon Nelson authored
      
      
      The sparse complaints around the static_asserts were obscuring
      more useful complaints.  So, don't check the static_asserts,
      and fix the remaining sparse complaints.
      
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d701ec32
    • Vinay Kumar Yadav's avatar
      chelsio/chtls: fix tls record info to user · 4f3391ce
      Vinay Kumar Yadav authored
      chtls_pt_recvmsg() receives a skb with tls header and subsequent
      skb with data, need to finalize the data copy whenever next skb
      with tls header is available. but here current tls header is
      overwritten by next available tls header, ends up corrupting
      user buffer data. fixing it by finalizing current record whenever
      next skb contains tls header.
      
      v1->v2:
      - Improved commit message.
      
      Fixes: 17a7d24a
      
       ("crypto: chtls - generic handling of data and hdr")
      Signed-off-by: default avatarVinay Kumar Yadav <vinay.yadav@chelsio.com>
      Link: https://lore.kernel.org/r/20201022190556.21308-1-vinay.yadav@chelsio.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f3391ce
    • Alex Elder's avatar
      net: ipa: command payloads already mapped · df833050
      Alex Elder authored
      IPA transactions describe actions to be performed by the IPA
      hardware.  Three cases use IPA transactions:  transmitting a socket
      buffer; providing a page to receive packet data; and issuing an IPA
      immediate command.  An IPA transaction contains a scatter/gather
      list (SGL) to hold the set of actions to be performed.
      
      We map buffers in the SGL for DMA at the time they are added to the
      transaction.  For skb TX transactions, we fill the SGL with a call
      to skb_to_sgvec().  Page RX transactions involve a single page
      pointer, and that is recorded in the SGL with sg_set_page().  In
      both of these cases we then map the SGL for DMA with a call to
      dma_map_sg().
      
      Immediate commands are different.  The payload for an immediate
      command comes from a region of coherent DMA memory, which must
      *not* be mapped for DMA.  For that reason, gsi_trans_cmd_add()
      sort of hand-crafts each SGL entry added to a command transaction.
      
      This patch fixes a problem with the code that crafts the SGL entry
      for an immediate command.  Previously a portion of the SGL entry was
      updated using sg_set_buf().  However this is not valid because it
      includes a call to virt_to_page() on the buffer, but the command
      buffer pointer is not a linear address.
      
      Since we never actually map the SGL for command transactions, there
      are very few fields in the SGL we need to fill.  Specifically, we
      only need to record the DMA address and the length, so they can be
      used by __gsi_trans_commit() to fill a TRE.  We additionally need to
      preserve the SGL flags so for_each_sg() still works.  For that we
      can simply assign a null page pointer for command SGL entries.
      
      Fixes: 9dd441e4
      
       ("soc: qcom: ipa: GSI transactions")
      Reported-by: default avatarStephen Boyd <swboyd@chromium.org>
      Tested-by: default avatarStephen Boyd <swboyd@chromium.org>
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Link: https://lore.kernel.org/r/20201022010029.11877-1-elder@linaro.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      df833050
    • Linus Torvalds's avatar
      Merge tag 'net-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 3cb12d27
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Cross-tree/merge window issues:
      
         - rtl8150: don't incorrectly assign random MAC addresses; fix late in
           the 5.9 cycle started depending on a return code from a function
           which changed with the 5.10 PR from the usb subsystem
      
        Current release regressions:
      
         - Revert "virtio-net: ethtool configurable RXCSUM", it was causing
           crashes at probe when control vq was not negotiated/available
      
        Previous release regressions:
      
         - ixgbe: fix probing of multi-port 10 Gigabit Intel NICs with an MDIO
           bus, only first device would be probed correctly
      
         - nexthop: Fix performance regression in nexthop deletion by
           effectively switching from recently added synchronize_rcu() to
           synchronize_rcu_expedited()
      
         - netsec: ignore 'phy-mode' device property on ACPI systems; the
           property is not populated correctly by the firmware, but firmware
           configures the PHY so just keep boot settings
      
        Previous releases - always broken:
      
         - tcp: fix to update snd_wl1 in bulk receiver fast path, addressing
           bulk transfers getting "stuck"
      
         - icmp: randomize the global rate limiter to prevent attackers from
           getting useful signal
      
         - r8169: fix operation under forced interrupt threading, make the
           driver always use hard irqs, even on RT, given the handler is light
           and only wants to schedule napi (and do so through a _irqoff()
           variant, preferably)
      
         - bpf: Enforce pointer id generation for all may-be-null register
           type to avoid pointers erroneously getting marked as null-checked
      
         - tipc: re-configure queue limit for broadcast link
      
         - net/sched: act_tunnel_key: fix OOB write in case of IPv6 ERSPAN
           tunnels
      
         - fix various issues in chelsio inline tls driver
      
        Misc:
      
         - bpf: improve just-added bpf_redirect_neigh() helper api to support
           supplying nexthop by the caller - in case BPF program has already
           done a lookup we can avoid doing another one
      
         - remove unnecessary break statements
      
         - make MCTCP not select IPV6, but rather depend on it"
      
      * tag 'net-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
        tcp: fix to update snd_wl1 in bulk receiver fast path
        net: Properly typecast int values to set sk_max_pacing_rate
        netfilter: nf_fwd_netdev: clear timestamp in forwarding path
        ibmvnic: save changed mac address to adapter->mac_addr
        selftests: mptcp: depends on built-in IPv6
        Revert "virtio-net: ethtool configurable RXCSUM"
        rtnetlink: fix data overflow in rtnl_calcit()
        net: ethernet: mtk-star-emac: select REGMAP_MMIO
        net: hdlc_raw_eth: Clear the IFF_TX_SKB_SHARING flag after calling ether_setup
        net: hdlc: In hdlc_rcv, check to make sure dev is an HDLC device
        bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static
        bpf, selftests: Extend test_tc_redirect to use modified bpf_redirect_neigh()
        bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop
        mptcp: depends on IPV6 but not as a module
        sfc: move initialisation of efx->filter_sem to efx_init_struct()
        mpls: load mpls_gso after mpls_iptunnel
        net/sched: act_tunnel_key: fix OOB write in case of IPv6 ERSPAN tunnels
        net/sched: act_gate: Unlock ->tcfa_lock in tc_setup_flow_action()
        net: dsa: bcm_sf2: make const array static, makes object smaller
        mptcp: MPTCP_IPV6 should depend on IPV6 instead of selecting it
        ...
      3cb12d27
    • Linus Torvalds's avatar
      Merge tag 'gfs2-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 0adc313c
      Linus Torvalds authored
      Pull gfs2 updates from Andreas Gruenbacher:
      
       - Use iomap for non-journaled buffered I/O. This largely eliminates
         buffer heads on filesystems where the block size matches the page
         size. Many thanks to Christoph Hellwig for this patch!
      
       - Fixes for some more journaled data filesystem bugs, found by running
         xfstests with data journaling on for all files (chattr +j $MNT) (Bob
         Peterson)
      
       - gfs2_evict_inode refactoring (Bob Peterson)
      
       - Use the statfs data in the journal during recovery instead of reading
         it in from the local statfs inodes (Abhi Das)
      
       - Several other minor fixes by various people
      
      * tag 'gfs2-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (30 commits)
        gfs2: Recover statfs info in journal head
        gfs2: lookup local statfs inodes prior to journal recovery
        gfs2: Add fields for statfs info in struct gfs2_log_header_host
        gfs2: Ignore subsequent errors after withdraw in rgrp_go_sync
        gfs2: Eliminate gl_vm
        gfs2: Only access gl_delete for iopen glocks
        gfs2: Fix comments to glock_hash_walk
        gfs2: eliminate GLF_QUEUED flag in favor of list_empty(gl_holders)
        gfs2: Ignore journal log writes for jdata holes
        gfs2: simplify gfs2_block_map
        gfs2: Only set PageChecked if we have a transaction
        gfs2: don't lock sd_ail_lock in gfs2_releasepage
        gfs2: make gfs2_ail1_empty_one return the count of active items
        gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe
        gfs2: enhance log_blocks trace point to show log blocks free
        gfs2: add missing log_blocks trace points in gfs2_write_revokes
        gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove parm
        gfs2: add validation checks for size of superblock
        gfs2: use-after-free in sysfs deregistration
        gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump
        ...
      0adc313c
    • Linus Torvalds's avatar
      Merge tag '5.10-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6 · 0613ed91
      Linus Torvalds authored
      Pull cifs updates from Steve French:
      
       - add support for recognizing special file types (char/block/fifo/
         symlink) for files created by Linux on WSL (a format we plan to move
         to as the default for creating special files on Linux, as it has
         advantages over the other current option, the SFU format) in readdir.
      
       - fix double queries to root directory when directory leases not
         supported (e.g. Samba)
      
       - fix querying mode bits (modefromsid mount option) for special file
         types
      
       - stronger encryption (gcm256), disabled by default until tested more
         broadly
      
       - allow querying owner when server reports 'well known SID' on query
         dir with SMB3.1.1 POSIX extensions
      
      * tag '5.10-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (30 commits)
        SMB3: add support for recognizing WSL reparse tags
        cifs: remove bogus debug code
        smb3.1.1: fix typo in compression flag
        cifs: move smb version mount options into fs_context.c
        cifs: move cache mount options to fs_context.ch
        cifs: move security mount options into fs_context.ch
        cifs: add files to host new mount api
        smb3: do not try to cache root directory if dir leases not supported
        smb3: fix stat when special device file and mounted with modefromsid
        cifs: Print the address and port we are connecting to in generic_ip_connect()
        SMB3: Resolve data corruption of TCP server info fields
        cifs: make const array static, makes object smaller
        SMB3.1.1: Fix ids returned in POSIX query dir
        smb3: add dynamic trace point to trace when credits obtained
        smb3.1.1: do not fail if no encryption required but server doesn't support it
        cifs: Return the error from crypt_message when enc/dec key not found.
        smb3.1.1: set gcm256 when requested
        smb3.1.1: rename nonces used for GCM and CCM encryption
        smb3.1.1: print warning if server does not support requested encryption type
        smb3.1.1: add new module load parm enable_gcm_256
        ...
      0613ed91
    • Linus Torvalds's avatar
      Merge tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · c4728cfb
      Linus Torvalds authored
      Pull clone/dedupe/remap code refactoring from Darrick Wong:
       "Move the generic file range remap (aka reflink and dedupe) functions
        out of mm/filemap.c and fs/read_write.c and into fs/remap_range.c to
        reduce clutter in the first two files"
      
      * tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        vfs: move the generic write and copy checks out of mm
        vfs: move the remap range helpers to remap_range.c
        vfs: move generic_remap_checks out of mm
      c4728cfb
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f9a705ad
      Linus Torvalds authored
      Pull KVM updates from Paolo Bonzini:
       "For x86, there is a new alternative and (in the future) more scalable
        implementation of extended page tables that does not need a reverse
        map from guest physical addresses to host physical addresses.
      
        For now it is disabled by default because it is still lacking a few of
        the existing MMU's bells and whistles. However it is a very solid
        piece of work and it is already available for people to hammer on it.
      
        Other updates:
      
        ARM:
         - New page table code for both hypervisor and guest stage-2
         - Introduction of a new EL2-private host context
         - Allow EL2 to have its own private per-CPU variables
         - Support of PMU event filtering
         - Complete rework of the Spectre mitigation
      
        PPC:
         - Fix for running nested guests with in-kernel IRQ chip
         - Fix race condition causing occasional host hard lockup
         - Minor cleanups and bugfixes
      
        x86:
         - allow trapping unknown MSRs to userspace
         - allow userspace to force #GP on specific MSRs
         - INVPCID support on AMD
         - nested AMD cleanup, on demand allocation of nested SVM state
         - hide PV MSRs and hypercalls for features not enabled in CPUID
         - new test for MSR_IA32_TSC writes from host and guest
         - cleanups: MMU, CPUID, shared MSRs
         - LAPIC latency optimizations ad bugfixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
        kvm: x86/mmu: NX largepage recovery for TDP MMU
        kvm: x86/mmu: Don't clear write flooding count for direct roots
        kvm: x86/mmu: Support MMIO in the TDP MMU
        kvm: x86/mmu: Support write protection for nesting in tdp MMU
        kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
        kvm: x86/mmu: Support dirty logging for the TDP MMU
        kvm: x86/mmu: Support changed pte notifier in tdp MMU
        kvm: x86/mmu: Add access tracking for tdp_mmu
        kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
        kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
        kvm: x86/mmu: Add TDP MMU PF handler
        kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
        kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
        KVM: Cache as_id in kvm_memory_slot
        kvm: x86/mmu: Add functions to handle changed TDP SPTEs
        kvm: x86/mmu: Allocate and free TDP MMU roots
        kvm: x86/mmu: Init / Uninit the TDP MMU
        kvm: x86/mmu: Introduce tdp_iter
        KVM: mmu: extract spte.h and spte.c
        KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
        ...
      f9a705ad