Skip to content
  1. Jun 04, 2021
  2. Jun 03, 2021
    • Maciej Fijalkowski's avatar
      ice: track AF_XDP ZC enabled queues in bitmap · e102db78
      Maciej Fijalkowski authored
      Commit c7a21904 ("ice: Remove xsk_buff_pool from VSI structure")
      silently introduced a regression and broke the Tx side of AF_XDP in copy
      mode. xsk_pool on ice_ring is set only based on the existence of the XDP
      prog on the VSI which in turn picks ice_clean_tx_irq_zc to be executed.
      That is not something that should happen for copy mode as it should use
      the regular data path ice_clean_tx_irq.
      
      This results in a following splat when xdpsock is run in txonly or l2fwd
      scenarios in copy mode:
      
      <snip>
      [  106.050195] BUG: kernel NULL pointer dereference, address: 0000000000000030
      [  106.057269] #PF: supervisor read access in kernel mode
      [  106.062493] #PF: error_code(0x0000) - not-present page
      [  106.067709] PGD 0 P4D 0
      [  106.070293] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [  106.074721] CPU: 61 PID: 0 Comm: swapper/61 Not tainted 5.12.0-rc2+ #45
      [  106.081436] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
      [  106.092027] RIP: 0010:xp_raw_get_dma+0x36/0x50
      [  106.096551] Code: 74 14 48 b8 ff ff ff ff ff ff 00 00 48 21 f0 48 c1 ee 30 48 01 c6 48 8b 87 90 00 00 00 48 89 f2 81 e6 ff 0f 00 00 48 c1 ea 0c <48> 8b 04 d0 48 83 e0 fe 48 01 f0 c3 66 66 2e 0f 1f 84 00 00 00 00
      [  106.115588] RSP: 0018:ffffc9000d694e50 EFLAGS: 00010206
      [  106.120893] RAX: 0000000000000000 RBX: ffff88984b8c8a00 RCX: ffff889852581800
      [  106.128137] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88984cd8b800
      [  106.135383] RBP: ffff888123b50001 R08: ffff889896800000 R09: 0000000000000800
      [  106.142628] R10: 0000000000000000 R11: ffffffff826060c0 R12: 00000000000000ff
      [  106.149872] R13: 0000000000000000 R14: 0000000000000040 R15: ffff888123b50018
      [  106.157117] FS:  0000000000000000(0000) GS:ffff8897e0f40000(0000) knlGS:0000000000000000
      [  106.165332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  106.171163] CR2: 0000000000000030 CR3: 000000000560a004 CR4: 00000000007706e0
      [  106.178408] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  106.185653] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  106.192898] PKRU: 55555554
      [  106.195653] Call Trace:
      [  106.198143]  <IRQ>
      [  106.200196]  ice_clean_tx_irq_zc+0x183/0x2a0 [ice]
      [  106.205087]  ice_napi_poll+0x3e/0x590 [ice]
      [  106.209356]  __napi_poll+0x2a/0x160
      [  106.212911]  net_rx_action+0xd6/0x200
      [  106.216634]  __do_softirq+0xbf/0x29b
      [  106.220274]  irq_exit_rcu+0x88/0xc0
      [  106.223819]  common_interrupt+0x7b/0xa0
      [  106.227719]  </IRQ>
      [  106.229857]  asm_common_interrupt+0x1e/0x40
      </snip>
      
      Fix this by introducing the bitmap of queues that are zero-copy enabled,
      where each bit, corresponding to a queue id that xsk pool is being
      configured on, will be set/cleared within ice_xsk_pool_{en,dis}able and
      checked within ice_xsk_pool(). The latter is a function used for
      deciding which napi poll routine is executed.
      Idea is being taken from our other drivers such as i40e and ixgbe.
      
      Fixes: c7a21904
      
       ("ice: Remove xsk_buff_pool from VSI structure")
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      e102db78
    • Magnus Karlsson's avatar
      igc: add correct exception tracing for XDP · 45ce0859
      Magnus Karlsson authored
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: 73f1071c ("igc: Add support for XDP_TX action")
      Fixes: 4ff32036
      
       ("igc: Add support for XDP_REDIRECT action")
      Reported-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarDvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      45ce0859
    • Magnus Karlsson's avatar
      ixgbevf: add correct exception tracing for XDP · faae8142
      Magnus Karlsson authored
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: 21092e9c
      
       ("ixgbevf: Add support for XDP_TX action")
      Reported-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      faae8142
    • Magnus Karlsson's avatar
      igb: add correct exception tracing for XDP · 74431c40
      Magnus Karlsson authored
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: 9cbc948b
      
       ("igb: add XDP support")
      Reported-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      74431c40
    • Magnus Karlsson's avatar
      ixgbe: add correct exception tracing for XDP · 8281356b
      Magnus Karlsson authored
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: 33fdc82f ("ixgbe: add support for XDP_TX action")
      Fixes: d0bcacd0
      
       ("ixgbe: add AF_XDP zero-copy Rx support")
      Reported-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      8281356b
    • Magnus Karlsson's avatar
      ice: add correct exception tracing for XDP · 89d65df0
      Magnus Karlsson authored
      Add missing exception tracing to XDP when a number of different
      errors can occur. The support was only partial. Several errors
      where not logged which would confuse the user quite a lot not
      knowing where and why the packets disappeared.
      
      Fixes: efc2214b ("ice: Add support for XDP")
      Fixes: 2d4238f5
      
       ("ice: Add support for AF_XDP")
      Reported-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      89d65df0
    • Magnus Karlsson's avatar
      i40e: add correct exception tracing for XDP · f6c10b48
      Magnus Karlsson authored
      Add missing exception tracing to XDP when a number of different errors
      can occur. The support was only partial. Several errors where not
      logged which would confuse the user quite a lot not knowing where and
      why the packets disappeared.
      
      Fixes: 74608d17 ("i40e: add support for XDP_TX action")
      Fixes: 0a714186
      
       ("i40e: add AF_XDP zero-copy Rx support")
      Reported-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      f6c10b48
    • Kurt Kanzenbach's avatar
      igb: Fix XDP with PTP enabled · 53792608
      Kurt Kanzenbach authored
      When using native XDP with the igb driver, the XDP frame data doesn't point to
      the beginning of the packet. It's off by 16 bytes. Everything works as expected
      with XDP skb mode.
      
      Actually these 16 bytes are used to store the packet timestamps. Therefore, pull
      the timestamp before executing any XDP operations and adjust all other code
      accordingly. The igc driver does it like that as well.
      
      Tested with Intel i210 card and AF_XDP sockets.
      
      Fixes: 9cbc948b
      
       ("igb: add XDP support")
      Signed-off-by: default avatarKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Tested-by: default avatarSandeep Penigalapati <sandeep.penigalapati@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      53792608
    • Wong Vee Khee's avatar
      net: stmmac: fix issue where clk is being unprepared twice · ab00f3e0
      Wong Vee Khee authored
      In the case of MDIO bus registration failure due to no external PHY
      devices is connected to the MAC, clk_disable_unprepare() is called in
      stmmac_bus_clk_config() and intel_eth_pci_probe() respectively.
      
      The second call in intel_eth_pci_probe() will caused the following:-
      
      [   16.578605] intel-eth-pci 0000:00:1e.5: No PHY found
      [   16.583778] intel-eth-pci 0000:00:1e.5: stmmac_dvr_probe: MDIO bus (id: 2) registration failed
      [   16.680181] ------------[ cut here ]------------
      [   16.684861] stmmac-0000:00:1e.5 already disabled
      [   16.689547] WARNING: CPU: 13 PID: 2053 at drivers/clk/clk.c:952 clk_core_disable+0x96/0x1b0
      [   16.697963] Modules linked in: dwc3 iTCO_wdt mei_hdcp iTCO_vendor_support udc_core x86_pkg_temp_thermal kvm_intel marvell10g kvm sch_fq_codel nfsd irqbypass dwmac_intel(+) stmmac uio ax88179_178a pcs_xpcs phylink uhid spi_pxa2xx_platform usbnet mei_me pcspkr tpm_crb mii i2c_i801 dw_dmac dwc3_pci thermal dw_dmac_core intel_rapl_msr libphy i2c_smbus mei tpm_tis intel_th_gth tpm_tis_core tpm intel_th_acpi intel_pmc_core intel_th i915 fuse configfs snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_pcm snd_timer snd soundcore
      [   16.746785] CPU: 13 PID: 2053 Comm: systemd-udevd Tainted: G     U            5.13.0-rc3-intel-lts #76
      [   16.756134] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DRR4 CRB, BIOS ADLIFSI1.R00.1494.B00.2012031421 12/03/2020
      [   16.769465] RIP: 0010:clk_core_disable+0x96/0x1b0
      [   16.774222] Code: 00 8b 05 45 96 17 01 85 c0 7f 24 48 8b 5b 30 48 85 db 74 a5 8b 43 7c 85 c0 75 93 48 8b 33 48 c7 c7 6e 32 cc b7 e8 b2 5d 52 00 <0f> 0b 5b 5d c3 65 8b 05 76 31 18 49 89 c0 48 0f a3 05 bc 92 1a 01
      [   16.793016] RSP: 0018:ffffa44580523aa0 EFLAGS: 00010086
      [   16.798287] RAX: 0000000000000000 RBX: ffff8d7d0eb70a00 RCX: 0000000000000000
      [   16.805435] RDX: 0000000000000002 RSI: ffffffffb7c62d5f RDI: 00000000ffffffff
      [   16.812610] RBP: 0000000000000287 R08: 0000000000000000 R09: ffffa445805238d0
      [   16.819759] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8d7d0eb70a00
      [   16.826904] R13: ffff8d7d027370c8 R14: 0000000000000006 R15: ffffa44580523ad0
      [   16.834047] FS:  00007f9882fa2600(0000) GS:ffff8d80a0940000(0000) knlGS:0000000000000000
      [   16.842177] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   16.847966] CR2: 00007f9882bea3d8 CR3: 000000010b126001 CR4: 0000000000370ee0
      [   16.855144] Call Trace:
      [   16.857614]  clk_core_disable_lock+0x1b/0x30
      [   16.861941]  intel_eth_pci_probe.cold+0x11d/0x136 [dwmac_intel]
      [   16.867913]  pci_device_probe+0xcf/0x150
      [   16.871890]  really_probe+0xf5/0x3e0
      [   16.875526]  driver_probe_device+0x64/0x150
      [   16.879763]  device_driver_attach+0x53/0x60
      [   16.883998]  __driver_attach+0x9f/0x150
      [   16.887883]  ? device_driver_attach+0x60/0x60
      [   16.892288]  ? device_driver_attach+0x60/0x60
      [   16.896698]  bus_for_each_dev+0x77/0xc0
      [   16.900583]  bus_add_driver+0x184/0x1f0
      [   16.904469]  driver_register+0x6c/0xc0
      [   16.908268]  ? 0xffffffffc07ae000
      [   16.911598]  do_one_initcall+0x4a/0x210
      [   16.915489]  ? kmem_cache_alloc_trace+0x305/0x4e0
      [   16.920247]  do_init_module+0x5c/0x230
      [   16.924057]  load_module+0x2894/0x2b70
      [   16.927857]  ? __do_sys_finit_module+0xb5/0x120
      [   16.932441]  __do_sys_finit_module+0xb5/0x120
      [   16.936845]  do_syscall_64+0x42/0x80
      [   16.940476]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   16.945586] RIP: 0033:0x7f98830e5ccd
      [   16.949177] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 31 0c 00 f7 d8 64 89 01 48
      [   16.967970] RSP: 002b:00007ffc66b60168 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      [   16.975583] RAX: ffffffffffffffda RBX: 000055885de35ef0 RCX: 00007f98830e5ccd
      [   16.982725] RDX: 0000000000000000 RSI: 00007f98832541e3 RDI: 0000000000000012
      [   16.989868] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000
      [   16.997042] R10: 0000000000000012 R11: 0000000000000246 R12: 00007f98832541e3
      [   17.004222] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffc66b60328
      [   17.011369] ---[ end trace df06a3dab26b988c ]---
      [   17.016062] ------------[ cut here ]------------
      [   17.020701] stmmac-0000:00:1e.5 already unprepared
      
      Removing the stmmac_bus_clks_config() call in stmmac_dvr_probe and let
      dwmac-intel to handle the unprepare and disable of the clk device.
      
      Fixes: 5ec55823
      
       ("net: stmmac: add clocks management for gmac driver")
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: default avatarWong Vee Khee <vee.khee.wong@linux.intel.com>
      Reviewed-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab00f3e0
    • Josh Triplett's avatar
      net: ipconfig: Don't override command-line hostnames or domains · b508d5fb
      Josh Triplett authored
      
      
      If the user specifies a hostname or domain name as part of the ip=
      command-line option, preserve it and don't overwrite it with one
      supplied by DHCP/BOOTP.
      
      For instance, ip=::::myhostname::dhcp will use "myhostname" rather than
      ignoring and overwriting it.
      
      Fix the comment on ic_bootp_string that suggests it only copies a string
      "if not already set"; it doesn't have any such logic.
      
      Signed-off-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b508d5fb
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2021-06-01' of git://git.kernel.org/pub/scm/linu · dd627662
      David S. Miller authored
      
      x/kernel/git/saeed/linux
      
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2021-06-01
      
      This series introduces some fixes to mlx5 driver.
      Please pull and let me know if there is any problem.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd627662
    • Daniel Borkmann's avatar
      bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks · ff40e510
      Daniel Borkmann authored
      Commit 59438b46 ("security,lockdown,selinux: implement SELinux lockdown")
      added an implementation of the locked_down LSM hook to SELinux, with the aim
      to restrict which domains are allowed to perform operations that would breach
      lockdown. This is indirectly also getting audit subsystem involved to report
      events. The latter is problematic, as reported by Ondrej and Serhei, since it
      can bring down the whole system via audit:
      
        1) The audit events that are triggered due to calls to security_locked_down()
           can OOM kill a machine, see below details [0].
      
        2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit()
           when trying to wake up kauditd, for example, when using trace_sched_switch()
           tracepoint, see details in [1]. Triggering this was not via some hypothetical
           corner case, but with existing tools like runqlat & runqslower from bcc, for
           example, which make use of this tracepoint. Rough call sequence goes like:
      
           rq_lock(rq) -> -------------------------+
             trace_sched_switch() ->               |
               bpf_prog_xyz() ->                   +-> deadlock
                 selinux_lockdown() ->             |
                   audit_log_end() ->              |
                     wake_up_interruptible() ->    |
                       try_to_wake_up() ->         |
                         rq_lock(rq) --------------+
      
      What's worse is that the intention of 59438b46 to further restrict lockdown
      settings for specific applications in respect to the global lockdown policy is
      completely broken for BPF. The SELinux policy rule for the current lockdown check
      looks something like this:
      
        allow <who> <who> : lockdown { <reason> };
      
      However, this doesn't match with the 'current' task where the security_locked_down()
      is executed, example: httpd does a syscall. There is a tracing program attached
      to the syscall which triggers a BPF program to run, which ends up doing a
      bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does
      the permission check against 'current', that is, httpd in this example. httpd
      has literally zero relation to this tracing program, and it would be nonsensical
      having to write an SELinux policy rule against httpd to let the tracing helper
      pass. The policy in this case needs to be against the entity that is installing
      the BPF program. For example, if bpftrace would generate a histogram of syscall
      counts by user space application:
      
        bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
      
      bpftrace would then go and generate a BPF program from this internally. One way
      of doing it [for the sake of the example] could be to call bpf_get_current_task()
      helper and then access current->comm via one of bpf_probe_read_kernel{,_str}()
      helpers. So the program itself has nothing to do with httpd or any other random
      app doing a syscall here. The BPF program _explicitly initiated_ the lockdown
      check. The allow/deny policy belongs in the context of bpftrace: meaning, you
      want to grant bpftrace access to use these helpers, but other tracers on the
      system like my_random_tracer _not_.
      
      Therefore fix all three issues at the same time by taking a completely different
      approach for the security_locked_down() hook, that is, move the check into the
      program verification phase where we actually retrieve the BPF func proto. This
      also reliably gets the task (current) that is trying to install the BPF tracing
      program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since
      we're moving this out of the BPF helper's fast-path which can be called several
      millions of times per second.
      
      The check is then also in line with other security_locked_down() hooks in the
      system where the enforcement is performed at open/load time, for example,
      open_kcore() for /proc/kcore access or module_sig_check() for module signatures
      just to pick few random ones. What's out of scope in the fix as well as in
      other security_locked_down() hook locations /outside/ of BPF subsystem is that
      if the lockdown policy changes on the fly there is no retrospective action.
      This requires a different discussion, potentially complex infrastructure, and
      it's also not clear whether this can be solved generically. Either way, it is
      out of scope for a suitable stable fix which this one is targeting. Note that
      the breakage is specifically on 59438b46 where it started to rely on 'current'
      as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5 ("bpf:
      Restrict bpf when kernel lockdown is in confidentiality mode").
      
      [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says:
      
        I starting seeing this with F-34. When I run a container that is traced with
        BPF to record the syscalls it is doing, auditd is flooded with messages like:
      
        type=AVC msg=audit(1619784520.593:282387): avc:  denied  { confidentiality }
          for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM"
            scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0
              tclass=lockdown permissive=0
      
        This seems to be leading to auditd running out of space in the backlog buffer
        and eventually OOMs the machine.
      
        [...]
        auditd running at 99% CPU presumably processing all the messages, eventually I get:
        Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
        Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64
        Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
        Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1
        Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
        [...]
      
      [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/,
          Serhei Makarov says:
      
        Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
        bpf_probe_read_compat() call within a sched_switch tracepoint. The problem
        is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend
        testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on
        ppc64le. Example stack trace:
      
        [...]
        [  730.868702] stack backtrace:
        [  730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1
        [  730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
        [  730.873278] Call Trace:
        [  730.873770]  dump_stack+0x7f/0xa1
        [  730.874433]  check_noncircular+0xdf/0x100
        [  730.875232]  __lock_acquire+0x1202/0x1e10
        [  730.876031]  ? __lock_acquire+0xfc0/0x1e10
        [  730.876844]  lock_acquire+0xc2/0x3a0
        [  730.877551]  ? __wake_up_common_lock+0x52/0x90
        [  730.878434]  ? lock_acquire+0xc2/0x3a0
        [  730.879186]  ? lock_is_held_type+0xa7/0x120
        [  730.880044]  ? skb_queue_tail+0x1b/0x50
        [  730.880800]  _raw_spin_lock_irqsave+0x4d/0x90
        [  730.881656]  ? __wake_up_common_lock+0x52/0x90
        [  730.882532]  __wake_up_common_lock+0x52/0x90
        [  730.883375]  audit_log_end+0x5b/0x100
        [  730.884104]  slow_avc_audit+0x69/0x90
        [  730.884836]  avc_has_perm+0x8b/0xb0
        [  730.885532]  selinux_lockdown+0xa5/0xd0
        [  730.886297]  security_locked_down+0x20/0x40
        [  730.887133]  bpf_probe_read_compat+0x66/0xd0
        [  730.887983]  bpf_prog_250599c5469ac7b5+0x10f/0x820
        [  730.888917]  trace_call_bpf+0xe9/0x240
        [  730.889672]  perf_trace_run_bpf_submit+0x4d/0xc0
        [  730.890579]  perf_trace_sched_switch+0x142/0x180
        [  730.891485]  ? __schedule+0x6d8/0xb20
        [  730.892209]  __schedule+0x6d8/0xb20
        [  730.892899]  schedule+0x5b/0xc0
        [  730.893522]  exit_to_user_mode_prepare+0x11d/0x240
        [  730.894457]  syscall_exit_to_user_mode+0x27/0x70
        [  730.895361]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [...]
      
      Fixes: 59438b46
      
       ("security,lockdown,selinux: implement SELinux lockdown")
      Reported-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Reported-by: default avatarJakub Hrozek <jhrozek@redhat.com>
      Reported-by: default avatarSerhei Makarov <smakarov@redhat.com>
      Reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Tested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: James Morris <jamorris@linux.microsoft.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Frank Eigler <fche@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net
      ff40e510
  3. Jun 02, 2021