Skip to content
  1. Jun 19, 2023
  2. Jun 15, 2023
    • Michal Sekletar's avatar
      selftests: tty: add selftest for tty timestamp updates · e8cc3348
      Michal Sekletar authored
      
      
      Add new test case which checks that timestamp updates on actual terminal
      character device (e.g. /dev/pts/0) happen even if the terminal is
      accessed via magic /dev/tty file.
      
      Signed-off-by: default avatarMichal Sekletar <msekleta@redhat.com>
      Message-ID: <20230613172107.78138-2-msekleta@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8cc3348
    • Michal Sekletar's avatar
      tty: tty_io: update timestamps on all device nodes · 360c11e2
      Michal Sekletar authored
      
      
      User space applications watch for timestamp changes on character device
      files in order to determine idle time of a given terminal session. For
      example, "w" program uses this information to populate the IDLE column
      of its output [1]. Similarly, systemd-logind has optional feature where
      it uses atime of the tty character device to determine if there was
      activity on the terminal associated with the logind's session object. If
      there was no activity for a configured period of time then logind will
      terminate such session [2].
      
      Now, usually (e.g. bash running on the terminal) the use of the terminal
      will update timestamps (atime and mtime) on the corresponding terminal
      character device. However, if access to the terminal, e.g. /dev/pts/0,
      is performed through magic character device /dev/tty then such access
      obviously changes the state of the terminal, however timestamps on the
      device that correspond to the terminal (/dev/pts/0) are not updated.
      
      This patch makes sure that we update timestamps on *all* character
      devices that correspond to the given tty, because outside observers (w,
      systemd-logind) are maybe checking these timestamps. Obviously, they can
      not check timestamps on /dev/tty as that has per-process meaning.
      
      [1] https://gitlab.com/procps-ng/procps/-/blob/v4.0.0/w.c#L286
      [2] https://github.com/systemd/systemd/blob/v252/NEWS#L477
      
      Signed-off-by: default avatarMichal Sekletar <msekleta@redhat.com>
      Message-ID: <20230613172107.78138-1-msekleta@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      360c11e2
    • Hui Li's avatar
      tty: fix hang on tty device with no_room set · 4903fde8
      Hui Li authored
      It is possible to hang pty devices in this case, the reader was
      blocking at epoll on master side, the writer was sleeping at
      wait_woken inside n_tty_write on slave side, and the write buffer
      on tty_port was full, we found that the reader and writer would
      never be woken again and blocked forever.
      
      The problem was caused by a race between reader and kworker:
      n_tty_read(reader):  n_tty_receive_buf_common(kworker):
      copy_from_read_buf()|
                          |room = N_TTY_BUF_SIZE - (ldata->read_head - tail)
                          |room <= 0
      n_tty_kick_worker() |
                          |ldata->no_room = true
      
      After writing to slave device, writer wakes up kworker to flush
      data on tty_port to reader, and the kworker finds that reader
      has no room to store data so room <= 0 is met. At this moment,
      reader consumes all the data on reader buffer and calls
      n_tty_kick_worker to check ldata->no_room which is false and
      reader quits reading. Then kworker sets ldata->no_room=true
      and quits too.
      
      If write buffer is not full, writer will wake kworker to flush data
      again after following writes, but if write buffer is full and writer
      goes to sleep, kworker will never be woken again and tty device is
      blocked.
      
      This problem can be solved with a check for read buffer size inside
      n_tty_receive_buf_common, if read buffer is empty and ldata->no_room
      is true, a call to n_tty_kick_worker is necessary to keep flushing
      data to reader.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 42458f41
      
       ("n_tty: Ensure reader restarts worker for next reader")
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Signed-off-by: default avatarHui Li <caelli@tencent.com>
      Message-ID: <1680749090-14106-1-git-send-email-caelli@tencent.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4903fde8
    • Dan Carpenter's avatar
      serial: core: fix -EPROBE_DEFER handling in init · cef09673
      Dan Carpenter authored
      The -EPROBE_DEFER error path in serial_base_device_init() is a bit
      awkward.  Before the call to device_initialize(dev) then we need to
      manually release all the device resources.  And after the call then we
      need to call put_device() to release the resources.  Doing either one
      wrong will result in a leak or a use after free.
      
      So let's wait to return -EPROBE_DEFER until after the call to
      device_initialize(dev) so that way callers do not have to handle
      -EPROBE_DEFER as a special case.  Now callers can just use put_device()
      for clean up.
      
      The second issue with the -EPROBE_DEFER path is that deferring is not
      supposed to be a fatal error, but instead it's normal part of the
      init process and the kernel recovers from it automatically.  That means
      we should not print an error message but just a debug message on this
      path.
      
      Fixes: 53991424
      
       ("serial: core: Fix probing serial_base_bus devices")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarTony Lindgren <tony@atomide.com>
      Message-ID: <18318adb-ab2c-4dcc-9f96-498a13d16b80@moroto.mountain>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cef09673
    • Tony Lindgren's avatar
      serial: 8250_omap: Use force_suspend and resume for system suspend · 20a41a62
      Tony Lindgren authored
      We should not rely on autosuspend timeout for system suspend. Instead,
      let's use force_suspend and force_resume functions. Otherwise the serial
      port controller device may not be idled on suspend.
      
      As we are doing a register write on suspend to configure the serial port,
      we still need to runtime PM resume the port on suspend.
      
      While at it, let's switch to pm_runtime_resume_and_get() and check for
      errors returned. And let's add the missing line break before return to the
      suspend function while at it.
      
      Fixes: 09d8b2bd
      
       ("serial: 8250: omap: Provide ability to enable/disable UART as wakeup source")
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarDhruva Gole <d-gole@ti.com>
      Message-ID: <20230614045922.4798-1-tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      20a41a62
  3. Jun 13, 2023
  4. Jun 06, 2023
  5. Jun 05, 2023
  6. Jun 04, 2023
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.4_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6f64a5eb
      Linus Torvalds authored
      Pull irq fix from Borislav Petkov:
      
       - Fix open firmware quirks validation so that they don't get applied
         wrongly
      
      * tag 'irq_urgent_for_v6.4_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic: Correctly validate OF quirk descriptors
      6f64a5eb
    • Linus Torvalds's avatar
      Merge tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 5e89d62e
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Some driver fixes:
         - a regression fix for the verisilicon driver
         - uvcvideo: don't expose unsupported video formats to userspace
         - camss-video: don't zero subdev format after init
         - mediatek: some fixes for 4K decoder formats
         - fix a Sphinx build warning (missing doc for client_caps)
         - some fixes for imx and atomisp staging drivers
      
        And two CEC core fixes:
         - don't set last_initiator if TX in progress
         - disable adapter in cec_devnode_unregister"
      
      * tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: uvcvideo: Don't expose unsupported formats to userspace
        media: v4l2-subdev: Fix missing kerneldoc for client_caps
        media: staging: media: imx: initialize hs_settle to avoid warning
        media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad()
        media: staging: media: atomisp: init high & low vars
        media: cec: core: don't set last_initiator if tx in progress
        media: cec: core: disable adapter in cec_devnode_unregister
        media: mediatek: vcodec: Only apply 4K frame sizes on decoder formats
        media: camss: camss-video: Don't zero subdev format again after initialization
        media: verisilicon: Additional fix for the crash when opening the driver
      5e89d62e
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 209835e8
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
        resolve a number of reported issues. Included in here are:
      
         - iio driver fixes
      
         - fpga driver fixes
      
         - test_firmware bugfixes
      
         - fastrpc driver tiny bugfixes
      
         - MAINTAINERS file updates for some subsystems
      
        All of these have been in linux-next this past week with no reported
        issues"
      
      * tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits)
        test_firmware: fix the memory leak of the allocated firmware buffer
        test_firmware: fix a memory leak with reqs buffer
        test_firmware: prevent race conditions by a correct implementation of locking
        firmware_loader: Fix a NULL vs IS_ERR() check
        MAINTAINERS: Vaibhav Gupta is the new ipack maintainer
        dt-bindings: fpga: replace Ivan Bornyakov maintainership
        MAINTAINERS: update Microchip MPF FPGA reviewers
        misc: fastrpc: reject new invocations during device removal
        misc: fastrpc: return -EPIPE to invocations on device removal
        misc: fastrpc: Reassign memory ownership only for remote heap
        misc: fastrpc: Pass proper scm arguments for secure map request
        iio: imu: inv_icm42600: fix timestamp reset
        iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag
        dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value
        iio: dac: mcp4725: Fix i2c_master_send() return value handling
        iio: accel: kx022a fix irq getting
        iio: bu27034: Ensure reset is written
        iio: dac: build ad5758 driver when AD5758 is selected
        iio: addac: ad74413: fix resistance input processing
        iio: light: vcnl4035: fixed chip ID check
        ...
      209835e8
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 41f3ab2d
      Linus Torvalds authored
      Pull driver core fixes from Greg KH:
       "Here are two small driver core cacheinfo fixes for 6.4-rc5 that
        resolve a number of reported issues with that file. These changes have
        been in linux-next this past week with no reported problems"
      
      * tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug
        drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
      41f3ab2d
    • Linus Torvalds's avatar
      Merge tag 'tty-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 12c2f77b
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small tty/serial driver fixes for 6.4-rc5 that have all
        been in linux-next this past week with no reported problems. Included
        in here are:
      
         - 8250_tegra driver bugfix
      
         - fsl uart driver bugfixes
      
         - Kconfig fix for dependancy issue
      
         - dt-bindings fix for the 8250_omap driver"
      
      * tag 'tty-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        dt-bindings: serial: 8250_omap: add rs485-rts-active-high
        serial: cpm_uart: Fix a COMPILE_TEST dependency
        soc: fsl: cpm1: Fix TSA and QMC dependencies in case of COMPILE_TEST
        tty: serial: fsl_lpuart: use UARTCTRL_TXINV to send break instead of UARTCTRL_SBK
        serial: 8250_tegra: Fix an error handling path in tegra_uart_probe()
      12c2f77b
    • Linus Torvalds's avatar
      Merge tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 8b435e40
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some USB driver and core fixes for 6.4-rc5. Most of these are
        tiny driver fixes, including:
      
         - udc driver bugfix
      
         - f_fs gadget driver bugfix
      
         - cdns3 driver bugfix
      
         - typec bugfixes
      
        But the "big" thing in here is a fix yet-again for how the USB buffers
        are handled from userspace when dealing with DMA issues. The changes
        were discussed a lot, and tested a lot, on the list, and acked by the
        relevant mm maintainers and have been in linux-next all this past week
        with no reported problems"
      
      * tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: tps6598x: Fix broken polling mode after system suspend/resume
        mm: page_table_check: Ensure user pages are not slab pages
        mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM
        usb: usbfs: Use consistent mmap functions
        usb: usbfs: Enforce page requirements for mmap
        dt-bindings: usb: snps,dwc3: Fix "snps,hsphy_interface" type
        usb: gadget: udc: fix NULL dereference in remove()
        usb: gadget: f_fs: Add unbind event before functionfs_unbind
        usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM
      8b435e40
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b066935b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Address some fallout of the locking rework, this time affecting the
           way the vgic is configured
      
         - Fix an issue where the page table walker frees a subtree and then
           proceeds with walking what it has just freed...
      
         - Check that a given PA donated to the guest is actually memory (only
           affecting pKVM)
      
         - Correctly handle MTE CMOs by Set/Way
      
         - Fix the reported address of a watchpoint forwarded to userspace
      
         - Fix the freeing of the root of stage-2 page tables
      
         - Stop creating spurious PMU events to perform detection of the
           default PMU and use the existing PMU list instead
      
        x86:
      
         - Fix a memslot lookup bug in the NX recovery thread that could
           theoretically let userspace bypass the NX hugepage mitigation
      
         - Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
      
         - Account exit stats for fastpath VM-Exits that never leave the super
           tight run-loop
      
         - Fix an out-of-bounds bug in the optimized APIC map code, and add a
           regression test for the race"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: selftests: Add test for race in kvm_recalculate_apic_map()
        KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds
        KVM: x86: Account fastpath-only VM-Exits in vCPU stats
        KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK
        KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker
        KVM: arm64: Document default vPMU behavior on heterogeneous systems
        KVM: arm64: Iterate arm_pmus list to probe for default PMU
        KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()
        KVM: arm64: Populate fault info for watchpoint
        KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
        KVM: arm64: Handle trap of tagged Set/Way CMOs
        arm64: Add missing Set/Way CMO encodings
        KVM: arm64: Prevent unconditional donation of unmapped regions from the host
        KVM: arm64: vgic: Fix a comment
        KVM: arm64: vgic: Fix locking comment
        KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
        KVM: arm64: vgic: Fix a circular locking issue
      b066935b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 9455b4b6
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix link errors in new aes-gcm-p10 code when built-in with other
         drivers
      
       - Limit number of TCEs passed to H_STUFF_TCE hcall as per spec
      
       - Use KSYM_NAME_LEN in xmon array size to avoid possible OOB write
      
      Thanks to Gaurav Batra and Maninder Singh Vishal Chourasia.
      
      * tag 'powerpc-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/xmon: Use KSYM_NAME_LEN in array size
        powerpc/iommu: Limit number of TCEs to 512 for H_STUFF_TCE hcall
        powerpc/crypto: Fix aes-gcm-p10 link errors
      9455b4b6
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-fixes-6.4' of https://github.com/kvm-x86/linux into HEAD · f211b450
      Paolo Bonzini authored
      KVM x86 fixes for 6.4
      
       - Fix a memslot lookup bug in the NX recovery thread that could
         theoretically let userspace bypass the NX hugepage mitigation
      
       - Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
      
       - Account exit stats for fastpath VM-Exits that never leave the super
         tight run-loop
      
       - Fix an out-of-bounds bug in the optimized APIC map code, and add a
         regression test for the race.
      f211b450
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.4-3' of... · 49661a52
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.4, take #3
      
      - Fix the reported address of a watchpoint forwarded to userspace
      
      - Fix the freeing of the root of stage-2 page tables
      
      - Stop creating spurious PMU events to perform detection of the
        default PMU and use the existing PMU list instead.
      49661a52
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.4-2' of... · 26f31498
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.4, take #2
      
      - Address some fallout of the locking rework, this time affecting
        the way the vgic is configured
      
      - Fix an issue where the page table walker frees a subtree and
        then proceeds with walking what it has just freed...
      
      - Check that a given PA donated to the gues is actually memory
        (only affecting pKVM)
      
      - Correctly handle MTE CMOs by Set/Way
      26f31498
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · e5282a7d
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Five fixes, all in drivers.
      
        The most extensive is the target change to fix the hang in the login
        code, which involves changing timers from per login to per connection"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: stex: Fix gcc 13 warnings
        scsi: qla2xxx: Fix NULL pointer dereference in target mode
        scsi: target: iscsi: Prevent login threads from racing between each other
        scsi: target: iscsi: Remove unused transport_timer
        scsi: target: iscsi: Fix hang in the iSCSI login code
      e5282a7d
    • Linus Torvalds's avatar
      Merge tag 'leds-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/linux · d1b65edf
      Linus Torvalds authored
      
      
      Pull LED fix from Johan Hovold:
       "Here's a fix for a regression in 6.4-rc1 which broke the backlight on
        machines such as the Lenovo ThinkPad X13s"
      
      Acked-by: default avatarLee Jones <lee@kernel.org>
      Link: https://lore.kernel.org/lkml/20230602091928.GR449117@google.com/
      
      * tag 'leds-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/linux:
        leds: qcom-lpg: Fix PWM period limits
      d1b65edf
  7. Jun 03, 2023
    • Bjorn Andersson's avatar
      leds: qcom-lpg: Fix PWM period limits · b05d3946
      Bjorn Andersson authored
      The introduction of high resolution PWM support changed the order of the
      operations in the calculation of min and max period. The result in both
      divisions is in most cases a truncation to 0, which limits the period to
      the range of [0, 0].
      
      Both numerators (and denominators) are within 64 bits, so the whole
      expression can be put directly into the div64_u64, instead of doing it
      partially.
      
      Fixes: b00d2ed3
      
       ("leds: rgb: leds-qcom-lpg: Add support for high resolution PWM")
      Reviewed-by: default avatarCaleb Connolly <caleb.connolly@linaro.org>
      Tested-by: default avatarSteev Klimaszewski <steev@kali.org>
      Signed-off-by: default avatarBjorn Andersson <quic_bjorande@quicinc.com>
      Acked-by: default avatarLee Jones <lee@kernel.org>
      Tested-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
      Link: https://lore.kernel.org/r/20230515162604.649203-1-quic_bjorande@quicinc.com
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      b05d3946
    • Linus Torvalds's avatar
      Merge tag 'probes-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 51f269a6
      Linus Torvalds authored
      Pull probes fixes from Masami Hiramatsu:
      
       - Return NULL if the trace_probe list on trace_probe_event is empty
      
       - selftests/ftrace: Choose testing symbol name for filtering feature
         from sample data instead of fixed symbol
      
      * tag 'probes-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        selftests/ftrace: Choose target function for filter test from samples
        tracing/probe: trace_probe_primary_from_call(): checked list_first_entry
      51f269a6
    • Masami Hiramatsu (Google)'s avatar
      selftests/ftrace: Choose target function for filter test from samples · eb50d0f2
      Masami Hiramatsu (Google) authored
      
      
      Since the event-filter-function.tc expects the 'exit_mmap()' directly
      calls 'kmem_cache_free()', this is vulnerable to code modifications.
      
      Choose the target function for the filter test from the sample
      event data so that it can keep test running correctly even if the caller
      function name will be changed.
      
      Link: https://lore.kernel.org/linux-trace-kernel/167919441260.1922645.18355804179347364057.stgit@mhiramat.roam.corp.google.com/
      
      Link: https://lore.kernel.org/all/CA+G9fYtF-XEKi9YNGgR=Kf==7iRb2FrmEC7qtwAeQbfyah-UhA@mail.gmail.com/
      Reported-by: default avatarLinux Kernel Functional Testing <lkft@linaro.org>
      Fixes: 7f09d639
      
       ("tracing/selftests: Add test for event filtering on function name")
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Acked-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      eb50d0f2
    • Michal Luczaj's avatar
      KVM: selftests: Add test for race in kvm_recalculate_apic_map() · 47d2804b
      Michal Luczaj authored
      
      
      Keep switching between LAPIC_MODE_X2APIC and LAPIC_MODE_DISABLED during
      APIC map construction to hunt for TOCTOU bugs in KVM.  KVM's optimized map
      recalc makes multiple passes over the list of vCPUs, and the calculations
      ignore vCPU's whose APIC is hardware-disabled, i.e. there's a window where
      toggling LAPIC_MODE_DISABLED is quite interesting.
      
      Signed-off-by: default avatarMichal Luczaj <mhal@rbox.co>
      Co-developed-by: default avatarSean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/r/20230602233250.1014316-4-seanjc@google.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      47d2804b
    • Sean Christopherson's avatar
      KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds · 4364b287
      Sean Christopherson authored
      
      
      Bail from kvm_recalculate_phys_map() and disable the optimized map if the
      target vCPU's x2APIC ID is out-of-bounds, i.e. if the vCPU was added
      and/or enabled its local APIC after the map was allocated.  This fixes an
      out-of-bounds access bug in the !x2apic_format path where KVM would write
      beyond the end of phys_map.
      
      Check the x2APIC ID regardless of whether or not x2APIC is enabled,
      as KVM's hardcodes x2APIC ID to be the vCPU ID, i.e. it can't change, and
      the map allocation in kvm_recalculate_apic_map() doesn't check for x2APIC
      being enabled, i.e. the check won't get false postivies.
      
      Note, this also affects the x2apic_format path, which previously just
      ignored the "x2apic_id > new->max_apic_id" case.  That too is arguably a
      bug fix, as ignoring the vCPU meant that KVM would not send interrupts to
      the vCPU until the next map recalculation.  In practice, that "bug" is
      likely benign as a newly present vCPU/APIC would immediately trigger a
      recalc.  But, there's no functional downside to disabling the map, and
      a future patch will gracefully handle the -E2BIG case by retrying instead
      of simply disabling the optimized map.
      
      Opportunistically add a sanity check on the xAPIC ID size, along with a
      comment explaining why the xAPIC ID is guaranteed to be "good".
      
      Reported-by: default avatarMichal Luczaj <mhal@rbox.co>
      Fixes: 5b84b029
      
       ("KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs")
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230602233250.1014316-2-seanjc@google.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      4364b287
    • Sean Christopherson's avatar
      KVM: x86: Account fastpath-only VM-Exits in vCPU stats · 8b703a49
      Sean Christopherson authored
      Increment vcpu->stat.exits when handling a fastpath VM-Exit without
      going through any part of the "slow" path.  Not bumping the exits stat
      can result in wildly misleading exit counts, e.g. if the primary reason
      the guest is exiting is to program the TSC deadline timer.
      
      Fixes: 404d5d7b
      
       ("KVM: X86: Introduce more exit_fastpath_completion enum values")
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230602011920.787844-2-seanjc@google.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      8b703a49
    • Maciej S. Szmigiero's avatar
      KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK · b2ce8997
      Maciej S. Szmigiero authored
      While testing Hyper-V enabled Windows Server 2019 guests on Zen4 hardware
      I noticed that with vCPU count large enough (> 16) they sometimes froze at
      boot.
      With vCPU count of 64 they never booted successfully - suggesting some kind
      of a race condition.
      
      Since adding "vnmi=0" module parameter made these guests boot successfully
      it was clear that the problem is most likely (v)NMI-related.
      
      Running kvm-unit-tests quickly showed failing NMI-related tests cases, like
      "multiple nmi" and "pending nmi" from apic-split, x2apic and xapic tests
      and the NMI parts of eventinj test.
      
      The issue was that once one NMI was being serviced no other NMI was allowed
      to be set pending (NMI limit = 0), which was traced to
      svm_is_vnmi_pending() wrongly testing for the "NMI blocked" flag rather
      than for the "NMI pending" flag.
      
      Fix this by testing for the right flag in svm_is_vnmi_pending().
      Once this is done, the NMI-related kvm-unit-tests pass successfully and
      the Windows guest no longer freezes at boot.
      
      Fixes: fa4c027a
      
       ("KVM: x86: Add support for SVM's Virtual NMI")
      Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/r/be4ca192eb0c1e69a210db3009ca984e6a54ae69.1684495380.git.maciej.szmigiero@oracle.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      b2ce8997
    • Sean Christopherson's avatar
      KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker · 817fa998
      Sean Christopherson authored
      Factor in the address space (non-SMM vs. SMM) of the target shadow page
      when recovering potential NX huge pages, otherwise KVM will retrieve the
      wrong memslot when zapping shadow pages that were created for SMM.  The
      bug most visibly manifests as a WARN on the memslot being non-NULL, but
      the worst case scenario is that KVM could unaccount the shadow page
      without ensuring KVM won't install a huge page, i.e. if the non-SMM slot
      is being dirty logged, but the SMM slot is not.
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 3911 at arch/x86/kvm/mmu/mmu.c:7015
       kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
       CPU: 1 PID: 3911 Comm: kvm-nx-lpage-re
       RIP: 0010:kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
       RSP: 0018:ffff99b284f0be68 EFLAGS: 00010246
       RAX: 0000000000000000 RBX: ffff99b284edd000 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
       RBP: ffff9271397024e0 R08: 0000000000000000 R09: ffff927139702450
       R10: 0000000000000000 R11: 0000000000000001 R12: ffff99b284f0be98
       R13: 0000000000000000 R14: ffff9270991fcd80 R15: 0000000000000003
       FS:  0000000000000000(0000) GS:ffff927f9f640000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f0aacad3ae0 CR3: 000000088fc2c005 CR4: 00000000003726e0
       Call Trace:
        <TASK>
      __pfx_kvm_nx_huge_page_recovery_worker+0x10/0x10 [kvm]
        kvm_vm_worker_thread+0x106/0x1c0 [kvm]
        kthread+0xd9/0x100
        ret_from_fork+0x2c/0x50
        </TASK>
       ---[ end trace 0000000000000000 ]---
      
      This bug was exposed by commit edbdb43f ("KVM: x86: Preserve TDP MMU
      roots until they are explicitly invalidated"), which allowed KVM to retain
      SMM TDP MMU roots effectively indefinitely.  Before commit edbdb43f,
      KVM would zap all SMM TDP MMU roots and thus all SMM TDP MMU shadow pages
      once all vCPUs exited SMM, which made the window where this bug (recovering
      an SMM NX huge page) could be encountered quite tiny.  To hit the bug, the
      NX recovery thread would have to run while at least one vCPU was in SMM.
      Most VMs typically only use SMM during boot, and so the problematic shadow
      pages were gone by the time the NX recovery thread ran.
      
      Now that KVM preserves TDP MMU roots until they are explicitly invalidated
      (e.g. by a memslot deletion), the window to trigger the bug is effectively
      never closed because most VMMs don't delete memslots after boot (except
      for a handful of special scenarios).
      
      Fixes: eb298605
      
       ("KVM: x86/mmu: Do not recover dirty-tracked NX Huge Pages")
      Reported-by: default avatarFabio Coatti <fabio.coatti@gmail.com>
      Closes: https://lore.kernel.org/all/CADpTngX9LESCdHVu_2mQkNGena_Ng2CphWNwsRGSMxzDsTjU2A@mail.gmail.com
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230602010137.784664-1-seanjc@google.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      817fa998