Skip to content
  1. Jan 05, 2024
    • Shin'ichiro Kawasaki's avatar
      Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe" · 74c4c7d5
      Shin'ichiro Kawasaki authored
      commit b20712e8 upstream.
      
      This reverts commit b28ff7a7
      
      .
      
      The commit introduced P2SB device scan and resource cache during the
      boot process to avoid deadlock. But it caused detection failure of
      IDE controllers on old systems [1]. The IDE controllers on old systems
      and P2SB devices on newer systems have same PCI DEVFN. It is suspected
      the confusion between those two is the failure cause. Revert the change
      at this moment until the proper solution gets ready.
      
      Link: https://lore.kernel.org/platform-driver-x86/CABq1_vjfyp_B-f4LAL6pg394bP6nDFyvg110TOLHHb0x4aCPeg@mail.gmail.com/T/#m07b30468d9676fc5e3bb2122371121e4559bb383 [1]
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Link: https://lore.kernel.org/r/20240104114050.3142690-1-shinichiro.kawasaki@wdc.com
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Signed-off-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74c4c7d5
    • Andrii Nakryiko's avatar
      tracing/kprobes: Fix symbol counting logic by looking at modules as well · 7709b16b
      Andrii Nakryiko authored
      commit 926fe783 upstream.
      
      Recent changes to count number of matching symbols when creating
      a kprobe event failed to take into account kernel modules. As such, it
      breaks kprobes on kernel module symbols, by assuming there is no match.
      
      Fix this my calling module_kallsyms_on_each_symbol() in addition to
      kallsyms_on_each_match_symbol() to perform a proper counting.
      
      Link: https://lore.kernel.org/all/20231027233126.2073148-1-andrii@kernel.org/
      
      Cc: Francis Laniel <flaniel@linux.microsoft.com>
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Fixes: b022f0c7
      
       ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarHao Wei Tee <angelsl@in04.sg>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7709b16b
    • Jiri Olsa's avatar
      kallsyms: Make module_kallsyms_on_each_symbol generally available · 9dd29534
      Jiri Olsa authored
      commit 73feb8d5
      
       upstream.
      
      Making module_kallsyms_on_each_symbol generally available, so it
      can be used outside CONFIG_LIVEPATCH option in following changes.
      
      Rather than adding another ifdef option let's make the function
      generally available (when CONFIG_KALLSYMS and CONFIG_MODULES
      options are defined).
      
      Cc: Christoph Hellwig <hch@lst.de>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20221025134148.3300700-2-jolsa@kernel.org
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dd29534
    • Andy Shevchenko's avatar
      device property: Allow const parameter to dev_fwnode() · 29cb1657
      Andy Shevchenko authored
      commit b295d484
      
       upstream.
      
      It's not fully correct to take a const parameter pointer to a struct
      and return a non-const pointer to a member of that struct.
      
      Instead, introduce a const version of the dev_fwnode() API which takes
      and returns const pointers and use it where it's applicable.
      
      With this, convert dev_fwnode() to be a macro wrapper on top of const
      and non-const APIs that chooses one based on the type.
      
      Suggested-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Fixes: aade55c8
      
       ("device property: Add const qualifier to device_get_match_data() parameter")
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Reviewed-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Link: https://lore.kernel.org/r/20221004092129.19412-2-andriy.shevchenko@linux.intel.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29cb1657
    • Geert Uytterhoeven's avatar
      spi: Constify spi parameters of chip select APIs · e7b04372
      Geert Uytterhoeven authored
      commit d2f19eec upstream.
      
      The "spi" parameters of spi_get_chipselect() and spi_get_csgpiod() can
      be const.
      
      Fixes: 303feb3c
      
       ("spi: Add APIs in spi core to set/get spi->chip_select and spi->cs_gpiod")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Link: https://lore.kernel.org/r/b112de79e7a1e9095a3b6ff22b639f39e39d7748.1678704562.git.geert+renesas@glider.be
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7b04372
    • NeilBrown's avatar
      NFSD: fix possible oops when nfsd/pool_stats is closed. · f9a01938
      NeilBrown authored
      commit 88956eab upstream.
      
      If /proc/fs/nfsd/pool_stats is open when the last nfsd thread exits, then
      when the file is closed a NULL pointer is dereferenced.
      This is because nfsd_pool_stats_release() assumes that the
      pointer to the svc_serv cannot become NULL while a reference is held.
      
      This used to be the case but a recent patch split nfsd_last_thread() out
      from nfsd_put(), and clearing the pointer is done in nfsd_last_thread().
      
      This is easily reproduced by running
         rpc.nfsd 8 ; ( rpc.nfsd 0;true) < /proc/fs/nfsd/pool_stats
      
      Fortunately nfsd_pool_stats_release() has easy access to the svc_serv
      pointer, and so can call svc_put() on it directly.
      
      Fixes: 9f28a971
      
       ("nfsd: separate nfsd_last_thread() from nfsd_put()")
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9a01938
    • Steven Rostedt (Google)'s avatar
      ring-buffer: Fix slowpath of interrupted event · 899ac418
      Steven Rostedt (Google) authored
      commit b803d7c6 upstream.
      
      To synchronize the timestamps with the ring buffer reservation, there are
      two timestamps that are saved in the buffer meta data.
      
      1. before_stamp
      2. write_stamp
      
      When the two are equal, the write_stamp is considered valid, as in, it may
      be used to calculate the delta of the next event as the write_stamp is the
      timestamp of the previous reserved event on the buffer.
      
      This is done by the following:
      
       /*A*/	w = current position on the ring buffer
      	before = before_stamp
      	after = write_stamp
      	ts = read current timestamp
      
      	if (before != after) {
      		write_stamp is not valid, force adding an absolute
      		timestamp.
      	}
      
       /*B*/	before_stamp = ts
      
       /*C*/	write = local_add_return(event length, position on ring buffer)
      
      	if (w == write - event length) {
      		/* Nothing interrupted between A and C */
       /*E*/		write_stamp = ts;
      		delta = ts - after
      		/*
      		 * If nothing interrupted again,
      		 * before_stamp == write_stamp and write_stamp
      		 * can be used to calculate the delta for
      		 * events that come in after this one.
      		 */
      	} else {
      
      		/*
      		 * The slow path!
      		 * Was interrupted between A and C.
      		 */
      
      This is the place that there's a bug. We currently have:
      
      		after = write_stamp
      		ts = read current timestamp
      
       /*F*/		if (write == current position on the ring buffer &&
      		    after < ts && cmpxchg(write_stamp, after, ts)) {
      
      			delta = ts - after;
      
      		} else {
      			delta = 0;
      		}
      
      The assumption is that if the current position on the ring buffer hasn't
      moved between C and F, then it also was not interrupted, and that the last
      event written has a timestamp that matches the write_stamp. That is the
      write_stamp is valid.
      
      But this may not be the case:
      
      If a task context event was interrupted by softirq between B and C.
      
      And the softirq wrote an event that got interrupted by a hard irq between
      C and E.
      
      and the hard irq wrote an event (does not need to be interrupted)
      
      We have:
      
       /*B*/ before_stamp = ts of normal context
      
         ---> interrupted by softirq
      
      	/*B*/ before_stamp = ts of softirq context
      
      	  ---> interrupted by hardirq
      
      		/*B*/ before_stamp = ts of hard irq context
      		/*E*/ write_stamp = ts of hard irq context
      
      		/* matches and write_stamp valid */
      	  <----
      
      	/*E*/ write_stamp = ts of softirq context
      
      	/* No longer matches before_stamp, write_stamp is not valid! */
      
         <---
      
       w != write - length, go to slow path
      
      // Right now the order of events in the ring buffer is:
      //
      // |-- softirq event --|-- hard irq event --|-- normal context event --|
      //
      
       after = write_stamp (this is the ts of softirq)
       ts = read current timestamp
      
       if (write == current position on the ring buffer [true] &&
           after < ts [true] && cmpxchg(write_stamp, after, ts) [true]) {
      
      	delta = ts - after  [Wrong!]
      
      The delta is to be between the hard irq event and the normal context
      event, but the above logic made the delta between the softirq event and
      the normal context event, where the hard irq event is between the two. This
      will shift all the remaining event timestamps on the sub-buffer
      incorrectly.
      
      The write_stamp is only valid if it matches the before_stamp. The cmpxchg
      does nothing to help this.
      
      Instead, the following logic can be done to fix this:
      
      	before = before_stamp
      	ts = read current timestamp
      	before_stamp = ts
      
      	after = write_stamp
      
      	if (write == current position on the ring buffer &&
      	    after == before && after < ts) {
      
      		delta = ts - after
      
      	} else {
      		delta = 0;
      	}
      
      The above will only use the write_stamp if it still matches before_stamp
      and was tested to not have changed since C.
      
      As a bonus, with this logic we do not need any 64-bit cmpxchg() at all!
      
      This means the 32-bit rb_time_t workaround can finally be removed. But
      that's for a later time.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20231218175229.58ec3daf@gandalf.local.home/
      Link: https://lore.kernel.org/linux-trace-kernel/20231218230712.3a76b081@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Fixes: dd939425
      
       ("ring-buffer: Do not try to put back write_stamp")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      899ac418
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: skip set commit for deleted/destroyed sets · 0105571f
      Pablo Neira Ayuso authored
      commit 7315dc1e upstream.
      
      NFT_MSG_DELSET deactivates all elements in the set, skip
      set->ops->commit() to avoid the unnecessary clone (for the pipapo case)
      as well as the sync GC cycle, which could deactivate again expired
      elements in such set.
      
      Fixes: 5f68718b
      
       ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
      Reported-by: default avatarKevin Rich <kevinrich1337@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0105571f
    • Steven Rostedt (Google)'s avatar
      ring-buffer: Remove useless update to write_stamp in rb_try_to_discard() · 4768430d
      Steven Rostedt (Google) authored
      commit 083e9f65 upstream.
      
      When filtering is enabled, a temporary buffer is created to place the
      content of the trace event output so that the filter logic can decide
      from the trace event output if the trace event should be filtered out or
      not. If it is to be filtered out, the content in the temporary buffer is
      simply discarded, otherwise it is written into the trace buffer.
      
      But if an interrupt were to come in while a previous event was using that
      temporary buffer, the event written by the interrupt would actually go
      into the ring buffer itself to prevent corrupting the data on the
      temporary buffer. If the event is to be filtered out, the event in the
      ring buffer is discarded, or if it fails to discard because another event
      were to have already come in, it is turned into padding.
      
      The update to the write_stamp in the rb_try_to_discard() happens after a
      fix was made to force the next event after the discard to use an absolute
      timestamp by setting the before_stamp to zero so it does not match the
      write_stamp (which causes an event to use the absolute timestamp).
      
      But there's an effort in rb_try_to_discard() to put back the write_stamp
      to what it was before the event was added. But this is useless and
      wasteful because nothing is going to be using that write_stamp for
      calculations as it still will not match the before_stamp.
      
      Remove this useless update, and in doing so, we remove another
      cmpxchg64()!
      
      Also update the comments to reflect this change as well as remove some
      extra white space in another comment.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20231215081810.1f4f38fe@rorschach.local.home
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Cc: Vincent Donnefort   <vdonnefort@google.com>
      Fixes: b2dd7975
      
       ("ring-buffer: Force absolute timestamp on discard of event")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4768430d
    • Steven Rostedt (Google)'s avatar
      tracing: Fix blocked reader of snapshot buffer · f33c4e4c
      Steven Rostedt (Google) authored
      commit 39a7dc23
      
       upstream.
      
      If an application blocks on the snapshot or snapshot_raw files, expecting
      to be woken up when a snapshot occurs, it will not happen. Or it may
      happen with an unexpected result.
      
      That result is that the application will be reading the main buffer
      instead of the snapshot buffer. That is because when the snapshot occurs,
      the main and snapshot buffers are swapped. But the reader has a descriptor
      still pointing to the buffer that it originally connected to.
      
      This is fine for the main buffer readers, as they may be blocked waiting
      for a watermark to be hit, and when a snapshot occurs, the data that the
      main readers want is now on the snapshot buffer.
      
      But for waiters of the snapshot buffer, they are waiting for an event to
      occur that will trigger the snapshot and they can then consume it quickly
      to save the snapshot before the next snapshot occurs. But to do this, they
      need to read the new snapshot buffer, not the old one that is now
      receiving new data.
      
      Also, it does not make sense to have a watermark "buffer_percent" on the
      snapshot buffer, as the snapshot buffer is static and does not receive new
      data except all at once.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20231228095149.77f5b45d@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Fixes: debdd57f
      
       ("tracing: Make a snapshot feature available from userspace")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f33c4e4c
    • Steven Rostedt (Google)'s avatar
      ring-buffer: Fix wake ups when buffer_percent is set to 100 · 09640899
      Steven Rostedt (Google) authored
      commit 623b1f89
      
       upstream.
      
      The tracefs file "buffer_percent" is to allow user space to set a
      water-mark on how much of the tracing ring buffer needs to be filled in
      order to wake up a blocked reader.
      
       0 - is to wait until any data is in the buffer
       1 - is to wait for 1% of the sub buffers to be filled
       50 - would be half of the sub buffers are filled with data
       100 - is not to wake the waiter until the ring buffer is completely full
      
      Unfortunately the test for being full was:
      
      	dirty = ring_buffer_nr_dirty_pages(buffer, cpu);
      	return (dirty * 100) > (full * nr_pages);
      
      Where "full" is the value for "buffer_percent".
      
      There is two issues with the above when full == 100.
      
      1. dirty * 100 > 100 * nr_pages will never be true
         That is, the above is basically saying that if the user sets
         buffer_percent to 100, more pages need to be dirty than exist in the
         ring buffer!
      
      2. The page that the writer is on is never considered dirty, as dirty
         pages are only those that are full. When the writer goes to a new
         sub-buffer, it clears the contents of that sub-buffer.
      
      That is, even if the check was ">=" it would still not be equal as the
      most pages that can be considered "dirty" is nr_pages - 1.
      
      To fix this, add one to dirty and use ">=" in the compare.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20231226125902.4a057f1d@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Fixes: 03329f99
      
       ("tracing: Add tracefs file buffer_percentage")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09640899
    • Matthew Wilcox (Oracle)'s avatar
      mm/memory-failure: check the mapcount of the precise page · 4ee9d929
      Matthew Wilcox (Oracle) authored
      commit c79c5a0a upstream.
      
      A process may map only some of the pages in a folio, and might be missed
      if it maps the poisoned page but not the head page.  Or it might be
      unnecessarily hit if it maps the head page, but not the poisoned page.
      
      Link: https://lkml.kernel.org/r/20231218135837.3310403-3-willy@infradead.org
      Fixes: 7af446a8
      
       ("HWPOISON, hugetlb: enable error handling path for hugepage")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ee9d929
    • Matthew Wilcox (Oracle)'s avatar
      mm/memory-failure: cast index to loff_t before shifting it · fb21c978
      Matthew Wilcox (Oracle) authored
      commit 39ebd6dc upstream.
      
      On 32-bit systems, we'll lose the top bits of index because arithmetic
      will be performed in unsigned long instead of unsigned long long.  This
      affects files over 4GB in size.
      
      Link: https://lkml.kernel.org/r/20231218135837.3310403-4-willy@infradead.org
      Fixes: 6100e34b
      
       ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb21c978
    • Charan Teja Kalla's avatar
      mm: migrate high-order folios in swap cache correctly · be72d197
      Charan Teja Kalla authored
      commit fc346d0a upstream.
      
      Large folios occupy N consecutive entries in the swap cache instead of
      using multi-index entries like the page cache.  However, if a large folio
      is re-added to the LRU list, it can be migrated.  The migration code was
      not aware of the difference between the swap cache and the page cache and
      assumed that a single xas_store() would be sufficient.
      
      This leaves potentially many stale pointers to the now-migrated folio in
      the swap cache, which can lead to almost arbitrary data corruption in the
      future.  This can also manifest as infinite loops with the RCU read lock
      held.
      
      [willy@infradead.org: modifications to the changelog & tweaked the fix]
      Fixes: 3417013e
      
       ("mm/migrate: Add folio_migrate_mapping()")
      Link: https://lkml.kernel.org/r/20231214045841.961776-1-willy@infradead.org
      Signed-off-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reported-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
      Closes: https://lkml.kernel.org/r/1700569840-17327-1-git-send-email-quic_charante@quicinc.com
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be72d197
    • Baokun Li's avatar
      mm/filemap: avoid buffered read/write race to read inconsistent data · a8df7914
      Baokun Li authored
      commit e2c27b80
      
       upstream.
      
      The following concurrency may cause the data read to be inconsistent with
      the data on disk:
      
                   cpu1                           cpu2
      ------------------------------|------------------------------
                                     // Buffered write 2048 from 0
                                     ext4_buffered_write_iter
                                      generic_perform_write
                                       copy_page_from_iter_atomic
                                       ext4_da_write_end
                                        ext4_da_do_write_end
                                         block_write_end
                                          __block_commit_write
                                           folio_mark_uptodate
      // Buffered read 4096 from 0          smp_wmb()
      ext4_file_read_iter                   set_bit(PG_uptodate, folio_flags)
       generic_file_read_iter            i_size_write // 2048
        filemap_read                     unlock_page(page)
         filemap_get_pages
          filemap_get_read_batch
          folio_test_uptodate(folio)
           ret = test_bit(PG_uptodate, folio_flags)
           if (ret)
            smp_rmb();
            // Ensure that the data in page 0-2048 is up-to-date.
      
                                     // New buffered write 2048 from 2048
                                     ext4_buffered_write_iter
                                      generic_perform_write
                                       copy_page_from_iter_atomic
                                       ext4_da_write_end
                                        ext4_da_do_write_end
                                         block_write_end
                                          __block_commit_write
                                           folio_mark_uptodate
                                            smp_wmb()
                                            set_bit(PG_uptodate, folio_flags)
                                         i_size_write // 4096
                                         unlock_page(page)
      
         isize = i_size_read(inode) // 4096
         // Read the latest isize 4096, but without smp_rmb(), there may be
         // Load-Load disorder resulting in the data in the 2048-4096 range
         // in the page is not up-to-date.
         copy_page_to_iter
         // copyout 4096
      
      In the concurrency above, we read the updated i_size, but there is no read
      barrier to ensure that the data in the page is the same as the i_size at
      this point, so we may copy the unsynchronized page out.  Hence adding the
      missing read memory barrier to fix this.
      
      This is a Load-Load reordering issue, which only occurs on some weak
      mem-ordering architectures (e.g.  ARM64, ALPHA), but not on strong
      mem-ordering architectures (e.g.  X86).  And theoretically the problem
      doesn't only happen on ext4, filesystems that call filemap_read() but
      don't hold inode lock (e.g.  btrfs, f2fs, ubifs ...) will have this
      problem, while filesystems with inode lock (e.g.  xfs, nfs) won't have
      this problem.
      
      Link: https://lkml.kernel.org/r/20231213062324.739009-1-libaokun1@huawei.com
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: yangerkun <yangerkun@huawei.com>
      Cc: Yu Kuai <yukuai3@huawei.com>
      Cc: Zhang Yi <yi.zhang@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8df7914
    • Shin'ichiro Kawasaki's avatar
      platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe · b954b92e
      Shin'ichiro Kawasaki authored
      commit b28ff7a7
      
       upstream.
      
      p2sb_bar() unhides P2SB device to get resources from the device. It
      guards the operation by locking pci_rescan_remove_lock so that parallel
      rescans do not find the P2SB device. However, this lock causes deadlock
      when PCI bus rescan is triggered by /sys/bus/pci/rescan. The rescan
      locks pci_rescan_remove_lock and probes PCI devices. When PCI devices
      call p2sb_bar() during probe, it locks pci_rescan_remove_lock again.
      Hence the deadlock.
      
      To avoid the deadlock, do not lock pci_rescan_remove_lock in p2sb_bar().
      Instead, do the lock at fs_initcall. Introduce p2sb_cache_resources()
      for fs_initcall which gets and caches the P2SB resources. At p2sb_bar(),
      refer the cache and return to the caller.
      
      Suggested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Fixes: 9745fb07
      
       ("platform/x86/intel: Add Primary to Sideband (P2SB) bridge support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Link: https://lore.kernel.org/linux-pci/6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j/
      Link: https://lore.kernel.org/r/20231229063912.2517922-2-shinichiro.kawasaki@wdc.com
      Signed-off-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b954b92e
    • Namjae Jeon's avatar
      ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16() · 7a3bbbad
      Namjae Jeon authored
      commit d10c7787
      
       upstream.
      
      If ->NameOffset/Length is bigger than ->CreateContextsOffset/Length,
      ksmbd_check_message doesn't validate request buffer it correctly.
      So slab-out-of-bounds warning from calling smb_strndup_from_utf16()
      in smb2_open() could happen. If ->NameLength is non-zero, Set the larger
      of the two sums (Name and CreateContext size) as the offset and length of
      the data area.
      
      Reported-by: default avatarYang Chaoming <lometsj@live.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a3bbbad
    • Christoph Hellwig's avatar
      block: renumber QUEUE_FLAG_HW_WC · b9c5f0fd
      Christoph Hellwig authored
      [ Upstream commit 02d374f3 ]
      
      For the QUEUE_FLAG_HW_WC to actually work, it needs to have a separate
      number from QUEUE_FLAG_FUA, doh.
      
      Fixes: 43c9835b
      
       ("block: don't allow enabling a cache on devices that don't support it")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20231226081524.180289-1-hch@lst.de
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b9c5f0fd
    • Louis Chauvet's avatar
      spi: atmel: Fix clock issue when using devices with different polarities · e21b5fc5
      Louis Chauvet authored
      [ Upstream commit fc70d643 ]
      
      The current Atmel SPI controller driver (v2) behaves incorrectly when
      using two SPI devices with different clock polarities and GPIO CS.
      
      When switching from one device to another, the controller driver first
      enables the CS and then applies whatever configuration suits the targeted
      device (typically, the polarities). The side effect of such order is the
      apparition of a spurious clock edge after enabling the CS when the clock
      polarity needs to be inverted wrt. the previous configuration of the
      controller.
      
      This parasitic clock edge is problematic when the SPI device uses that edge
      for internal processing, which is perfectly legitimate given that its CS
      was asserted. Indeed, devices such as HVS8080 driven by driver gpio-sr in
      the kernel are shift registers and will process this first clock edge to
      perform a first register shift. In this case, the first bit gets lost and
      the whole data block that will later be read by the kernel is all shifted
      by one.
      
          Current behavior:
            The actual switching of the clock polarity only occurs after the CS
            when the controller sends the first message:
      
          CLK ------------\   /-\ /-\
                          |   | | | |    . . .
                          \---/ \-/ \
          CS  -----\
                   |
                   \------------------
      
                   ^      ^   ^
                   |      |   |
                   |      |   Actual clock of the message sent
                   |      |
                   |      Change of clock polarity, which occurs with the first
                   |      write to the bus. This edge occurs when the CS is
                   |      already asserted, and can be interpreted as
                   |      the first clock edge by the receiver.
                   |
                   GPIO CS toggle
      
      This issue is specific to this controller because while the SPI core
      performs the operations in the right order, the controller however does
      not. In practice, the controller only applies the clock configuration right
      before the first transmission.
      
      So this is not a problem when using the controller's dedicated CS, as the
      controller does things correctly, but it becomes a problem when you need to
      change the clock polarity and use an external GPIO for the CS.
      
      One possible approach to solve this problem is to send a dummy message
      before actually activating the CS, so that the controller applies the clock
      polarity beforehand.
      
      New behavior:
      
      CLK     ------\      /-\     /-\      /-\     /-\
                    |      | | ... | |      | | ... | |
                    \------/ \-   -/ \------/ \-   -/ \------
      
      CS      -\/-----------------------\
               ||                       |
               \/                       \---------------------
               ^    ^       ^           ^    ^
               |    |       |           |    |
               |    |       |           |    Expected clock cycles when
               |    |       |           |    sending the message
               |    |       |           |
               |    |       |           Actual GPIO CS activation, occurs inside
               |    |       |           the driver
               |    |       |
               |    |       Dummy message, to trigger clock polarity
               |    |       reconfiguration. This message is not received and
               |    |       processed by the device because CS is low.
               |    |
               |    Change of clock polarity, forced by the dummy message. This
               |    time, the edge is not detected by the receiver.
               |
               This small spike in CS activation is due to the fact that the
               spi-core activates the CS gpio before calling the driver's
               set_cs callback, which deactivates this gpio again until the
               clock polarity is correct.
      
      To avoid having to systematically send a dummy packet, the driver keeps
      track of the clock's current polarity. In this way, it only sends the dummy
      packet when necessary, ensuring that the clock will have the correct
      polarity when the CS is toggled.
      
      There could be two hardware problems with this patch:
      1- Maybe the small CS activation peak can confuse SPI devices
      2- If on a design, a single wire is used to select two devices depending
      on its state, the dummy message may disturb them.
      
      Fixes: 5ee36c98
      
       ("spi: atmel_spi update chipselect handling")
      Cc:  <stable@vger.kernel.org>
      Signed-off-by: default avatarLouis Chauvet <louis.chauvet@bootlin.com>
      Link: https://msgid.link/r/20231204154903.11607-1-louis.chauvet@bootlin.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e21b5fc5
    • Amit Kumar Mahapatra's avatar
      spi: Add APIs in spi core to set/get spi->chip_select and spi->cs_gpiod · 025cf65f
      Amit Kumar Mahapatra authored
      [ Upstream commit 303feb3c
      
       ]
      
      Supporting multi-cs in spi core and spi controller drivers would require
      the chip_select & cs_gpiod members of struct spi_device to be an array.
      But changing the type of these members to array would break the spi driver
      functionality. To make the transition smoother introduced four new APIs to
      get/set the spi->chip_select & spi->cs_gpiod and replaced all
      spi->chip_select and spi->cs_gpiod references in spi core with the API
      calls.
      While adding multi-cs support in further patches the chip_select & cs_gpiod
      members of the spi_device structure would be converted to arrays & the
      "idx" parameter of the APIs would be used as array index i.e.,
      spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.
      
      Suggested-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarAmit Kumar Mahapatra <amit.kumar-mahapatra@amd.com>
      Reviewed-by: default avatarMichal Simek <michal.simek@amd.com>
      Link: https://lore.kernel.org/r/20230119185342.2093323-2-amit.kumar-mahapatra@amd.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Stable-dep-of: fc70d643
      
       ("spi: atmel: Fix clock issue when using devices with different polarities")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      025cf65f
    • Tudor Ambarus's avatar
      spi: Reintroduce spi_set_cs_timing() · 64a4eb29
      Tudor Ambarus authored
      [ Upstream commit 684a4784 ]
      
      commit 4ccf3598 ("spi: remove spi_set_cs_timing()"), removed the
      method as noboby used it. Nobody used it probably because some SPI
      controllers use some default large cs-setup time that covers the usual
      cs-setup time required by the spi devices. There are though SPI controllers
      that have a smaller granularity for the cs-setup time and their default
      value can't fulfill the spi device requirements. That's the case for the
      at91 QSPI IPs where the default cs-setup time is half of the QSPI clock
      period. This was observed when using an sst26vf064b SPI NOR flash which
      needs a spi-cs-setup-ns = <7>; in order to be operated close to its maximum
      104 MHz frequency.
      
      Call spi_set_cs_timing() in spi_setup() just before calling spi_set_cs(),
      as the latter needs the CS timings already set.
      If spi->controller->set_cs_timing is not set, the method will return 0.
      There's no functional impact expected for the existing drivers. Even if the
      spi-mt65xx.c and spi-tegra114.c drivers set the set_cs_timing method,
      there's no user for them as of now. The only tested user of this support
      will be a SPI NOR flash that comunicates with the Atmel QSPI controller for
      which the support follows in the next patches.
      
      One will notice that this support is a bit different from the one that was
      removed in commit 4ccf3598
      
       ("spi: remove spi_set_cs_timing()"),
      because this patch adapts to the changes done after the removal: the move
      of the cs delays to the spi device, the retirement of the lelgacy GPIO
      handling. The mutex handling was removed from spi_set_cs_timing() because
      we now always call spi_set_cs_timing() in spi_setup(), which already
      handles the spi->controller->io_mutex, so use the mutex handling from
      spi_setup().
      
      Signed-off-by: default avatarTudor Ambarus <tudor.ambarus@microchip.com>
      Link: https://lore.kernel.org/r/20221117105249.115649-4-tudor.ambarus@microchip.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Stable-dep-of: fc70d643
      
       ("spi: atmel: Fix clock issue when using devices with different polarities")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      64a4eb29
    • Helge Deller's avatar
      linux/export: Ensure natural alignment of kcrctab array · 95e21657
      Helge Deller authored
      [ Upstream commit 753547de ]
      
      The ___kcrctab section holds an array of 32-bit CRC values.
      Add a .balign 4 to tell the linker the correct memory alignment.
      
      Fixes: f3304ecd
      
       ("linux/export: use inline assembler to populate symbol CRCs")
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95e21657
    • NeilBrown's avatar
      nfsd: call nfsd_last_thread() before final nfsd_put() · bb4f791c
      NeilBrown authored
      [ Upstream commit 2a501f55 ]
      
      If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put()
      without calling nfsd_last_thread().  This leaves nn->nfsd_serv pointing
      to a structure that has been freed.
      
      So remove 'static' from nfsd_last_thread() and call it when the
      nfsd_serv is about to be destroyed.
      
      Fixes: ec52361d
      
       ("SUNRPC: stop using ->sv_nrthreads as a refcount")
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bb4f791c
    • NeilBrown's avatar
      nfsd: separate nfsd_last_thread() from nfsd_put() · 03d68ffc
      NeilBrown authored
      [ Upstream commit 9f28a971
      
       ]
      
      Now that the last nfsd thread is stopped by an explicit act of calling
      svc_set_num_threads() with a count of zero, we only have a limited
      number of places that can happen, and don't need to call
      nfsd_last_thread() in nfsd_put()
      
      So separate that out and call it at the two places where the number of
      threads is set to zero.
      
      Move the clearing of ->nfsd_serv and the call to svc_xprt_destroy_all()
      into nfsd_last_thread(), as they are really part of the same action.
      
      nfsd_put() is now a thin wrapper around svc_put(), so make it a static
      inline.
      
      nfsd_put() cannot be called after nfsd_last_thread(), so in a couple of
      places we have to use svc_put() instead.
      
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Stable-dep-of: 2a501f55
      
       ("nfsd: call nfsd_last_thread() before final nfsd_put()")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      03d68ffc
    • Nuno Sa's avatar
      iio: imu: adis16475: add spi_device_id table · 481561a4
      Nuno Sa authored
      [ Upstream commit ee4d7905 ]
      
      This prevents the warning message "SPI driver has no spi_device_id for..."
      when registering the driver. More importantly, it makes sure that
      module autoloading works as spi relies on spi: modaliases and not of.
      
      While at it, move the of_device_id table to it's natural place.
      
      Fixes: fff7352b
      
       ("iio: imu: Add support for adis16475")
      Signed-off-by: default avatarNuno Sa <nuno.sa@analog.com>
      Link: https://lore.kernel.org/r/20231102125258.3284830-1-nuno.sa@analog.com
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      481561a4
    • Andy Shevchenko's avatar
      spi: Introduce spi_get_device_match_data() helper · 027eaeaf
      Andy Shevchenko authored
      [ Upstream commit aea672d0
      
       ]
      
      The proposed spi_get_device_match_data() helper is for retrieving
      a driver data associated with the ID in an ID table. First, it tries
      to get driver data of the device enumerated by firmware interface
      (usually Device Tree or ACPI). If none is found it falls back to
      the SPI ID table matching.
      
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Link: https://lore.kernel.org/r/20221020195421.10482-1-andriy.shevchenko@linux.intel.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Stable-dep-of: ee4d7905
      
       ("iio: imu: adis16475: add spi_device_id table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      027eaeaf
    • Dan Carpenter's avatar
      usb: fotg210-hcd: delete an incorrect bounds test · 457a219c
      Dan Carpenter authored
      [ Upstream commit 7fbcd195 ]
      
      Here "temp" is the number of characters that we have written and "size"
      is the size of the buffer.  The intent was clearly to say that if we have
      written to the end of the buffer then stop.
      
      However, for that to work the comparison should have been done on the
      original "size" value instead of the "size -= temp" value.  Not only
      will that not trigger when we want to, but there is a small chance that
      it will trigger incorrectly before we want it to and we break from the
      loop slightly earlier than intended.
      
      This code was recently changed from using snprintf() to scnprintf().  With
      snprintf() we likely would have continued looping and passed a negative
      size parameter to snprintf().  This would have triggered an annoying
      WARN().  Now that we have converted to scnprintf() "size" will never
      drop below 1 and there is no real need for this test.  We could change
      the condition to "if (temp <= 1) goto done;" but just deleting the test
      is cleanest.
      
      Fixes: 7d50195f
      
       ("usb: host: Faraday fotg210-hcd driver")
      Cc: stable <stable@kernel.org>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarLee Jones <lee@kernel.org>
      Link: https://lore.kernel.org/r/ZXmwIwHe35wGfgzu@suswa
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      457a219c
    • Tony Lindgren's avatar
      ARM: dts: Fix occasional boot hang for am3 usb · 11912727
      Tony Lindgren authored
      [ Upstream commit 9b6a51aa ]
      
      With subtle timings changes, we can now sometimes get an external abort on
      non-linefetch error booting am3 devices at sysc_reset(). This is because
      of a missing reset delay needed for the usb target module.
      
      Looks like we never enabled the delay earlier for am3, although a similar
      issue was seen earlier with a similar usb setup for dm814x as described in
      commit ebf24414 ("ARM: OMAP2+: Use srst_udelay for USB on dm814x").
      
      Cc: stable@vger.kernel.org
      Fixes: 0782e857
      
       ("ARM: dts: Probe am335x musb with ti-sysc")
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      11912727
    • Namjae Jeon's avatar
      ksmbd: fix wrong allocation size update in smb2_open() · 98235bc1
      Namjae Jeon authored
      [ Upstream commit a9f106c7
      
       ]
      
      When client send SMB2_CREATE_ALLOCATION_SIZE create context, ksmbd update
      old size to ->AllocationSize in smb2 create response. ksmbd_vfs_getattr()
      should be called after it to get updated stat result.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      98235bc1
    • Namjae Jeon's avatar
      ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack() · 04b8e04f
      Namjae Jeon authored
      [ Upstream commit 658609d9
      
       ]
      
      opinfo_put() could be called twice on error of smb21_lease_break_ack().
      It will cause UAF issue if opinfo is referenced on other places.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      04b8e04f
    • Namjae Jeon's avatar
      ksmbd: lazy v2 lease break on smb2_write() · 34f7d5b5
      Namjae Jeon authored
      [ Upstream commit c2a721ee
      
       ]
      
      Don't immediately send directory lease break notification on smb2_write().
      Instead, It postpones it until smb2_close().
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      34f7d5b5
    • Namjae Jeon's avatar
      ksmbd: send v2 lease break notification for directory · 500c7a5e
      Namjae Jeon authored
      [ Upstream commit d47d9886
      
       ]
      
      If client send different parent key, different client guid, or there is
      no parent lease key flags in create context v2 lease, ksmbd send lease
      break to client.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      500c7a5e
    • Namjae Jeon's avatar
      ksmbd: downgrade RWH lease caching state to RH for directory · 19939594
      Namjae Jeon authored
      [ Upstream commit eb547407
      
       ]
      
      RWH(Read + Write + Handle) caching state is not supported for directory.
      ksmbd downgrade it to RH for directory if client send RWH caching lease
      state.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      19939594
    • Namjae Jeon's avatar
      ksmbd: set v2 lease capability · 2fcb46df
      Namjae Jeon authored
      [ Upstream commit 18dd1c36
      
       ]
      
      Set SMB2_GLOBAL_CAP_DIRECTORY_LEASING to ->capabilities to inform server
      support directory lease to client.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2fcb46df
    • Namjae Jeon's avatar
      ksmbd: set epoch in create context v2 lease · 3eddc811
      Namjae Jeon authored
      [ Upstream commit d045850b
      
       ]
      
      To support v2 lease(directory lease), ksmbd set epoch in create context
      v2 lease response.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3eddc811
    • Namjae Jeon's avatar
      ksmbd: don't update ->op_state as OPLOCK_STATE_NONE on error · 52a32eaf
      Namjae Jeon authored
      [ Upstream commit cd80ce7e
      
       ]
      
      ksmbd set ->op_state as OPLOCK_STATE_NONE on lease break ack error.
      op_state of lease should not be updated because client can send lease
      break ack again. This patch fix smb2.lease.breaking2 test failure.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      52a32eaf
    • Namjae Jeon's avatar
      ksmbd: move setting SMB2_FLAGS_ASYNC_COMMAND and AsyncId · 0bc46c23
      Namjae Jeon authored
      [ Upstream commit 9ac45ac7
      
       ]
      
      Directly set SMB2_FLAGS_ASYNC_COMMAND flags and AsyncId in smb2 header of
      interim response instead of current response header.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0bc46c23
    • Namjae Jeon's avatar
      ksmbd: release interim response after sending status pending response · d9aa5c19
      Namjae Jeon authored
      [ Upstream commit 2a3f7857
      
       ]
      
      Add missing release async id and delete interim response entry after
      sending status pending response. This only cause when smb2 lease is enable.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9aa5c19
    • Namjae Jeon's avatar
      ksmbd: move oplock handling after unlock parent dir · 013bf453
      Namjae Jeon authored
      [ Upstream commit 2e450920
      
       ]
      
      ksmbd should process secound parallel smb2 create request during waiting
      oplock break ack. parent lock range that is too large in smb2_open() causes
      smb2_open() to be serialized. Move the oplock handling to the bottom of
      smb2_open() and make it called after parent unlock. This fixes the failure
      of smb2.lease.breaking1 testcase.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      013bf453
    • Namjae Jeon's avatar
      ksmbd: separately allocate ci per dentry · 20dd92c2
      Namjae Jeon authored
      [ Upstream commit 4274a9dc
      
       ]
      
      xfstests generic/002 test fail when enabling smb2 leases feature.
      This test create hard link file, but removeal failed.
      ci has a file open count to count file open through the smb client,
      but in the case of hard link files, The allocation of ci per inode
      cause incorrectly open count for file deletion. This patch allocate
      ci per dentry to counts open counts for hard link.
      
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      20dd92c2