Skip to content
  1. Aug 08, 2021
    • Linus Torvalds's avatar
      ACPI: fix NULL pointer dereference · 38f54217
      Linus Torvalds authored
      [ Upstream commit fc68f42a ]
      
      Commit 71f64283 ("ACPI: utils: Fix reference counting in
      for_each_acpi_dev_match()") started doing "acpi_dev_put()" on a pointer
      that was possibly NULL.  That fails miserably, because that helper
      inline function is not set up to handle that case.
      
      Just make acpi_dev_put() silently accept a NULL pointer, rather than
      calling down to put_device() with an invalid offset off that NULL
      pointer.
      
      Link: https://lore.kernel.org/lkml/a607c149-6bf6-0fd0-0e31-100378504da2@kernel.dk/
      
      
      Reported-and-tested-by: default avatarJens Axboe <axboe@kernel.dk>
      Tested-by: default avatarDaniel Scally <djrscally@gmail.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      38f54217
    • Keith Busch's avatar
      nvme: fix nvme_setup_command metadata trace event · 0ea2f55b
      Keith Busch authored
      [ Upstream commit 234211b8
      
       ]
      
      The metadata address is set after the trace event, so the trace is not
      capturing anything useful. Rather than logging the memory address, it's
      useful to know if the command carries a metadata payload, so change the
      trace event to log that true/false state instead.
      
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0ea2f55b
    • Pravin B Shelar's avatar
      net: Fix zero-copy head len calculation. · b508b652
      Pravin B Shelar authored
      [ Upstream commit a17ad096
      
       ]
      
      In some cases skb head could be locked and entire header
      data is pulled from skb. When skb_zerocopy() called in such cases,
      following BUG is triggered. This patch fixes it by copying entire
      skb in such cases.
      This could be optimized incase this is performance bottleneck.
      
      ---8<---
      kernel BUG at net/core/skbuff.c:2961!
      invalid opcode: 0000 [#1] SMP PTI
      CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           OE     5.4.0-77-generic #86-Ubuntu
      Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014
      RIP: 0010:skb_zerocopy+0x37a/0x3a0
      RSP: 0018:ffffbcc70013ca38 EFLAGS: 00010246
      Call Trace:
       <IRQ>
       queue_userspace_packet+0x2af/0x5e0 [openvswitch]
       ovs_dp_upcall+0x3d/0x60 [openvswitch]
       ovs_dp_process_packet+0x125/0x150 [openvswitch]
       ovs_vport_receive+0x77/0xd0 [openvswitch]
       netdev_port_receive+0x87/0x130 [openvswitch]
       netdev_frame_hook+0x4b/0x60 [openvswitch]
       __netif_receive_skb_core+0x2b4/0xc90
       __netif_receive_skb_one_core+0x3f/0xa0
       __netif_receive_skb+0x18/0x60
       process_backlog+0xa9/0x160
       net_rx_action+0x142/0x390
       __do_softirq+0xe1/0x2d6
       irq_exit+0xae/0xb0
       do_IRQ+0x5a/0xf0
       common_interrupt+0xf/0xf
      
      Code that triggered BUG:
      int
      skb_zerocopy(struct sk_buff *to, struct sk_buff *from, int len, int hlen)
      {
              int i, j = 0;
              int plen = 0; /* length of skb->head fragment */
              int ret;
              struct page *page;
              unsigned int offset;
      
              BUG_ON(!from->head_frag && !hlen);
      
      Signed-off-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b508b652
    • Jia He's avatar
      qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union() · bf692e7e
      Jia He authored
      [ Upstream commit 6206b798
      
       ]
      
      Liajian reported a bug_on hit on a ThunderX2 arm64 server with FastLinQ
      QL41000 ethernet controller:
       BUG: scheduling while atomic: kworker/0:4/531/0x00000200
        [qed_probe:488()]hw prepare failed
        kernel BUG at mm/vmalloc.c:2355!
        Internal error: Oops - BUG: 0 [#1] SMP
        CPU: 0 PID: 531 Comm: kworker/0:4 Tainted: G W 5.4.0-77-generic #86-Ubuntu
        pstate: 00400009 (nzcv daif +PAN -UAO)
       Call trace:
        vunmap+0x4c/0x50
        iounmap+0x48/0x58
        qed_free_pci+0x60/0x80 [qed]
        qed_probe+0x35c/0x688 [qed]
        __qede_probe+0x88/0x5c8 [qede]
        qede_probe+0x60/0xe0 [qede]
        local_pci_probe+0x48/0xa0
        work_for_cpu_fn+0x24/0x38
        process_one_work+0x1d0/0x468
        worker_thread+0x238/0x4e0
        kthread+0xf0/0x118
        ret_from_fork+0x10/0x18
      
      In this case, qed_hw_prepare() returns error due to hw/fw error, but in
      theory work queue should be in process context instead of interrupt.
      
      The root cause might be the unpaired spin_{un}lock_bh() in
      _qed_mcp_cmd_and_union(), which causes botton half is disabled incorrectly.
      
      Reported-by: default avatarLijian Zhang <Lijian.Zhang@arm.com>
      Signed-off-by: default avatarJia He <justin.he@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf692e7e
    • Takashi Iwai's avatar
      r8152: Fix potential PM refcount imbalance · 6bc48348
      Takashi Iwai authored
      [ Upstream commit 9c23aa51 ]
      
      rtl8152_close() takes the refcount via usb_autopm_get_interface() but
      it doesn't release when RTL8152_UNPLUG test hits.  This may lead to
      the imbalance of PM refcount.  This patch addresses it.
      
      Link: https://bugzilla.suse.com/show_bug.cgi?id=1186194
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6bc48348
    • Kyle Russell's avatar
      ASoC: tlv320aic31xx: fix reversed bclk/wclk master bits · a57c75ff
      Kyle Russell authored
      [ Upstream commit 9cf76a72 ]
      
      These are backwards from Table 7-71 of the TLV320AIC3100 spec [1].
      
      This was broken in 12eb4d66 when BCLK_MASTER and WCLK_MASTER
      were converted from 0x08 and 0x04 to BIT(2) and BIT(3), respectively.
      
      -#define AIC31XX_BCLK_MASTER		0x08
      -#define AIC31XX_WCLK_MASTER		0x04
      +#define AIC31XX_BCLK_MASTER		BIT(2)
      +#define AIC31XX_WCLK_MASTER		BIT(3)
      
      Probably just a typo since the defines were not listed in bit order.
      
      [1] https://www.ti.com/lit/gpn/tlv320aic3100
      
      
      
      Signed-off-by: default avatarKyle Russell <bkylerussell@gmail.com>
      Link: https://lore.kernel.org/r/20210622010941.241386-1-bkylerussell@gmail.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a57c75ff
    • Alain Volmat's avatar
      spi: stm32h7: fix full duplex irq handler handling · e2cccb83
      Alain Volmat authored
      [ Upstream commit e4a5c198
      
       ]
      
      In case of Full-Duplex mode, DXP flag is set when RXP and TXP flags are
      set. But to avoid 2 different handlings, just add TXP and RXP flag in
      the mask instead of DXP, and then keep the initial handling of TXP and
      RXP events.
      Also rephrase comment about EOTIE which is one of the interrupt enable
      bits. It is not triggered by any event.
      
      Signed-off-by: default avatarAmelie Delaunay <amelie.delaunay@foss.st.com>
      Signed-off-by: default avatarAlain Volmat <alain.volmat@foss.st.com>
      Reviewed-by: default avatarAmelie Delaunay <amelie.delaunay@foss.st.com>
      Link: https://lore.kernel.org/r/1625042723-661-3-git-send-email-alain.volmat@foss.st.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e2cccb83
    • Axel Lin's avatar
      regulator: rt5033: Fix n_voltages settings for BUCK and LDO · b72f2d9e
      Axel Lin authored
      [ Upstream commit 6549c46a
      
       ]
      
      For linear regulators, the n_voltages should be (max - min) / step + 1.
      
      Buck voltage from 1v to 3V, per step 100mV, and vout mask is 0x1f.
      If value is from 20 to 31, the voltage will all be fixed to 3V.
      And LDO also, just vout range is different from 1.2v to 3v, step is the
      same. If value is from 18 to 31, the voltage will also be fixed to 3v.
      
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Reviewed-by: default avatarChiYuan Huang <cy_huang@richtek.com>
      Link: https://lore.kernel.org/r/20210627080418.1718127-1-axel.lin@ingics.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b72f2d9e
    • Filipe Manana's avatar
      btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction · 86f2a3e9
      Filipe Manana authored
      [ Upstream commit ecc64fab
      
       ]
      
      When checking if we need to log the new name of a renamed inode, we are
      checking if the inode and its parent inode have been logged before, and if
      not we don't log the new name. The check however is buggy, as it directly
      compares the logged_trans field of the inodes versus the ID of the current
      transaction. The problem is that logged_trans is a transient field, only
      stored in memory and never persisted in the inode item, so if an inode
      was logged before, evicted and reloaded, its logged_trans field is set to
      a value of 0, meaning the check will return false and the new name of the
      renamed inode is not logged. If the old parent directory was previously
      fsynced and we deleted the logged directory entries corresponding to the
      old name, we end up with a log that when replayed will delete the renamed
      inode.
      
      The following example triggers the problem:
      
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt
      
        $ mkdir /mnt/A
        $ mkdir /mnt/B
        $ echo -n "hello world" > /mnt/A/foo
      
        $ sync
      
        # Add some new file to A and fsync directory A.
        $ touch /mnt/A/bar
        $ xfs_io -c "fsync" /mnt/A
      
        # Now trigger inode eviction. We are only interested in triggering
        # eviction for the inode of directory A.
        $ echo 2 > /proc/sys/vm/drop_caches
      
        # Move foo from directory A to directory B.
        # This deletes the directory entries for foo in A from the log, and
        # does not add the new name for foo in directory B to the log, because
        # logged_trans of A is 0, which is less than the current transaction ID.
        $ mv /mnt/A/foo /mnt/B/foo
      
        # Now make an fsync to anything except A, B or any file inside them,
        # like for example create a file at the root directory and fsync this
        # new file. This syncs the log that contains all the changes done by
        # previous rename operation.
        $ touch /mnt/baz
        $ xfs_io -c "fsync" /mnt/baz
      
        <power fail>
      
        # Mount the filesystem and replay the log.
        $ mount /dev/sdc /mnt
      
        # Check the filesystem content.
        $ ls -1R /mnt
        /mnt/:
        A
        B
        baz
      
        /mnt/A:
        bar
      
        /mnt/B:
        $
      
        # File foo is gone, it's neither in A/ nor in B/.
      
      Fix this by using the inode_logged() helper at btrfs_log_new_name(), which
      safely checks if an inode was logged before in the current transaction.
      
      A test case for fstests will follow soon.
      
      CC: stable@vger.kernel.org # 4.14+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      86f2a3e9
    • Filipe Manana's avatar
      btrfs: fix race causing unnecessary inode logging during link and rename · b7f0fa21
      Filipe Manana authored
      [ Upstream commit de53d892
      
       ]
      
      When we are doing a rename or a link operation for an inode that was logged
      in the previous transaction and that transaction is still committing, we
      have a time window where we incorrectly consider that the inode was logged
      previously in the current transaction and therefore decide to log it to
      update it in the log. The following steps give an example on how this
      happens during a link operation:
      
      1) Inode X is logged in transaction 1000, so its logged_trans field is set
         to 1000;
      
      2) Task A starts to commit transaction 1000;
      
      3) The state of transaction 1000 is changed to TRANS_STATE_UNBLOCKED;
      
      4) Task B starts a link operation for inode X, and as a consequence it
         starts transaction 1001;
      
      5) Task A is still committing transaction 1000, therefore the value stored
         at fs_info->last_trans_committed is still 999;
      
      6) Task B calls btrfs_log_new_name(), it reads a value of 999 from
         fs_info->last_trans_committed and because the logged_trans field of
         inode X has a value of 1000, the function does not return immediately,
         instead it proceeds to logging the inode, which should not happen
         because the inode was logged in the previous transaction (1000) and
         not in the current one (1001).
      
      This is not a functional problem, just wasted time and space logging an
      inode that does not need to be logged, contributing to higher latency
      for link and rename operations.
      
      So fix this by comparing the inodes' logged_trans field with the
      generation of the current transaction instead of comparing with the value
      stored in fs_info->last_trans_committed.
      
      This case is often hit when running dbench for a long enough duration, as
      it does lots of rename operations.
      
      This patch belongs to a patch set that is comprised of the following
      patches:
      
        btrfs: fix race causing unnecessary inode logging during link and rename
        btrfs: fix race that results in logging old extents during a fast fsync
        btrfs: fix race that causes unnecessary logging of ancestor inodes
        btrfs: fix race that makes inode logging fallback to transaction commit
        btrfs: fix race leading to unnecessary transaction commit when logging inode
        btrfs: do not block inode logging for so long during transaction commit
      
      Performance results are mentioned in the change log of the last patch.
      
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b7f0fa21
    • Filipe Manana's avatar
      btrfs: do not commit logs and transactions during link and rename operations · cb006da6
      Filipe Manana authored
      [ Upstream commit 75b463d2 ]
      
      Since commit d4682ba0
      
       ("Btrfs: sync log after logging new name") we
      started to commit logs, and fallback to transaction commits when we failed
      to log the new names or commit the logs, after link and rename operations
      when the target inodes (or their parents) were previously logged in the
      current transaction. This was to avoid losing directories despite an
      explicit fsync on them when they are ancestors of some inode that got a
      new named logged, due to a link or rename operation. However that adds the
      cost of starting IO and waiting for it to complete, which can cause higher
      latencies for applications.
      
      Instead of doing that, just make sure that when we log a new name for an
      inode we don't mark any of its ancestors as logged, so that if any one
      does an fsync against any of them, without doing any other change on them,
      the fsync commits the log. This way we only pay the cost of a log commit
      (or a transaction commit if something goes wrong or a new block group was
      created) if the application explicitly asks to fsync any of the parent
      directories.
      
      Using dbench, which mixes several filesystems operations including renames,
      revealed some significant latency gains. The following script that uses
      dbench was used to test this:
      
        #!/bin/bash
      
        DEV=/dev/nvme0n1
        MNT=/mnt/btrfs
        MOUNT_OPTIONS="-o ssd -o space_cache=v2"
        MKFS_OPTIONS="-m single -d single"
        THREADS=16
      
        echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
        mkfs.btrfs -f $MKFS_OPTIONS $DEV
        mount $MOUNT_OPTIONS $DEV $MNT
      
        dbench -t 300 -D $MNT $THREADS
      
        umount $MNT
      
      The test was run on bare metal, no virtualization, on a box with 12 cores
      (Intel i7-8700), 64Gb of RAM and using a NVMe device, with a kernel
      configuration that is the default of typical distributions (debian in this
      case), without debug options enabled (kasan, kmemleak, slub debug, debug
      of page allocations, lock debugging, etc).
      
      Results before this patch:
      
       Operation      Count    AvgLat    MaxLat
       ----------------------------------------
       NTCreateX    10750455     0.011   155.088
       Close         7896674     0.001     0.243
       Rename         455222     2.158  1101.947
       Unlink        2171189     0.067   121.638
       Deltree           256     2.425     7.816
       Mkdir             128     0.002     0.003
       Qpathinfo     9744323     0.006    21.370
       Qfileinfo     1707092     0.001     0.146
       Qfsinfo       1786756     0.001    11.228
       Sfileinfo      875612     0.003    21.263
       Find          3767281     0.025     9.617
       WriteX        5356924     0.011   211.390
       ReadX        16852694     0.003     9.442
       LockX           35008     0.002     0.119
       UnlockX         35008     0.001     0.138
       Flush          753458     4.252  1102.249
      
      Throughput 1128.35 MB/sec  16 clients  16 procs  max_latency=1102.255 ms
      
      Results after this patch:
      
      16 clients, after
      
       Operation      Count    AvgLat    MaxLat
       ----------------------------------------
       NTCreateX    11471098     0.012   448.281
       Close         8426396     0.001     0.925
       Rename         485746     0.123   267.183
       Unlink        2316477     0.080    63.433
       Deltree           288     2.830    11.144
       Mkdir             144     0.003     0.010
       Qpathinfo    10397420     0.006    10.288
       Qfileinfo     1822039     0.001     0.169
       Qfsinfo       1906497     0.002    14.039
       Sfileinfo      934433     0.004     2.438
       Find          4019879     0.026    10.200
       WriteX        5718932     0.011   200.985
       ReadX        17981671     0.003    10.036
       LockX           37352     0.002     0.076
       UnlockX         37352     0.001     0.109
       Flush          804018     5.015   778.033
      
      Throughput 1201.98 MB/sec  16 clients  16 procs  max_latency=778.036 ms
      (+6.5% throughput, -29.4% max latency, -75.8% rename latency)
      
      Test case generic/498 from fstests tests the scenario that the previously
      mentioned commit fixed.
      
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cb006da6
    • Randy Dunlap's avatar
      btrfs: delete duplicated words + other fixes in comments · 174c27d0
      Randy Dunlap authored
      [ Upstream commit 260db43c
      
       ]
      
      Delete repeated words in fs/btrfs/.
      {to, the, a, and old}
      and change "into 2 part" to "into 2 parts".
      
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      174c27d0
  2. Aug 04, 2021