Skip to content
  1. Mar 21, 2013
    • Linus Torvalds's avatar
      Merge tag 'dm-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm · 85ab3c46
      Linus Torvalds authored
      Pull device-mapper fixes from Alasdair G Kergon:
       "Fix reported data loss with discards and thin snapshots; avoid a
        deadlock observed in dm verity; fix a race in the new dm cache code
        along with some other minor bugs; store the cache policy version on
        disk to make the stored hints format future-proof."
      
      * tag 'dm-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
        dm cache: policy ignore hints if generated by different version
        dm cache: policy change version from string to integer set
        dm cache: fix race in writethrough implementation
        dm cache: metadata clear dirty bits on clean shutdown
        dm cache: avoid calling policy destructor twice on error
        dm cache: detect cache_create failure
        dm cache: avoid 64 bit division on 32 bit
        dm verity: avoid deadlock
        dm thin: fix non power of two discard granularity calc
        dm thin: fix discard corruption
      85ab3c46
    • Mike Snitzer's avatar
      dm cache: policy ignore hints if generated by different version · ea2dd8c1
      Mike Snitzer authored
      
      
      When reading the dm cache metadata from disk, ignore the policy hints
      unless they were generated by the same major version number of the same
      policy module.
      
      The hints are considered to be private data belonging to the specific
      module that generated them and there is no requirement for them to make
      sense to different versions of the policy that generated them.
      Policy modules are all required to work fine if no previous hints are
      supplied (or if existing hints are lost).
      
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      ea2dd8c1
    • Mike Snitzer's avatar
      dm cache: policy change version from string to integer set · 4e7f506f
      Mike Snitzer authored
      
      
      Separate dm cache policy version string into 3 unsigned numbers
      corresponding to major, minor and patchlevel and store them at the end
      of the on-disk metadata so we know which version of the policy generated
      the hints in case a future version wants to use them differently.
      
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      4e7f506f
    • Joe Thornber's avatar
      dm cache: fix race in writethrough implementation · e2e74d61
      Joe Thornber authored
      
      
      We have found a race in the optimisation used in the dm cache
      writethrough implementation.  Currently, dm core sends the cache target
      two bios, one for the origin device and one for the cache device and
      these are processed in parallel.  This patch avoids the race by
      changing the code back to a simpler (slower) implementation which
      processes the two writes in series, one after the other, until we can
      develop a complete fix for the problem.
      
      When the cache is in writethrough mode it needs to send WRITE bios to
      both the origin and cache devices.
      
      Previously we've been implementing this by having dm core query the
      cache target on every write to find out how many copies of the bio it
      wants.  The cache will ask for two bios if the block is in the cache,
      and one otherwise.
      
      Then main problem with this is it's racey.  At the time this check is
      made the bio hasn't yet been submitted and so isn't being taken into
      account when quiescing a block for migration (promotion or demotion).
      This means a single bio may be submitted when two were needed because
      the block has since been promoted to the cache (catastrophic), or two
      bios where only one is needed (harmless).
      
      I really don't want to start entering bios into the quiescing system
      (deferred_set) in the get_num_write_bios callback.  Instead this patch
      simplifies things; only one bio is submitted by the core, this is
      first written to the origin and then the cache device in series.
      Obviously this will have a latency impact.
      
      deferred_writethrough_bios is introduced to record bios that must be
      later issued to the cache device from the worker thread.  This deferred
      submission, after the origin bio completes, is required given that we're
      in interrupt context (writethrough_endio).
      
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      e2e74d61
    • Joe Thornber's avatar
      dm cache: metadata clear dirty bits on clean shutdown · 79ed9caf
      Joe Thornber authored
      
      
      When writing the dirty bitset to the metadata device on a clean
      shutdown, clear the dirty bits.  Previously they were left indicating
      the cache was dirty. This led to confusion about whether there really
      was dirty data in the cache or not.  (This was a harmless bug.)
      
      Reported-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      79ed9caf
    • Heinz Mauelshagen's avatar
      dm cache: avoid calling policy destructor twice on error · b978440b
      Heinz Mauelshagen authored
      
      
      If the cache policy's config values are not able to be set we must
      set the policy to NULL after destroying it in create_cache_policy()
      so we don't attempt to destroy it a second time later.
      
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      b978440b
    • Heinz Mauelshagen's avatar
      dm cache: detect cache_create failure · 617a0b89
      Heinz Mauelshagen authored
      
      
      Return error if cache_create() fails.
      
      A missing return check made cache_ctr continue even after an error in
      cache_create() resulting in the cache object being destroyed.  So a
      simple failure like an odd number of cache policy config value arguments
      would result in an oops.
      
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      617a0b89
    • Joe Thornber's avatar
      dm cache: avoid 64 bit division on 32 bit · 414dd67d
      Joe Thornber authored
      
      
      Squash various 32bit link errors.
      
        >> on i386:
        >> drivers/built-in.o: In function `is_discarded_oblock':
        >> dm-cache-target.c:(.text+0x1ea28e): undefined reference to `__udivdi3'
        ...
      
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      414dd67d
    • Mikulas Patocka's avatar
      dm verity: avoid deadlock · 3b6b7813
      Mikulas Patocka authored
      
      
      A deadlock was found in the prefetch code in the dm verity map
      function.  This patch fixes this by transferring the prefetch
      to a worker thread and skipping it completely if kmalloc fails.
      
      If generic_make_request is called recursively, it queues the I/O
      request on the current->bio_list without making the I/O request
      and returns. The routine making the recursive call cannot wait
      for the I/O to complete.
      
      The deadlock occurs when one thread grabs the bufio_client
      mutex and waits for an I/O to complete but the I/O is queued
      on another thread's current->bio_list and is waiting to get
      the mutex held by the first thread.
      
      The fix recognises that prefetching is not essential.  If memory
      can be allocated, it queues the prefetch request to the worker thread,
      but if not, it does nothing.
      
      Signed-off-by: default avatarPaul Taysom <taysom@chromium.org>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      Cc: stable@kernel.org
      3b6b7813
    • Joe Thornber's avatar
      dm thin: fix non power of two discard granularity calc · 58051b94
      Joe Thornber authored
      
      
      Fix a discard granularity calculation to work for non power of 2 block sizes.
      
      In order for thinp to passdown discard bios to the underlying data
      device, the data device must have a discard granularity that is a
      factor of the thinp block size.  Originally this check was done by
      using bitops since the block_size was known to be a power of two.
      
      Introduced by commit f13945d7
      ("dm thin: support a non power of 2 discard_granularity").
      
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      58051b94
    • Joe Thornber's avatar
      dm thin: fix discard corruption · f046f89a
      Joe Thornber authored
      
      
      Fix a bug in dm_btree_remove that could leave leaf values with incorrect
      reference counts.  The effect of this was that removal of a shared block
      could result in the space maps thinking the block was no longer used.
      More concretely, if you have a thin device and a snapshot of it, sending
      a discard to a shared region of the thin could corrupt the snapshot.
      
      Thinp uses a 2-level nested btree to store it's mappings.  This first
      level is indexed by thin device, and the second level by logical
      block.
      
      Often when we're removing an entry in this mapping tree we need to
      rebalance nodes, which can involve shadowing them, possibly creating a
      copy if the block is shared.  If we do create a copy then children of
      that node need to have their reference counts incremented.  In this
      way reference counts percolate down the tree as shared trees diverge.
      
      The rebalance functions were incrementing the children at the
      appropriate time, but they were always assuming the children were
      internal nodes.  This meant the leaf values (in our case packed
      block/flags entries) were not being incremented.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      f046f89a
  2. Mar 20, 2013
    • Linus Torvalds's avatar
      Merge tag 'vfio-v3.9-rc4' of git://github.com/awilliam/linux-vfio · 2ffdd7e2
      Linus Torvalds authored
      Pull vfio fix from Alex Williamson.
      
      * tag 'vfio-v3.9-rc4' of git://github.com/awilliam/linux-vfio:
        vfio: include <linux/slab.h> for kmalloc
      2ffdd7e2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · ea4a0ce1
      Linus Torvalds authored
      Pull kvm fixes from Marcelo Tosatti.
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)
        KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
        KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796)
        KVM: x86: fix deadlock in clock-in-progress request handling
        KVM: allow host header to be included even for !CONFIG_KVM
      ea4a0ce1
    • Linus Torvalds's avatar
      Merge tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs · 10b38669
      Linus Torvalds authored
      Pull XFS fixes from Ben Myers:
      
       - Fix for a potential infinite loop which was introduced in commit
         4d559a3b ("xfs: limit speculative prealloc near ENOSPC
         thresholds")
      
       - Fix for the return type of xfs_iomap_eof_prealloc_initial_size from
         commit a1e16c26 ("xfs: limit speculative prealloc size on sparse
         files")
      
       - Fix for a failed buffer readahead causing subsequent callers to fail
         incorrectly
      
      * tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs:
        xfs: ensure we capture IO errors correctly
        xfs: fix xfs_iomap_eof_prealloc_initial_size type
        xfs: fix potential infinite loop in xfs_iomap_prealloc_size()
      10b38669
    • Matthew Garrett's avatar
      PCI: Use ROM images from firmware only if no other ROM source available · 547b5246
      Matthew Garrett authored
      
      
      Mantas Mikulėnas reported that his graphics hardware failed to
      initialise after commit f9a37be0 ("x86: Use PCI setup data").
      
      The aim of this commit was to ensure that ROM images were available on
      some Apple systems that don't expose the GPU ROM via any other source.
      In this case, UEFI appears to have provided a broken ROM image that we
      were using even though there was a perfectly valid ROM available via
      other sources.  The simplest way to handle this seems to be to just
      re-order pci_map_rom() and leave any firmare-supplied ROM to last.
      
      Signed-off-by: default avatarMatthew Garrett <matthew.garrett@nebula.com>
      Tested-by: default avatarMantas Mikulėnas <grawity@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      547b5246
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 5c7c3361
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "Just some minor fixups, a sunsu console setup panic cure, and
        recognition of a Fujitsu sun4v cpu."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: remove unused "config BITS"
        sparc: delete "if !ULTRA_HAS_POPULATION_COUNT"
        sparc64: correctly recognize SPARC64-X chips
        sparc,leon: fix GRPCI2 device0 PCI config space access
        sunsu: Fix panic in case of nonexistent port at "console=ttySY" cmdline option
      5c7c3361
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 · e7489622
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Fix !SMP build error.
      
       - Fix padding computation in struct ucontext (no ABI change).
      
       - Minor clean-up after the signal patches (unused var).
      
       - Two old Kconfig options clean-up.
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
        arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS
        arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED
        arm64: fix padding computation in struct ucontext
        arm64: Fix build error with !SMP
        arm64: Removed unused variable in compat_setup_rt_frame()
      e7489622
    • Paul Bolle's avatar
      sparc: remove unused "config BITS" · f58b20bd
      Paul Bolle authored
      
      
      sparc's asm/module.h got removed in commit
      786d35d4 ("Make most arch asm/module.h
      files use asm-generic/module.h"). That removed the only two uses of this
      Kconfig symbol. So we can remove its entry too.
      
      > >From arch/sparc/Makefile:
      >     ifeq ($(CONFIG_SPARC32),y)
      >     [...]
      >
      >     [...]
      >     export BITS    := 32
      >     [...]
      >
      >     else
      >     [...]
      >
      >     [...]
      >     export BITS   := 64
      >     [...]
      >
      > So $(BITS) is set depending on whether CONFIG_SPARC32 is set or not.
      > Using $(BITS) in sparc's Makefiles is not using CONFIG_BITS. That
      > doesn't count as usage of "config BITS".
      
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f58b20bd
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7b1b3fd7
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang.
      
       2) Insufficient space reserved for bridge netlink values, fix from
          Stephen Hemminger.
      
       3) Some dst_neigh_lookup*() callers don't interpret error pointer
          correctly, fix from Zhouyi Zhou.
      
       4) Fix transport match in SCTP active_path loops, from Xugeng Zhang.
      
       5) Fix qeth driver handling of multi-order SKB frags, from Frank
          Blaschka.
      
       6) fec driver is missing napi_disable() call, resulting in crashes on
          unload, from Georg Hofmann.
      
       7) Don't try to handle PMTU events on a listening socket, fix from Eric
          Dumazet.
      
       8) Fix timestamp location calculations in IP option processing, from
          David Ward.
      
       9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig
          tests, from Denis V Lunev.
      
      10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings.
      
      11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd
          Bergmann.
      
      12) bnx2x statistics don't handle 4GB rollover correctly, fix from
          Maciej Żenczykowski.
      
      13) Openvswitch bug fixes for vport del/new error reporting, missing
          genlmsg_end() call in netlink processing, and mis-parsing of
          LLC/SNAP ethernet types.  From Rich Lane.
      
      14) SKB pfmemalloc state should only be propagated from the head page of
          a compound page, fix from Pavel Emelyanov.
      
      15) Fix link handling in tg3 driver for 5715 chips when autonegotation
          is disabled.  From Nithin Sujir.
      
      16) Fix inverted test of cpdma_check_free_tx_desc return value in
          davinci_emac driver, from Mugunthan V N.
      
      17) vlan_depth is incorrectly calculated in skb_network_protocol(), from
          Li RongQing.
      
      18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM
          device mode backwards compat in cdc_ncm driver.  From Bjørn Mork.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
        inet: limit length of fragment queue hash table bucket lists
        qeth: Fix scatter-gather regression
        qeth: Fix invalid router settings handling
        qeth: delay feature trace
        tcp: dont handle MTU reduction on LISTEN socket
        bnx2x: fix occasional statistics off-by-4GB error
        vhost/net: fix heads usage of ubuf_info
        bridge: Add support for setting BR_ROOT_BLOCK flag.
        bnx2x: add missing napi deletion in error path
        drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc()
        ethernet/tulip: DE4x5 needs VIRT_TO_BUS
        isdn: hisax: netjet requires VIRT_TO_BUS
        net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility
        rtnetlink: Mask the rta_type when range checking
        Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally"
        Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
        smsc75xx: configuration help incorrectly mentions smsc95xx
        net: fec: fix missing napi_disable call
        net: fec: restart the FEC when PHY speed changes
        skb: Propagate pfmemalloc on skb from head page only
        ...
      7b1b3fd7
    • Paul Bolle's avatar
      sparc: delete "if !ULTRA_HAS_POPULATION_COUNT" · e0b20296
      Paul Bolle authored
      
      
      Commit 2d78d4be ("[PATCH] bitops:
      sparc64: use generic bitops") made the default of GENERIC_HWEIGHT depend
      on !ULTRA_HAS_POPULATION_COUNT. But since there's no Kconfig symbol with
      that name, this always evaluates to true. Delete this dependency.
      
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0b20296
    • Andy Honig's avatar
      KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) · a2c118bf
      Andy Honig authored
      
      
      If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows
      that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate
      that request.  ioapic_read_indirect contains an
      ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in
      non-debug builds.  In recent kernels this allows a guest to cause a kernel
      oops by reading invalid memory.  In older kernels (pre-3.3) this allows a
      guest to read from large ranges of host memory.
      
      Tested: tested against apic unit tests.
      
      Signed-off-by: default avatarAndrew Honig <ahonig@google.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      a2c118bf
    • Andy Honig's avatar
      KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) · 0b79459b
      Andy Honig authored
      
      
      There is a potential use after free issue with the handling of
      MSR_KVM_SYSTEM_TIME.  If the guest specifies a GPA in a movable or removable
      memory such as frame buffers then KVM might continue to write to that
      address even after it's removed via KVM_SET_USER_MEMORY_REGION.  KVM pins
      the page in memory so it's unlikely to cause an issue, but if the user
      space component re-purposes the memory previously used for the guest, then
      the guest will be able to corrupt that memory.
      
      Tested: Tested against kvmclock unit test
      
      Signed-off-by: default avatarAndrew Honig <ahonig@google.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      0b79459b
    • Andy Honig's avatar
      KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) · c300aa64
      Andy Honig authored
      
      
      If the guest sets the GPA of the time_page so that the request to update the
      time straddles a page then KVM will write onto an incorrect page.  The
      write is done byusing kmap atomic to get a pointer to the page for the time
      structure and then performing a memcpy to that page starting at an offset
      that the guest controls.  Well behaved guests always provide a 32-byte aligned
      address, however a malicious guest could use this to corrupt host kernel
      memory.
      
      Tested: Tested against kvmclock unit test.
      
      Signed-off-by: default avatarAndrew Honig <ahonig@google.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      c300aa64
    • Paul Bolle's avatar
      arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS · 79207206
      Paul Bolle authored
      
      
      The Kconfig entry for DEBUG_ERRORS is a verbatim copy of the former arm
      entry for that symbol. It got removed in v2.6.39 because it wasn't
      actually used anywhere. There are still no users of DEBUG_ERRORS so
      remove this entry too.
      
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      [catalin.marinas@arm.com: removed option from defconfig]
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      79207206
  3. Mar 19, 2013