Skip to content
  1. Dec 18, 2010
    • Chris Metcalf's avatar
      arch/tile: handle rt_sigreturn() more cleanly · 81711cee
      Chris Metcalf authored
      
      
      The current tile rt_sigreturn() syscall pattern uses the common idiom
      of loading up pt_regs with all the saved registers from the time of
      the signal, then anticipating the fact that we will clobber the ABI
      "return value" register (r0) as we return from the syscall by setting
      the rt_sigreturn return value to whatever random value was in the pt_regs
      for r0.
      
      However, this breaks in our 64-bit kernel when running "compat" tasks,
      since we always sign-extend the "return value" register to properly
      handle returned pointers that are in the upper 2GB of the 32-bit compat
      address space.  Doing this to the sigreturn path then causes occasional
      random corruption of the 64-bit r0 register.
      
      Instead, we stop doing the crazy "load the return-value register"
      hack in sigreturn.  We already have some sigreturn-specific assembly
      code that we use to pass the pt_regs pointer to C code.  We extend that
      code to also set the link register to point to a spot a few instructions
      after the usual syscall return address so we don't clobber the saved r0.
      Now it no longer matters what the rt_sigreturn syscall returns, and the
      pt_regs structure can be cleanly and completely reloaded.
      
      Signed-off-by: default avatarChris Metcalf <cmetcalf@tilera.com>
      81711cee
    • Chris Metcalf's avatar
      arch/tile: handle CLONE_SETTLS in copy_thread(), not user space · bc4cf2bb
      Chris Metcalf authored
      
      
      Previously we were just setting up the "tp" register in the
      new task as started by clone() in libc.  However, this is not
      quite right, since in principle a signal might be delivered to
      the new task before it had its TLS set up.  (Of course, this race
      window still exists for resetting the libc getpid() cached value
      in the new task, in principle.  But in any case, we are now doing
      this exactly the way all other architectures do it.)
      
      This change is important for 2.6.37 since the tile glibc we will
      be submitting upstream will not set TLS in user space any more,
      so it will only work on a kernel that has this fix.  It should
      also be taken for 2.6.36.x in the stable tree if possible.
      
      Signed-off-by: default avatarChris Metcalf <cmetcalf@tilera.com>
      Cc: stable <stable@kernel.org>
      bc4cf2bb
  2. Dec 15, 2010
  3. Dec 14, 2010
    • Chris Mason's avatar
      Btrfs: prevent RAID level downgrades when space is low · 83a50de9
      Chris Mason authored
      
      
      The extent allocator has code that allows us to fill
      allocations from any available block group, even if it doesn't
      match the raid level we've requested.
      
      This was put in because adding a new drive to a filesystem
      made with the default mkfs options actually upgrades the metadata from
      single spindle dup to full RAID1.
      
      But, the code also allows us to allocate from a raid0 chunk when we
      really want a raid1 or raid10 chunk.  This can cause big trouble because
      mkfs creates a small (4MB) raid0 chunk for data and metadata which then
      goes unused for raid1/raid10 installs.
      
      The allocator will happily wander in and allocate from that chunk when
      things get tight, which is not correct.
      
      The fix here is to make sure that we provide duplication when the
      caller has asked for it.  It does all the dups to be any raid level,
      which preserves the dup->raid1 upgrade abilities.
      
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      83a50de9
    • Chris Mason's avatar
      Btrfs: account for missing devices in RAID allocation profiles · cd02dca5
      Chris Mason authored
      
      
      When we mount in RAID degraded mode without adding a new device to
      replace the failed one, we can end up using the wrong RAID flags for
      allocations.
      
      This results in strange combinations of block groups (raid1 in a raid10
      filesystem) and corruptions when we try to allocate blocks from single
      spindle chunks on drives that are actually missing.
      
      The first device has two small 4MB chunks in it that mkfs creates and
      these are usually unused in a raid1 or raid10 setup.  But, in -o degraded,
      the allocator will fall back to these because the mask of desired raid groups
      isn't correct.
      
      The fix here is to count the missing devices as we build up the list
      of devices in the system.  This count is used when picking the
      raid level to make sure we continue using the same levels that were
      in place before we lost a drive.
      
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      cd02dca5
    • Chris Mason's avatar
      Btrfs: EIO when we fail to read tree roots · 68433b73
      Chris Mason authored
      
      
      If we just get a plain IO error when we read tree roots, the code
      wasn't properly sending that error up the chain.  This allowed mounts to
      continue when they should failed, and allowed operations
      on partially setup root structs.  The end result was usually oopsen
      on spinlocks that hadn't been spun up correctly.
      
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      68433b73
  4. Dec 11, 2010
  5. Dec 10, 2010
  6. Dec 09, 2010
  7. Dec 08, 2010