Skip to content
  1. Apr 06, 2009
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param · cab4e4c4
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param:
        module: use strstarts()
        strstarts: helper function for !strncmp(str, prefix, strlen(prefix))
        arm: allow usage of string functions in linux/string.h
        module: don't use stop_machine on module load
        module: create a request_module_nowait()
        module: include other structures in module version check
        module: remove the SHF_ALLOC flag on the __versions section.
        module: clarify the force-loading taint message.
        module: Export symbols needed for Ksplice
        Ksplice: Add functions for walking kallsyms symbols
        module: remove module_text_address()
        module: __module_address
        module: Make find_symbol return a struct kernel_symbol
        kernel/module.c: fix an unused goto label
        param: fix charp parameters set via sysfs
      
      Fix trivial conflicts in kernel/extable.c manually.
      cab4e4c4
    • Linus Torvalds's avatar
      Merge branch 'core/debugobjects' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · 5412b539
      Linus Torvalds authored
      * 'core/debugobjects' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        debugobjects: delay free of internal objects
        debugobjects: replace static objects when slab cache becomes available
        debug_objects: add boot-parameter toggle to turn object debugging off again
      5412b539
    • Linus Torvalds's avatar
      Merge branch 'printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · e4c393fd
      Linus Torvalds authored
      * 'printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        printk: correct the behavior of printk_timed_ratelimit()
        vsprintf: unify the format decoding layer for its 3 users, cleanup
        fix regression from "vsprintf: unify the format decoding layer for its 3 users"
        vsprintf: fix bug in negative value printing
        vsprintf: unify the format decoding layer for its 3 users
        vsprintf: add binary printf
        printk: introduce printk_once()
      
      Fix trivial conflicts (printk_once vs log_buf_kexec_setup() added near
      each other) in include/linux/kernel.h.
      e4c393fd
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc · 0a053e8c
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (42 commits)
        atmel-mci: fix sdc_reg typo
        tmio_mmc: add maintainer
        mmc: Add OpenFirmware bindings for SDHCI driver
        sdhci: Add quirk for forcing maximum block size to 2048 bytes
        sdhci: Add quirk for controllers that need IRQ re-init after reset
        sdhci: Add quirk for controllers that need small delays for PIO
        sdhci: Add set_clock callback and a quirk for nonstandard clocks
        sdhci: Add get_{max,timeout}_clock callbacks
        sdhci: Add support for hosts reporting inverted write-protect state
        sdhci: Add support for card-detection polling
        sdhci: Enable only relevant (DMA/PIO) interrupts during transfers
        sdhci: Split card-detection IRQs management from sdhci_init()
        sdhci: Add support for bus-specific IO memory accessors
        mmc_spi: adjust for delayed data token response
        omap_hsmmc: Wait for SDBP
        omap_hsmmc: Fix MMC3 dma
        omap_hsmmc: Disable SDBP at suspend
        omap_hsmmc: Do not prefix slot name
        omap_hsmmc: Allow cover switch to cause rescan
        omap_hsmmc: Add 8-bit bus width mode support
        ...
      0a053e8c
  2. Apr 05, 2009
    • Linus Torvalds's avatar
      Make non-compat preadv/pwritev use native register size · 601cc11d
      Linus Torvalds authored
      
      
      Instead of always splitting the file offset into 32-bit 'high' and 'low'
      parts, just split them into the largest natural word-size - which in C
      terms is 'unsigned long'.
      
      This allows 64-bit architectures to avoid the unnecessary 32-bit
      shifting and masking for native format (while the compat interfaces will
      obviously always have to do it).
      
      This also changes the order of 'high' and 'low' to be "low first".  Why?
      Because when we have it like this, the 64-bit system calls now don't use
      the "pos_high" argument at all, and it makes more sense for the native
      system call to simply match the user-mode prototype.
      
      This results in a much more natural calling convention, and allows the
      compiler to generate much more straightforward code.  On x86-64, we now
      generate
      
              testq   %rcx, %rcx      # pos_l
              js      .L122   #,
              movq    %rcx, -48(%rbp) # pos_l, pos
      
      from the C source
      
              loff_t pos = pos_from_hilo(pos_h, pos_l);
      	...
              if (pos < 0)
                      return -EINVAL;
      
      and the 'pos_h' register isn't even touched.  It used to generate code
      like
      
              mov     %r8d, %r8d      # pos_low, pos_low
              salq    $32, %rcx       #, tmp71
              movq    %r8, %rax       # pos_low, pos.386
              orq     %rcx, %rax      # tmp71, pos.386
              js      .L122   #,
              movq    %rax, -48(%rbp) # pos.386, pos
      
      which isn't _that_ horrible, but it does show how the natural word size
      is just a more sensible interface (same arguments will hold in the user
      level glibc wrapper function, of course, so the kernel side is just half
      of the equation!)
      
      Note: in all cases the user code wrapper can again be the same. You can
      just do
      
      	#define HALF_BITS (sizeof(unsigned long)*4)
      	__syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS);
      
      or something like that.  That way the user mode wrapper will also be
      nicely passing in a zero (it won't actually have to do the shifts, the
      compiler will understand what is going on) for the last argument.
      
      And that is a good idea, even if nobody will necessarily ever care: if
      we ever do move to a 128-bit lloff_t, this particular system call might
      be left alone.  Of course, that will be the least of our worries if we
      really ever need to care, so this may not be worth really caring about.
      
      [ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]
      
      Acked-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-api@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ralf Baechle <ralf@linux-mips.org>>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      601cc11d
  3. Apr 04, 2009
    • Linus Torvalds's avatar
      Merge branch 'x86-fixes-for-linus' of... · 6bb59750
      Linus Torvalds authored
      Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86, mtrr: remove debug message
        x86: disable stack-protector for __restore_processor_state()
        x86: fix is_io_mapping_possible() build warning on i386 allnoconfig
        x86, setup: compile with -DDISABLE_BRANCH_PROFILING
        x86/dma: unify definition of pci_unmap_addr* and pci_unmap_len macros
        x86, mm: fix misuse of debug_kmap_atomic
        x86: remove duplicated code with pcpu_need_numa()
        x86,percpu: fix inverted NUMA test in setup_pcpu_remap()
        x86: signal: check sas_ss_size instead of sas_ss_flags()
      6bb59750
    • Linus Torvalds's avatar
      Merge branch 'core-cleanups-for-linus' of... · 09f38dc1
      Linus Torvalds authored
      Merge branch 'core-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'core-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        ptrace: remove a useless goto
      09f38dc1
    • Linus Torvalds's avatar
      Merge branch 'stacktrace-for-linus' of... · 30a39e0e
      Linus Torvalds authored
      Merge branch 'stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        symbols, stacktrace: look up init symbols after module symbols
      30a39e0e
    • Linus Torvalds's avatar
      Merge branch 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · c7edad5f
      Linus Torvalds authored
      * 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
      c7edad5f
    • Linus Torvalds's avatar
      Merge branch 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · b1dbb679
      Linus Torvalds authored
      * 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        s390: remove arch specific smp_send_stop()
        panic: clean up kernel/panic.c
        panic, smp: provide smp_send_stop() wrapper on UP too
        panic: decrease oops_in_progress only after having done the panic
        generic-ipi: eliminate WARN_ON()s during oops/panic
        generic-ipi: cleanups
        generic-ipi: remove CSD_FLAG_WAIT
        generic-ipi: remove kmalloc()
        generic IPI: simplify barriers and locking
      b1dbb679
    • Linus Torvalds's avatar
      Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · 492f59f5
      Linus Torvalds authored
      * 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        locking: rename trace_softirq_[enter|exit] => lockdep_softirq_[enter|exit]
        lockdep: remove duplicate CONFIG_DEBUG_LOCKDEP definitions
        lockdep: require framepointers for x86
        lockdep: remove extra "irq" string
        lockdep: fix incorrect state name
      492f59f5
    • Ingo Molnar's avatar
      x86, mtrr: remove debug message · c5c67c7c
      Ingo Molnar authored
      
      
      The MTRR code grew a new debug message which triggers commonly:
      
      [   40.142276]   get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back
      [   40.142280]   get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back
      [   40.142284]   get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back
      [   40.142311]   get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back
      [   40.142314]   get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back
      [   40.142317]   get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back
      
      Remove this annoyance.
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c5c67c7c
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · f945b7ab
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: allow private mappings of "direct_io" files
        fuse: allow kernel to access "direct_io" files
      f945b7ab
    • Alan Cox's avatar
      LANANA: Change of management and resync · 04c860c1
      Alan Cox authored
      
      
      Bring the devices.txt back into some relationship with reality. Update the
      documentation a bit.
      
      Signed-off-by: default avatarAlan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04c860c1
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 5fba0925
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: remove compat stuff
        HID: constify arrays of struct apple_key_translation
        HID: add support for Kye/Genius Ergo 525V
        HID: Support Apple mini aluminum keyboard
        HID: support for Kensington slimblade device
        HID: DragonRise game controller force feedback driver
        HID: add support for another version of 0e8f:0003 device in hid-pl
        HID: fix race between usb_register_dev() and hiddev_open()
        HID: bring back possibility to specify vid/pid ignore on module load
        HID: make HID_DEBUG defaults consistent
        HID: autosuspend -- fix lockup of hid on reset
        HID: hid_reset_resume() needs to be defined only when CONFIG_PM is set
        HID: fix USB HID devices after STD with autosuspend
        HID: do not try to compile PM code with CONFIG_PM unset
        HID: autosuspend support for USB HID
      5fba0925
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial · 811158b1
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
        trivial: Update my email address
        trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
        trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
        trivial: Fix misspelling of "Celsius".
        trivial: remove unused variable 'path' in alloc_file()
        trivial: fix a pdlfush -> pdflush typo in comment
        trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
        trivial: wusb: Storage class should be before const qualifier
        trivial: drivers/char/bsr.c: Storage class should be before const qualifier
        trivial: h8300: Storage class should be before const qualifier
        trivial: fix where cgroup documentation is not correctly referred to
        trivial: Give the right path in Documentation example
        trivial: MTD: remove EOL from MODULE_DESCRIPTION
        trivial: Fix typo in bio_split()'s documentation
        trivial: PWM: fix of #endif comment
        trivial: fix typos/grammar errors in Kconfig texts
        trivial: Fix misspelling of firmware
        trivial: cgroups: documentation typo and spelling corrections
        trivial: Update contact info for Jochen Hein
        trivial: fix typo "resgister" -> "register"
        ...
      811158b1
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6 · 4e76c5cc
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6: (21 commits)
        xtensa: we don't need to include asm/io.h
        xtensa: only build platform or variant if they contain a Makefile
        xtensa: make startup code discardable
        xtensa: ccount clocksource
        xtensa: remove platform rtc hooks
        xtensa: use generic sched_clock()
        xtensa: platform: s6105
        xtensa: let platform override KERNELOFFSET
        xtensa: s6000 variant
        xtensa: s6000 variant core definitions
        xtensa: variant irq set callbacks
        xtensa: variant-specific code
        xtensa: nommu support
        xtensa: add flat support
        xtensa: enforce slab alignment to maximum register width
        xtensa: cope with ram beginning at higher addresses
        xtensa: don't make bootmem bitmap larger than required
        xtensa: fix init_bootmem_node() argument order
        xtensa: use correct stack pointer for stack traces
        xtensa: beat Kconfig into shape
        ...
      4e76c5cc
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · b9834717
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
        Btrfs: BUG to BUG_ON changes
        Btrfs: remove dead code
        Btrfs: remove dead code
        Btrfs: fix typos in comments
        Btrfs: remove unused ftrace include
        Btrfs: fix __ucmpdi2 compile bug on 32 bit builds
        Btrfs: free inode struct when btrfs_new_inode fails
        Btrfs: fix race in worker_loop
        Btrfs: add flushoncommit mount option
        Btrfs: notreelog mount option
        Btrfs: introduce btrfs_show_options
        Btrfs: rework allocation clustering
        Btrfs: Optimize locking in btrfs_next_leaf()
        Btrfs: break up btrfs_search_slot into smaller pieces
        Btrfs: kill the pinned_mutex
        Btrfs: kill the block group alloc mutex
        Btrfs: clean up find_free_extent
        Btrfs: free space cache cleanups
        Btrfs: unplug in the async bio submission threads
        Btrfs: keep processing bios for a given bdev if our proc is batching
      b9834717
    • Suresh Siddha's avatar
      x86, PAT: Remove duplicate memtype reserve in pci mmap · 5a3ae276
      Suresh Siddha authored
      
      
      pci mmap code was doing memtype reserve for a while now. Recently we
      added memtype tracking in remap_pfn_range, and pci code indirectly calls
      remap_pfn_range. So, we don't need seperate tracking in pci code
      anymore. Which means a patch that removes ~50 lines of code :-).
      
      Also, recently we found out that the pci tracking is not working as we expect
      it to work in some cases. Specifically, userlevel X mmap of pci, with some
      recent version of X, is having a problem with vm_page_prot getting reset.
      The pci tracking uses vm_page_prot to pass on the protection type from parent
      to child during fork.
      a) Parent does a pci mmap
      b) We look at PAT and get either UC_MINUS or WC mapping for parent
      c) Store that mapping type in vma vm_page_prot for future use
      d) This thread does a fork
      e) Fork results in mmap_ops ->open for the child process
      f) We get the vm_page_prot from vma and reserve that type for the child process
      
      But, between c) and e) above, the vma vm_page_prot is getting reset to zero.
      This results in PAT reserve failing at the time of fork as in here.
      http://marc.info/?l=linux-kernel&m=123858163103240&w=2
      
      This cleanup makes the above problem go away as we do not depend on
      vm_page_prot in our PAT code anymore.
      
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a3ae276
    • Linus Torvalds's avatar
      Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 · 78609a81
      Linus Torvalds authored
      * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
        ocfs2: recover orphans in offline slots during recovery and mount
        ocfs2: Pagecache usage optimization on ocfs2
        ocfs2: fix rare stale inode errors when exporting via nfs
        ocfs2/dlm: Tweak mle_state output
        ocfs2/dlm: Do not purge lockres that is being migrated dlm_purge_lockres()
        ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entry
        ocfs2/dlm: Show the number of lockres/mles in dlm_state
        ocfs2/dlm: dlm_set_lockres_owner() and dlm_change_lockres_owner() inlined
        ocfs2/dlm: Improve lockres counts
        ocfs2/dlm: Track number of mles
        ocfs2/dlm: Indent dlm_cleanup_master_list()
        ocfs2/dlm: Activate dlm->master_hash for master list entries
        ocfs2/dlm: Create and destroy the dlm->master_hash
        ocfs2/dlm: Refactor dlm_clean_master_list()
        ocfs2/dlm: Clean up struct dlm_lock_name
        ocfs2/dlm: Encapsulate adding and removing of mle from dlm->master_list
        ocfs2: Optimize inode group allocation by recording last used group.
        ocfs2: Allocate inode groups from global_bitmap.
        ocfs2: Optimize inode allocation by remembering last group
        ocfs2: fix leaf start calculation in ocfs2_dx_dir_rebalance()
        ...
      78609a81
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx · 133e2a31
      Linus Torvalds authored
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
        dma: Add SoF and EoF debugging to ipu_idmac.c, minor cleanup
        dw_dmac: add cyclic API to DW DMA driver
        dmaengine: Add privatecnt to revert DMA_PRIVATE property
        dmatest: add dma interrupts and callbacks
        dmatest: add xor test
        dmaengine: allow dma support for async_tx to be toggled
        async_tx: provide __async_inline for HAS_DMA=n archs
        dmaengine: kill some unused headers
        dmaengine: initialize tx_list in dma_async_tx_descriptor_init
        dma: i.MX31 IPU DMA robustness improvements
        dma: improve section assignment in i.MX31 IPU DMA driver
        dma: ipu_idmac driver cosmetic clean-up
        dmaengine: fail device registration if channel registration fails
      133e2a31
    • Srinivas Eeda's avatar
      ocfs2: recover orphans in offline slots during recovery and mount · 9140db04
      Srinivas Eeda authored
      
      
      During recovery, a node recovers orphans in it's slot and the dead node(s). But
      if the dead nodes were holding orphans in offline slots, they will be left
      unrecovered.
      
      If the dead node is the last one to die and is holding orphans in other slots
      and is the first one to mount, then it only recovers it's own slot, which
      leaves orphans in offline slots.
      
      This patch queues complete_recovery to clean orphans for all offline slots
      during mount and node recovery.
      
      Signed-off-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Acked-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      9140db04
    • Hisashi Hifumi's avatar
      ocfs2: Pagecache usage optimization on ocfs2 · 1fca3a05
      Hisashi Hifumi authored
      
      
      A page can have multiple buffers and even if a page is not uptodate, some buffers
      can be uptodate on pagesize != blocksize environment.
      This aops checks that all buffers which correspond to a part of a file
      that we want to read are uptodate. If so, we do not have to issue actual
      read IO to HDD even if a page is not uptodate because the portion we
      want to read are uptodate.
      "block_is_partially_uptodate" function is already used by ext2/3/4.
      With the following patch random read/write mixed workloads or random read after
      random write workloads can be optimized and we can get performance improvement.
      
      Signed-off-by: default avatarHisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      1fca3a05
    • wengang wang's avatar
      ocfs2: fix rare stale inode errors when exporting via nfs · 6ca497a8
      wengang wang authored
      
      
      For nfs exporting, ocfs2_get_dentry() returns the dentry for fh.
      ocfs2_get_dentry() may read from disk when the inode is not in memory,
      without any cross cluster lock. this leads to the file system loading a
      stale inode.
      
      This patch fixes above problem.
      
      Solution is that in case of inode is not in memory, we get the cluster
      lock(PR) of alloc inode where the inode in question is allocated from (this
      causes node on which deletion is done sync the alloc inode) before reading
      out the inode itsself. then we check the bitmap in the group (the inode in
      question allcated from) to see if the bit is clear. if it's clear then it's
      stale. if the bit is set, we then check generation as the existing code
      does.
      
      We have to read out the inode in question from disk first to know its alloc
      slot and allot bit. And if its not stale we read it out using ocfs2_iget().
      The second read should then be from cache.
      
      And also we have to add a per superblock nfs_sync_lock to cover the lock for
      alloc inode and that for inode in question. this is because ocfs2_get_dentry()
      and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked
      in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so
      that mutliple ocfs2_delete_inode() can run concurrently in normal case.
      
      [mfasheh@suse.com: build warning fixes and comment cleanups]
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Acked-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      6ca497a8
    • Sunil Mushran's avatar
      ocfs2/dlm: Tweak mle_state output · 9405dccf
      Sunil Mushran authored
      
      
      The debugfs file, mle_state, now prints the number of largest number of mles
      in one hash link.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      9405dccf
    • Sunil Mushran's avatar
      ocfs2/dlm: Do not purge lockres that is being migrated dlm_purge_lockres() · 516b7e52
      Sunil Mushran authored
      
      
      This patch attempts to fix a fine race between purging and migration.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      516b7e52
    • Sunil Mushran's avatar
      ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entry · 7141514b
      Sunil Mushran authored
      
      
      This patch removes struct dlm_lock_name and adds the entries directly
      to struct dlm_master_list_entry. Under the new scheme, both mles that
      are backed by a lockres or not, will have the name populated in mle->mname.
      This allows us to get rid of code that was figuring out the location of
      the mle name.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      7141514b
    • Sunil Mushran's avatar
      ocfs2/dlm: Show the number of lockres/mles in dlm_state · e64ff146
      Sunil Mushran authored
      
      
      This patch shows the number of lockres' and mles in the debugfs file, dlm_state.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      e64ff146
    • Sunil Mushran's avatar
      ocfs2/dlm: dlm_set_lockres_owner() and dlm_change_lockres_owner() inlined · 7d62a978
      Sunil Mushran authored
      
      
      This patch inlines dlm_set_lockres_owner() and dlm_change_lockres_owner().
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      7d62a978
    • Sunil Mushran's avatar
      ocfs2/dlm: Improve lockres counts · 6800791a
      Sunil Mushran authored
      
      
      This patch replaces the lockres counts that tracked the number number of
      locally and remotely mastered lockres' with a current and total count. The
      total count is the number of lockres' that have been created since the dlm
      domain was created.
      
      The number of locally and remotely mastered counts can be computed using
      the locking_state output.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      6800791a
    • Sunil Mushran's avatar
      ocfs2/dlm: Track number of mles · 2041d8fd
      Sunil Mushran authored
      
      
      The lifetime of a mle is limited to the duration of the lockres mastery
      process. While typically this lifetime is fairly short, we have noticed
      the number of mles explode under certain circumstances. This patch tracks
      the number of each different types of mles and should help us determine
      how best to speed up the mastery process.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      2041d8fd
    • Sunil Mushran's avatar
      ocfs2/dlm: Indent dlm_cleanup_master_list() · 67ae1f06
      Sunil Mushran authored
      
      
      The previous patch explicitly did not indent dlm_cleanup_master_list()
      so as to make the patch readable. This patch properly indents the
      function.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      67ae1f06
    • Sunil Mushran's avatar
      ocfs2/dlm: Activate dlm->master_hash for master list entries · 2ed6c750
      Sunil Mushran authored
      
      
      With this patch, the mles are stored in a hash and not a simple list.
      This should improve the mle lookup time when the number of outstanding
      masteries is large.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      2ed6c750
    • Sunil Mushran's avatar
      ocfs2/dlm: Create and destroy the dlm->master_hash · e2b66ddc
      Sunil Mushran authored
      
      
      This patch adds code to create and destroy the dlm->master_hash.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      e2b66ddc
    • Sunil Mushran's avatar
      ocfs2/dlm: Refactor dlm_clean_master_list() · c2cd4a44
      Sunil Mushran authored
      
      
      This patch refactors dlm_clean_master_list() so as to make it
      easier to convert the mle list to a hash.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      c2cd4a44
    • Sunil Mushran's avatar
      ocfs2/dlm: Clean up struct dlm_lock_name · f77a9a78
      Sunil Mushran authored
      
      
      For master mle, the name it stored in the attached lockres in struct qstr.
      For block and migration mle, the name is stored inline in struct dlm_lock_name.
      This patch attempts to make struct dlm_lock_name look like a struct qstr. While
      we could use struct qstr, we don't because we want to avoid having to malloc
      and free the lockname string as the mle's lifetime is fairly short.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      f77a9a78
    • Sunil Mushran's avatar
      ocfs2/dlm: Encapsulate adding and removing of mle from dlm->master_list · 1c084577
      Sunil Mushran authored
      
      
      This patch encapsulates adding and removing of the mle from the
      dlm->master_list. This patch is part of the series of patches that
      converts the mle list to a mle hash.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      1c084577
    • Tao Ma's avatar
      ocfs2: Optimize inode group allocation by recording last used group. · feb473a6
      Tao Ma authored
      
      
      In ocfs2, the block group search looks for the "emptiest" group
      to allocate from. So if the allocator has many equally(or almost
      equally) empty groups, new block group will tend to get spread
      out amongst them.
      
      So we add osb_inode_alloc_group in ocfs2_super to record the last
      used inode allocation group.
      For more details, please see
      http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy.
      
      I have done some basic test and the results are a ten times improvement on
      some cold-cache stat workloads.
      
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      feb473a6
    • Tao Ma's avatar
      ocfs2: Allocate inode groups from global_bitmap. · 60ca81e8
      Tao Ma authored
      
      
      Inode groups used to be allocated from local alloc file,
      but since we want all inodes to be contiguous enough, we
      will try to allocate them directly from global_bitmap.
      
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      60ca81e8
    • Tao Ma's avatar
      ocfs2: Optimize inode allocation by remembering last group · 13821151
      Tao Ma authored
      
      
      In ocfs2, the inode block search looks for the "emptiest" inode
      group to allocate from. So if an inode alloc file has many equally
      (or almost equally) empty groups, new inodes will tend to get
      spread out amongst them, which in turn can put them all over the
      disk. This is undesirable because directory operations on conceptually
      "nearby" inodes force a large number of seeks.
      
      So we add ip_last_used_group in core directory inodes which records
      the last used allocation group. Another field named ip_last_used_slot
      is also added in case inode stealing happens. When claiming new inode,
      we passed in directory's inode so that the allocation can use this
      information.
      For more details, please see
      http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy.
      
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      13821151