Skip to content
  1. Feb 20, 2014
    • Thomas Gleixner's avatar
      x86, tsc: Fallback to normal calibration if fast MSR calibration fails · 5f0e0309
      Thomas Gleixner authored
      
      
      If we cannot calibrate TSC via MSR based calibration
      try_msr_calibrate_tsc() stores zero to fast_calibrate and returns that
      to the caller. This value gets then propagated further to clockevents
      code resulting division by zero oops like the one below:
      
       divide error: 0000 [#1] PREEMPT SMP
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W    3.13.0+ #47
       task: ffff880075508000 ti: ffff880075506000 task.ti: ffff880075506000
       RIP: 0010:[<ffffffff810aec14>]  [<ffffffff810aec14>] clockevents_config.part.3+0x24/0xa0
       RSP: 0000:ffff880075507e58  EFLAGS: 00010246
       RAX: ffffffffffffffff RBX: ffff880079c0cd80 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff
       RBP: ffff880075507e70 R08: 0000000000000001 R09: 00000000000000be
       R10: 00000000000000bd R11: 0000000000000003 R12: 000000000000b008
       R13: 0000000000000008 R14: 000000000000b010 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff880079c00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff880079fff000 CR3: 0000000001c0b000 CR4: 00000000001006f0
       Stack:
        ffff880079c0cd80 000000000000b008 0000000000000008 ffff880075507e88
        ffffffff810aecb0 ffff880079c0cd80 ffff880075507e98 ffffffff81030168
        ffff880075507ed8 ffffffff81d1104f 00000000000000c3 0000000000000000
       Call Trace:
        [<ffffffff810aecb0>] clockevents_config_and_register+0x20/0x30
        [<ffffffff81030168>] setup_APIC_timer+0xc8/0xd0
        [<ffffffff81d1104f>] setup_boot_APIC_clock+0x4cc/0x4d8
        [<ffffffff81d0f5de>] native_smp_prepare_cpus+0x3dd/0x3f0
        [<ffffffff81d02ee9>] kernel_init_freeable+0xc3/0x205
        [<ffffffff8177c910>] ? rest_init+0x90/0x90
        [<ffffffff8177c91e>] kernel_init+0xe/0x120
        [<ffffffff8178deec>] ret_from_fork+0x7c/0xb0
        [<ffffffff8177c910>] ? rest_init+0x90/0x90
      
      Prevent this from happening by:
       1) Modifying try_msr_calibrate_tsc() to return calibration value or zero
          if it fails.
       2) Check this return value in native_calibrate_tsc() and in case of zero
          fallback to use normal non-MSR based calibration.
      
      [mw: Added subject and changelog]
      
      Reported-and-tested-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Bin Gao <bin.gao@linux.intel.com>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/1392810750-18660-1-git-send-email-mika.westerberg@linux.intel.com
      
      
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      5f0e0309
  2. Feb 19, 2014
  3. Feb 18, 2014
  4. Feb 17, 2014
    • Chen Gang's avatar
      avr32: add generic vga.h to Kbuild · d7668f9d
      Chen Gang authored
      
      
      Need add generic "vga.h", or can not pass building for allmodconfig,
      the related error:
      
          CC [M]  drivers/gpu/drm/drm_irq.o
        In file included from include/linux/vgaarb.h:34,
                         from drivers/gpu/drm/drm_irq.c:42:
        include/video/vga.h:22:21: error: asm/vga.h: No such file or directory
      
      Signed-off-by: default avatarChen Gang <gang.chen.5i5j@gmail.com>
      Acked-by: default avatarHans-Christian Egtvedt <hegtvedt@cisco.com>
      d7668f9d
    • Chen Gang's avatar
      avr32: add generic ioremap_wc() definition in io.h · 1bbce4f3
      Chen Gang authored
      
      
      Need generic ioremap_wc(), or can not pass compiling with allmodconfig,
      the related error:
      
          CC [M]  drivers/gpu/drm/drm_bufs.o
        drivers/gpu/drm/drm_bufs.c: In function 'drm_addmap_core':
        drivers/gpu/drm/drm_bufs.c:217: error: implicit declaration of function 'ioremap_wc'
        drivers/gpu/drm/drm_bufs.c:218: warning: assignment makes pointer from integer without a cast
      
      Signed-off-by: default avatarChen Gang <gang.chen.5i5j@gmail.com>
      Acked-by: default avatarHans-Christian Egtvedt <hegtvedt@cisco.com>
      1bbce4f3
    • Chen Gang's avatar
      avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 use · 8d80390c
      Chen Gang authored
      For avr32 cross compiler, do not define '__linux__' internally, so it
      will cause issue with allmodconfig.
      
      The related error:
      
          CC [M]  fs/coda/psdev.o
        In file included from include/linux/coda.h:64,
                         from fs/coda/psdev.c:45:
        include/uapi/linux/coda.h:221: error: expected specifier-qualifier-list before 'u_quad_t'
      
      The related toolchain version (which only download, not re-compile):
      
        [root@gchen linux-next]# /upstream/toolchain/download/avr32-gnu-toolchain-linux_x86/bin/avr32-gcc -v
        Using built-in specs.
        Target: avr32
        Configured with: /data2/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/src/gcc/configure --target=avr32 --host=i686-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-languages=c,c++ --disable-nls --disable-libssp --disable-libstdcxx-pch --with-dwarf2 --enable-version-specific-runtime-libs --disable-shared --enable-doc --with-mpfr-lib=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/lib --with-mpfr-include=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/include --with-gmp=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --with-mpc=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-__cxa_atexit --disable-shared --with-newlib --with-pkgversion=AVR_32_bit_GNU_Toolchain_3.4.2_435 --with-bugurl=http://www
      
      
      .atmel.com/avr
        Thread model: single
        gcc version 4.4.7 (AVR_32_bit_GNU_Toolchain_3.4.2_435)
      
      Signed-off-by: default avatarChen Gang <gang.chen.5i5j@gmail.com>
      Acked-by: default avatarHans-Christian Egtvedt <hegtvedt@cisco.com>
      Cc: stable@vger.kernel.org
      8d80390c
    • Paul Gortmaker's avatar
      avr32: fix missing module.h causing build failure in mimc200/fram.c · 5745d6a4
      Paul Gortmaker authored
      
      
      Causing this:
      
      In file included from arch/avr32/boards/mimc200/fram.c:13:
      include/linux/miscdevice.h:51: error: field 'list' has incomplete type
      include/linux/miscdevice.h:55: error: expected specifier-qualifier-list before 'mode_t'
      arch/avr32/boards/mimc200/fram.c:42: error: 'THIS_MODULE' undeclared here (not in a function)
      
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Acked-by: default avatarHans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: stable@vger.kernel.org
      5745d6a4
    • Theodore Ts'o's avatar
      ext4: don't leave i_crtime.tv_sec uninitialized · 19ea8060
      Theodore Ts'o authored
      If the i_crtime field is not present in the inode, don't leave the
      field uninitialized.
      
      Fixes: ef7f3835
      
       ("ext4: Add nanosecond timestamps")
      Reported-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Tested-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      19ea8060
    • Gavin Shan's avatar
      powerpc/eeh: Disable EEH on reboot · 66f9af83
      Gavin Shan authored
      
      
      We possiblly detect EEH errors during reboot, particularly in kexec
      path, but it's impossible for device drivers and EEH core to handle
      or recover them properly.
      
      The patch registers one reboot notifier for EEH and disable EEH
      subsystem during reboot. That means the EEH errors is going to be
      cleared by hardware reset or second kernel during early stage of
      PCI probe.
      
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      66f9af83
    • Gavin Shan's avatar
      powerpc/eeh: Cleanup on eeh_subsystem_enabled · 2ec5a0ad
      Gavin Shan authored
      
      
      The patch cleans up variable eeh_subsystem_enabled so that we needn't
      refer the variable directly from external. Instead, we will use
      function eeh_enabled() and eeh_set_enable() to operate the variable.
      
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2ec5a0ad
    • Gavin Shan's avatar
      powerpc/powernv: Rework EEH reset · 5b2e198e
      Gavin Shan authored
      
      
      When doing reset in order to recover the affected PE, we issue
      hot reset on PE primary bus if it's not root bus. Otherwise, we
      issue hot or fundamental reset on root port or PHB accordingly.
      For the later case, we didn't cover the situation where PE only
      includes root port and it potentially causes kernel crash upon
      EEH error to the PE.
      
      The patch reworks the logic of EEH reset to improve the code
      readability and also avoid the kernel crash.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5b2e198e
    • Anton Blanchard's avatar
      powerpc: Use unstripped VDSO image for more accurate profiling data · 24b659a1
      Anton Blanchard authored
      
      
      We are seeing a lot of hits in the VDSO that are not resolved by perf.
      A while(1) gettimeofday() loop shows the issue:
      
      27.64%  [vdso]  [.] 0x000000000000060c
      22.57%  [vdso]  [.] 0x0000000000000628
      16.88%  [vdso]  [.] 0x0000000000000610
      12.39%  [vdso]  [.] __kernel_gettimeofday
       6.09%  [vdso]  [.] 0x00000000000005f8
       3.58%  test    [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18
       2.94%  [vdso]  [.] __kernel_datapage_offset
       2.90%  test    [.] main
      
      We are using a stripped VDSO image which means only symbols with
      relocation info can be resolved. There isn't a lot of point to
      stripping the VDSO, the debug info is only about 1kB:
      
      4680 arch/powerpc/kernel/vdso64/vdso64.so
      5815 arch/powerpc/kernel/vdso64/vdso64.so.dbg
      
      By using the unstripped image, we can resolve all the symbols in the
      VDSO and the perf profile data looks much better:
      
      76.53%  [vdso]  [.] __do_get_tspec
      12.20%  [vdso]  [.] __kernel_gettimeofday
       5.05%  [vdso]  [.] __get_datapage
       3.20%  test    [.] main
       2.92%  test    [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18
      
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      24b659a1
    • Anton Blanchard's avatar
      powerpc: Link VDSOs at 0x0 · a0a4419e
      Anton Blanchard authored
      
      
      perf is failing to resolve symbols in the VDSO. A while (1)
      gettimeofday() loop shows:
      
      93.99%  [vdso]  [.] 0x00000000000005e0
       3.12%  test    [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18
       2.81%  test    [.] main
      
      The reason for this is that we are linking our VDSO shared libraries
      at 1MB, which is a little weird. Even though this is uncommon, Alan
      points out that it is valid and we should probably fix perf userspace.
      
      Regardless, I can't see a reason why we are doing this. The code
      is all position independent and we never rely on the VDSO ending
      up at 1M (and we never place it there on 64bit tasks).
      
      Changing our link address to 0x0 fixes perf VDSO symbol resolution:
      
      73.18%  [vdso]  [.] 0x000000000000060c
      12.39%  [vdso]  [.] __kernel_gettimeofday
       3.58%  test    [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18
       2.94%  [vdso]  [.] __kernel_datapage_offset
       2.90%  test    [.] main
      
      We still have some local symbol resolution issues that will be
      fixed in a subsequent patch.
      
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a0a4419e
    • Aneesh Kumar K.V's avatar
      mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit · 56eecdb9
      Aneesh Kumar K.V authored
      
      
      Archs like ppc64 doesn't do tlb flush in set_pte/pmd functions when using
      a hash table MMU for various reasons (the flush is handled as part of
      the PTE modification when necessary).
      
      ppc64 thus doesn't implement flush_tlb_range for hash based MMUs.
      
      Additionally ppc64 require the tlb flushing to be batched within ptl locks.
      
      The reason to do that is to ensure that the hash page table is in sync with
      linux page table.
      
      We track the hpte index in linux pte and if we clear them without flushing
      hash and drop the ptl lock, we can have another cpu update the pte and can
      end up with duplicate entry in the hash table, which is fatal.
      
      We also want to keep set_pte_at simpler by not requiring them to do hash
      flush for performance reason. We do that by assuming that set_pte_at() is
      never *ever* called on a PTE that is already valid.
      
      This was the case until the NUMA code went in which broke that assumption.
      
      Fix that by introducing a new pair of helpers to set _PAGE_NUMA in a
      way similar to ptep/pmdp_set_wrprotect(), with a generic implementation
      using set_pte_at() and a powerpc specific one using the appropriate
      mechanism needed to keep the hash table in sync.
      
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      56eecdb9
    • Aneesh Kumar K.V's avatar
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Add new "set" flag argument to pte/pmd update function · 88247e8d
      Aneesh Kumar K.V authored
      
      
      pte_update() is a powerpc-ism used to change the bits of a PTE
      when the access permission is being restricted (a flush is
      potentially needed).
      
      It uses atomic operations on when needed and handles the hash
      synchronization on hash based processors.
      
      It is currently only used to clear PTE bits and so the current
      implementation doesn't provide a way to also set PTE bits.
      
      The new _PAGE_NUMA bit, when set, is actually restricting access
      so it must use that function too, so this change adds the ability
      for pte_update() to also set bits.
      
      We will use this later to set the _PAGE_NUMA bit.
      
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      88247e8d
    • Kleber Sacilotto de Souza's avatar
      powerpc/pseries: Add Gen3 definitions for PCIE link speed · 49d9684a
      Kleber Sacilotto de Souza authored
      
      
      Rev3 of the PCI Express Base Specification defines a Supported Link
      Speeds Vector where the bit definitions within this field are:
      
      Bit 0 - 2.5 GT/s
      Bit 1 - 5.0 GT/s
      Bit 2 - 8.0 GT/s
      
      This vector definition is used by the platform firmware to export the
      maximum and current link speeds of the PCI bus via the
      "ibm,pcie-link-speed-stats" device-tree property.
      
      This patch updates pseries_root_bridge_prepare() to detect Gen3
      speed buses (defined by 0x04).
      
      Signed-off-by: default avatarKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      49d9684a
    • Kleber Sacilotto de Souza's avatar
      powerpc/pseries: Fix regression on PCI link speed · b020cc6c
      Kleber Sacilotto de Souza authored
      Commit 5091f0c9
      
       (powerpc/pseries: Fix PCIE link speed endian issue)
      introduced a regression on the PCI link speed detection using the
      device-tree property. The ibm,pcie-link-speed-stats property is composed
      of two 32-bit integers, the first one being the maxinum link speed and
      the second the current link speed. The changes introduced by the
      aforementioned commit are considering just the first integer.
      
      Fix this issue by changing how the property is accessed, using the
      helper functions to properly access the array of values. The explicit
      byte swapping is not needed anymore here, since it's done by the helper
      functions.
      
      Signed-off-by: default avatarKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b020cc6c
    • Kevin Hao's avatar
      powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack · 1a18a664
      Kevin Hao authored
      Guenter Roeck has got the following call trace on a p2020 board:
        Kernel stack overflow in process eb3e5a00, r1=eb79df90
        CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
        task: eb3e5a00 ti: c0616000 task.ti: ef440000
        NIP: c003a420 LR: c003a410 CTR: c0017518
        REGS: eb79dee0 TRAP: 0901   Not tainted (3.13.0-rc8-juniper-00146-g19eca00)
        MSR: 00029000 <CE,EE,ME>  CR: 24008444  XER: 00000000
        GPR00: c003a410 eb79df90 eb3e5a00 00000000 eb05d900 00000001 65d87646 00000000
        GPR08: 00000000 020b8000 00000000 00000000 44008442
        NIP [c003a420] __do_softirq+0x94/0x1ec
        LR [c003a410] __do_softirq+0x84/0x1ec
        Call Trace:
        [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
        [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
        [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
        [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
        [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
        --- Exception: 501 at 0xfcda524
            LR = 0x10024900
        Instruction dump:
        7c781b78 3b40000a 3a73b040 543c0024 3a800000 3b3913a0 7ef5bb78 48201bf9
        5463103a 7d3b182e 7e89b92e 7c008146 <3ba00000> 7e7e9b78 48000014 57fff87f
        Kernel panic - not syncing: kernel stack overflow
        CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
        Call Trace:
      
      The reason is that we have used the wrong register to calculate the
      ksp_limit in commit cbc9565e
      
       (powerpc: Remove ksp_limit on ppc64).
      Just fix it.
      
      As suggested by Benjamin Herrenschmidt, also add the C prototype of the
      function in the comment in order to avoid such kind of errors in the
      future.
      
      Cc: stable@vger.kernel.org # 3.12
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarKevin Hao <haokexin@gmail.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1a18a664
    • Linus Torvalds's avatar
      Linux 3.14-rc3 · 6d0abeca
      Linus Torvalds authored
      6d0abeca
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 3962dfbe
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "We have a small collection of fixes in my for-linus branch.
      
        The big thing that stands out is a revert of a new ioctl.  Users
        haven't shipped yet in btrfs-progs, and Dave Sterba found a better way
        to export the information"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: use right clone root offset for compressed extents
        btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105
        Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol
        Btrfs: fix max_inline mount option
        Btrfs: fix a lockdep warning when cleaning up aborted transaction
        Revert "btrfs: add ioctl to export size of global metadata reservation"
      3962dfbe
    • Linus Torvalds's avatar
      Merge tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 4302a875
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
       "Fix booting on PPC boards.  Changes to of_match_node matching caused
        the serial port on some PPC boards to stop working.  Reverted the
        change and reimplement to split matching between new style compatible
        only matching and fallback to old matching algorithm"
      
      * tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        of: search the best compatible match first in __of_match_node()
        Revert "OF: base: match each node compatible against all given matches first"
      4302a875
  5. Feb 16, 2014
    • Theodore Ts'o's avatar
      ext4: fix online resize with a non-standard blocks per group setting · 3d2660d0
      Theodore Ts'o authored
      
      
      The set_flexbg_block_bitmap() function assumed that the number of
      blocks in a blockgroup was sb->blocksize * 8, which is normally true,
      but not always!  Use EXT4_BLOCKS_PER_GROUP(sb) instead, to fix block
      bitmap corruption after:
      
      mke2fs -t ext4 -g 3072 -i 4096 /dev/vdd 1G
      mount -t ext4 /dev/vdd /vdd
      resize2fs /dev/vdd 8G
      
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reported-by: default avatarJon Bernard <jbernard@tuxion.com>
      Cc: stable@vger.kernel.org
      3d2660d0
    • Theodore Ts'o's avatar
      ext4: fix online resize with very large inode tables · b93c9535
      Theodore Ts'o authored
      
      
      If a file system has a large number of inodes per block group, all of
      the metadata blocks in a flex_bg may be larger than what can fit in a
      single block group.  Unfortunately, ext4_alloc_group_tables() in
      resize.c was never tested to see if it would handle this case
      correctly, and there were a large number of bugs which caused the
      following sequence to result in a BUG_ON:
      
      kernel bug at fs/ext4/resize.c:409!
         ...
      call trace:
       [<ffffffff81256768>] ext4_flex_group_add+0x1448/0x1830
       [<ffffffff81257de2>] ext4_resize_fs+0x7b2/0xe80
       [<ffffffff8123ac50>] ext4_ioctl+0xbf0/0xf00
       [<ffffffff811c111d>] do_vfs_ioctl+0x2dd/0x4b0
       [<ffffffff811b9df2>] ? final_putname+0x22/0x50
       [<ffffffff811c1371>] sys_ioctl+0x81/0xa0
       [<ffffffff81676aa9>] system_call_fastpath+0x16/0x1b
      code: c8 4c 89 df e8 41 96 f8 ff 44 89 e8 49 01 c4 44 29 6d d4 0
      rip  [<ffffffff81254fa1>] set_flexbg_block_bitmap+0x171/0x180
      
      
      This can be reproduced with the following command sequence:
      
         mke2fs -t ext4 -i 4096 /dev/vdd 1G
         mount -t ext4 /dev/vdd /vdd
         resize2fs /dev/vdd 8G
      
      To fix this, we need to make sure the right thing happens when a block
      group's inode table straddles two block groups, which means the
      following bugs had to be fixed:
      
      1) Not clearing the BLOCK_UNINIT flag in the second block group in
         ext4_alloc_group_tables --- the was proximate cause of the BUG_ON.
      
      2) Incorrectly determining how many block groups contained contiguous
         free blocks in ext4_alloc_group_tables().
      
      3) Incorrectly setting the start of the next block range to be marked
         in use after a discontinuity in setup_new_flex_group_blocks().
      
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      b93c9535