Skip to content
  1. May 19, 2012
    • H. Peter Anvin's avatar
      x86, realmode: 16-bit real-mode code support for relocs tool · 6520fe55
      H. Peter Anvin authored
      
      
      A new option is added to the relocs tool called '--realmode'.
      This option causes the generation of 16-bit segment relocations
      and 32-bit linear relocations for the real-mode code. When
      the real-mode code is moved to the low-memory during kernel
      initialization, these relocation entries can be used to
      relocate the code properly.
      
      In the assembly code 16-bit segment relocations must be relative
      to the 'real_mode_seg' absolute symbol. Linear relocations must be
      relative to a symbol prefixed with 'pa_'.
      
      16-bit segment relocation is used to load cs:ip in 16-bit code.
      Linear relocations are used in the 32-bit code for relocatable
      data references. They are declared in the linker script of the
      real-mode code.
      
      The relocs tool is moved to arch/x86/tools/relocs.c, and added new
      target archscripts that can be used to build scripts needed building
      an architecture.  be compiled before building the arch/x86 tree.
      
      [ hpa: accelerating this because it detects invalid absolute
        relocations, a serious bug in binutils 2.22.52.0.x which currently
        produces bad kernels. ]
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1336501366-28617-2-git-send-email-jarkko.sakkinen@intel.com
      
      
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      6520fe55
  2. May 09, 2012
    • Tejun Heo's avatar
      percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit · d5e28005
      Tejun Heo authored
      
      
      With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE
      or PMD_SIZE for atom_size.  PMD_SIZE is used when CPU supports PSE so
      that percpu areas are aligned to PMD mappings and possibly allow using
      PMD mappings in vmalloc areas in the future.  Using larger atom_size
      doesn't waste actual memory; however, it does require larger vmalloc
      space allocation later on for !first chunks.
      
      With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem
      but x86_32 at this point is anything but reasonable in terms of
      address space and using larger atom_size reportedly leads to frequent
      percpu allocation failures on certain setups.
      
      As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space
      is aplenty and most x86_64 configurations support PSE, fix the issue
      by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32.
      
      v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and
          x86_32 PAGE_SIZE as suggested by hpa.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarYanmin Zhang <yanmin.zhang@intel.com>
      Reported-by: default avatarShuoX Liu <shuox.liu@intel.com>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4F97BA98.6010001@intel.com>
      Cc: stable@vger.kernel.org
      d5e28005
  3. May 08, 2012
  4. May 07, 2012
    • Konrad Rzeszutek Wilk's avatar
      xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEs · b7e5ffe5
      Konrad Rzeszutek Wilk authored
      
      
      If I try to do "cat /sys/kernel/debug/kernel_page_tables"
      I end up with:
      
      BUG: unable to handle kernel paging request at ffffc7fffffff000
      IP: [<ffffffff8106aa51>] ptdump_show+0x221/0x480
      PGD 0
      Oops: 0000 [#1] SMP
      CPU 0
      .. snip..
      RAX: 0000000000000000 RBX: ffffc00000000fff RCX: 0000000000000000
      RDX: 0000800000000000 RSI: 0000000000000000 RDI: ffffc7fffffff000
      
      which is due to the fact we are trying to access a PFN that is not
      accessible to us. The reason (at least in this case) was that
      PGD[256] is set to __HYPERVISOR_VIRT_START which was setup (by the
      hypervisor) to point to a read-only linear map of the MFN->PFN array.
      During our parsing we would get the MFN (a valid one), try to look
      it up in the MFN->PFN tree and find it invalid and return ~0 as PFN.
      Then pte_mfn_to_pfn would happilly feed that in, attach the flags
      and return it back to the caller. 'ptdump_show' bitshifts it and
      gets and invalid value that it tries to dereference.
      
      Instead of doing all of that, we detect the ~0 case and just
      return !_PAGE_PRESENT.
      
      This bug has been in existence .. at least until 2.6.37 (yikes!)
      
      CC: stable@kernel.org
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b7e5ffe5
    • Konrad Rzeszutek Wilk's avatar
      xen/apic: Return the APIC ID (and version) for CPU 0. · 558daa28
      Konrad Rzeszutek Wilk authored
      
      
      On x86_64 on AMD machines where the first APIC_ID is not zero, we get:
      
      ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled)
      BIOS bug: APIC version is 0 for CPU 1/0x10, fixing up to 0x10
      BIOS bug: APIC version mismatch, boot CPU: 0, CPU 1: version 10
      
      which means that when the ACPI processor driver loads and
      tries to parse the _Pxx states it fails to do as, as it
      ends up calling acpi_get_cpuid which does this:
      
      for_each_possible_cpu(i) {
              if (cpu_physical_id(i) == apic_id)
                      return i;
      }
      
      And the bootup CPU, has not been found so it fails and returns -1
      for the first CPU - which then subsequently in the loop that
      "acpi_processor_get_info" does results in returning an error, which
      means that "acpi_processor_add" failing and per_cpu(processor)
      is never set (and is NULL).
      
      That means that when xen-acpi-processor tries to load (much much
      later on) and parse the P-states it gets -ENODEV from
      acpi_processor_register_performance() (which tries to read
      the per_cpu(processor)) and fails to parse the data.
      
      Reported-by-and-Tested-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Suggested-by: default avatarBoris Ostrovsky <boris.ostrovsky@amd.com>
      [v2: Bit-shift APIC ID by 24 bits]
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      558daa28
    • Larry Finger's avatar
      IA32 emulation: Fix build problem for modular ia32 a.out support · febb72a6
      Larry Finger authored
      
      
      Commit ce7e5d2d ("x86: fix broken TASK_SIZE for ia32_aout") breaks
      kernel builds when "CONFIG_IA32_AOUT=m" with
      
        ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined!
        make[1]: *** [__modpost] Error 1
      
      The entry point needs to be exported.
      
      Signed-off-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      febb72a6
    • Al Viro's avatar
      x86: fix broken TASK_SIZE for ia32_aout · ce7e5d2d
      Al Viro authored
      
      
      Setting TIF_IA32 in load_aout_binary() used to be enough; these days
      TASK_SIZE is controlled by TIF_ADDR32 and that one doesn't get set
      there.  Switch to use of set_personality_ia32()...
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ce7e5d2d
  5. May 06, 2012
  6. May 05, 2012
  7. May 04, 2012
    • Linus Torvalds's avatar
      vfs: make word-at-a-time accesses handle a non-existing page · e419b4cc
      Linus Torvalds authored
      
      
      It turns out that there are more cases than CONFIG_DEBUG_PAGEALLOC that
      can have holes in the kernel address space: it seems to happen easily
      with Xen, and it looks like the AMD gart64 code will also punch holes
      dynamically.
      
      Actually hitting that case is still very unlikely, so just do the
      access, and take an exception and fix it up for the very unlikely case
      of it being a page-crosser with no next page.
      
      And hey, this abstraction might even help other architectures that have
      other issues with unaligned word accesses than the possible missing next
      page.  IOW, this could do the byte order magic too.
      
      Peter Anvin fixed a thinko in the shifting for the exception case.
      
      Reported-and-tested-by: default avatarJana Saout <jana@saout.de>
      Cc:  Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e419b4cc
  8. May 01, 2012
  9. Apr 28, 2012
  10. Apr 27, 2012
    • Andreas Herrmann's avatar
      x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it · f7f286a9
      Andreas Herrmann authored
      
      
      BIOS will switch off the corresponding feature flag on family
      15h models 10h-1fh non-desktop CPUs.
      
      The topology extension CPUID leafs are required to detect which
      cores belong to the same compute unit. (thread siblings mask is
      set accordingly and also correct information about L1i and L2
      cache sharing depends on this).
      
      W/o this patch we wouldn't see which cores belong to the same
      compute unit and also cache sharing information for L1i and L2
      would be incorrect on such systems.
      
      Signed-off-by: default avatarAndreas Herrmann <andreas.herrmann3@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f7f286a9
    • Konrad Rzeszutek Wilk's avatar
      xen/smp: Fix crash when booting with ACPI hotplug CPUs. · cf405ae6
      Konrad Rzeszutek Wilk authored
      
      
      When we boot on a machine that can hotplug CPUs and we
      are using 'dom0_max_vcpus=X' on the Xen hypervisor line
      to clip the amount of CPUs available to the initial domain,
      we get this:
      
      (XEN) Command line: com1=115200,8n1 dom0_mem=8G noreboot dom0_max_vcpus=8 sync_console mce_verbosity=verbose console=com1,vga loglvl=all guest_loglvl=all
      .. snip..
      DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x032.072520111118 07/25/2011
      .. snip.
      SMP: Allowing 64 CPUs, 32 hotplug CPUs
      installing Xen timer for CPU 7
      cpu 7 spinlock event irq 361
      NMI watchdog: disabled (cpu7): hardware events not enabled
      Brought up 8 CPUs
      .. snip..
      	[acpi processor finds the CPUs are not initialized and starts calling
      	arch_register_cpu, which creates /sys/devices/system/cpu/cpu8/online]
      CPU 8 got hotplugged
      CPU 9 got hotplugged
      CPU 10 got hotplugged
      .. snip..
      initcall 1_acpi_battery_init_async+0x0/0x1b returned 0 after 406 usecs
      calling  erst_init+0x0/0x2bb @ 1
      
      	[and the scheduler sticks newly started tasks on the new CPUs, but
      	said CPUs cannot be initialized b/c the hypervisor has limited the
      	amount of vCPUS to 8 - as per the dom0_max_vcpus=8 flag.
      	The spinlock tries to kick the other CPU, but the structure for that
      	is not initialized and we crash.]
      BUG: unable to handle kernel paging request at fffffffffffffed8
      IP: [<ffffffff81035289>] xen_spin_lock+0x29/0x60
      PGD 180d067 PUD 180e067 PMD 0
      Oops: 0002 [#1] SMP
      CPU 7
      Modules linked in:
      
      Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc2upstream-00001-gf5154e8 #1 Intel Corporation S2600CP/S2600CP
      RIP: e030:[<ffffffff81035289>]  [<ffffffff81035289>] xen_spin_lock+0x29/0x60
      RSP: e02b:ffff8801fb9b3a70  EFLAGS: 00010282
      
      With this patch, we cap the amount of vCPUS that the initial domain
      can run, to exactly what dom0_max_vcpus=X has specified.
      
      In the future, if there is a hypercall that will allow a running
      domain to expand past its initial set of vCPUS, this patch should
      be re-evaluated.
      
      CC: stable@kernel.org
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      cf405ae6
    • Konrad Rzeszutek Wilk's avatar
      xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded. · df88b2d9
      Konrad Rzeszutek Wilk authored
      
      
      There are exactly four users of __monitor and __mwait:
      
       - cstate.c (which allows acpi_processor_ffh_cstate_enter to be called
         when the cpuidle API drivers are used. However patch
         "cpuidle: replace xen access to x86 pm_idle and default_idle"
         provides a mechanism to disable the cpuidle and use safe_halt.
       - smpboot (which allows mwait_play_dead to be called). However
         safe_halt is always used so we skip that.
       - intel_idle (same deal as above).
       - acpi_pad.c. This the one that we do not want to run as we
         will hit the below crash.
      
      Why do we want to expose MWAIT_LEAF in the first place?
      We want it for the xen-acpi-processor driver - which uploads
      C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c
      sets the proper address in the C-states so that the hypervisor
      can benefit from using the MWAIT functionality. And that is
      the sole reason for using it.
      
      Without this patch, if a module performs mwait or monitor we
      get this:
      
      invalid opcode: 0000 [#1] SMP
      CPU 2
      .. snip..
      Pid: 5036, comm: insmod Tainted: G           O 3.4.0-rc2upstream-dirty #2 Intel Corporation S2600CP/S2600CP
      RIP: e030:[<ffffffffa000a017>]  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
      RSP: e02b:ffff8801c298bf18  EFLAGS: 00010282
      RAX: ffff8801c298a010 RBX: ffffffffa03b2000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff8801c29800d8 RDI: ffff8801ff097200
      RBP: ffff8801c298bf18 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      R13: ffffffffa000a000 R14: 0000005148db7294 R15: 0000000000000003
      FS:  00007fbb364f2700(0000) GS:ffff8801ff08c000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 000000000179f038 CR3: 00000001c9469000 CR4: 0000000000002660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process insmod (pid: 5036, threadinfo ffff8801c298a000, task ffff8801c29cd7e0)
      Stack:
       ffff8801c298bf48 ffffffff81002124 ffffffffa03b2000 00000000000081fd
       000000000178f010 000000000178f030 ffff8801c298bf78 ffffffff810c41e6
       00007fff3fb30db9 00007fff3fb30db9 00000000000081fd 0000000000010000
      Call Trace:
       [<ffffffff81002124>] do_one_initcall+0x124/0x170
       [<ffffffff810c41e6>] sys_init_module+0xc6/0x220
       [<ffffffff815b15b9>] system_call_fastpath+0x16/0x1b
      Code: <0f> 01 c8 31 c0 0f 01 c9 c9 c3 00 00 00 00 00 00 00 00 00 00 00 00
      RIP  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
       RSP <ffff8801c298bf18>
      ---[ end trace 16582fc8a3d1e29a ]---
      Kernel panic - not syncing: Fatal exception
      
      With this module (which is what acpi_pad.c would hit):
      
      MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
      MODULE_DESCRIPTION("mwait_check_and_back");
      MODULE_LICENSE("GPL");
      MODULE_VERSION();
      
      static int __init mwait_check_init(void)
      {
      	__monitor((void *)&current_thread_info()->flags, 0, 0);
      	__mwait(0, 0);
      	return 0;
      }
      static void __exit mwait_check_exit(void)
      {
      }
      module_init(mwait_check_init);
      module_exit(mwait_check_exit);
      
      Reported-by: default avatarLiu, Jinsong <jinsong.liu@intel.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      df88b2d9
  11. Apr 25, 2012
  12. Apr 24, 2012
  13. Apr 21, 2012
  14. Apr 20, 2012
  15. Apr 19, 2012
  16. Apr 18, 2012
  17. Apr 17, 2012
  18. Apr 16, 2012
  19. Apr 13, 2012
  20. Apr 12, 2012
    • Linus Torvalds's avatar
      x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up · 92ae03f2
      Linus Torvalds authored
      
      
      This merges the 32- and 64-bit versions of the x86 strncpy_from_user()
      by just rewriting it in C rather than the ancient inline asm versions
      that used lodsb/stosb and had been duplicated for (trivial) differences
      between the 32-bit and 64-bit versions.
      
      While doing that, it also speeds them up by doing the accesses a word at
      a time.  Finally, the new routines also properly handle the case of
      hitting the end of the address space, which we have never done correctly
      before (fs/namei.c has a hack around it for that reason).
      
      Despite all these improvements, it actually removes more lines than it
      adds, due to the de-duplication.  Also, we no longer export (or define)
      the legacy __strncpy_from_user() function (that was defined to not do
      the user permission checks), since it's not actually used anywhere, and
      the user address space checks are built in to the new code.
      
      Other architecture maintainers have been notified that the old hack in
      fs/namei.c will be going away in the 3.5 merge window, in case they
      copied the x86 approach of being a bit cavalier about the end of the
      address space.
      
      Cc: linux-arch@vger.kernel.org
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92ae03f2