Skip to content
  1. Jul 09, 2008
  2. Jul 08, 2008
    • Bernhard Walle's avatar
      x86: use FIRMWARE_MEMMAP on x86/E820 · 5dfcf14d
      Bernhard Walle authored
      
      
      This patch uses the /sys/firmware/memmap interface provided in the last patch
      on the x86 architecture when E820 is used. The patch copies the E820
      memory map very early, and registers the E820 map afterwards via
      firmware_map_add_early().
      
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Acked-by: default avatarGreg KH <gregkh@suse.de>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: yhlu.kernel@gmail.com
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5dfcf14d
    • Bernhard Walle's avatar
      sysfs: add /sys/firmware/memmap · 69ac9cd6
      Bernhard Walle authored
      
      
      This patch adds /sys/firmware/memmap interface that represents the BIOS
      (or Firmware) provided memory map. The tree looks like:
      
          /sys/firmware/memmap/0/start   (hex number)
                                 end     (hex number)
                                 type    (string)
          ...                 /1/start
                                 end
                                 type
      
      With the following shell snippet one can print the memory map in the same form
      the kernel prints itself when booting on x86 (the E820 map).
      
        --------- 8< --------------------------
          #!/bin/sh
          cd /sys/firmware/memmap
          for dir in * ; do
              start=$(cat $dir/start)
              end=$(cat $dir/end)
              type=$(cat $dir/type)
              printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type"
          done
        --------- >8 --------------------------
      
      That patch only provides the needed interface:
      
       1. The sysfs interface.
       2. The structure and enumeration definition.
       3. The function firmware_map_add() and firmware_map_add_early()
          that should be called from architecture code (E820/EFI, for
          example) to add the contents to the interface.
      
      If the kernel is compiled without CONFIG_FIRMWARE_MEMMAP, the interface does
      nothing without cluttering the architecture-specific code with #ifdef's.
      
      The purpose of the new interface is kexec: While /proc/iomem represents
      the *used* memory map (e.g. modified via kernel parameters like 'memmap'
      and 'mem'), the /sys/firmware/memmap tree represents the unmodified memory
      map provided via the firmware. So kexec can:
      
       - use the original memory map for rebooting,
       - use the /proc/iomem for setting up the ELF core headers for kdump
         case that should only represent the memory of the system.
      
      The patch has been tested on i386 and x86_64.
      
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Acked-by: default avatarGreg KH <gregkh@suse.de>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: yhlu.kernel@gmail.com
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      69ac9cd6
    • Yinghai Lu's avatar
      x86: remove acpi_srat config v2 · 6247943d
      Yinghai Lu authored
      
      
      use ACPI_NUMA directly
      
      and move srat_32.c to mm/
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6247943d
    • Yinghai Lu's avatar
      x86: remove have_arch_parse_srat -v2 · 698839fe
      Yinghai Lu authored
      
      
      we already have the same srat handling interface for 32bit.
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      698839fe
    • Ingo Molnar's avatar
      printk: export console_drivers · a29d1cfe
      Ingo Molnar authored
      
      
      this symbol is needed by drivers/video/xen-fbfront.ko.
      
      [ cherry-picked from tip/core/printk ]
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a29d1cfe
    • Jeremy Fitzhardinge's avatar
      x86/cpa: use an undefined PTE bit for testing CPA · 5a654ba7
      Jeremy Fitzhardinge authored
      
      
      Rather than using _PAGE_GLOBAL - which not all CPUs support - to test
      CPA, use one of the reserved-for-software-use PTE flags instead.  This
      allows CPA testing to work on CPUs which don't support PGD.
      
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5a654ba7
    • Jeremy Fitzhardinge's avatar
      x86_32: remove __PAGE_KERNEL(_EXEC) · ef5e94af
      Jeremy Fitzhardinge authored
      
      From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      
      Older x86-32 processors do not support global mappings (PGD), so must
      only use it if the processor supports it.
      
      The _PAGE_KERNEL* flags always have _PAGE_KERNEL set, since logically
      we always want it set.
      
      This is OK even on processors which do not support PGD, since all
      _PAGE flags are masked with __supported_pte_mask before being turned
      into a real in-pagetable pte.  On 32-bit systems, __supported_pte_mask
      is initialized to not contain _PAGE_GLOBAL, and it is then added if
      the CPU is found to support it.
      
      The x86-32 code used to use __PAGE_KERNEL/__PAGE_KERNEL_EXEC for this
      purpose, but they're now redundant and can be removed.
      
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ef5e94af
    • Jeremy Fitzhardinge's avatar
      x86: always set _PAGE_GLOBAL in _PAGE_KERNEL* flags · 8490638c
      Jeremy Fitzhardinge authored
      
      
      Consistently set _PAGE_GLOBAL in _PAGE_KERNEL flags.  This makes 32-
      and 64-bit code consistent, and removes some special cases where
      __PAGE_KERNEL* did not have _PAGE_GLOBAL set, causing confusion as a
      result of the inconsistencies.
      
      This patch only affects x86-64, which generally always supports PGD.
      The x86-32 patch is next.
      
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8490638c
    • Jeremy Fitzhardinge's avatar
      x86_64/setup: unconditionally populate the pgd · 574977a2
      Jeremy Fitzhardinge authored
      
      
      When allocating a new pud, unconditionally populate the pgd (why did
      we bother to create a new pud if we weren't going to populate it?).
      
      This will only happen if the pgd slot was empty, since any existing
      pud will be reused.
      
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      574977a2
    • Ingo Molnar's avatar
      x86: fix "x86: let setup_arch call init_apic_mappings for 32bit" · aea5f9f8
      Ingo Molnar authored
      
      
      add back this line lost from trap_init():
      
              set_trap_gate(0,  &divide_error);
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      aea5f9f8
    • Ingo Molnar's avatar
      x86: move prefill_possible_map calling early, fix · 4a701737
      Ingo Molnar authored
      
      
      fix:
      
      arch/x86/kernel/built-in.o: In function `setup_arch':
      : undefined reference to `prefill_possible_map'
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4a701737
    • Yinghai Lu's avatar
      x86: move prefill_possible_map calling early · 329513a3
      Yinghai Lu authored
      
      
      call it right after we are done with MADT/mptable handling, instead of
      doing that in setup_per_cpu_areas() later on...
      
      this way for_possible_cpu() can be used early.
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      329513a3
    • Yinghai Lu's avatar
      x86: move init_cpu_to_node after get_smp_config · 5f4765f9
      Yinghai Lu authored
      
      
      when acpi=off, cpu_to_apicid is ready after get_smp_config
      so need to move init_cpu_to_node after it.
      
      otherwise, we will get wrong cpu->node mapping, and it will rely on
      amd_detect_cmp() to correct it - but that is too late as
      setup_per_cpu_data is already called before that so  we will get
      per_cpu_data on the wrong node.
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5f4765f9
    • Yinghai Lu's avatar
      x86: merge zones_sizes_init for numa and non numa on 32-bit · cb95a13a
      Yinghai Lu authored
      
      
      move out e820_register_active_regions from non numa zones_sizes_init()
      and remove numa version zones_sizes_init().
      
      and let 32 bit call remove_all_active_ranges() in setup_arch() directly
      like 64-bit
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cb95a13a
    • Yinghai Lu's avatar
      d9a81b44
    • Yinghai Lu's avatar
    • Yinghai Lu's avatar
      x86: change copy_e820_map to append_e820_map · dc8e8120
      Yinghai Lu authored
      
      
      so it has a more meaningful name.
      also change it to static.
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dc8e8120
    • Bernhard Walle's avatar
      x86: fix documentation bug about relocatability · 068b4538
      Bernhard Walle authored
      
      
      This patch fixes a small bug in documentation: x86_64 also has now
      the ability to build a relocatable kernel.
      
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Cc: vgoyal@redhat.com
      Cc: kexec@lists.infradead.org
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      068b4538
    • Bernhard Walle's avatar
      x86: find offset for crashkernel reservation automatically · 32105f7f
      Bernhard Walle authored
      
      
      This patch removes the need of the crashkernel=...@offset parameter to define
      a fixed offset for crashkernel reservation. That feature can be used together
      with a relocatable kernel where the kexec-tools relocate the kernel and
      get the actual offset from /proc/iomem.
      
      The use case is a kernel where the .text+.data+.bss is after 16M physical
      memory (debug kernel with lockdep on x86_64 can cause that) which caused a
      major pain in autoconfiguration in our distribution.
      
      Also, that patch unifies crashdump architectures a bit since IA64 has
      that semantics from the very beginning of the kdump port.
      
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Cc: vgoyal@redhat.com
      Cc: Bernhard Walle <bwalle@suse.de>
      Cc: kexec@lists.infradead.org
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      32105f7f
    • Alok Kataria's avatar
      x86: cleanup e820_setup_gap(), v2 · fd6493e1
      Alok Kataria authored
      
      
      e820_search_gap also take a end_addr parameter to limit search from
      start_addr to end_addr.
      
      Signed-off-by: default avatarAloK N Kataria <akataria@vmware.com>
      Acked-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Cc: "lenb@kernel.org" <lenb@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fd6493e1
    • Mike Travis's avatar
      x86: add check for node passed to node_to_cpumask, v3 · 6a2f47ca
      Mike Travis authored
      
      
        * When CONFIG_DEBUG_PER_CPU_MAPS is set, the node passed to
          node_to_cpumask and node_to_cpumask_ptr should be validated.
          If invalid, then a dump_stack is performed and a zero cpumask
          is returned.
      
      v2: Slightly different version to remove a compiler warning.
      v3: Redone to reflect moving setup.c -> setup_percpu.c
      
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6a2f47ca
    • Jeremy Fitzhardinge's avatar
      x86: fix CPA self-test for "x86/paravirt: groundwork for 64-bit Xen support" · cd5dce2f
      Jeremy Fitzhardinge authored
      
      
      Ingo Molnar wrote:
      > -tip auto-testing found pagetable corruption (CPA self-test failure):
      >
      > [   32.956015] CPA self-test:
      > [   32.958822]  4k 2048 large 508 gb 0 x 2556[ffff880000000000-ffff88003fe00000] miss 0
      > [   32.964000] CPA ffff88001d54e000: bad pte 1d4000e3
      > [   32.968000] CPA ffff88001d54e000: unexpected level 2
      > [   32.972000] CPA ffff880022c5d000: bad pte 22c000e3
      > [   32.976000] CPA ffff880022c5d000: unexpected level 2
      > [   32.980000] CPA ffff8800200ce000: bad pte 200000e3
      > [   32.984000] CPA ffff8800200ce000: unexpected level 2
      > [   32.988000] CPA ffff8800210f0000: bad pte 210000e3
      >
      > config and full log can be found at:
      >
      >  http://redhat.com/~mingo/misc/config-Mon_Jun_30_11_11_51_CEST_2008.bad
      >  http://redhat.com/~mingo/misc/log-Mon_Jun_30_11_11_51_CEST_2008.bad
      
      Phew.  OK, I've worked this out.  Short version is that's it's a false
      alarm, and there was no real failure here.  Long version:
      
          * I changed the code to create the physical mapping pagetables to
            reuse any existing mapping rather than replace it.   Specifically,
            reusing an pud pointed to by the pgd caused this symptom to appear.
          * The specific PUD being reused is the one created statically in
            head_64.S, which creates an initial 1GB mapping.
          * That mapping doesn't have _PAGE_GLOBAL set on it, due to the
            inconsistency between __PAGE_* and PAGE_*.
          * The CPA test attempts to clear _PAGE_GLOBAL, and then checks to
            see that the resulting range is 1) shattered into 4k pages, and 2)
            has no _PAGE_GLOBAL.
          * However, since it didn't have _PAGE_GLOBAL on that range to start
            with, change_page_attr_clear() had nothing to do, and didn't
            bother shattering the range,
          * resulting in the reported messages
      
      The simple fix is to set _PAGE_GLOBAL in level2_ident_pgt.
      
      An additional fix to make CPA testing more robust by using some other
      pagetable bit (one of the unused available-to-software ones).  This
      would solve spurious CPA test warnings under Xen which uses _PAGE_GLOBAL
      for its own purposes (ie, not under guest control).
      
      Also, we should revisit the use of _PAGE_GLOBAL in asm-x86/pgtable.h,
      and use it consistently, and drop MAKE_GLOBAL.  The first time I
      proposed it it caused breakages in the very early CPA code; with luck
      that's all fixed now.
      
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Cc: xen-devel <xen-devel@lists.xensource.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cd5dce2f
    • Yinghai Lu's avatar
      x86: don't reallocate pgt for node0 · 996cf443
      Yinghai Lu authored
      
      
      kva ram already mapped right after away, so don't need to get that for low ram.
      avoid wasting one copy of pgdat.
      
      also add node id in early_res name in case we get it from find_e820_area.
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      996cf443
    • Yinghai Lu's avatar
      x86: move reserve_setup_data to setup.c · 28bb2237
      Yinghai Lu authored
      
      
      Ying Huang would like setup_data to be reserved, but not included in the
      no save range.
      
      Here we try to modify the e820 table to reserve that range early.
      also add that in early_res in case bootloader messes up with the ramdisk.
      
      other solution would be
      1. add early_res_to_highmem...
      2. early_res_to_e820...
      but they could reserve another type memory wrongly, if early_res has some
      resource reserved early, and not needed later, but it is not removed from
      early_res in time. Like the RAMDISK (already handled).
      
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Cc: andi@firstfloor.org
      Tested-by: default avatarHuang, Ying <ying.huang@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      28bb2237