Skip to content
  1. Apr 26, 2013
    • Jesse Larrew's avatar
      powerpc/pseries: RE-enable Virtual Processor Home Node updating · b7abef04
      Jesse Larrew authored
      
      
      The new PRRN firmware feature provides a more convenient and event-driven
      interface than VPHN for notifying Linux of changes to the NUMA affinity of
      platform resources. However, for practical reasons, it may not be feasible
      for some customers to update to the latest firmware. For these customers,
      the VPHN feature supported on previous firmware versions may still be the
      best option.
      
      The VPHN feature was previously disabled due to races with the load
      balancing code when accessing the NUMA cpu maps, but the new stop_machine()
      approach protects the NUMA cpu maps from these concurrent accesses. It
      should be safe to re-enable this feature now.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b7abef04
    • Jesse Larrew's avatar
      powerpc/pseries: Update NUMA VDSO information when updating CPU maps · 176bbf14
      Jesse Larrew authored
      The following patch adds vdso_getcpu_init(), which stores the NUMA node for
      a cpu in SPRG3:
      
      Commit 18ad51dd
      
       ("powerpc: Add VDSO version of getcpu") adds
      vdso_getcpu_init(), which stores the NUMA node for a cpu in SPRG3.
      
      This patch ensures that this information is also updated when the NUMA
      affinity of a cpu changes.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      176bbf14
    • Nathan Fontenot's avatar
      powerpc/pseries: Use stop machine to update cpu maps · 30c05350
      Nathan Fontenot authored
      
      
      The new PRRN firmware feature allows CPU and memory resources to be
      transparently reassigned across NUMA boundaries. When this happens, the
      kernel must update the node maps to reflect the new affinity information.
      
      Although the NUMA maps can be protected by locking primitives during the
      update itself, this is insufficient to prevent concurrent accesses to these
      structures. Since cpumask_of_node() hands out a pointer to these
      structures, they can still be modified outside of the lock. Furthermore,
      tracking down each usage of these pointers and adding locks would be quite
      invasive and difficult to maintain.
      
      The approach used is to make a list of affected cpus and call stop_machine
      to have the update routine run on each of the affected cpus allowing them
      to update themselves. Each cpu finds itself in the list of cpus and makes
      the appropriate updates. We need to have each cpu do this for themselves to
      handle calls to vdso_getcpu_init() added in a subsequent patch.
      
      Situations like these are best handled using stop_machine(). Since the NUMA
      affinity updates are exceptionally rare events, this approach has the
      benefit of not adding any overhead while accessing the NUMA maps during
      normal operation.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      30c05350
    • Jesse Larrew's avatar
      powerpc/pseries: Update CPU maps when device tree is updated · 5d88aa85
      Jesse Larrew authored
      
      
      Platform events such as partition migration or the new PRRN firmware
      feature can cause the NUMA characteristics of a CPU to change, and these
      changes will be reflected in the device tree nodes for the affected
      CPUs.
      
      This patch registers a handler for Open Firmware device tree updates
      and reconfigures the CPU and node maps whenever the associativity
      changes. Currently, this is accomplished by marking the affected CPUs in
      the cpu_associativity_changes_mask and allowing
      arch_update_cpu_topology() to retrieve the new associativity information
      using hcall_vphn().
      
      Protecting the NUMA cpu maps from concurrent access during an update
      operation will be addressed in a subsequent patch in this series.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5d88aa85
    • Nathan Fontenot's avatar
      powerpc/pseries: Update numa.c to use updated firmware_has_feature() · 8002b0c5
      Nathan Fontenot authored
      
      
      Update the numa code to use the updated firmware_has_feature() when checking
      for type 1 affinity.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8002b0c5
    • Nathan Fontenot's avatar
      powerpc/pseries: Update firmware_has_feature() to check architecture vector 5 bits · f0ff7eb4
      Nathan Fontenot authored
      
      
      The firmware_has_feature() function makes it easy to check for supported
      features of the hypervisor. This patch extends the capability of
      firmware_has_feature() to include checking for specified bits
      in vector 5 of the architecture vector as reported in the device tree.
      
      As part of this the #defines used for the architecture vector are re-defined
      such that each option has the index into vector 5 and the feature bit encoded
      into it. This makes checking for architecture bits when initiating data
      for firmware_has_feature much easier.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f0ff7eb4
    • Nathan Fontenot's avatar
      powerpc/pseries: Use ARRAY_SIZE to iterate over firmware_features_table array · 43c0ea60
      Nathan Fontenot authored
      
      
      When iterating over the entries in firmware_features_table we only need
      to go over the actual number of entries in the array instead of declaring
      it to be bigger and checking to make sure there is a valid entry in every
      slot.
      
      This patch removes the FIRMWARE_MAX_FEATURES #define and replaces the
      array looping with the use of ARRAY_SIZE().
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      43c0ea60
    • Nathan Fontenot's avatar
      powerpc/pseries: Move architecture vector definitions to prom.h · 530b5e14
      Nathan Fontenot authored
      
      
      As part of handling of PRRN events we need to check vector 5 of the
      architecture vector bits reported in the device tree to ensure PRRN event
      handling is enabled. To do this firmware_has_feature() is updated (in a
      subsequent patch) to make this check vector 5 bits. To avoid having to
      re-define bits in the architecture vector the bit definitions are moved
      to prom.h.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      530b5e14
    • Jesse Larrew's avatar
      powerpc/pseries: Add PRRN RTAS event handler · 49c68a85
      Jesse Larrew authored
      
      
      A PRRN event is signaled via the RTAS event-scan mechanism, which
      returns a Hot Plug Event message "fixed part" indicating "Platform
      Resource Reassignment". In response to the Hot Plug Event message,
      we must call ibm,update-nodes to determine which resources were
      reassigned and then ibm,update-properties to obtain the new affinity
      information about those resources.
      
      The PRRN event-scan RTAS message contains only the "fixed part" with
      the "Type" field set to the value 160 and no Extended Event Log. The
      four-byte Extended Event Log Length field is re-purposed (since no
      Extended Event Log message is included) to pass the "scope" parameter
      that causes the ibm,update-nodes to return the nodes affected by the
      specific resource reassignment.
      
      This patch adds a handler for RTAS events. The function
      pseries_devicetree_update() (from mobility.c) is used to make the
      ibm,update-nodes/ibm,update-properties RTAS calls. Updating the NUMA maps
      (handled by a subsequent patch) will require significant processing,
      so pseries_devicetree_update() is called from an asynchronous workqueue
      to allow event processing to continue.
      
      PRRN RTAS events on pseries systems are rare events that have to be
      initiated from the HMC console for the system by an IBM tech. This allows
      us to assume that these events are widely spaced. Additionally, all work
      on the queue is flushed before handling any new work to ensure we only have
      one event in flight being handled at a time.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      49c68a85
    • Nathan Fontenot's avatar
      powerpc/pseries: Correct buffer parsing in update_dt_node() · 2e9b7b02
      Nathan Fontenot authored
      
      
      Correct parsing of the buffer returned from ibm,update-properties. The first
      element is a length and the path to the property which is slightly different
      from the list of properties in the buffer so we need to specifically
      handle this.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2e9b7b02
    • Nathan Fontenot's avatar
      powerpc/pseries: Expose pseries devicetree_update() · 762ec157
      Nathan Fontenot authored
      
      
      Newer firmware on Power systems can transparently reassign platform resources
      (CPU and Memory) in use. For instance, if a processor or memory unit is
      predicted to fail, the platform may transparently move the processing to an
      equivalent unused processor or the memory state to an equivalent unused
      memory unit. However, reassigning resources across NUMA boundaries may alter
      the performance of the partition. When such reassignment is necessary, the
      Platform Resource Reassignment Notification (PRRN) option provides a
      mechanism to inform the Linux kernel of changes to the NUMA affinity of
      its platform resources.
      
      When rtasd receives a PRRN event, it needs to make a series of RTAS
      calls (ibm,update-nodes and ibm,update-properties) to retrieve the
      updated device tree information. These calls are already handled in the
      pseries_devicetree_update() routine used in partition migration.
      
      This patch exposes pseries_devicetree_update() to make it accessible
      to other pseries routines, this patch also updates pseries_devicetree_update()
      to take a 32-bit scope parameter. The scope value, which was previously hard
      coded to 1 for partition migration, is used for the RTAS calls
      ibm,update-nodes/properties to update the device tree.
      
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      762ec157
    • Michael Neuling's avatar
      powerpc: Fix hardware IRQs with MMU on exceptions when HV=0 · 3e96ca7f
      Michael Neuling authored
      
      
      POWER8 allows us to take interrupts with the MMU on.  This gives us a
      second set of vectors offset at 0x4000.
      
      Unfortunately when coping these vectors we missed checking for MSR HV
      for hardware interrupts (0x500).  This results in us trying to use
      HSRR0/1 when HV=0, rather than SRR0/1 on HW IRQs
      
      The below fixes this to check CPU_FTR_HVMODE when patching the code at
      0x4500.
      
      Also we remove the check for CPU_FTR_ARCH_206 since relocation on IRQs
      are only available in arch 2.07 and beyond.
      
      Thanks to benh for helping find this.
      
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3e96ca7f
    • Michael Neuling's avatar
      powerpc/power8: Fix secondary CPUs hanging on boot for HV=0 · 8c2a3817
      Michael Neuling authored
      
      
      In __restore_cpu_power8 we determine if we are HV and if not, we return
      before setting HV only resources.
      
      Unfortunately we forgot to restore the link register from r11 before
      returning.
      
      This will happen on boot and with secondary CPUs not coming online.
      
      This adds the missing link register restore.
      
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8c2a3817
    • Michael Neuling's avatar
      powerpc: Add isync to copy_and_flush · 29ce3c50
      Michael Neuling authored
      
      
      In __after_prom_start we copy the kernel down to zero in two calls to
      copy_and_flush.  After the first call (copy from 0 to copy_to_here:)
      we jump to the newly copied code soon after.
      
      Unfortunately there's no isync between the copy of this code and the
      jump to it.  Hence it's possible that stale instructions could still be
      in the icache or pipeline before we branch to it.
      
      We've seen this on real machines and it's results in no console output
      after:
        calling quiesce...
        returning from prom_init
      
      The below adds an isync to ensure that the copy and flushing has
      completed before any branching to the new instructions occurs.
      
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      29ce3c50
    • Michael Neuling's avatar
      powerpc: Add HWCAP2 aux entry · 2171364d
      Michael Neuling authored
      
      
      We are currently out of free bits in AT_HWCAP. With POWER8, we have
      several hardware features that we need to advertise.
      
      Tested on POWER and x86.
      
      Signed-off-by: default avatarMichael Neuling <michael@neuling.org>
      Signed-off-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2171364d
  2. Apr 24, 2013
  3. Apr 23, 2013
  4. Apr 22, 2013
  5. Apr 21, 2013
    • Linus Torvalds's avatar
      Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 830ac852
      Linus Torvalds authored
      Pull kdump fixes from Peter Anvin:
       "The kexec/kdump people have found several problems with the support
        for loading over 4 GiB that was introduced in this merge cycle.  This
        is partly due to a number of design problems inherent in the way the
        various pieces of kdump fit together (it is pretty horrifically manual
        in many places.)
      
        After a *lot* of iterations this is the patchset that was agreed upon,
        but of course it is now very late in the cycle.  However, because it
        changes both the syntax and semantics of the crashkernel option, it
        would be desirable to avoid a stable release with the broken
        interfaces."
      
      I'm not happy with the timing, since originally the plan was to release
      the final 3.9 tomorrow.  But apparently I'm doing an -rc8 instead...
      
      * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        kexec: use Crash kernel for Crash kernel low
        x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
        x86, kdump: Retore crashkernel= to allocate under 896M
        x86, kdump: Set crashkernel_low automatically
      830ac852
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · db93f8b4
      Linus Torvalds authored
      Pull x86 fixes from Peter Anvin:
       "Three groups of fixes:
      
         1. Make sure we don't execute the early microcode patching if family
            < 6, since it would touch MSRs which don't exist on those
            families, causing crashes.
      
         2. The Xen partial emulation of HyperV can be dealt with more
            gracefully than just disabling the driver.
      
         3. More EFI variable space magic.  In particular, variables hidden
            from runtime code need to be taken into account too."
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, microcode: Verify the family before dispatching microcode patching
        x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
        x86,efi: Implement efi_no_storage_paranoia parameter
        efi: Export efi_query_variable_store() for efivars.ko
        x86/Kconfig: Make EFI select UCS2_STRING
        efi: Distinguish between "remaining space" and actually used space
        efi: Pass boot services variable info to runtime code
        Move utf16 functions to kernel core and rename
        x86,efi: Check max_size only if it is non-zero.
        x86, efivars: firmware bug workarounds should be in platform code
      db93f8b4
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm · 8c3a13c8
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A set of fixes from various people - Will Deacon gets a prize for
        removing code this time around.  The biggest fix in this lot is
        sorting out the ARM740T mess.  The rest are relatively small fixes."
      
      * 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: 7699/1: sched_clock: Add more notrace to prevent recursion
        ARM: 7698/1: perf: fix group validation when using enable_on_exec
        ARM: 7697/1: hw_breakpoint: do not use __cpuinitdata for dbg_cpu_pm_nb
        ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon
        ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch()
        ARM: 7692/1: iop3xx: move IOP3XX_PERIPHERAL_VIRT_BASE
        ARM: modules: don't export cpu_set_pte_ext when !MMU
        ARM: mm: remove broken condition check for v4 flushing
        ARM: mm: fix numerous hideous errors in proc-arm740.S
        ARM: cache: remove ARMv3 support code
        ARM: tlbflush: remove ARMv3 support
      8c3a13c8
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 851b3f32
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
      
       1) Fix race in sparc64 TLB shootdowns, we have to synchronize with the
          sibling cpus completing if we are passing them a reference via
          pointer to a data structure.
      
       2) Fix cleaning of bitmaps in sparc32, from Akinobu Mita.
      
       3) Fix various sparc header mistakes, some of which resulted in
          userland build breakage.  From Sam Ravnborg.
      
       4) Kill ghost declarations and defines missed when several bits of code
          got deleted recently.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix race in TLB batch processing.
        sparc: use asm-generic version of types.h
        bbc_i2c: fix section mismatch warning
        sparc: use generic headers
        sparc:cleanup unused code in smp_32.h
        sparc/iommu: fix typo s/265KB/256KB/
        sparc/srmmu: clear trailing edge of bitmap properly
        sparc:remove unused declaration smp_boot_cpus()
      851b3f32
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c437d888
      Linus Torvalds authored
      Pull networking updates from David Miller:
      
       1) ax88796 does 64-bit divides which causes link errors on ARM, fix
          from Arnd Bergmann.
      
       2) Once an improper offload setting is detected on an SKB we don't rate
          limit the log message so we can very easily live lock.  From Ben
          Greear.
      
       3) Openvswitch cannot report vport configuration changes reliably
          because it didn't preallocate the netlink notification message
          before changing state.  From Jesse Gross.
      
       4) The effective UID/GID SCM credentials fix, from Linus.
      
       5) When a user explicitly asks for wireless authentication, cfg80211
          isn't told about the AP detachment leaving inconsistent state.  Fix
          from Johannes Berg.
      
       6) Fix self-MAC checks in batman-adv on multi-mesh nodes, from Antonio
          Quartulli.
      
       7) Revert build_skb() change sin IGB driver, can result in memory
          corruption.  From Alexander Duyck.
      
       8) Fix setting VLANs on virtual functions in IXGBE, from Greg Rose.
      
       9) Fix TSO races in qlcnic driver, from Sritej Velaga.
      
      10) In bnx2x the kernel driver and UNDI firmware can try to program the
          chip at the same time, resulting in corruption.  Add proper
          synchronization.  From Dmitry Kravkov.
      
      11) Fix corruption of status block in firmware ram in bxn2x, from Ariel
          Elior.
      
      12) Fix load balancing hash regression of bonding driver in forwarding
          configurations, from Eric Dumazet.
      
      13) Fix TS ECR regression in TCP by calling tcp_replace_ts_recent() in
          all the right spots, from Eric Dumazet.
      
      14) Fix several bonding bugs having to do with address manintainence,
          including not removing address when configuration operations
          encounter errors, missed locking on the address lists, missing
          refcounting on VLAN objects, etc.  All from Nikolay Aleksandrov.
      
      15) Add workarounds for firmware bugs in LTE qmi_wwan devices, wherein
          the devices fail to add a proper ethernet header while on LTE
          networks but otherwise properly do so on 2G and 3G ones.  From Bjørn
          Mork.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
        net: fix incorrect credentials passing
        net: rate-limit warn-bad-offload splats.
        net: ax88796: avoid 64 bit arithmetic
        qlge: Update version to 1.00.00.32.
        qlge: Fix ethtool autoneg advertising.
        qlge: Fix receive path to drop error frames
        net: qmi_wwan: prevent duplicate mac address on link (firmware bug workaround)
        net: qmi_wwan: fixup destination address (firmware bug workaround)
        net: qmi_wwan: fixup missing ethernet header (firmware bug workaround)
        bonding: in bond_mc_swap() bond's mc addr list is walked without lock
        bonding: disable netpoll on enslave failure
        bonding: primary_slave & curr_active_slave are not cleaned on enslave failure
        bonding: vlans don't get deleted on enslave failure
        bonding: mc addresses don't get deleted on enslave failure
        pkt_sched: fix error return code in fw_change_attrs()
        irda: small read past the end of array in debug code
        tcp: call tcp_replace_ts_recent() from tcp_ack()
        netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too
        netfilter: ipset: bitmap:ip,mac: fix listing with timeout
        bonding: fix l23 and l34 load balancing in forwarding path
        ...
      c437d888
    • Linus Torvalds's avatar
      net: fix incorrect credentials passing · 83f1b4ba
      Linus Torvalds authored
      Commit 257b5358
      
       ("scm: Capture the full credentials of the scm
      sender") changed the credentials passing code to pass in the effective
      uid/gid instead of the real uid/gid.
      
      Obviously this doesn't matter most of the time (since normally they are
      the same), but it results in differences for suid binaries when the wrong
      uid/gid ends up being used.
      
      This just undoes that (presumably unintentional) part of the commit.
      
      Reported-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83f1b4ba
  6. Apr 20, 2013