Skip to content
  1. Jun 17, 2014
  2. Jun 14, 2014
  3. Jun 13, 2014
  4. Jun 12, 2014
  5. Jun 11, 2014
    • Mahesh Salgaonkar's avatar
      powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest. · 74845bc2
      Mahesh Salgaonkar authored
      
      
      Currently we forward MCEs to guest which have been recovered by guest.
      And for unhandled errors we do not deliver the MCE to guest. It looks like
      with no support of FWNMI in qemu, guest just panics whenever we deliver the
      recovered MCEs to guest. Also, the existig code used to return to host for
      unhandled errors which was casuing guest to hang with soft lockups inside
      guest and makes it difficult to recover guest instance.
      
      This patch now forwards all fatal MCEs to guest causing guest to crash/panic.
      And, for recovered errors we just go back to normal functioning of guest
      instead of returning to host. This fixes soft lockup issues in guest.
      This patch also fixes an issue where guest MCE events were not logged to
      host console.
      
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      74845bc2
    • Mahesh Salgaonkar's avatar
      powerpc/book3s: Increment the mce counter during machine_check_early call. · e6654d5b
      Mahesh Salgaonkar authored
      
      
      We don't see MCE counter getting increased in /proc/interrupts which gives
      false impression of no MCE occurred even when there were MCE events.
      The machine check early handling was added for PowerKVM and we missed to
      increment the MCE count in the early handler.
      
      We also increment mce counters in the machine_check_exception call, but
      in most cases where we handle the error hypervisor never reaches there
      unless its fatal and we want to crash. Only during fatal situation we may
      see double increment of mce count. We need to fix that. But for
      now it always good to have some count increased instead of zero.
      
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e6654d5b
    • Mahesh Salgaonkar's avatar
      powerpc/book3s: Add stack overflow check in machine check handler. · e75ad93a
      Mahesh Salgaonkar authored
      
      
      Currently machine check handler does not check for stack overflow for
      nested machine check. If we hit another MCE while inside the machine check
      handler repeatedly from same address then we get into risk of stack
      overflow which can cause huge memory corruption. This patch limits the
      nested MCE level to 4 and panic when we cross level 4.
      
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e75ad93a
    • Mahesh Salgaonkar's avatar
      powerpc/book3s: Fix machine check handling for unhandled errors · 2749a2f2
      Mahesh Salgaonkar authored
      
      
      Current code does not check for unhandled/unrecovered errors and return from
      interrupt if it is recoverable exception which in-turn triggers same machine
      check exception in a loop causing hypervisor to be unresponsive.
      
      This patch fixes this situation and forces hypervisor to panic for
      unhandled/unrecovered errors.
      
      This patch also fixes another issue where unrecoverable_exception routine
      was called in real mode in case of unrecoverable exception (MSR_RI = 0).
      This causes another exception vector 0x300 (data access) during system crash
      leading to confusion while debugging cause of the system crash.
      
      Also turn ME bit off while going down, so that when another MCE is hit during
      panic path, system will checkstop and hypervisor will get restarted cleanly
      by SP.
      
      With the above fixes we now throw correct console messages (see below) while
      crashing the system in case of unhandled/unrecoverable machine checks.
      
      --------------
      Severe Machine check interrupt [[Not recovered]
        Initiator: CPU
        Error type: UE [Instruction fetch]
          Effective address: 0000000030002864
      Oops: Machine check, sig: 7 [#1]
      SMP NR_CPUS=2048 NUMA PowerNV
      Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork]
      CPU: 36 PID: 55162 Comm: bash Tainted: G           O 3.14.0mce #1
      task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000
      NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c
      REGS: c000000007ec3d80 TRAP: 0200   Tainted: G           O  (3.14.0mce)
      MSR: 9000000000041002 <SF,HV,ME,RI>  CR: 28222848  XER: 20000000
      CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1
      GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864
      GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9
      GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728
      GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000
      GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc
      GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000
      GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250
      GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000
      NIP [0000000030002864] 0x30002864
      LR [00000000300151a4] 0x300151a4
      Call Trace:
      Instruction dump:
      XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      ---[ end trace 7285f0beac1e29d3 ]---
      
      Sending IPI to other CPUs
      IPI complete
      OPAL V3 detected !
      --------------
      
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2749a2f2
    • Gavin Shan's avatar
      powerpc/eeh: Dump PE location code · 357b2f3d
      Gavin Shan authored
      
      
      As Ben suggested, it's meaningful to dump PE's location code
      for site engineers when hitting EEH errors. The patch introduces
      function eeh_pe_loc_get() to retireve the location code from
      dev-tree so that we can output it when hitting EEH errors.
      
      If primary PE bus is root bus, the PHB's dev-node would be tried
      prior to root port's dev-node. Otherwise, the upstream bridge's
      dev-node of the primary PE bus will be check for the location code
      directly.
      
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      357b2f3d
    • Alexei Starovoitov's avatar
      net: filter: cleanup A/X name usage · e430f34e
      Alexei Starovoitov authored
      
      
      The macro 'A' used in internal BPF interpreter:
       #define A regs[insn->a_reg]
      was easily confused with the name of classic BPF register 'A', since
      'A' would mean two different things depending on context.
      
      This patch is trying to clean up the naming and clarify its usage in the
      following way:
      
      - A and X are names of two classic BPF registers
      
      - BPF_REG_A denotes internal BPF register R0 used to map classic register A
        in internal BPF programs generated from classic
      
      - BPF_REG_X denotes internal BPF register R7 used to map classic register X
        in internal BPF programs generated from classic
      
      - internal BPF instruction format:
      struct sock_filter_int {
              __u8    code;           /* opcode */
              __u8    dst_reg:4;      /* dest register */
              __u8    src_reg:4;      /* source register */
              __s16   off;            /* signed offset */
              __s32   imm;            /* signed immediate constant */
      };
      
      - BPF_X/BPF_K is 1 bit used to encode source operand of instruction
      In classic:
        BPF_X - means use register X as source operand
        BPF_K - means use 32-bit immediate as source operand
      In internal:
        BPF_X - means use 'src_reg' register as source operand
        BPF_K - means use 32-bit immediate as source operand
      
      Suggested-by: default avatarChema Gonzalez <chema@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarChema Gonzalez <chema@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e430f34e
    • Michael Neuling's avatar
      powerpc/powernv: Enable POWER8 doorbell IPIs · d4e58e59
      Michael Neuling authored
      
      
      This patch enables POWER8 doorbell IPIs on powernv.
      
      Since doorbells can only IPI within a core, we test to see when we can use
      doorbells and if not we fall back to XICS.  This also enables hypervisor
      doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit.
      
      Based on tests by Anton, the best case IPI latency between two threads dropped
      from 894ns to 512ns.
      
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d4e58e59
    • Gavin Shan's avatar
      powerpc/powernv: Fix killed EEH event · 5c7a35e3
      Gavin Shan authored
      
      
      On PowerNV platform, EEH errors are reported by IO accessors or poller
      driven by interrupt. After the PE is isolated, we won't produce EEH
      event for the PE. The current implementation has possibility of EEH
      event lost in this way:
      
      The interrupt handler queues one "special" event, which drives the poller.
      EEH thread doesn't pick the special event yet. IO accessors kicks in, the
      frozen PE is marked as "isolated" and EEH event is queued to the list.
      EEH thread runs because of special event and purge all existing EEH events.
      However, we never produce an other EEH event for the frozen PE. Eventually,
      the PE is marked as "isolated" and we don't have EEH event to recover it.
      
      The patch fixes the issue to keep EEH events for PEs that have been
      marked as "isolated" with the help of additional "force" help to
      eeh_remove_event().
      
      Reported-by: default avatarRolf Brudeseth <rolfb@us.ibm.com>
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5c7a35e3
    • Paul Bolle's avatar
      powerpc: fix typo 'CONFIG_PMAC' · 6e0fdf9a
      Paul Bolle authored
      
      
      Commit b0d278b7 ("powerpc/perf_event: Reduce latency of calling
      perf_event_do_pending") added a check for CONFIG_PMAC were a check for
      CONFIG_PPC_PMAC was clearly intended.
      
      Fixes: b0d278b7 ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending")
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6e0fdf9a
    • Paul Bolle's avatar
      powerpc: fix typo 'CONFIG_PPC_CPU' · b69a1da9
      Paul Bolle authored
      
      
      Commit cd64d169 ("powerpc: mtmsrd not defined") added a check for
      CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended.
      
      Fixes: cd64d169 ("powerpc: mtmsrd not defined")
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b69a1da9
    • Gavin Shan's avatar
      powerpc/powernv: Don't escalate non-existing frozen PE · 71b540ad
      Gavin Shan authored
      
      
      Commit cb5b242c ("powerpc/eeh: Escalate error on non-existing PE")
      escalates the frozen state on non-existing PE to fenced PHB. It
      was to improve kdump reliability. After that, commit 361f2a2a
      ("powrpc/powernv: Reset PHB in kdump kernel") was introduced to
      issue complete reset on all PHBs to increase the reliability of
      kdump kernel.
      
      Commit cb5b242c becomes unuseful and it would be reverted.
      
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      71b540ad
    • Gavin Shan's avatar
      powerpc/eeh: Report frozen parent PE prior to child PE · 1ad7a72c
      Gavin Shan authored
      
      
      When we have the corner case of frozen parent and child PE at the
      same time, we have to handle the frozen parent PE prior to the
      child. Without clearning the frozen state on parent PE, the child
      PE can't be recovered successfully.
      
      The patch searches the EEH PE hierarchy tree and returns the toppest
      frozen PE to be handled. It ensures the frozen parent PE will be
      handled prior to child PE.
      
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1ad7a72c
    • Gavin Shan's avatar
      powerpc/eeh: Clear frozen state for child PE · 2c665992
      Gavin Shan authored
      
      
      Since commit cb523e09 ("powerpc/eeh: Avoid I/O access during PE
      reset"), the PE is kept as frozen state on hardware level until
      the PE reset is done completely. After that, we explicitly clear
      the frozen state of the affected PE. However, there might have
      frozen child PEs of the affected PE and we also need clear their
      frozen state as well. Otherwise, the recovery is going to fail.
      
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2c665992
    • Anton Blanchard's avatar
      powerpc/powernv: Reduce panic timeout from 180s to 10s · 4817fc32
      Anton Blanchard authored
      
      
      We've already dropped the default pseries timeout to 10s, do
      the same for powernv.
      
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4817fc32
    • Kees Cook's avatar
      powerpc/xmon: avoid format string leaking to printk · 50b66dbf
      Kees Cook authored
      
      
      This makes sure format strings cannot leak into printk (the string has
      already been correctly processed for format arguments).
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      50b66dbf
    • Michael Ellerman's avatar
      powerpc/perf: Ensure all EBB register state is cleared on fork() · 3df48c98
      Michael Ellerman authored
      
      
      In commit 330a1eb7 "Core EBB support for 64-bit book3s" I messed up
      clear_task_ebb(). It clears some but not all of the task's Event Based
      Branch (EBB) registers when we duplicate a task struct.
      
      That allows a child task to observe the EBBHR & EBBRR of its parent,
      which it should not be able to do.
      
      Fix it by clearing EBBHR & EBBRR.
      
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: stable@vger.kernel.org [v3.11+]
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3df48c98
    • Joel Stanley's avatar
      powerpc/powernv: Fix reading of OPAL msglog · caf69ba6
      Joel Stanley authored
      memory_return_from_buffer returns a signed value, so ret should be
      ssize_t.
      
      Fixes the following issue reported by David Binderman:
      
        [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style)
        Checking if unsigned variable 'ret' is less than zero.
        [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style)
        Checking if unsigned variable 'ret' is less than zero.
      
        Local variable "ret" is of type size_t. This is always unsigned,
        so it is pointless to check if it is less than zero.
      
        https://bugzilla.kernel.org/show_bug.cgi?id=77551
      
      
      
      Fixing this exposes a real bug for the case where the entire count
      bytes is successfully read from the POS_WRAP case. The second
      memory_read_from_buffer will return EINVAL, causing the entire read to
      return EINVAL to userspace, despite the data being copied correctly. The
      fix is to test for the case where the data has been read and return
      early.
      
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      caf69ba6
    • Dan Carpenter's avatar
      powerpc/spufs: Remove duplicate SPUFS_CNTL_MAP_SIZE define · be8f9642
      Dan Carpenter authored
      
      
      The SPUFS_CNTL_MAP_SIZE define is cut and pasted twice so we can delete
      the second instance.
      
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJeremy Kerr <jk@ozlabs.org>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      be8f9642