Skip to content
  1. Oct 26, 2016
    • Josh Poimboeuf's avatar
      scripts/faddr2line: Fix "size mismatch" error · efdb4167
      Josh Poimboeuf authored
      
      
      I'm not sure how we missed this problem before.  When I take a function
      address and size from an oops and give it to faddr2line, it usually
      complains about a size mismatch:
      
        $ scripts/faddr2line ~/k/vmlinux write_sysrq_trigger+0x51/0x60
        skipping write_sysrq_trigger address at 0xffffffff815731a1 due to size mismatch (0x60 != 83)
        no match for write_sysrq_trigger+0x51/0x60
      
      The problem is caused by differences in how kallsyms and faddr2line
      determine a function's size.
      
      kallsyms calculates a function's size by parsing the output of 'nm -n'
      and subtracting the next function's address from the current function's
      address.  This means that nop instructions after the end of the function
      are included in the size.
      
      In contrast, faddr2line reads the size from the symbol table, which does
      *not* include the ending nops in the function's size.
      
      Change faddr2line to calculate the size from the output of 'nm -n' to be
      consistent with kallsyms and oops outputs.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/bd313ed7c4003f6b1fda63e825325c44a9d837de.1477405374.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      efdb4167
  2. Oct 25, 2016
  3. Oct 21, 2016
    • Josh Poimboeuf's avatar
      x86/dumpstack: Print orig_ax in __show_regs() · 6fa81a12
      Josh Poimboeuf authored
      
      
      The value of regs->orig_ax contains potentially useful debugging data:
      For syscalls it contains the syscall number.  For interrupts it contains
      the (negated) vector number.  To reduce noise, print it only if it has a
      useful value (i.e., something other than -1).
      
      Here's what it looks like for a write syscall:
      
        RIP: 0033:[<00007f53ad7b1940>] 0x7f53ad7b1940
        RSP: 002b:00007fff8de66558 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f53ad7b1940
        RDX: 0000000000000002 RSI: 00007f53ae0ca000 RDI: 0000000000000001
        ...
      
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/93f0fe0307a4af884d3fca00edabcc8cff236002.1476973742.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6fa81a12
    • Josh Poimboeuf's avatar
      x86/dumpstack: Fix duplicate RIP address display in __show_regs() · 1141c3e3
      Josh Poimboeuf authored
      
      
      The RIP address is shown twice in __show_regs().  Before:
      
        RIP: 0010:[<ffffffff81070446>]  [<ffffffff81070446>] native_write_msr+0x6/0x30
      
      After:
      
        RIP: 0010:[<ffffffff81070446>] native_write_msr+0x6/0x30
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/b3fda66f36761759b000883b059cdd9a7649dcc1.1476973742.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1141c3e3
    • Josh Poimboeuf's avatar
      x86/dumpstack: Print any pt_regs found on the stack · 3b3fa11b
      Josh Poimboeuf authored
      
      
      Now that we can find pt_regs registers on the stack, print them.  Here's
      an example of what it looks like:
      
        Call Trace:
         <IRQ>
         [<ffffffff8144b793>] dump_stack+0x86/0xc3
         [<ffffffff81142c73>] hrtimer_interrupt+0xb3/0x1c0
         [<ffffffff8105eb86>] local_apic_timer_interrupt+0x36/0x60
         [<ffffffff818b27cd>] smp_apic_timer_interrupt+0x3d/0x50
         [<ffffffff818b06ee>] apic_timer_interrupt+0x9e/0xb0
        RIP: 0010:[<ffffffff818aef43>]  [<ffffffff818aef43>] _raw_spin_unlock_irq+0x33/0x60
        RSP: 0018:ffff880079c4f760  EFLAGS: 00000202
        RAX: ffff880078738000 RBX: ffff88007d3da0c0 RCX: 0000000000000007
        RDX: 0000000000006d78 RSI: ffff8800787388f0 RDI: ffff880078738000
        RBP: ffff880079c4f768 R08: 0000002199088f38 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81e0d540
        R13: ffff8800369fb700 R14: 0000000000000000 R15: ffff880078738000
         <EOI>
         [<ffffffff810e1f14>] finish_task_switch+0xb4/0x250
         [<ffffffff810e1ed6>] ? finish_task_switch+0x76/0x250
         [<ffffffff818a7b61>] __schedule+0x3e1/0xb20
         ...
         [<ffffffff810759c8>] trace_do_page_fault+0x58/0x2c0
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
         [<ffffffff818b1dd8>] async_page_fault+0x28/0x30
        RIP: 0010:[<ffffffff8145b062>]  [<ffffffff8145b062>] __clear_user+0x42/0x70
        RSP: 0018:ffff880079c4fd38  EFLAGS: 00010202
        RAX: 0000000000000000 RBX: 0000000000000138 RCX: 0000000000000138
        RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000061b640
        RBP: ffff880079c4fd48 R08: 0000002198feefd7 R09: ffffffff82a40928
        R10: 0000000000000001 R11: 0000000000000000 R12: 000000000061b640
        R13: 0000000000000000 R14: ffff880079c50000 R15: ffff8800791d7400
         [<ffffffff8145b043>] ? __clear_user+0x23/0x70
         [<ffffffff8145b0fb>] clear_user+0x2b/0x40
         [<ffffffff812fbda2>] load_elf_binary+0x1472/0x1750
         [<ffffffff8129a591>] search_binary_handler+0xa1/0x200
         [<ffffffff8129b69b>] do_execveat_common.isra.36+0x6cb/0x9f0
         [<ffffffff8129b5f3>] ? do_execveat_common.isra.36+0x623/0x9f0
         [<ffffffff8129bcaa>] SyS_execve+0x3a/0x50
         [<ffffffff81003f5c>] do_syscall_64+0x6c/0x1e0
         [<ffffffff818afa3f>] entry_SYSCALL64_slow_path+0x25/0x25
        RIP: 0033:[<00007fd2e2f2e537>]  [<00007fd2e2f2e537>] 0x7fd2e2f2e537
        RSP: 002b:00007ffc449c5fc8  EFLAGS: 00000246
        RAX: ffffffffffffffda RBX: 00007ffc449c8860 RCX: 00007fd2e2f2e537
        RDX: 000000000127cc40 RSI: 00007ffc449c8860 RDI: 00007ffc449c6029
        RBP: 00007ffc449c60b0 R08: 65726f632d667265 R09: 00007ffc449c5e20
        R10: 00000000000005a7 R11: 0000000000000246 R12: 000000000127cc40
        R13: 000000000127ce05 R14: 00007ffc449c6029 R15: 000000000127ce01
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5cc2c512ec82cfba00dd22467644d4ed751a48c0.1476973742.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3b3fa11b
    • Josh Poimboeuf's avatar
      x86/dumpstack: Print stack identifier on its own line · 79439d8e
      Josh Poimboeuf authored
      
      
      show_trace_log_lvl() prints the stack id (e.g. "<IRQ>") without a
      newline so that any stack address printed after it will appear on the
      same line.  That causes the first stack address to be vertically
      misaligned with the rest, making it visually cluttered and slightly
      confusing:
      
        Call Trace:
         <IRQ> [<ffffffff814431c3>] dump_stack+0x86/0xc3
         [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160
         [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0
         ...
         <EOI> [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60
         [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
      
      It will look worse once we start printing pt_regs registers found in the
      middle of the stack:
      
        <IRQ> RIP: 0010:[<ffffffff8189c6c3>]  [<ffffffff8189c6c3>] _raw_spin_unlock_irq+0x33/0x60
        RSP: 0018:ffff88007876f720  EFLAGS: 00000206
        RAX: ffff8800786caa40 RBX: ffff88007d5da140 RCX: 0000000000000007
        ...
      
      Improve readability by adding a newline to the stack name:
      
        Call Trace:
         <IRQ>
         [<ffffffff814431c3>] dump_stack+0x86/0xc3
         [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160
         [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0
         ...
         <EOI>
         [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60
         [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
      
      Now that "continued" lines are no longer needed, we can also remove the
      hack of using the empty string (aka KERN_CONT) and replace it with
      KERN_DEFAULT.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/9bdd6dee2c74555d45500939fcc155997dc7889e.1476973742.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      79439d8e
    • Josh Poimboeuf's avatar
      x86/unwind: Create stack frames for saved syscall registers · acb4608a
      Josh Poimboeuf authored
      
      
      The entry code doesn't encode the pt_regs pointer for syscalls.  But the
      pt_regs are always at the same location, so we can add a manual check
      for them.
      
      A later patch prints them as part of the oops stack dump.  They could be
      useful, for example, to determine the arguments to a system call.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e176aa9272930cd3f51fda0b94e2eae356677da4.1476973742.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      acb4608a
    • Josh Poimboeuf's avatar
      x86/entry/unwind: Create stack frames for saved interrupt registers · 946c1911
      Josh Poimboeuf authored
      
      
      With frame pointers, when a task is interrupted, its stack is no longer
      completely reliable because the function could have been interrupted
      before it had a chance to save the previous frame pointer on the stack.
      So the caller of the interrupted function could get skipped by a stack
      trace.
      
      This is problematic for live patching, which needs to know whether a
      stack trace of a sleeping task can be relied upon.  There's currently no
      way to detect if a sleeping task was interrupted by a page fault
      exception or preemption before it went to sleep.
      
      Another issue is that when dumping the stack of an interrupted task, the
      unwinder has no way of knowing where the saved pt_regs registers are, so
      it can't print them.
      
      This solves those issues by encoding the pt_regs pointer in the frame
      pointer on entry from an interrupt or an exception.
      
      This patch also updates the unwinder to be able to decode it, because
      otherwise the unwinder would be broken by this change.
      
      Note that this causes a change in the behavior of the unwinder: each
      instance of a pt_regs on the stack is now considered a "frame".  So
      callers of unwind_get_return_address() will now get an occasional
      'regs->ip' address that would have previously been skipped over.
      
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8b9f84a21e39d249049e0547b559ff8da0df0988.1476973742.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      946c1911
    • Alexander Kuleshov's avatar
      entry/64: Remove unused ZERO_EXTRA_REGS macro · 29a6d796
      Alexander Kuleshov authored
      
      
      Signed-off-by: default avatarAlexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Link: http://lkml.kernel.org/r/20161020120704.24042-1-kuleshovmail@gmail.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      29a6d796
  4. Oct 20, 2016
  5. Oct 19, 2016