Skip to content
  1. May 24, 2017
    • Ingo Molnar's avatar
      tools/include: Sync kernel ABI headers with tooling headers · 6e30437b
      Ingo Molnar authored
      Sync (copy) the following v4.12 kernel headers to the tooling headers:
      
        arch/x86/include/asm/disabled-features.h:
        arch/x86/include/uapi/asm/kvm.h:
        arch/powerpc/include/uapi/asm/kvm.h:
        arch/s390/include/uapi/asm/kvm.h:
        arch/arm/include/uapi/asm/kvm.h:
        arch/arm64/include/uapi/asm/kvm.h:
      
         - 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
           fortunately none of the (in-kernel) tooling relied on it
      
         - new KVM_DEV calls added
      
        arch/x86/include/asm/required-features.h:
      
         - 5-level paging hardware ABI detail added
      
        arch/x86/include/asm/cpufeatures.h:
      
         - new CPU feature added
      
        arch/x86/include/uapi/asm/vmx.h:
      
         - new VMX exit conditions
      
      None of the changes requires fixes in the tooling source code.
      
      This addresses the following warnings:
      
        Warning: include/uapi/linux/stat.h differs from kernel
        Warning: arch/x86/include/asm/disabled-features.h differs from kernel
        Warning: arch/x86/include/asm/required-features.h differs from kernel
        Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
        Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
        Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
      
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6e30437b
    • Namhyung Kim's avatar
      perf tools: Put caller above callee in --children mode · 7111ffff
      Namhyung Kim authored
      
      
      The __hpp__sort_acc() sorts entries using callchain depth in order to
      put callers above in children mode.  But it assumed the callchain order
      was callee-first.  Now default (for children) is caller-first so the
      order of entries is reverted.
      
      For example, consider following case:
      
        $ perf report --no-children
        ..l
        # Overhead  Command  Shared Object        Symbol
        # ........  .......  ...................  ..........................
        #
            99.44%  a.out    a.out                [.] main
                    |
                    ---main
                       __libc_start_main
                       _start
      
      Then children mode should show 'start' above '__libc_start_main' since
      it's the caller (parent) of the __libc_start_main.  But it's reversed:
      
        # Children      Self  Command  Shared Object    Symbol
        # ........  ........  .......  ...............  .....................
        #
            99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
            99.61%     0.00%  a.out    a.out            [.] _start
            99.54%    99.44%  a.out    a.out            [.] main
      
      This patch fixes it.
      
        # Children      Self  Command  Shared Object    Symbol
        # ........  ........  .......  ...............  .....................
        #
            99.61%     0.00%  a.out    a.out            [.] _start
            99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
            99.54%    99.44%  a.out    a.out            [.] main
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7111ffff
    • Milian Wolff's avatar
      perf report: Do not drop last inlined frame · 4d53b9d5
      Milian Wolff authored
      
      
      The very last inlined frame, i.e. the one furthest away from the
      non-inlined frame, was silently dropped. This is apparent when
      comparing the output of `perf script` and `addr2line`:
      
      ~~~~~~
        $ perf script --inline
        ...
        a.out 26722 80836.309329:      72425 cycles:
                           21561 __hypot_finite (/usr/lib/libm-2.25.so)
                            ace3 hypot (/usr/lib/libm-2.25.so)
                             a4a main (a.out)
                                 std::abs<double>
                                 std::_Norm_helper<true>::_S_do_it<double>
                                 std::norm<double>
                                 main
                           20510 __libc_start_main (/usr/lib/libc-2.25.so)
                             bd9 _start (a.out)
      
        $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
        0x0000000000000a4a
        std::__complex_abs(doublecomplex )
        /usr/include/c++/6.3.1/complex:589
        double std::abs<double>(std::complex<double> const&)
        /usr/include/c++/6.3.1/complex:597
        double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
        /usr/include/c++/6.3.1/complex:654
        double std::norm<double>(std::complex<double> const&)
        /usr/include/c++/6.3.1/complex:664
        main
        /tmp/inlining.cpp:14
      ~~~~~
      
      Note how `std::__complex_abs` is missing from the `perf script`
      output. This is similarly showing up in `perf report`. The patch
      here fixes this issue, and the output becomes:
      
      ~~~~~
        a.out 26722 80836.309329:      72425 cycles:
                           21561 __hypot_finite (/usr/lib/libm-2.25.so)
                            ace3 hypot (/usr/lib/libm-2.25.so)
                             a4a main (a.out)
                                 std::__complex_abs
                                 std::abs<double>
                                 std::_Norm_helper<true>::_S_do_it<double>
                                 std::norm<double>
                                 main
                           20510 __libc_start_main (/usr/lib/libc-2.25.so)
                             bd9 _start (a.out)
      ~~~~~
      
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4d53b9d5
    • Milian Wolff's avatar
      perf report: Always honor callchain order for inlined nodes · 28071f51
      Milian Wolff authored
      
      
      So far, the inlined nodes where only reversed when we built perf
      against libbfd. If that was not available, the addr2line fallback
      code path was missing the inline_list__reverse call.
      
      Now we always add the nodes in the correct order within
      inline_list__append. This removes the need to reverse the list
      and also ensures that all callers construct the list in the right
      order.
      
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      28071f51
    • Namhyung Kim's avatar
      perf script: Add --inline option for debugging · 325fbff5
      Namhyung Kim authored
      
      
      The --inline option is to show inlined functions in callchains.
      
      For example:
      
        $ perf script
        a.out  5644 11611.467597:     309961 cycles:u:
                           790 main (/home/namhyung/tmp/perf/a.out)
                         20511 __libc_start_main (/usr/lib/libc-2.25.so)
                           8ba _start (/home/namhyung/tmp/perf/a.out)
        ...
      
        $ perf script --inline
        a.out  5644 11611.467597:     309961 cycles:u:
                           790 main (/home/namhyung/tmp/perf/a.out)
                               std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
                               std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                               std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                               main
                         20511 __libc_start_main (/usr/lib/libc-2.25.so)
                           8ba _start (/home/namhyung/tmp/perf/a.out)
        ...
      
      Reviewed-and-tested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      325fbff5
    • Milian Wolff's avatar
      perf report: Fix off-by-one for non-activation frames · 1982ad48
      Milian Wolff authored
      
      
      As the documentation for dwfl_frame_pc says, frames that
      are no activation frames need to have their program counter
      decremented by one to properly find the function of the caller.
      
      This fixes many cases where perf report currently attributes
      the cost to the next line. I.e. I have code like this:
      
      ~~~~~~~~~~~~~~~
        #include <thread>
        #include <chrono>
      
        using namespace std;
      
        int main()
        {
          this_thread::sleep_for(chrono::milliseconds(1000));
          this_thread::sleep_for(chrono::milliseconds(100));
          this_thread::sleep_for(chrono::milliseconds(10));
      
          return 0;
        }
      ~~~~~~~~~~~~~~~
      
      Now compile and record it:
      
      ~~~~~~~~~~~~~~~
        g++ -std=c++11 -g -O2 test.cpp
        echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
        perf record \
          --event sched:sched_stat_sleep \
          --event sched:sched_process_exit \
          --event sched:sched_switch --call-graph=dwarf \
          --output perf.data.raw \
          ./a.out
        echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
        perf inject --sched-stat --input perf.data.raw --output perf.data
      ~~~~~~~~~~~~~~~
      
      Before this patch, the report clearly shows the off-by-one issue.
      Most notably, the last sleep invocation is incorrectly attributed
      to the "return 0;" line:
      
      ~~~~~~~~~~~~~~~
        Overhead  Source:Line
        ........  ...........
      
         100.00%  core.c:0
                  |
                  ---__schedule core.c:0
                     schedule
                     do_nanosleep hrtimer.c:0
                     hrtimer_nanosleep
                     sys_nanosleep
                     entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
                     __nanosleep_nocancel .:0
                     std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
                     |
                     |--90.08%--main test.cpp:9
                     |          __libc_start_main
                     |          _start
                     |
                     |--9.01%--main test.cpp:10
                     |          __libc_start_main
                     |          _start
                     |
                      --0.91%--main test.cpp:13
                                __libc_start_main
                                _start
      ~~~~~~~~~~~~~~~
      
      With this patch here applied, the issue is fixed. The report becomes
      much more usable:
      
      ~~~~~~~~~~~~~~~
        Overhead  Source:Line
        ........  ...........
      
         100.00%  core.c:0
                  |
                  ---__schedule core.c:0
                     schedule
                     do_nanosleep hrtimer.c:0
                     hrtimer_nanosleep
                     sys_nanosleep
                     entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
                     __nanosleep_nocancel .:0
                     std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
                     |
                     |--90.08%--main test.cpp:8
                     |          __libc_start_main
                     |          _start
                     |
                     |--9.01%--main test.cpp:9
                     |          __libc_start_main
                     |          _start
                     |
                      --0.91%--main test.cpp:10
                                __libc_start_main
                                _start
      ~~~~~~~~~~~~~~~
      
      Similarly it works for signal frames:
      
      ~~~~~~~~~~~~~~~
        __noinline void bar(void)
        {
          volatile long cnt = 0;
      
          for (cnt = 0; cnt < 100000000; cnt++);
        }
      
        __noinline void foo(void)
        {
          bar();
        }
      
        void sig_handler(int sig)
        {
          foo();
        }
      
        int main(void)
        {
          signal(SIGUSR1, sig_handler);
          raise(SIGUSR1);
      
          foo();
          return 0;
        }
      ~~~~~~~~~~~~~~~~
      
      Before, the report wrongly points to `signal.c:29` after raise():
      
      ~~~~~~~~~~~~~~~~
        $ perf report --stdio --no-children -g srcline -s srcline
        ...
         100.00%  signal.c:11
                  |
                  ---bar signal.c:11
                     |
                     |--50.49%--main signal.c:29
                     |          __libc_start_main
                     |          _start
                     |
                      --49.51%--0x33a8f
                                raise .:0
                                main signal.c:29
                                __libc_start_main
                                _start
      ~~~~~~~~~~~~~~~~
      
      With this patch in, the issue is fixed and we instead get:
      
      ~~~~~~~~~~~~~~~~
         100.00%  signal   signal            [.] bar
                  |
                  ---bar signal.c:11
                     |
                     |--50.49%--main signal.c:29
                     |          __libc_start_main
                     |          _start
                     |
                      --49.51%--0x33a8f
                                raise .:0
                                main signal.c:27
                                __libc_start_main
                                _start
      ~~~~~~~~~~~~~~~~
      
      Note how this patch fixes this issue for both unwinding methods, i.e.
      both dwfl and libunwind. The former case is straight-forward thanks
      to dwfl_frame_pc(). For libunwind, we replace the functionality via
      unw_is_signal_frame() for any but the very first frame.
      
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1982ad48
    • Milian Wolff's avatar
      perf report: Fix memory leak in addr2line when called by addr2inlines · b21cc978
      Milian Wolff authored
      
      
      When a filename was found in addr2line it was duplicated via strdup()
      but never freed. Now we pass NULL and handle this gracefully in
      addr2line.
      
      Detected by Valgrind:
      
        ==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
        ==16331==    at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
        ==16331==    by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
        ==16331==    by 0x52769F: addr2line (srcline.c:256)
        ==16331==    by 0x52769F: addr2inlines (srcline.c:294)
        ==16331==    by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
        ==16331==    by 0x574D7A: inline__fprintf (hist.c:41)
        ==16331==    by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
        ==16331==    by 0x57518A: __callchain__fprintf_graph (hist.c:212)
        ==16331==    by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
        ==16331==    by 0x57738E: hist_entry__fprintf (hist.c:628)
        ==16331==    by 0x57738E: hists__fprintf (hist.c:882)
        ==16331==    by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
        ==16331==    by 0x44A20F: report__browse_hists (builtin-report.c:491)
        ==16331==    by 0x44A20F: __cmd_report (builtin-report.c:624)
        ==16331==    by 0x44A20F: cmd_report (builtin-report.c:1054)
        ==16331==    by 0x4A49CE: run_builtin (perf.c:296)
        ==16331==    by 0x4A4CC0: handle_internal_command (perf.c:348)
        ==16331==    by 0x434371: run_argv (perf.c:392)
        ==16331==    by 0x434371: main (perf.c:530)
      
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b21cc978
    • Milian Wolff's avatar
      perf report: Don't crash on invalid maps in `-g srcline` mode · 7d4df089
      Milian Wolff authored
      
      
      I just hit a segfault when doing `perf report -g srcline`.
      Valgrind pointed me at this code as the culprit:
      
        ==8359== Invalid read of size 8
        ==8359==    at 0x3096D9: map__rip_2objdump (map.c:430)
        ==8359==    by 0x2FC1A3: match_chain_srcline (callchain.c:645)
        ==8359==    by 0x2FC1A3: match_chain (callchain.c:700)
        ==8359==    by 0x2FC1A3: append_chain (callchain.c:895)
        ==8359==    by 0x2FC1A3: append_chain_children (callchain.c:846)
        ==8359==    by 0x2FF719: callchain_append (callchain.c:944)
        ==8359==    by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
        ==8359==    by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
        ==8359==    by 0x33195C: hist_entry_iter__add (hist.c:1050)
        ==8359==    by 0x258F65: process_sample_event (builtin-report.c:204)
        ==8359==    by 0x30D60C: perf_session__deliver_event (session.c:1310)
        ==8359==    by 0x30D60C: ordered_events__deliver_event (session.c:119)
        ==8359==    by 0x310D12: __ordered_events__flush (ordered-events.c:210)
        ==8359==    by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
        ==8359==    by 0x30DD3C: perf_session__process_user_event (session.c:1349)
        ==8359==    by 0x30DD3C: perf_session__process_event (session.c:1475)
        ==8359==    by 0x30FC3C: __perf_session__process_events (session.c:1867)
        ==8359==    by 0x30FC3C: perf_session__process_events (session.c:1921)
        ==8359==    by 0x25A985: __cmd_report (builtin-report.c:575)
        ==8359==    by 0x25A985: cmd_report (builtin-report.c:1054)
        ==8359==    by 0x2B9A80: run_builtin (perf.c:296)
        ==8359==  Address 0x70 is not stack'd, malloc'd or (recently) free'd
      
      This patch fixes the issue.
      
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      [ Remove dependency from another change ]
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7d4df089
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 56fff1bb
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Fix the i2c-designware regression of rc2.
      
        Also, a DMA buffer fix for the tiny-usb driver where the USB core now
        loudly complains about the non DMA-capable buffer"
      
      [ I had cherry-picked the designware fix separately because it hit my
        laptop, but here is the proper sync with the i2c tree   - Linus ]
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: designware: Fix bogus sda_hold_time due to uninitialized vars
        i2c: i2c-tiny-usb: fix buffer not being DMA capable
      56fff1bb
  2. May 23, 2017
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · fde8e33d
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "This fixes a regression in the skcipher interface that allows bogus
        key parameters to hit underlying implementations which can cause
        crashes"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: skcipher - Add missing API setkey checks
      fde8e33d
    • Linus Torvalds's avatar
      Merge tag 'pstore-v4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · fadd2ce5
      Linus Torvalds authored
      Pull pstore fix from Kees Cook:
       "Marta noticed another misbehavior in EFI pstore, which this fixes.
      
        Hopefully this is the last of the v4.12 fixes for pstore!"
      
      * tag 'pstore-v4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        efi-pstore: Fix write/erase id tracking
      fadd2ce5
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 74a9e7db
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These revert a 4.11 change that turned out to be problematic and add a
        .gitignore file.
      
        Specifics:
      
         - Revert a 4.11 commit related to the ACPI-based handling of laptop
           lids that made changes incompatible with existing user space stacks
           and broke things there (Lv Zheng).
      
         - Add .gitignore to the ACPI tools directory (Prarit Bhargava)"
      
      * tag 'acpi-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "ACPI / button: Remove lid_init_state=method mode"
        tools/power/acpi: Add .gitignore file
      74a9e7db
    • Linus Torvalds's avatar
      Merge tag 'pm-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 801099be
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix RTC wakeup from suspend-to-idle broken recently, fix CPU
        idleness detection condition in the schedutil cpufreq governor, fix a
        cpufreq driver build failure, fix an error code path in the power
        capping framework, clean up the hibernate core and update the
        intel_pstate documentation.
      
        Specifics:
      
         - Fix RTC wakeup from suspend-to-idle broken by the recent rework of
           ACPI wakeup handling (Rafael Wysocki).
      
         - Update intel_pstate driver documentation to reflect the current
           code and explain how it works in more detail (Rafael Wysocki).
      
         - Fix an issue related to CPU idleness detection on systems with
           shared cpufreq policies in the schedutil governor (Juri Lelli).
      
         - Fix a possible build issue in the dbx500 cpufreq driver (Arnd
           Bergmann).
      
         - Fix a function in the power capping framework core to return an
           error code instead of 0 when there's an error (Dan Carpenter).
      
         - Clean up variable definition in the hibernation core (Pushkar
           Jambhlekar)"
      
      * tag 'pm-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: dbx500: add a Kconfig symbol
        PM / hibernate: Declare variables as static
        PowerCap: Fix an error code in powercap_register_zone()
        RTC: rtc-cmos: Fix wakeup from suspend-to-idle
        PM / wakeup: Fix up wakeup_source_report_event()
        cpufreq: intel_pstate: Document the current behavior and user interface
        cpufreq: schedutil: use now as reference when aggregating shared policy requests
      801099be
    • Jan Kiszka's avatar
      i2c: designware: Fix bogus sda_hold_time due to uninitialized vars · ad258fb9
      Jan Kiszka authored
      
      
      We need to initializes those variables to 0 for platforms that do not
      provide ACPI parameters. Otherwise, we set sda_hold_time to random
      values, breaking e.g. Galileo and IOT2000 boards.
      
      Reported-and-tested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: default avatarTobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
      Fixes: 9d640843
      
       ("i2c: designware: don't infer timings described by ACPI from clock rate")
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ad258fb9
    • Kees Cook's avatar
      efi-pstore: Fix write/erase id tracking · c10e8031
      Kees Cook authored
      
      
      Prior to the pstore interface refactoring, the "id" generated during
      a backend pstore_write() was only retained by the internal pstore
      inode tracking list. Additionally the "part" was ignored, so EFI
      would encode this in the id. This corrects the misunderstandings
      and correctly sets "id" during pstore_write(), and uses "part"
      directly during pstore_erase().
      
      Reported-by: default avatarMarta Lofstedt <marta.lofstedt@intel.com>
      Fixes: 76cc9580 ("pstore: Replace arguments for write() API")
      Fixes: a61072aa
      
       ("pstore: Replace arguments for erase() API")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Tested-by: default avatarMarta Lofstedt <marta.lofstedt@intel.com>
      c10e8031
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 86ca984c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Mostly netfilter bug fixes in here, but we have some bits elsewhere as
        well.
      
         1) Don't do SNAT replies for non-NATed connections in IPVS, from
            Julian Anastasov.
      
         2) Don't delete conntrack helpers while they are still in use, from
            Liping Zhang.
      
         3) Fix zero padding in xtables's xt_data_to_user(), from Willem de
            Bruijn.
      
         4) Add proper RCU protection to nf_tables_dump_set() because we
            cannot guarantee that we hold the NFNL_SUBSYS_NFTABLES lock. From
            Liping Zhang.
      
         5) Initialize rcv_mss in tcp_disconnect(), from Wei Wang.
      
         6) smsc95xx devices can't handle IPV6 checksums fully, so don't
            advertise support for offloading them. From Nisar Sayed.
      
         7) Fix out-of-bounds access in __ip6_append_data(), from Eric
            Dumazet.
      
         8) Make atl2_probe() propagate the error code properly on failures,
            from Alexey Khoroshilov.
      
         9) arp_target[] in bond_check_params() is used uninitialized. This
            got changes from a global static to a local variable, which is how
            this mistake happened. Fix from Jarod Wilson.
      
        10) Fix fallout from unnecessary NULL check removal in cls_matchall,
            from Jiri Pirko. This is definitely brown paper bag territory..."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
        net: sched: cls_matchall: fix null pointer dereference
        vsock: use new wait API for vsock_stream_sendmsg()
        bonding: fix randomly populated arp target array
        net: Make IP alignment calulations clearer.
        bonding: fix accounting of active ports in 3ad
        net: atheros: atl2: don't return zero on failure path in atl2_probe()
        ipv6: fix out of bound writes in __ip6_append_data()
        bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
        smsc95xx: Support only IPv4 TCP/UDP csum offload
        arp: always override existing neigh entries with gratuitous ARP
        arp: postpone addr_type calculation to as late as possible
        arp: decompose is_garp logic into a separate function
        arp: fixed error in a comment
        tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0
        netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT
        ebtables: arpreply: Add the standard target sanity check
        netfilter: nf_tables: revisit chain/object refcounting from elements
        netfilter: nf_tables: missing sanitization in data from userspace
        netfilter: nf_tables: can't assume lock is acquired when dumping set elems
        netfilter: synproxy: fix conntrackd interaction
        ...
      86ca984c
    • Jiri Pirko's avatar
      net: sched: cls_matchall: fix null pointer dereference · 2d76b2f8
      Jiri Pirko authored
      Since the head is guaranteed by the check above to be null, the call_rcu
      would explode. Remove the previously logically dead code that was made
      logically very much alive and kicking.
      
      Fixes: 985538ee
      
       ("net/sched: remove redundant null check on head")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d76b2f8
    • WANG Cong's avatar
      vsock: use new wait API for vsock_stream_sendmsg() · 499fde66
      WANG Cong authored
      As reported by Michal, vsock_stream_sendmsg() could still
      sleep at vsock_stream_has_space() after prepare_to_wait():
      
        vsock_stream_has_space
          vmci_transport_stream_has_space
            vmci_qpair_produce_free_space
              qp_lock
                qp_acquire_queue_mutex
                  mutex_lock
      
      Just switch to the new wait API like we did for commit
      d9dc8b0f
      
       ("net: fix sleeping for sk_wait_event()").
      
      Reported-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Jorgen Hansen <jhansen@vmware.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      499fde66
    • Jarod Wilson's avatar
      bonding: fix randomly populated arp target array · 72ccc471
      Jarod Wilson authored
      In commit dc9c4d0f, the arp_target array moved from a static global
      to a local variable. By the nature of static globals, the array used to
      be initialized to all 0. At present, it's full of random data, which
      that gets interpreted as arp_target values, when none have actually been
      specified. Systems end up booting with spew along these lines:
      
      [   32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
      [   32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
      [   32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
      [   32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
      [   32.204892] lacp0: Setting MII monitoring interval to 100
      [   32.211071] lacp0: Removing ARP target 216.124.228.17
      [   32.216824] lacp0: Removing ARP target 218.160.255.255
      [   32.222646] lacp0: Removing ARP target 185.170.136.184
      [   32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
      [   32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
      [   32.243987] lacp0: Removing ARP target 56.125.228.17
      [   32.249625] lacp0: Removing ARP target 218.160.255.255
      [   32.255432] lacp0: Removing ARP target 15.157.233.184
      [   32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
      [   32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
      [   32.276632] lacp0: Removing ARP target 16.0.0.0
      [   32.281755] lacp0: Removing ARP target 218.160.255.255
      [   32.287567] lacp0: Removing ARP target 72.125.228.17
      [   32.293165] lacp0: Removing ARP target 218.160.255.255
      [   32.298970] lacp0: Removing ARP target 8.125.228.17
      [   32.304458] lacp0: Removing ARP target 218.160.255.255
      
      None of these were actually specified as ARP targets, and the driver does
      seem to clean up the mess okay, but it's rather noisy and confusing, leaks
      values to userspace, and the 255.255.255.255 spew shows up even when debug
      prints are disabled.
      
      The fix: just zero out arp_target at init time.
      
      While we're in here, init arp_all_targets_value in the right place.
      
      Fixes: dc9c4d0f
      
       ("bonding: reduce scope of some global variables")
      CC: Mahesh Bandewar <maheshb@google.com>
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Acked-by: default avatarAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72ccc471
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-sleep' and 'powercap' · bb47e964
      Rafael J. Wysocki authored
      * pm-sleep:
        PM / hibernate: Declare variables as static
        RTC: rtc-cmos: Fix wakeup from suspend-to-idle
        PM / wakeup: Fix up wakeup_source_report_event()
      
      * powercap:
        PowerCap: Fix an error code in powercap_register_zone()
      bb47e964
    • Rafael J. Wysocki's avatar
      Merge branches 'acpi-button' and 'acpi-tools' · e3170cc0
      Rafael J. Wysocki authored
      * acpi-button:
        Revert "ACPI / button: Remove lid_init_state=method mode"
      
      * acpi-tools:
        tools/power/acpi: Add .gitignore file
      e3170cc0
    • Rafael J. Wysocki's avatar
      Merge branches 'intel_pstate', 'pm-cpufreq' and 'pm-cpufreq-sched' · 079c1812
      Rafael J. Wysocki authored
      * intel_pstate:
        cpufreq: intel_pstate: Document the current behavior and user interface
      
      * pm-cpufreq:
        cpufreq: dbx500: add a Kconfig symbol
      
      * pm-cpufreq-sched:
        cpufreq: schedutil: use now as reference when aggregating shared policy requests
      079c1812
    • David S. Miller's avatar
      net: Make IP alignment calulations clearer. · e4eda884
      David S. Miller authored
      
      
      The assignmnet:
      
      	ip_align = strict ? 2 : NET_IP_ALIGN;
      
      in compare_pkt_ptr_alignment() trips up Coverity because we can only
      get to this code when strict is true, therefore ip_align will always
      be 2 regardless of NET_IP_ALIGN's value.
      
      So just assign directly to '2' and explain the situation in the
      comment above.
      
      Reported-by: default avatar"Gustavo A. R. Silva" <garsilva@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4eda884
    • Jarod Wilson's avatar
      bonding: fix accounting of active ports in 3ad · 751da2a6
      Jarod Wilson authored
      As of 7bb11dc9 and 0622cab0, bond slaves in a 3ad bond are not
      removed from the aggregator when they are down, and the active slave count
      is NOT equal to number of ports in the aggregator, but rather the number
      of ports in the aggregator that are still enabled. The sysfs spew for
      bonding_show_ad_num_ports() has a comment that says "Show number of active
      802.3ad ports.", but it's currently showing total number of ports, both
      active and inactive. Remedy it by using the same logic introduced in
      0622cab0
      
       in __bond_3ad_get_active_agg_info(), so sysfs, procfs and
      netlink all report the number of active ports. Note that this means that
      IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of
      NUM_PORTS, and thus perhaps should be renamed for clarity.
      
      Lightly tested on a dual i40e lacp bond, simulating link downs with an ip
      link set dev <slave2> down, was able to produce the state where I could
      see both in the same aggregator, but a number of ports count of 1.
      
      MII Status: up
      Active Aggregator Info:
              Aggregator ID: 1
              Number of ports: 2 <---
      Slave Interface: ens10
      MII Status: up <---
      Aggregator ID: 1
      Slave Interface: ens11
      MII Status: up
      Aggregator ID: 1
      
      MII Status: up
      Active Aggregator Info:
              Aggregator ID: 1
              Number of ports: 1 <---
      Slave Interface: ens10
      MII Status: down <---
      Aggregator ID: 1
      Slave Interface: ens11
      MII Status: up
      Aggregator ID: 1
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      751da2a6
    • Alexey Khoroshilov's avatar
      net: atheros: atl2: don't return zero on failure path in atl2_probe() · bd703a15
      Alexey Khoroshilov authored
      
      
      If dma mask checks fail in atl2_probe(), it breaks off initialization,
      deallocates all resources, but returns zero.
      
      The patch adds proper error code return value and
      make error code setup unified.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd703a15
  3. May 22, 2017
    • Eric Dumazet's avatar
      ipv6: fix out of bound writes in __ip6_append_data() · 232cd35d
      Eric Dumazet authored
      
      
      Andrey Konovalov and idaifish@gmail.com reported crashes caused by
      one skb shared_info being overwritten from __ip6_append_data()
      
      Andrey program lead to following state :
      
      copy -4200 datalen 2000 fraglen 2040
      maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200
      
      The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
      fraggap, 0); is overwriting skb->head and skb_shared_info
      
      Since we apparently detect this rare condition too late, move the
      code earlier to even avoid allocating skb and risking crashes.
      
      Once again, many thanks to Andrey and syzkaller team.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatar <idaifish@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      232cd35d
    • Jan Kiszka's avatar
      i2c: designware: Fix bogus sda_hold_time due to uninitialized vars · e2c82492
      Jan Kiszka authored
      We need to initializes those variables to 0 for platforms that do not
      provide ACPI parameters. Otherwise, we set sda_hold_time to random
      values, breaking e.g. Galileo and IOT2000 boards.
      
      Fixes: 9d640843
      
       ("i2c: designware: don't infer timings described by ACPI from clock rate")
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      e2c82492
    • Sebastian Reichel's avatar
      i2c: i2c-tiny-usb: fix buffer not being DMA capable · 5165da59
      Sebastian Reichel authored
      
      
      Since v4.9 i2c-tiny-usb generates the below call trace
      and longer works, since it can't communicate with the
      USB device. The reason is, that since v4.9 the USB
      stack checks, that the buffer it should transfer is DMA
      capable. This was a requirement since v2.2 days, but it
      usually worked nevertheless.
      
      [   17.504959] ------------[ cut here ]------------
      [   17.505488] WARNING: CPU: 0 PID: 93 at drivers/usb/core/hcd.c:1587 usb_hcd_map_urb_for_dma+0x37c/0x570
      [   17.506545] transfer buffer not dma capable
      [   17.507022] Modules linked in:
      [   17.507370] CPU: 0 PID: 93 Comm: i2cdetect Not tainted 4.11.0-rc8+ #10
      [   17.508103] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      [   17.509039] Call Trace:
      [   17.509320]  ? dump_stack+0x5c/0x78
      [   17.509714]  ? __warn+0xbe/0xe0
      [   17.510073]  ? warn_slowpath_fmt+0x5a/0x80
      [   17.510532]  ? nommu_map_sg+0xb0/0xb0
      [   17.510949]  ? usb_hcd_map_urb_for_dma+0x37c/0x570
      [   17.511482]  ? usb_hcd_submit_urb+0x336/0xab0
      [   17.511976]  ? wait_for_completion_timeout+0x12f/0x1a0
      [   17.512549]  ? wait_for_completion_timeout+0x65/0x1a0
      [   17.513125]  ? usb_start_wait_urb+0x65/0x160
      [   17.513604]  ? usb_control_msg+0xdc/0x130
      [   17.514061]  ? usb_xfer+0xa4/0x2a0
      [   17.514445]  ? __i2c_transfer+0x108/0x3c0
      [   17.514899]  ? i2c_transfer+0x57/0xb0
      [   17.515310]  ? i2c_smbus_xfer_emulated+0x12f/0x590
      [   17.515851]  ? _raw_spin_unlock_irqrestore+0x11/0x20
      [   17.516408]  ? i2c_smbus_xfer+0x125/0x330
      [   17.516876]  ? i2c_smbus_xfer+0x125/0x330
      [   17.517329]  ? i2cdev_ioctl_smbus+0x1c1/0x2b0
      [   17.517824]  ? i2cdev_ioctl+0x75/0x1c0
      [   17.518248]  ? do_vfs_ioctl+0x9f/0x600
      [   17.518671]  ? vfs_write+0x144/0x190
      [   17.519078]  ? SyS_ioctl+0x74/0x80
      [   17.519463]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
      [   17.519959] ---[ end trace d047c04982f5ac50 ]---
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: default avatarTill Harbaum <till@harbaum.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      5165da59
    • Linus Torvalds's avatar
      Linux 4.12-rc2 · 08332893
      Linus Torvalds authored
      08332893
    • Linus Torvalds's avatar
      x86: fix 32-bit case of __get_user_asm_u64() · 33c9e972
      Linus Torvalds authored
      The code to fetch a 64-bit value from user space was entirely buggered,
      and has been since the code was merged in early 2016 in commit
      b2f68038 ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit
      kernels").
      
      Happily the buggered routine is almost certainly entirely unused, since
      the normal way to access user space memory is just with the non-inlined
      "get_user()", and the inlined version didn't even historically exist.
      
      The normal "get_user()" case is handled by external hand-written asm in
      arch/x86/lib/getuser.S that doesn't have either of these issues.
      
      There were two independent bugs in __get_user_asm_u64():
      
       - it still did the STAC/CLAC user space access marking, even though
         that is now done by the wrapper macros, see commit 11f1a4b9
      
      
         ("x86: reorganize SMAP handling in user space accesses").
      
         This didn't result in a semantic error, it just means that the
         inlined optimized version was hugely less efficient than the
         allegedly slower standard version, since the CLAC/STAC overhead is
         quite high on modern Intel CPU's.
      
       - the double register %eax/%edx was marked as an output, but the %eax
         part of it was touched early in the asm, and could thus clobber other
         inputs to the asm that gcc didn't expect it to touch.
      
         In particular, that meant that the generated code could look like
         this:
      
              mov    (%eax),%eax
              mov    0x4(%eax),%edx
      
         where the load of %edx obviously was _supposed_ to be from the 32-bit
         word that followed the source of %eax, but because %eax was
         overwritten by the first instruction, the source of %edx was
         basically random garbage.
      
      The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark
      the 64-bit output as early-clobber to let gcc know that no inputs should
      alias with the output register.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@kernel.org   # v4.8+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33c9e972
    • Linus Torvalds's avatar
      Clean up x86 unsafe_get/put_user() type handling · 334a023e
      Linus Torvalds authored
      Al noticed that unsafe_put_user() had type problems, and fixed them in
      commit a7cc722f
      
       ("fix unsafe_put_user()"), which made me look more
      at those functions.
      
      It turns out that unsafe_get_user() had a type issue too: it limited the
      largest size of the type it could handle to "unsigned long".  Which is
      fine with the current users, but doesn't match our existing normal
      get_user() semantics, which can also handle "u64" even when that does
      not fit in a long.
      
      While at it, also clean up the type cast in unsafe_put_user().  We
      actually want to just make it an assignment to the expected type of the
      pointer, because we actually do want warnings from types that don't
      convert silently.  And it makes the code more readable by not having
      that one very long and complex line.
      
      [ This patch might become stable material if we ever end up back-porting
        any new users of the unsafe uaccess code, but as things stand now this
        doesn't matter for any current existing uses. ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      334a023e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · f3926e4c
      Linus Torvalds authored
      Pull misc uaccess fixes from Al Viro:
       "Fix for unsafe_put_user() (no callers currently in mainline, but
        anyone starting to use it will step into that) + alpha osf_wait4()
        infoleak fix"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        osf_wait4(): fix infoleak
        fix unsafe_put_user()
      f3926e4c
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 970c305a
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "A single scheduler fix:
      
        Prevent idle task from ever being preempted. That makes sure that
        synchronize_rcu_tasks() which is ignoring idle task does not pretend
        that no task is stuck in preempted state. If that happens and idle was
        preempted on a ftrace trampoline the machine crashes due to
        inconsistent state"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/core: Call __schedule() from do_idle() without enabling preemption
      970c305a
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7a3d627
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A set of small fixes for the irq subsystem:
      
         - Cure a data ordering problem with chained interrupts
      
         - Three small fixlets for the mbigen irq chip"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Fix chained interrupt data ordering
        irqchip/mbigen: Fix the clear register offset calculation
        irqchip/mbigen: Fix potential NULL dereferencing
        irqchip/mbigen: Fix memory mapping code
      e7a3d627
    • Xin Long's avatar
      bridge: start hello_timer when enabling KERNEL_STP in br_stp_start · 6d18c732
      Xin Long authored
      Since commit 76b91c32 ("bridge: stp: when using userspace stp stop
      kernel hello and hold timers"), bridge would not start hello_timer if
      stp_enabled is not KERNEL_STP when br_dev_open.
      
      The problem is even if users set stp_enabled with KERNEL_STP later,
      the timer will still not be started. It causes that KERNEL_STP can
      not really work. Users have to re-ifup the bridge to avoid this.
      
      This patch is to fix it by starting br->hello_timer when enabling
      KERNEL_STP in br_stp_start.
      
      As an improvement, it's also to start hello_timer again only when
      br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
      no reason to start the timer again when it's NO_STP.
      
      Fixes: 76b91c32
      
       ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
      Reported-by: default avatarHaidong Li <haili@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: default avatarIvan Vecera <cera@cera.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d18c732
    • Nisar Sayed's avatar
      smsc95xx: Support only IPv4 TCP/UDP csum offload · fe0cd8ca
      Nisar Sayed authored
      
      
      When TX checksum offload is used, if the computed checksum is 0 the
      LAN95xx device do not alter the checksum to 0xffff.  In the case of ipv4
      UDP checksum, it indicates to receiver that no checksum is calculated.
      Under ipv6, UDP checksum yields a result of zero must be changed to
      0xffff. Hence disabling checksum offload for ipv6 packets.
      
      Signed-off-by: default avatarNisar Sayed <Nisar.Sayed@microchip.com>
      
      Reported-by: default avatarpopcorn mix <popcornmix@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe0cd8ca
    • David S. Miller's avatar
      Merge branch 'arp-always-override-existing-neigh-entries-with-gratuitous-ARP' · 776ee323
      David S. Miller authored
      Ihar Hrachyshka says:
      
      ====================
      arp: always override existing neigh entries with gratuitous ARP
      
      This patchset is spurred by discussion started at
      https://patchwork.ozlabs.org/patch/760372/
      
       where we figured that there is no
      real reason for enforcing override by gratuitous ARP packets only when
      arp_accept is 1. Same should happen when it's 0 (the default value).
      
      changelog v2: handled review comments by Julian Anastasov
      - fixed a mistake in a comment;
      - postponed addr_type calculation to as late as possible.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      776ee323
    • Ihar Hrachyshka's avatar
      arp: always override existing neigh entries with gratuitous ARP · 7d472a59
      Ihar Hrachyshka authored
      
      
      Currently, when arp_accept is 1, we always override existing neigh
      entries with incoming gratuitous ARP replies. Otherwise, we override
      them only if new replies satisfy _locktime_ conditional (packets arrive
      not earlier than _locktime_ seconds since the last update to the neigh
      entry).
      
      The idea behind locktime is to pick the very first (=> close) reply
      received in a unicast burst when ARP proxies are used. This helps to
      avoid ARP thrashing where Linux would switch back and forth from one
      proxy to another.
      
      This logic has nothing to do with gratuitous ARP replies that are
      generally not aligned in time when multiple IP address carriers send
      them into network.
      
      This patch enforces overriding of existing neigh entries by all incoming
      gratuitous ARP packets, irrespective of their time of arrival. This will
      make the kernel honour all incoming gratuitous ARP packets.
      
      Signed-off-by: default avatarIhar Hrachyshka <ihrachys@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d472a59
    • Ihar Hrachyshka's avatar
      arp: postpone addr_type calculation to as late as possible · d9ef2e7b
      Ihar Hrachyshka authored
      The addr_type retrieval can be costly, so it's worth trying to avoid its
      calculation as much as possible. This patch makes it calculated only
      for gratuitous ARP packets. This is especially important since later we
      may want to move is_garp calculation outside of arp_accept block, at
      which point the costly operation will be executed for all setups.
      
      The patch is the result of a discussion in net-dev:
      http://marc.info/?l=linux-netdev&m=149506354216994
      
      
      
      Suggested-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarIhar Hrachyshka <ihrachys@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9ef2e7b
    • Ihar Hrachyshka's avatar
      arp: decompose is_garp logic into a separate function · 6fd05633
      Ihar Hrachyshka authored
      
      
      The code is quite involving already to earn a separate function for
      itself. If anything, it helps arp_process readability.
      
      Signed-off-by: default avatarIhar Hrachyshka <ihrachys@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fd05633