Skip to content
  1. Dec 18, 2016
  2. Dec 17, 2016
    • Daniel Borkmann's avatar
      bpf, test_verifier: fix a test case error result on unprivileged · 0eb6984f
      Daniel Borkmann authored
      Running ./test_verifier as unprivileged lets 1 out of 98 tests fail:
      
        [...]
        #71 unpriv: check that printk is disallowed FAIL
        Unexpected error message!
        0: (7a) *(u64 *)(r10 -8) = 0
        1: (bf) r1 = r10
        2: (07) r1 += -8
        3: (b7) r2 = 8
        4: (bf) r3 = r1
        5: (85) call bpf_trace_printk#6
        unknown func bpf_trace_printk#6
        [...]
      
      The test case is correct, just that the error outcome changed with
      ebb676da ("bpf: Print function name in addition to function id").
      Same as with e00c7b21 ("bpf: fix multiple issues in selftest suite
      and samples") issue 2), so just fix up the function name.
      
      Fixes: ebb676da
      
       ("bpf: Print function name in addition to function id")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eb6984f
    • Daniel Borkmann's avatar
      bpf: fix regression on verifier pruning wrt map lookups · a08dd0da
      Daniel Borkmann authored
      Commit 57a09bf0 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
      registers") introduced a regression where existing programs stopped
      loading due to reaching the verifier's maximum complexity limit,
      whereas prior to this commit they were loading just fine; the affected
      program has roughly 2k instructions.
      
      What was found is that state pruning couldn't be performed effectively
      anymore due to mismatches of the verifier's register state, in particular
      in the id tracking. It doesn't mean that 57a09bf0 is incorrect per
      se, but rather that verifier needs to perform a lot more work for the
      same program with regards to involved map lookups.
      
      Since commit 57a09bf0 is only about tracking registers with type
      PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
      until they are promoted through pattern matching with a NULL check to
      either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
      id becomes irrelevant for the transitioned types.
      
      For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
      but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
      transferred further into other types that don't make use of it. Among
      others, one example is where UNKNOWN_VALUE is set on function call
      return with RET_INTEGER return type.
      
      states_equal() will then fall through the memcmp() on register state;
      note that the second memcmp() uses offsetofend(), so the id is part of
      that since d2a4dd37 ("bpf: fix state equivalence"). But the bisect
      pointed already to 57a09bf0, where we really reach beyond complexity
      limit. What I found was that states_equal() often failed in this
      case due to id mismatches in spilled regs with registers in type
      PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
      a memcmp() on their reg state and don't have any other optimizations
      in place, therefore also id was relevant in this case for making a
      pruning decision.
      
      We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
      For the affected program, it resulted in a ~17 fold reduction of
      complexity and let the program load fine again. Selftest suite also
      runs fine. The only other place where env->id_gen is used currently is
      through direct packet access, but for these cases id is long living, thus
      a different scenario.
      
      Also, the current logic in mark_map_regs() is not fully correct when
      marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
      reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
      it's id is reset and any subsequent registers that hold the original id
      and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
      anymore, since mark_map_reg() reuses the uncached regs[regno].id that
      was just overridden. Note, we don't need to cache it outside of
      mark_map_regs(), since it's called once on this_branch and the other
      time on other_branch, which are both two independent verifier states.
      A test case for this is added here, too.
      
      Fixes: 57a09bf0
      
       ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a08dd0da
    • David Ahern's avatar
      net: vrf: Drop conntrack data after pass through VRF device on Tx · eb63ecc1
      David Ahern authored
      Locally originated traffic in a VRF fails in the presence of a POSTROUTING
      rule. For example,
      
          $ iptables -t nat -A POSTROUTING -s 11.1.1.0/24  -j MASQUERADE
          $ ping -I red -c1 11.1.1.3
          ping: Warning: source address might be selected on device other than red.
          PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data.
          ping: sendmsg: Operation not permitted
      
      Worse, the above causes random corruption resulting in a panic in random
      places (I have not seen a consistent backtrace).
      
      Call nf_reset to drop the conntrack info following the pass through the
      VRF device.  The nf_reset is needed on Tx but not Rx because of the order
      in which NF_HOOK's are hit: on Rx the VRF device is after the real ingress
      device and on Tx it is is before the real egress device. Connection
      tracking should be tied to the real egress device and not the VRF device.
      
      Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device")
      Fixes: 35402e31
      
       ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb63ecc1
    • David Ahern's avatar
      net: vrf: Fix NAT within a VRF · a0f37efa
      David Ahern authored
      Connection tracking with VRF is broken because the pass through the VRF
      device drops the connection tracking info. Removing the call to nf_reset
      allows DNAT and MASQUERADE to work across interfaces within a VRF.
      
      Fixes: 73e20b76
      
       ("net: vrf: Add support for PREROUTING rules on vrf device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0f37efa
    • David S. Miller's avatar
      Merge branch 'cls_flower-mask' · 8a9f5fdf
      David S. Miller authored
      
      
      Paul Blakey says:
      
      ====================
      net/sched: cls_flower: Fix mask handling
      
      The series fix how the mask is being handled.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a9f5fdf
    • Paul Blakey's avatar
      net/sched: cls_flower: Use masked key when calling HW offloads · f93bd17b
      Paul Blakey authored
      Zero bits on the mask signify a "don't care" on the corresponding bits
      in key. Some HWs require those bits on the key to be zero. Since these
      bits are masked anyway, it's okay to provide the masked key to all
      drivers.
      
      Fixes: 5b33f488
      
       ('net/flower: Introduce hardware offload support')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f93bd17b
    • Paul Blakey's avatar
      net/sched: cls_flower: Use mask for addr_type · 970bfcd0
      Paul Blakey authored
      When addr_type is set, mask should also be set.
      
      Fixes: 66530bdf ('sched,cls_flower: set key address type when present')
      Fixes: bc3103f1
      
       ('net/sched: cls_flower: Classify packet in ip tunnels')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      970bfcd0
    • Bartosz Folta's avatar
      net: macb: Added PCI wrapper for Platform Driver. · 83a77e9e
      Bartosz Folta authored
      
      
      There are hardware PCI implementations of Cadence GEM network
      controller. This patch will allow to use such hardware with reuse of
      existing Platform Driver.
      
      Signed-off-by: default avatarBartosz Folta <bfolta@cadence.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83a77e9e
    • Thomas Falcon's avatar
      ibmveth: calculate gso_segs for large packets · 94acf164
      Thomas Falcon authored
      
      
      Include calculations to compute the number of segments
      that comprise an aggregated large packet.
      
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarJonathan Maxwell <jmaxwell37@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94acf164
    • Timur Tabi's avatar
      net: qcom/emac: don't try to claim clocks on ACPI systems · 026acd5f
      Timur Tabi authored
      
      
      On ACPI systems, clocks are not available to drivers directly.  They are
      handled exclusively by ACPI and/or firmware, so there is no clock driver.
      Calls to clk_get() always fail, so we should not even attempt to claim
      any clocks on ACPI systems.
      
      Signed-off-by: default avatarTimur Tabi <timur@codeaurora.org>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      026acd5f
    • Jeroen De Wachter's avatar
    • Jeroen De Wachter's avatar
      encx24j600: bugfix - always move ERXTAIL to next packet in encx24j600_rx_packets · ebe5236d
      Jeroen De Wachter authored
      
      
      Before, encx24j600_rx_packets did not update encx24j600_priv's next_packet
      member when an error occurred during packet handling (either because the
      packet's RSV header indicates an error or because the encx24j600_receive_packet
      method can't allocate an sk_buff).
      
      If the next_packet member is not updated, the ERXTAIL register will be set to
      the same value it had before, which means the bad packet remains in the
      component's memory and its RSV header will be read again when a new packet
      arrives. If the RSV header indicates a bad packet or if sk_buff allocation
      continues to fail, new packets will be stored in the component's memory until
      that memory is full, after which packets will be dropped.
      
      The SETPKTDEC command is always executed though, so the encx24j600 hardware has
      an incorrect count of the packets in its memory.
      
      To prevent this, the next_packet member should always be updated, allowing the
      packet to be skipped (either because it's bad, as indicated in its RSV header,
      or because allocating an sk_buff failed). In the allocation failure case, this
      does mean dropping a valid packet, but dropping the oldest packet to keep as
      much memory as possible available for new packets seems preferable to keeping
      old (but valid) packets around while dropping new ones.
      
      Signed-off-by: default avatarJeroen De Wachter <jeroen.de_wachter.ext@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebe5236d
    • David S. Miller's avatar
      Merge branch 'hisilicon-netdev-dev' · ea7a2b9a
      David S. Miller authored
      Dongpo Li says:
      
      ====================
      net: ethernet: hisilicon: set dev->dev.parent before PHY connect
      
      This patch series builds atop:
      ec988ad7
      
       ("phy: Don't increment MDIO bus
      refcount unless it's a different owner")
      
      I have checked all the hisilicon ethernet driver and found only two drivers
      need to be fixed to make sure set dev->dev.parent before PHY connect.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea7a2b9a
    • Dongpo Li's avatar
      net: ethernet: hip04: Call SET_NETDEV_DEV() · 8cd1f70f
      Dongpo Li authored
      The hip04 driver calls into PHYLIB which now checks for
      net_device->dev.parent, so make sure we do set it before calling into
      any MDIO/PHYLIB related function.
      
      Fixes: ec988ad7
      
       ("phy: Don't increment MDIO bus refcount unless it's a different owner")
      Signed-off-by: default avatarDongpo Li <lidongpo@hisilicon.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cd1f70f
    • Dongpo Li's avatar
      net: ethernet: hisi_femac: Call SET_NETDEV_DEV() · 2087d421
      Dongpo Li authored
      The hisi_femac driver calls into PHYLIB which now checks for
      net_device->dev.parent, so make sure we do set it before calling into
      any MDIO/PHYLIB related function.
      
      Fixes: ec988ad7
      
       ("phy: Don't increment MDIO bus refcount unless it's a different owner")
      Signed-off-by: default avatarDongpo Li <lidongpo@hisilicon.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2087d421
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Fix opps when adding vlan bridge · 66e2809d
      Andrew Lunn authored
      A port is not necessarily assigned to a netdev. And a port does not
      need to be a member of a bridge. So when iterating over all ports,
      check before using the netdev and bridge_dev for a port. Otherwise we
      dereference a NULL pointer.
      
      Fixes: da9c359e
      
       ("net: dsa: mv88e6xxx: check hardware VLAN in use")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66e2809d
    • Thomas Gleixner's avatar
      net/3com/3c515: Fix timer handling, prevent leaks and crashes · e28ceeb1
      Thomas Gleixner authored
      
      
      The timer handling in this driver is broken in several ways:
      
      - corkscrew_open() initializes and arms a timer before requesting the
        device interrupt. If the request fails the timer stays armed.
      
        A second call to corkscrew_open will unconditionally reinitialize the
        quued timer and arm it again. Also a immediate device removal will leave
        the timer queued because close() is not called (open() failed) and
        therefore nothing issues del_timer().
      
        The reinitialization corrupts the link chain in the timer wheel hash
        bucket and causes a NULL pointer dereference when the timer wheel tries
        to operate on that hash bucket. Immediate device removal lets the link
        chain poke into freed and possibly reused memory.
      
        Solution: Arm the timer after the successful irq request.
      
      - corkscrew_close() uses del_timer()
      
        On close the timer is disarmed with del_timer() which lets the following
        code race against a concurrent timer expiry function.
      
        Solution: Use del_timer_sync() instead
      
      - corkscrew_close() calls del_timer() unconditionally
      
        del_timer() is invoked even if the timer was never initialized. This
        works by chance because the struct containing the timer is zeroed at
        allocation time.
      
        Solution: Move the setup of the timer into corkscrew_setup().
      
      Reported-by: default avatarMatthew Whitehead <tedheadster@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e28ceeb1
  3. Dec 14, 2016
  4. Dec 13, 2016