Skip to content
  1. Apr 09, 2014
    • Daniel Borkmann's avatar
      net: sctp: wake up all assocs if sndbuf policy is per socket · 52c35bef
      Daniel Borkmann authored
      SCTP charges chunks for wmem accounting via skb->truesize in
      sctp_set_owner_w(), and sctp_wfree() respectively as the
      reverse operation. If a sender runs out of wmem, it needs to
      wait via sctp_wait_for_sndbuf(), and gets woken up by a call
      to __sctp_write_space() mostly via sctp_wfree().
      
      __sctp_write_space() is being called per association. Although
      we assign sk->sk_write_space() to sctp_write_space(), which
      is then being done per socket, it is only used if send space
      is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
      is set and therefore not invoked in sock_wfree().
      
      Commit 4c3a5bda ("sctp: Don't charge for data in sndbuf
      again when transmitting packet") fixed an issue where in case
      sctp_packet_transmit() manages to queue up more than sndbuf
      bytes, sctp_wait_for_sndbuf() will never be woken up again
      unless it is interrupted by a signal. However, a still
      remaining issue is that if net.sctp.sndbuf_policy=0, that is
      accounting per socket, and one-to-many sockets are in use,
      the reclaimed write space from sctp_wfree() is 'unfairly'
      handed back on the server to the association that is the lucky
      one to be woken up again via __sctp_write_space(), while
      the remaining associations are never be woken up again
      (unless by a signal).
      
      The effect disappears with net.sctp.sndbuf_policy=1, that
      is wmem accounting per association, as it guarantees a fair
      share of wmem among associations.
      
      Therefore, if we have reclaimed memory in case of per socket
      accounting, wake all related associations to a socket in a
      fair manner, that is, traverse the socket association list
      starting from the current neighbour of the association and
      issue a __sctp_write_space() to everyone until we end up
      waking ourselves. This guarantees that no association is
      preferred over another and even if more associations are
      taken into the one-to-many session, all receivers will get
      messages from the server and are not stalled forever on
      high load. This setting still leaves the advantage of per
      socket accounting in touch as an association can still use
      up global limits if unused by others.
      
      Fixes: 4eb701df
      
       ("[SCTP] Fix SCTP sendbuffer accouting.")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52c35bef
    • Dan Carpenter's avatar
      isdnloop: several buffer overflows · 7563487c
      Dan Carpenter authored
      
      
      There are three buffer overflows addressed in this patch.
      
      1) In isdnloop_fake_err() we add an 'E' to a 60 character string and
      then copy it into a 60 character buffer.  I have made the destination
      buffer 64 characters and I'm changed the sprintf() to a snprintf().
      
      2) In isdnloop_parse_cmd(), p points to a 6 characters into a 60
      character buffer so we have 54 characters.  The ->eazlist[] is 11
      characters long.  I have modified the code to return if the source
      buffer is too long.
      
      3) In isdnloop_command() the cbuf[] array was 60 characters long but the
      max length of the string then can be up to 79 characters.  I made the
      cbuf array 80 characters long and changed the sprintf() to snprintf().
      I also removed the temporary "dial" buffer and changed it to use "p"
      directly.
      
      Unfortunately, we pass the "cbuf" string from isdnloop_command() to
      isdnloop_writecmd() which truncates anything over 60 characters to make
      it fit in card->omsg[].  (It can accept values up to 255 characters so
      long as there is a '\n' character every 60 characters).  For now I have
      just fixed the memory corruption bug and left the other problems in this
      driver alone.
      
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7563487c
  2. Apr 08, 2014
  3. Apr 06, 2014
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · d80e773f
      David S. Miller authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      The following patchset contains Netfilter fixes for your net tree, they
      are:
      
      * Use 16-bits offset and length fields instead of 8-bits in the conntrack
        extension to avoid an overflow when many conntrack extension are used,
        from Andrey Vagin.
      
      * Allow to use cgroup match from LOCAL_IN, there is no apparent reason
        for not allowing this, from Alexey Perevalov.
      
      * Fix build of the connlimit match after recent changes to let it scale
        up that result in a divide by zero compilation error in UP, from
        Florian Westphal.
      
      * Move the lock out of the structure connlimit_data to avoid a false
        sharing spotted by Eric Dumazet and Jesper D. Brouer, this needed as
        part of the recent connlimit scalability improvements, also from
        Florian Westphal.
      
      * Add missing module aliases in xt_osf to fix loading of rules using
        this match, from Kirill Tkhai.
      
      * Restrict set names in nf_tables to 15 characters instead of silently
        trimming them off, from me.
      
      * Fix wrong format in nf_tables request module call for chain types,
        spotted by Florian Westphal, patch from me.
      
      * Fix crash in xtables when it fails to copy the counters back to userspace
        after having replaced the table already.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d80e773f
  4. Apr 05, 2014
    • Thomas Graf's avatar
      netfilter: Can't fail and free after table replacement · c58dd2dd
      Thomas Graf authored
      
      
      All xtables variants suffer from the defect that the copy_to_user()
      to copy the counters to user memory may fail after the table has
      already been exchanged and thus exposed. Return an error at this
      point will result in freeing the already exposed table. Any
      subsequent packet processing will result in a kernel panic.
      
      We can't copy the counters before exposing the new tables as we
      want provide the counter state after the old table has been
      unhooked. Therefore convert this into a silent error.
      
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c58dd2dd
  5. Apr 04, 2014
  6. Apr 03, 2014
    • YOSHIFUJI Hideaki / 吉藤英明's avatar
      isdnloop: Validate NUL-terminated strings from user. · 77bc6bed
      YOSHIFUJI Hideaki / 吉藤英明 authored
      
      
      Return -EINVAL unless all of user-given strings are correctly
      NUL-terminated.
      
      Signed-off-by: default avatarYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77bc6bed
    • Alexei Starovoitov's avatar
      net: ti: fix CPTS driver build on arm · 79eb9d28
      Alexei Starovoitov authored
      fix build errors:
      drivers/net/ethernet/ti/cpts.c:266:12: error: 'ETH_HLEN' undeclared (first use in this function)
      drivers/net/ethernet/ti/cpts.c:276:23: error: 'VLAN_HLEN' undeclared (first use in this function)
      
      Fixes: 408eccce
      
       ("net: ptp: move PTP classifier in its own file")
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Suggested-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79eb9d28
    • Mike Rapoport's avatar
      net: vxlan: fix crash when interface is created with no group · 5933a7bb
      Mike Rapoport authored
      
      
      If the vxlan interface is created without explicit group definition,
      there are corner cases which may cause kernel panic.
      
      For instance, in the following scenario:
      
      node A:
      $ ip link add dev vxlan42  address 2c:c2:60:00:10:20 type vxlan id 42
      $ ip addr add dev vxlan42 10.0.0.1/24
      $ ip link set up dev vxlan42
      $ arp -i vxlan42 -s 10.0.0.2 2c:c2:60:00:01:02
      $ bridge fdb add dev vxlan42 to 2c:c2:60:00:01:02 dst <IPv4 address>
      $ ping 10.0.0.2
      
      node B:
      $ ip link add dev vxlan42 address 2c:c2:60:00:01:02 type vxlan id 42
      $ ip addr add dev vxlan42 10.0.0.2/24
      $ ip link set up dev vxlan42
      $ arp -i vxlan42 -s 10.0.0.1 2c:c2:60:00:10:20
      
      node B crashes:
      
       vxlan42: 2c:c2:60:00:10:20 migrated from 4011:eca4:c0a8:6466:c0a8:6415:8e09:2118 to (invalid address)
       vxlan42: 2c:c2:60:00:10:20 migrated from 4011:eca4:c0a8:6466:c0a8:6415:8e09:2118 to (invalid address)
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000046
       IP: [<ffffffff8143c459>] ip6_route_output+0x58/0x82
       PGD 7bd89067 PUD 7bd4e067 PMD 0
       Oops: 0000 [#1] SMP
       Modules linked in:
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.0-rc8-hvx-xen-00019-g97a5221-dirty #154
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       task: ffff88007c774f50 ti: ffff88007c79c000 task.ti: ffff88007c79c000
       RIP: 0010:[<ffffffff8143c459>]  [<ffffffff8143c459>] ip6_route_output+0x58/0x82
       RSP: 0018:ffff88007fd03668  EFLAGS: 00010282
       RAX: 0000000000000000 RBX: ffffffff8186a000 RCX: 0000000000000040
       RDX: 0000000000000000 RSI: ffff88007b0e4a80 RDI: ffff88007fd03754
       RBP: ffff88007fd03688 R08: ffff88007b0e4a80 R09: 0000000000000000
       R10: 0200000a0100000a R11: 0001002200000000 R12: ffff88007fd03740
       R13: ffff88007b0e4a80 R14: ffff88007b0e4a80 R15: ffff88007bba0c50
       FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 0000000000000046 CR3: 000000007bb60000 CR4: 00000000000006e0
       Stack:
        0000000000000000 ffff88007fd037a0 ffffffff8186a000 ffff88007fd03740
        ffff88007fd036c8 ffffffff814320bb 0000000000006e49 ffff88007b8b7360
        ffff88007bdbf200 ffff88007bcbc000 ffff88007b8b7000 ffff88007b8b7360
       Call Trace:
        <IRQ>
        [<ffffffff814320bb>] ip6_dst_lookup_tail+0x2d/0xa4
        [<ffffffff814322a5>] ip6_dst_lookup+0x10/0x12
        [<ffffffff81323b4e>] vxlan_xmit_one+0x32a/0x68c
        [<ffffffff814a325a>] ? _raw_spin_unlock_irqrestore+0x12/0x14
        [<ffffffff8104c551>] ? lock_timer_base.isra.23+0x26/0x4b
        [<ffffffff8132451a>] vxlan_xmit+0x66a/0x6a8
        [<ffffffff8141a365>] ? ipt_do_table+0x35f/0x37e
        [<ffffffff81204ba2>] ? selinux_ip_postroute+0x41/0x26e
        [<ffffffff8139d0c1>] dev_hard_start_xmit+0x2ce/0x3ce
        [<ffffffff8139d491>] __dev_queue_xmit+0x2d0/0x392
        [<ffffffff813b380f>] ? eth_header+0x28/0xb5
        [<ffffffff8139d569>] dev_queue_xmit+0xb/0xd
        [<ffffffff813a5aa6>] neigh_resolve_output+0x134/0x152
        [<ffffffff813db741>] ip_finish_output2+0x236/0x299
        [<ffffffff813dc074>] ip_finish_output+0x98/0x9d
        [<ffffffff813dc749>] ip_output+0x62/0x67
        [<ffffffff813da9f2>] dst_output+0xf/0x11
        [<ffffffff813dc11c>] ip_local_out+0x1b/0x1f
        [<ffffffff813dcf1b>] ip_send_skb+0x11/0x37
        [<ffffffff813dcf70>] ip_push_pending_frames+0x2f/0x33
        [<ffffffff813ff732>] icmp_push_reply+0x106/0x115
        [<ffffffff813ff9e4>] icmp_reply+0x142/0x164
        [<ffffffff813ffb3b>] icmp_echo.part.16+0x46/0x48
        [<ffffffff813c1d30>] ? nf_iterate+0x43/0x80
        [<ffffffff813d8037>] ? xfrm4_policy_check.constprop.11+0x52/0x52
        [<ffffffff813ffb62>] icmp_echo+0x25/0x27
        [<ffffffff814005f7>] icmp_rcv+0x1d2/0x20a
        [<ffffffff813d8037>] ? xfrm4_policy_check.constprop.11+0x52/0x52
        [<ffffffff813d810d>] ip_local_deliver_finish+0xd6/0x14f
        [<ffffffff813d8037>] ? xfrm4_policy_check.constprop.11+0x52/0x52
        [<ffffffff813d7fde>] NF_HOOK.constprop.10+0x4c/0x53
        [<ffffffff813d82bf>] ip_local_deliver+0x4a/0x4f
        [<ffffffff813d7f7b>] ip_rcv_finish+0x253/0x26a
        [<ffffffff813d7d28>] ? inet_add_protocol+0x3e/0x3e
        [<ffffffff813d7fde>] NF_HOOK.constprop.10+0x4c/0x53
        [<ffffffff813d856a>] ip_rcv+0x2a6/0x2ec
        [<ffffffff8139a9a0>] __netif_receive_skb_core+0x43e/0x478
        [<ffffffff812a346f>] ? virtqueue_poll+0x16/0x27
        [<ffffffff8139aa2f>] __netif_receive_skb+0x55/0x5a
        [<ffffffff8139aaaa>] process_backlog+0x76/0x12f
        [<ffffffff8139add8>] net_rx_action+0xa2/0x1ab
        [<ffffffff81047847>] __do_softirq+0xca/0x1d1
        [<ffffffff81047ace>] irq_exit+0x3e/0x85
        [<ffffffff8100b98b>] do_IRQ+0xa9/0xc4
        [<ffffffff814a37ad>] common_interrupt+0x6d/0x6d
        <EOI>
        [<ffffffff810378db>] ? native_safe_halt+0x6/0x8
        [<ffffffff810110c7>] default_idle+0x9/0xd
        [<ffffffff81011694>] arch_cpu_idle+0x13/0x1c
        [<ffffffff8107480d>] cpu_startup_entry+0xbc/0x137
        [<ffffffff8102e741>] start_secondary+0x1a0/0x1a5
       Code: 24 14 e8 f1 e5 01 00 31 d2 a8 32 0f 95 c2 49 8b 44 24 2c 49 0b 44 24 24 74 05 83 ca 04 eb 1c 4d 85 ed 74 17 49 8b 85 a8 02 00 00 <66> 8b 40 46 66 c1 e8 07 83 e0 07 c1 e0 03 09 c2 4c 89 e6 48 89
       RIP  [<ffffffff8143c459>] ip6_route_output+0x58/0x82
        RSP <ffff88007fd03668>
       CR2: 0000000000000046
       ---[ end trace 4612329caab37efd ]---
      
      When vxlan interface is created without explicit group definition, the
      default_dst protocol family is initialiazed to AF_UNSPEC and the driver
      assumes IPv4 configuration. On the other side, the default_dst protocol
      family is used to differentiate between IPv4 and IPv6 cases and, since,
      AF_UNSPEC != AF_INET, the processing takes the IPv6 path.
      
      Making the IPv4 assumption explicit by settting default_dst protocol
      family to AF_INET4 and preventing mixing of IPv4 and IPv6 addresses in
      snooped fdb entries fixes the corner case crashes.
      
      Signed-off-by: default avatarMike Rapoport <mike.rapoport@ravellosystems.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5933a7bb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · cd6362be
      Linus Torvalds authored
      Pull networking updates from David Miller:
       "Here is my initial pull request for the networking subsystem during
        this merge window:
      
         1) Support for ESN in AH (RFC 4302) from Fan Du.
      
         2) Add full kernel doc for ethtool command structures, from Ben
            Hutchings.
      
         3) Add BCM7xxx PHY driver, from Florian Fainelli.
      
         4) Export computed TCP rate information in netlink socket dumps, from
            Eric Dumazet.
      
         5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
            Dichtel.
      
         6) Convert many drivers to pci_enable_msix_range(), from Alexander
            Gordeev.
      
         7) Record SKB timestamps more efficiently, from Eric Dumazet.
      
         8) Switch to microsecond resolution for TCP round trip times, also
            from Eric Dumazet.
      
         9) Clean up and fix 6lowpan fragmentation handling by making use of
            the existing inet_frag api for it's implementation.
      
        10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.
      
        11) Auto size SKB lengths when composing netlink messages based upon
            past message sizes used, from Eric Dumazet.
      
        12) qdisc dumps can take a long time, add a cond_resched(), From Eric
            Dumazet.
      
        13) Sanitize netpoll core and drivers wrt.  SKB handling semantics.
            Get rid of never-used-in-tree netpoll RX handling.  From Eric W
            Biederman.
      
        14) Support inter-address-family and namespace changing in VTI tunnel
            driver(s).  From Steffen Klassert.
      
        15) Add Altera TSE driver, from Vince Bridgers.
      
        16) Optimizing csum_replace2() so that it doesn't adjust the checksum
            by checksumming the entire header, from Eric Dumazet.
      
        17) Expand BPF internal implementation for faster interpreting, more
            direct translations into JIT'd code, and much cleaner uses of BPF
            filtering in non-socket ocntexts.  From Daniel Borkmann and Alexei
            Starovoitov"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
        netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
        net: Add a test to see if a skb is freeable in irq context
        qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
        net: ptp: move PTP classifier in its own file
        net: sxgbe: make "core_ops" static
        net: sxgbe: fix logical vs bitwise operation
        net: sxgbe: sxgbe_mdio_register() frees the bus
        Call efx_set_channels() before efx->type->dimension_resources()
        xen-netback: disable rogue vif in kthread context
        net/mlx4: Set proper build dependancy with vxlan
        be2net: fix build dependency on VxLAN
        mac802154: make csma/cca parameters per-wpan
        mac802154: allow only one WPAN to be up at any given time
        net: filter: minor: fix kdoc in __sk_run_filter
        netlink: don't compare the nul-termination in nla_strcmp
        can: c_can: Avoid led toggling for every packet.
        can: c_can: Simplify TX interrupt cleanup
        can: c_can: Store dlc private
        can: c_can: Reduce register access
        can: c_can: Make the code readable
        ...
      cd6362be