Skip to content
  1. Dec 15, 2013
  2. Dec 13, 2013
    • Paul Durrant's avatar
      xen-netback: fix gso_prefix check · a3314f3d
      Paul Durrant authored
      
      
      There is a mistake in checking the gso_prefix mask when passing large
      packets to a guest. The wrong shift is applied to the bit - the raw skb
      gso type is used rather then the translated one. This leads to large packets
      being handed to the guest without the GSO metadata. This patch fixes the
      check.
      
      The mistake manifested as errors whilst running Microsoft HCK large packet
      offload tests between a pair of Windows 8 VMs. I have verified this patch
      fixes those errors.
      
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3314f3d
    • Sebastian Siewior's avatar
      net: make neigh_priv_len in struct net_device 16bit instead of 8bit · a0a9663d
      Sebastian Siewior authored
      
      
      neigh_priv_len is defined as u8. With all debug enabled struct
      ipoib_neigh has 200 bytes. The largest part is sk_buff_head with 96
      bytes and here the spinlock with 72 bytes.
      The size value still fits in this u8 leaving some room for more.
      
      On -RT struct ipoib_neigh put on weight and has 392 bytes. The main
      reason is sk_buff_head with 288 and the fatty here is spinlock with 192
      bytes. This does no longer fit into into neigh_priv_len and gcc
      complains.
      
      This patch changes neigh_priv_len from being 8bit to 16bit. Since the
      following element (dev_id) is 16bit followed by a spinlock which is
      aligned, the struct remains with a total size of 3200 (allmodconfig) /
      2048 (with as much debug off as possible) bytes on x86-64.
      On x86-32 the struct is 1856 (allmodconfig) / 1216 (with as much debug
      off as possible) bytes long. The numbers were gained with and without
      the patch to prove that this change does not increase the size of the
      struct.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0a9663d
    • Mugunthan V N's avatar
      drivers: net: cpsw: fix for cpsw crash when build as modules · f280e89a
      Mugunthan V N authored
      
      
      When CPSW and Davinci MDIO are build as modules, CPSW crashes when
      accessing CPSW registers in CPSW probe. The same is working in built-in
      as the CPSW clocks are enabled in Davindi MDIO probe, SO Enabling the
      clocks before accessing the version register and moving out the other
      register access to cpsw device open.
      
      Signed-off-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f280e89a
    • Paul Durrant's avatar
      xen-netback: napi: don't prematurely request a tx event · d9601a36
      Paul Durrant authored
      
      
      This patch changes the RING_FINAL_CHECK_FOR_REQUESTS in
      xenvif_build_tx_gops to a check for RING_HAS_UNCONSUMED_REQUESTS as the
      former call has the side effect of advancing the ring event pointer and
      therefore inviting another interrupt from the frontend before the napi
      poll has actually finished, thereby defeating the point of napi.
      
      The event pointer is updated by RING_FINAL_CHECK_FOR_REQUESTS in
      xenvif_poll, the napi poll function, if the work done is less than the
      budget i.e. when actually transitioning back to interrupt mode.
      
      Reported-by: default avatarMalcolm Crossley <malcolm.crossley@citrix.com>
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9601a36
    • Paul Durrant's avatar
      xen-netback: napi: fix abuse of budget · 10574059
      Paul Durrant authored
      
      
      netback seems to be somewhat confused about the napi budget parameter. The
      parameter is supposed to limit the number of skbs processed in each poll,
      but netback has this confused with grant operations.
      
      This patch fixes that, properly limiting the work done in each poll. Note
      that this limit makes sure we do not process any more data from the shared
      ring than we intend to pass back from the poll. This is important to
      prevent tx_queue potentially growing without bound.
      
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10574059
  3. Dec 12, 2013
  4. Dec 11, 2013
    • John W. Linville's avatar
      Merge branch 'master' of... · 33457ff7
      John W. Linville authored
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
      33457ff7
    • Eric Dumazet's avatar
      udp: ipv4: fix an use after free in __udp4_lib_rcv() · 8afdd99a
      Eric Dumazet authored
      Dave Jones reported a use after free in UDP stack :
      
      [ 5059.434216] =========================
      [ 5059.434314] [ BUG: held lock freed! ]
      [ 5059.434420] 3.13.0-rc3+ #9 Not tainted
      [ 5059.434520] -------------------------
      [ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there!
      [ 5059.434815]  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
      [ 5059.435012] 3 locks held by named/863:
      [ 5059.435086]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940
      [ 5059.435295]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410
      [ 5059.435500]  #2:  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
      [ 5059.435734]
      stack backtrace:
      [ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365]
      [ 5059.436052] Hardware name:                  /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010
      [ 5059.436223]  0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0
      [ 5059.436390]  ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0
      [ 5059.436554]  ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48
      [ 5059.436718] Call Trace:
      [ 5059.436769]  <IRQ>  [<ffffffff8153a372>] dump_stack+0x4d/0x66
      [ 5059.436904]  [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160
      [ 5059.437037]  [<ffffffff8141c490>] ? __sk_free+0x110/0x230
      [ 5059.437147]  [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150
      [ 5059.437260]  [<ffffffff8141c490>] __sk_free+0x110/0x230
      [ 5059.437364]  [<ffffffff8141c5c9>] sk_free+0x19/0x20
      [ 5059.437463]  [<ffffffff8141cb25>] sock_edemux+0x25/0x40
      [ 5059.437567]  [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280
      [ 5059.437685]  [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0
      [ 5059.437805]  [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240
      [ 5059.437925]  [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70
      [ 5059.438038]  [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0
      [ 5059.438155]  [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00
      [ 5059.438269]  [<ffffffff8149d7f5>] udp_rcv+0x15/0x20
      [ 5059.438367]  [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410
      [ 5059.438492]  [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410
      [ 5059.438621]  [<ffffffff81468653>] ip_local_deliver+0x43/0x80
      [ 5059.438733]  [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0
      [ 5059.438843]  [<ffffffff81468926>] ip_rcv+0x296/0x3f0
      [ 5059.438945]  [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940
      [ 5059.439074]  [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940
      [ 5059.442231]  [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10
      [ 5059.442231]  [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60
      [ 5059.442231]  [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0
      [ 5059.442231]  [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0
      [ 5059.442231]  [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169]
      [ 5059.442231]  [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0
      [ 5059.442231]  [<ffffffff810478cd>] __do_softirq+0xed/0x240
      [ 5059.442231]  [<ffffffff81047e25>] irq_exit+0x125/0x140
      [ 5059.442231]  [<ffffffff81004241>] do_IRQ+0x51/0xc0
      [ 5059.442231]  [<ffffffff81542bef>] common_interrupt+0x6f/0x6f
      
      We need to keep a reference on the socket, by using skb_steal_sock()
      at the right place.
      
      Note that another patch is needed to fix a race in
      udp_sk_rx_dst_set(), as we hold no lock protecting the dst.
      
      Fixes: 421b3885
      
       ("udp: ipv4: Add udp early demux")
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8afdd99a
    • David S. Miller's avatar
      Merge branch 'sctp' · 4585a79d
      David S. Miller authored
      
      
      Wang Weidong says:
      
      ====================
      sctp: check the rto_min and rto_max
      
      v6 -> v7:
        -patch2: fix the whitespace issues which pointed out by Daniel
      
      v5 -> v6:
        split the v5' first patch to patch1 and patch2, and remove the
        macro in constants.h
      
        -patch1: do rto_min/max socket option handling in its own patch, and
         fix the check of rto_min/max.
        -patch2: do rto_min/max sysctl handling in its own patch.
        -patch3: add Suggested-by Daniel.
      
      v4 -> v5:
        - patch1: add marco in constants.h and fix up spacing as
          suggested by Daniel
        - patch2: add a patch for fix up do_hmac_alg for according
          to do_rto_min[max]
      
      v3 -> v4:
        -patch1: fix use init_net directly which suggested by Vlad.
      
      v2 -> v3:
        -patch1: add proc_handler for check rto_min and rto_max which suggested
         by Vlad
      
      v1 -> v2:
        -patch1: fix the From Name which pointed out by David, and
         add the ACK by Neil
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4585a79d
    • wangweidong's avatar
      sctp: fix up a spacing · b486b228
      wangweidong authored
      
      
      fix up spacing of proc_sctp_do_hmac_alg for according to the
      proc_sctp_do_rto_min[max] in sysctl.c
      
      Suggested-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarWang Weidong <wangweidong1@huawei.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b486b228
    • wangweidong's avatar
      sctp: add check rto_min and rto_max in sysctl · 4f3fdf3b
      wangweidong authored
      
      
      rto_min should be smaller than rto_max while rto_max should be larger
      than rto_min. Add two proc_handler for the checking.
      
      Suggested-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarWang Weidong <wangweidong1@huawei.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f3fdf3b
    • wangweidong's avatar
      sctp: check the rto_min and rto_max in setsockopt · 85f935d4
      wangweidong authored
      
      
      When we set 0 to rto_min or rto_max, just not change the value. Also
      we should check the rto_min > rto_max.
      
      Suggested-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarWang Weidong <wangweidong1@huawei.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85f935d4
    • Florent Fourcot's avatar
      ipv6: do not erase dst address with flow label destination · ce7a3bdf
      Florent Fourcot authored
      This patch is following b579035f
      
      
      	"ipv6: remove old conditions on flow label sharing"
      
      Since there is no reason to restrict a label to a
      destination, we should not erase the destination value of a
      socket with the value contained in the flow label storage.
      
      This patch allows to really have the same flow label to more
      than one destination.
      
      Signed-off-by: default avatarFlorent Fourcot <florent.fourcot@enst-bretagne.fr>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce7a3bdf
    • Neil Horman's avatar
      sctp: properly latch and use autoclose value from sock to association · 9f70f46b
      Neil Horman authored
      
      
      Currently, sctp associations latch a sockets autoclose value to an association
      at association init time, subject to capping constraints from the max_autoclose
      sysctl value.  This leads to an odd situation where an application may set a
      socket level autoclose timeout, but sliently sctp will limit the autoclose
      timeout to something less than that.
      
      Fix this by modifying the autoclose setsockopt function to check the limit, cap
      it and warn the user via syslog that the timeout is capped.  This will allow
      getsockopt to return valid autoclose timeout values that reflect what subsequent
      associations actually use.
      
      While were at it, also elimintate the assoc->autoclose variable, it duplicates
      whats in the timeout array, which leads to multiple sources for the same
      information, that may differ (as the former isn't subject to any capping).  This
      gives us the timeout information in a canonical place and saves some space in
      the association structure as well.
      
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      CC: Wang Weidong <wangweidong1@huawei.com>
      CC: David Miller <davem@davemloft.net>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f70f46b
    • David S. Miller's avatar
      Merge branch 'tipc' · 231df15f
      David S. Miller authored
      
      
      Jon Maloy says:
      
      ====================
      tipc: corrections related to tasklet job mechanism
      
      These commits correct two bugs related to tipc' service for launching
      functions for asynchronous execution in a separate tasklet.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      231df15f
    • Ying Xue's avatar
      tipc: protect handler_enabled variable with qitem_lock spin lock · 00ede977
      Ying Xue authored
      
      
      'handler_enabled' is a global flag indicating whether the TIPC
      signal handling service is enabled or not. The lack of lock
      protection for this flag incurs a risk for contention, so that
      a tipc_k_signal() call might queue a signal handler to a destroyed
      signal queue, with unpredictable results. To correct this, we let
      the already existing 'qitem_lock' protect the flag, as it already
      does with the queue itself. This way, we ensure that the flag
      always is consistent across all cores.
      
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00ede977
    • Jon Paul Maloy's avatar
      tipc: correct the order of stopping services at rmmod · 993b858e
      Jon Paul Maloy authored
      
      
      The 'signal handler' service in TIPC is a mechanism that makes it
      possible to postpone execution of functions, by launcing them into
      a job queue for execution in a separate tasklet, independent of
      the launching execution thread.
      
      When we do rmmod on the tipc module, this service is stopped after
      the network service. At the same time, the stopping of the network
      service may itself launch jobs for execution, with the risk that these
      functions may be scheduled for execution after the data structures
      meant to be accessed by the job have already been deleted. We have
      seen this happen, most often resulting in an oops.
      
      This commit ensures that the signal handler is the very first to be
      stopped when TIPC is shut down, so there are no surprises during
      the cleanup of the other services.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      993b858e
    • Nat Gurumoorthy's avatar
      tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 · 388d3335
      Nat Gurumoorthy authored
      
      
      The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
      uninitialized. From power on reset this register may have garbage in it. The
      Register Base Address register defines the device local address of a
      register. The data pointed to by this location is read or written using
      the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
      garbage any read or write of Register Data Register (PCI 128) will cause the
      PCI bus to lock up. The TCO watchdog will fire and bring down the system.
      
      Signed-off-by: default avatarNat Gurumoorthy <natg@google.com>
      Acked-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      388d3335
    • David S. Miller's avatar
      net: Revert macvtap/tun truncation signalling changes. · bbd37626
      David S. Miller authored
      
      
      Jason Wang and Michael S. Tsirkin are still discussing how
      to properly fix this.
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbd37626
    • Jason Wang's avatar
      macvtap: signal truncated packets · 730054da
      Jason Wang authored
      
      
      macvtap_put_user() never return a value grater than iov length, this in fact
      bypasses the truncated checking in macvtap_recvmsg(). Fix this by always
      returning the size of packet plus the possible vlan header to let the truncated
      checking work.
      
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      730054da
    • Jason Wang's avatar
      tun: unbreak truncated packet signalling · 923347bb
      Jason Wang authored
      Commit 6680ec68
      
      
      (tuntap: hardware vlan tx support) breaks the truncated packet signal by never
      return a length greater than iov length in tun_put_user(). This patch fixes this
      by always return the length of packet plus possible vlan header. Caller can
      detect the truncated packet by comparing the return value and the size of iov
      length.
      
      Reported-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      923347bb
    • Fan Du's avatar
      vxlan: release rt when found circular route · fffc15a5
      Fan Du authored
      
      
      Otherwise causing dst memory leakage.
      Have Checked all other type tunnel device transmit implementation,
      no such things happens anymore.
      
      Signed-off-by: default avatarFan Du <fan.du@windriver.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fffc15a5
    • Sasha Levin's avatar
      net: unix: allow set_peek_off to fail · 12663bfc
      Sasha Levin authored
      
      
      unix_dgram_recvmsg() will hold the readlock of the socket until recv
      is complete.
      
      In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
      unix_dgram_recvmsg() will complete (which can take a while) without allowing
      us to break out of it, triggering a hung task spew.
      
      Instead, allow set_peek_off to fail, this way userspace will not hang.
      
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Acked-by: default avatarPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12663bfc
    • David S. Miller's avatar
      Merge branch 'sfc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc · 88b07b36
      David S. Miller authored
      
      
      Ben Hutchings says:
      
      ====================
      Several fixes for the PTP hardware support added in 3.7:
      1. Fix filtering of PTP packets on the TX path to be robust against bad
      header lengths.
      2. Limit logging on the RX path in case of a PTP packet flood, partly
      from Laurence Evans.
      3. Disable PTP hardware when the interface is down so that we don't
      receive RX timestamp events, from Alexandre Rames.
      4. Maintain clock frequency adjustment when a time offset is applied.
      
      Also fixes for the SFC9100 family support added in 3.12:
      5. Take the RX prefix length into account when applying NET_IP_ALIGN,
      from Andrew Rybchenko.
      6. Work around a bug that breaks communication between the driver and
      firmware, from Robert Stonehouse.
      
      Please also queue these up for the appropriate stable branches.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88b07b36
    • Maxime Ripard's avatar
      net: allwinner: emac: Add missing free_irq · e9c56f8d
      Maxime Ripard authored
      
      
      The sun4i-emac driver uses devm_request_irq at .ndo_open time, but relies on
      the managed device mechanism to actually free it. This causes an issue whenever
      someone wants to restart the interface, the interrupt still being held, and not
      yet released.
      
      Fall back to using the regular request_irq at .ndo_open time, and introduce a
      free_irq during .ndo_stop.
      
      Signed-off-by: default avatarMaxime Ripard <maxime.ripard@free-electrons.com>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9c56f8d