Skip to content
  1. Jun 02, 2011
  2. Jun 01, 2011
  3. May 28, 2011
  4. May 27, 2011
  5. May 26, 2011
    • Flavio Leitner's avatar
      bonding: documentation and code cleanup for resend_igmp · 94265cf5
      Flavio Leitner authored
      
      
      Improves the documentation about how IGMP resend parameter
      works, fix two missing checks and coding style issues.
      
      Signed-off-by: default avatarFlavio Leitner <fbl@redhat.com>
      Acked-by: default avatarRick Jones <rick.jones2@hp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94265cf5
    • Neil Horman's avatar
      bonding: prevent deadlock on slave store with alb mode (v3) · 9fe0617d
      Neil Horman authored
      
      
      This soft lockup was recently reported:
      
      [root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
      [root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
      bonding: bond5: doing slave updates when interface is down.
      bonding bond5: master_dev is not up in bond_enslave
      [root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
      bonding: bond5: doing slave updates when interface is down.
      
      BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
      CPU 12:
      Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
      be2d
      Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
      RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
      .text.lock.spinlock+0x26/00
      RSP: 0018:ffff810113167da8  EFLAGS: 00000286
      RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
      RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
      RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
      R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
      R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
      FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0
      
      Call Trace:
       [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
       [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
       [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
       [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
       [<ffffffff8006457b>] __down_write_nested+0x12/0x92
       [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
       [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
       [<ffffffff80016b87>] vfs_write+0xce/0x174
       [<ffffffff80017450>] sys_write+0x45/0x6e
       [<ffffffff8005d28d>] tracesys+0xd5/0xe0
      
      It occurs because we are able to change the slave configuarion of a bond while
      the bond interface is down.  The bonding driver initializes some data structures
      only after its ndo_open routine is called.  Among them is the initalization of
      the alb tx and rx hash locks.  So if we add or remove a slave without first
      opening the bond master device, we run the risk of trying to lock/unlock a
      spinlock that has garbage for data in it, which results in our above softlock.
      
      Note that sometimes this works, because in many cases an unlocked spinlock has
      the raw_lock parameter initialized to zero (meaning that the kzalloc of the
      net_device private data is equivalent to calling spin_lock_init), but thats not
      true in all cases, and we aren't guaranteed that condition, so we need to pass
      the relevant spinlocks through the spin_lock_init function.
      
      Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
      the ndo_init path, so they are ready for use by the bond_store_slaves path.
      
      Change notes:
      v2) Based on conversation with Jay and Nicolas it seems that the ability to
      enslave devices while the bond master is down should be safe to do.  As such
      this is an outlier bug, and so instead we'll just initalize the errant spinlocks
      in the init path rather than the open path, solving the problem.  We'll also
      remove the warnings about the bond being down during enslave operations, since
      it should be safe
      
      v3) Fix spelling error
      
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatar <jtluka@redhat.com>
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: nicolas.2p.debian@gmail.com
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fe0617d