Skip to content
Commit 109d67e4 authored by Faisal Latif's avatar Faisal Latif Committed by Roland Dreier
Browse files

RDMA/nes: Fix hang issues for large cluster dynamic connections



Running large cluster setup, we are hanging after many hours of
testing.  Fixing this required going over the code and making sure the
rexmit entry was properly removed based on the cm_node's state and
packet received.  Also when receiving a FIN packet, check seq# and
make sure there were no errors before calling handle_fin().

Following are the changes done in nes_cm.c:

* handle_ack_pkt() needs to return error value, so in case of error,
  handle_fin() is not called. Some cleanup done while going over the code.

* handle_rst_pkt(), handling of cm_node's NES_CM_STATE_LAST_ACK is missing.

* process_packet(), in case of FIN only packet is received, call
  check_seq() before processing.

* in handle_fin_pkt(), we are calling cleanup_retrans_entry() for all
  conditions, even if the packets need to be dropped.

Signed-off-by: default avatarFaisal Latif <faisal.latif@intel.com>
Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
parent 4e9c3900
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment