[c-nsp] Cisco CSM issues

Matt Buford matt at overloaded.net
Mon Jul 17 15:33:01 EDT 2006


"Rubens Kuhl Jr." <rubensk at gmail.com> wrote:

> Scenario is a one-arm configuration, with host routes so VIP traffic
> goes to CSM, policy-routing of real servers to make return traffic
> flow thru CSM. CSM version is 4.3(a), running an 12.2(18)SXF4 Cat6500.
> It's a fault tolerant configuration with one CSM on each 6500, CSM FT
> and HSRP between the boxes.

Your topology (policy routing) sounds exactly like what I am doing for my 
large scale shared (many customers per CSM pair) deployments.  Interesting, 
as I've never come across anyone other than myself doing things this way. 
Any time I talk to TAC I have to explain to them how this works as they 
expect real servers to naturally route through the CSMs...

> Client-to-server traffic goes ok: CSM receives the packet, NAT the
> destination to the real server and send it. But when server-to-client
> (since the very first SYN+ACK) packet is received at CSM, it does the
> real-to-vip NAT back and then sends the packet to the wrong
> destination MAC. It has a route so all traffic would go to MSFC, but
> it instead fills one random MAC from its arp table. The packet goes to
> nowhere... 1 second later, a rubbish appears with client IP to VIP,
> coming from the CSM MAC, and then CSM itself generates RST packets to
> both client and real server to close the connection.

I haven't run into anything like this.  I have run into a number of 
significant bugs though.  After more than a year of issues, I've finally 
settled down into something that seems to work reasonably well.

> Any similar experiences, or CSM versions with a solid reliability track ?

I am at 4.1(7).  I was originally advised by TAC not to venture into 4.2 
territory.  I was told 4.1 was the stable branch and 4.2 was the feature 
branch.  I wasn't even aware 4.3 was out...  At this point, I only have one 
bug outstanding (large fragmented UDP packets (specifically RADIUS requests) 
are corrupted during the un-nat process - no workaround is available).  I 
put this one customer on his own private CSM running a 4.1(4-engineering) 
release as a temporary fix until Cisco has the bug fixed in current 
versions.  I'm told 4.2(4) will fix this bug, and that I should move to 4.2 
when that comes out (soon). 



More information about the cisco-nsp mailing list