> Are you still running SXF15a? David advice was already - move to SXI
> to stay out of trouble, as SXF train is already EOS and will hit
> end of software maintenance by December 2011. If You need to stay by
> SXF go to SXF17 and then try to troubleshoot.

Okay, updated the box to SXI3 about 12 hours ago.
Still the same issue though - loosing BGP / OSPF sessions (hold time
expired) and SNMP graphs again looking like crap.

> My first guess is - have You had any problems with TCAMs overflowing
> in the past? If so, in the nearest service window reload the box,
> to clean up the cache and TCAM contents. I'm only guessing that's your
> problem, but mysterious drops on the traffic with no process hinting
> high RP/SP CPU may be the issue here. As well as David noted - any
> errors/drops on the interfaces themselves.

Due to the IOS upgrade the box has been rebooted - so we can rule this
out, I guess?

> Any CoPP configured on the box? mls rate-limiters?

no CoPP configured yet - shame on me, but sh proc CPU does not reveal
any strange or unusual load.

mls rate-limiters:

mls rate-limit unicast cef glean 5000 10
mls rate-limit unicast ip rpf-failure 1000 10
mls rate-limit unicast ip icmp redirect 1000 10
mls rate-limit unicast ip icmp unreachable no-route 1000 10
mls rate-limit unicast ip icmp unreachable acl-drop 1000 10
mls rate-limit unicast ip errors 1000 10
mls rate-limit all ttl-failure 1000 10
mls rate-limit all mtu-failure 1000 10

One more thing I am guessing:

I have two 6704s, te8/1-4 and te9/1-4. Some OSPFs are on one card,
some on the others. The busy VLAN with a few thousand servers is also
channeled on both cards. Would it be better to regroup the vlan to
let's say te8/1-4 and everything that is backbone related (OSPF/IBGP)
to te9/1-4. I am not sure if I am hitting any fabric limitations.

I really do not know where else to look at...


