[j-nsp] Continuous SNMP traps for ospfTxRetransmit alarms
Jackson, William
william.jackson at gibtele.com
Tue Jan 26 18:20:11 EST 2016
Hi
Trying to get some other opinions here on a problem I have.
We have an ex4300 switch which is running ospf on an irb.
On the same vlan there are multiple other ospf routers, thus we have a DR/BDR and DROTHER nodes.
We are seeing a constant flow of traps for ospf retransmits from the ex. This is not traffic affecting just management/monitoring effecting.
It would seem that due to the way that Juniper have implemented this part of the RFC. I was wondering if other people have run into this situation and what any mitigation steps were apart from redesigning the network :)
JTAC have stated the following ( that is working as by design )
Update: Juniper's implementation of the OSPF flooding mechanism can occasionally result in unnecessary LSA retransmissions. This has no operational impact, and does not affect convergence times. Read below analysis and explanation given below.
Explanation/Analysis: Please refer to PR 35491 for more details:
Both DR-OTHER and DR ROUTER receive this LSA from other neighbors and flood the LSA. DR-OTHER floods to all
AllDRouters(224.0.0.6) and DR ROUTER floods to AllSPFRouters(224.0.0.5).
Thus DR-OTHER and DR ROUTER add the LSA to Neighbor BDR ROUTER's retransmit list as well as each other's retransmission list.
DR-OTHER receives the update from DR ROUTER and treats it as an implied acknowledgement. RFC2328 Section 13 Step (7).
Similarly DR ROUTER receives the update from DR-OTHER and treats it as an implied acknowledgement. RFC2328 Section 13 Step (7).
BDR ROUTER receives both the updates and processes the update from DR-OTHER first since it is acting as BDR on the subnet should add the LSA
to the retransmits lists of all neighbors on that interfaces(which in this case are DR-OTHER and DR ROUTER) but will not flood that LSA back out. RFC2328
Sections 13.3 Step (1d) and 13.3 Step (4)
BDR ROUTER then processes the update from DR ROUTER and it sends a direct ack to DR ROUTER.
*****Note that a direct ack is sent to the unicast address and is addressed to DR ROUTER (Not expected behavior and this leads to OSPF re-transmissions).
*****The section 13.5 and table 19 in RFC 2328 describes what BDR ROUTER should do in this case.
*****The expected behavior is to send a delayed acknowledgment to AllSPFRouters, which removes the LSA from the retransmit lists of both
*****DR-OTHER and DR ROUTER.
Root cause: Juniper sends a direct ack instead of a delayed ack and that is the reason for the
retransmissions. The direct ack is sent to the unicast address of DR ROUTER where as a delayed ack would have been addressed to
AllSPFRouters and would have removed the LSA from the retransmission lists of both DR-OTHER and DR ROUTER.
DR ROUTER receives the direct ack from BDR ROUTER and removes the entry from the retransmit list for BDR ROUTER, where as DR-OTHER ends up retransmitting the LSA when the retransmit timer fires.
The following was taken from RFC 2328 which clearly explains the above situation:
Action taken in state
Circumstances Backup All other states
_________________________________________________________________
LSA has No acknowledgment No acknowledgment
been flooded back sent. sent.
out receiving in-
terface (see Sec-
tion 13, step 5b).
_________________________________________________________________
LSA is Delayed acknowledg- Delayed ack-
more recent than ment sent if adver- nowledgment sent.
database copy, but tisement received
was not flooded from Designated
back out receiving Router, otherwise
interface do nothing
_________________________________________________________________
LSA is a Delayed acknowledg- No acknowledgment
duplicate, and was ment sent if adver- sent.
treated as an im- tisement received
plied acknowledg- from Designated
ment (see Section Router, otherwise
13, step 7a). do nothing
_________________________________________________________________
LSA is a Direct acknowledg- Direct acknowledg-
duplicate, and was ment sent. ment sent.
not treated as an
implied ack-
nowledgment.
_________________________________________________________________
LSA's LS Direct acknowledg- Direct acknowledg-
age is equal to ment sent. ment sent.
MaxAge, and there is
no current instance
of the LSA
in the link state
database, and none
of router's neighbors
are in states Exchange
I would have thought there might have been a cli statement to switch between strict RFC and the juniper implementation??
Any comments?
thanks
William
More information about the juniper-nsp
mailing list