[j-nsp] Continuous SNMP traps for ospfTxRetransmit alarms

Jackson, William william.jackson at gibtele.com
Tue Jan 26 18:20:11 EST 2016


Hi

Trying to get some other opinions here on a problem I have.

We have an ex4300 switch which is running ospf on an irb.
On the same vlan there are multiple other ospf routers, thus we have a DR/BDR and DROTHER nodes.
We are seeing a constant flow of traps for ospf retransmits from the ex.  This is not traffic affecting just management/monitoring effecting.

It would seem that due to the way that Juniper have implemented this part of the RFC.  I was wondering if other people have run into this situation and what any mitigation steps were apart from redesigning the network :)

JTAC have stated the following ( that is working as by design )



Update: Juniper's implementation of the OSPF flooding mechanism can occasionally result in unnecessary LSA retransmissions. This has no operational impact, and does not affect convergence times. Read below analysis and explanation given below.



Explanation/Analysis: Please refer to PR 35491 for more details:



Both DR-OTHER and DR ROUTER receive this LSA from other neighbors and flood the LSA. DR-OTHER floods to all

AllDRouters(224.0.0.6) and DR ROUTER floods to AllSPFRouters(224.0.0.5).



Thus DR-OTHER and DR ROUTER add the LSA to Neighbor BDR ROUTER's retransmit list as well as each other's retransmission list.

DR-OTHER receives the update from DR ROUTER and treats it as an implied acknowledgement. RFC2328 Section 13 Step (7).



Similarly DR ROUTER receives the update from DR-OTHER and treats it as an implied acknowledgement. RFC2328 Section 13 Step (7).



BDR ROUTER receives both the updates and processes the update from DR-OTHER first since it is acting as BDR on the subnet should add the LSA

to the retransmits lists of all neighbors on that interfaces(which in this case are DR-OTHER and DR ROUTER) but will not flood that LSA back out. RFC2328

Sections 13.3 Step (1d) and 13.3 Step (4)



BDR ROUTER then processes the update from DR ROUTER and it sends a direct ack  to DR ROUTER.



*****Note that a direct ack is sent to the unicast address and is addressed to DR ROUTER (Not expected behavior and this leads to OSPF re-transmissions).

*****The section 13.5 and table 19 in RFC 2328 describes what BDR ROUTER should do in this case.

*****The expected behavior is to send a delayed acknowledgment to AllSPFRouters, which removes the LSA from the retransmit lists of both

*****DR-OTHER and DR ROUTER.



Root cause:   Juniper sends a direct ack instead of a delayed ack and that is the reason for the

retransmissions. The direct ack is sent to the unicast address of DR ROUTER where as a delayed ack would have been addressed to

AllSPFRouters and would have removed the LSA from the retransmission lists of both DR-OTHER and DR ROUTER.

DR ROUTER receives the direct ack from BDR ROUTER and removes the entry from the retransmit list for BDR ROUTER, where as DR-OTHER ends up retransmitting the LSA when the retransmit timer fires.





The following was taken from RFC 2328 which clearly explains the above situation:




                                     Action taken in state
   Circumstances            Backup                All other states
   _________________________________________________________________
   LSA  has                 No  acknowledgment    No  acknowledgment
   been  flooded back       sent.                 sent.
   out receiving  in-
   terface  (see Sec-
   tion 13, step 5b).
   _________________________________________________________________
   LSA   is                 Delayed acknowledg-   Delayed       ack-
   more  recent  than       ment sent if adver-   nowledgment sent.
   database copy, but       tisement   received
   was   not  flooded       from    Designated
   back out receiving       Router,  otherwise
   interface                do nothing
   _________________________________________________________________
   LSA is a                 Delayed acknowledg-   No  acknowledgment
   duplicate, and was       ment sent if adver-   sent.
   treated as an  im-       tisement   received
   plied  acknowledg-       from    Designated
   ment (see  Section       Router,  otherwise
   13, step 7a).            do nothing
   _________________________________________________________________
   LSA is a                 Direct acknowledg-    Direct acknowledg-
   duplicate, and was       ment sent.            ment sent.
   not treated as  an
   implied       ack-
   nowledgment.
   _________________________________________________________________
   LSA's LS                 Direct acknowledg-    Direct acknowledg-
   age is equal to          ment sent.            ment sent.
   MaxAge, and there is
   no current instance
   of the LSA
   in the link state
   database, and none
   of router's neighbors
   are in states Exchange



I would have thought there might have been a cli statement to switch between strict RFC and the juniper implementation??
Any comments?

thanks

William



More information about the juniper-nsp mailing list