[j-nsp] strange problem on chassis cluster

Matthias Brumm matthias at commy.de
Sat Sep 4 08:36:25 EDT 2010


  Hi!

updating on the at the moment not used customer system was catastrophic. 
Not only the cluster seems not to be coming correctly up, I have 
difficulties to reach the system. The flowd process has high CPU rates. 
May that have something to do with it?

I have not wrote, why I have wrote about the customer system. We have 
different hardware there (4350) but the same problem. Right after a 
change on the interface configuration, the only BGP session to our main 
router is dropped. And at the moment after the upgrade I am unable to 
reach the system.

Matthias

Am 04.09.10 12:50, schrieb Matthias Brumm:
>    HI!
>
> We have a very strange problem on two chassis clusters with 10.0R3.10
> (will try updating to R4.7 today).
>
> One chassis cluster (2x J6350) is our main system
> The other (2x J4350) is a system located on the site of our customer.
>
> The two clusters are speaking BGP with each other. For the customer
> system, this is the only BGP session. Our main system has a full BGP
> mesh to our other locations and edge systems. For understanding the
> problem, I would compress this to three BGP sessions:
>
> A) BGP session to AMS-IX over VLAN 1
> B) BGP session to ECIX over VLAN 1
> C) BGP session to ECIX over VLAN 2
>
> Involved are two switches. VLAN 1 is configured on both switches to make
> it available in Amsterdam and Düsseldorf. VLAN 2 is only configured on
> the switch, faced to Düsseldorf, to have a backup in the case the first
> switch is dead.
>
> The day before yesterday, I started to pings to the ECIX router. One
> from my local workstation, the other from the main cluster.
>
> If I cofigure something on the redundant interfaces, as soon as I do the
> commit, the first ping stays normal, the second junps to +30ms (normal
> around 6ms). 2-3 minutes later, both pings stop. The BGP session drops.
> This is the only BGP session that is dropped, due to Hold time
> expiration. After a few minutes, the pings and the BGP session come
> back. Every other BGP session even the one to Düsseldorf over VLAN 2
> stays up.
>
> I switched the main load to Düsseldorf to VLAN 2. That time, that BGP
> session was dropped, while the other stays up. The session to Düsseldorf
> is taking the main load with around 260000 prefixes.
>
> Matthias
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



More information about the juniper-nsp mailing list