[j-nsp] Strange VRRP problem -- question about restarting process

John Neiberger jneiberger at gmail.com
Fri Nov 2 18:43:44 EDT 2012


Okay, I've been looking at this for a little bit and it's just really
bizarre. I was wrong about the connectivity earlier. It's really just a
single Cisco 4948 in the middle between these two MX960s. IGMP snooping is
not enabled, nor are there any inbound filters on the routers. I have
verified that our RE filter is allowing VRRP. We have verified with the
monitor traffic command that router 1 is sending and receiving vrrp
multicasts, but router 2 is not receiving them, only sending them.

The switch is a pretty vanilla config. The two links are in the same VLAN
and there are no special features enabled, like MAC filtering or whatever.
It's very straightforward, which is why we're all stumped. Something is
stopping those multicasts from reaching router 2, but for the life of me I
don't see what it could be.


On Fri, Nov 2, 2012 at 3:53 PM, John Neiberger <jneiberger at gmail.com> wrote:

> Sorry for the lack of replies. I got swamped today and haven't had a
> chance to look at this much. Another one of our engineers has been working
> it. I did notice that the three interfaces I originally looked at back when
> this started all seem to be fine now. However, this weird behavior seems to
> have moved to some other interfaces. I'm going to need to investigate a bit
> more to find out what changed when I wasn't looking.  :)
>
> We do not have IGMP snooping enabled on the Cisco switches and we have no
> inbound filters that would block traffic. In fact, we have this identical
> config on several different routers and dozens of interfaces and switches
> with no problem. Whatever is wrong seems to be isolated to this router.
> I'll try to regroup and get the latest info.
>
> Thanks!
> John
>
>
> On Fri, Nov 2, 2012 at 11:18 AM, Alex Arseniev <alex.arseniev at gmail.com>wrote:
>
>> Well, that's fairly straightforward - either (1) VRRP on master [J]
>> stopped sending or (2) CSCO switches stopped forwarding VRRP hellos, or (3)
>> backup [J] drops incoming VRRP hellos.
>> You can verify (1) by using "monitor traffic interface <blah> no-resolve
>> size 9999".
>> (2) could be verified with SPAN/RSPAN
>> (3) cannot be verified with "monitor traffic interface" _if_ there is an
>> input FW filter. "monitor traffic interface" a.k.a. tcpdump does not
>> capture packets dropped by FW filter. Which begs a question - do you have
>> an input FW filter on VRRP interfaces or lo0 and if yes, do you allow
>> "protocol vrrp" as well as AH/proto 51 and have you added/changed VRRP auth
>> type recently? Proto 51 is used when VRRP MD5 auth is configured. In any
>> case, I'd suggest to configure a FW filter to log/syslog incoming VRRP
>> packets (dst.ip 224.0.0.18/32) on backup [J].
>> HTH
>> Rgds
>> Alex
>>
>> ----- Original Message ----- From: "John Neiberger" <jneiberger at gmail.com
>> >
>> To: <juniper-nsp at puck.nether.net>
>> Sent: Friday, November 02, 2012 3:37 PM
>> Subject: [j-nsp] Strange VRRP problem -- question about restarting process
>>
>>
>>  We have a very odd problem that we've been dealing with for a couple of
>>> weeks. JTAC is involved but we have not come to a resolution yet. The
>>> gist
>>> of the problem is that we have two MX960s and we're running VRRP on
>>> multiple interfaces with different Cisco switches in between each pair of
>>> Juniper interfaces.
>>>
>>> [J] ----- [C]----[C]------ [J]
>>>
>>> The switches are just layer two and we're running VRRP on the routers.
>>> The
>>> problem is that one day, three of the interfaces on the backup router
>>> suddenly stopped receiving VRRP messages from its peer. JTAC seems to
>>> think
>>> that the Cisco switches just suddenly stopped forwarding VRRP messages to
>>> the backup router, but that makes zero sense unless some bizarre issue
>>> just
>>> happened to occur on multiple unrelated switches at exactly the same
>>> moment. I'm still leaning toward a problem on the router.
>>>
>>> Which leads me to my question. What is the risk of restarting the VRRP
>>> process? I see we have "soft" and "graceful" as options. Both sound
>>> fairly
>>> low-risk. I'm tempted to just restart the process on the backup router to
>>> see if that fixes the problem.
>>>
>>> What do you think?
>>>
>>> Thanks,
>>> John
>>> ______________________________**_________________
>>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>>> https://puck.nether.net/**mailman/listinfo/juniper-nsp<https://puck.nether.net/mailman/listinfo/juniper-nsp>
>>>
>>>
>>
>


More information about the juniper-nsp mailing list