Re: policy routing performance

From: Oded Comay (comay@post.tau.ac.il)
Date: Tue Jan 11 2000 - 04:29:57 EST


Please note that tests using new features in IOS 12.0S showed a much
improved performance when using policy routing and GRE tunneling (see
below). To summarize, the 7500 in the tested configuration was seated in a
loop, such that packets flowed both ways. We were able to squeeze 45K PPS
on each direction (I think some people count this as 4x45K= 190K PPS, as
packets are counted both incoming and outgoing).

Oded.

>From comay@post.tau.ac.il Sun Aug 1 21:38:39 1999
Date: Sun, 1 Aug 1999 21:38:38 +0300 (IDT)
From: Oded Comay <comay@post.tau.ac.il>
X-Sender: comay@ccsg.tau.ac.il
Reply-To: Oded Comay <comay@post.tau.ac.il>
To: CISCO-L - Israeli Cisco Kids <cisco-l@LISTSERV.AC.IL>
Subject: GigaPop capacity
Message-ID: <Pine.SGI.3.96-heb-2.07.990731214359.7024R-100000@ccsg.tau.ac.il>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: RO
X-Status:

First, Conclusions:

 . T3 is 44736Kbps. T3 framing leaves us with 44210Kbps. In the tests
   we could get 43Mbps. I suspect IOS doesn't report the HDLC encapsulation
   overhead (about 8 bytes/packet), which would make the utilization close
   to 44Mbps.

 . The Satellite link latency is 277ms (554ms round trip). A terrestrial
   link in this distance would probably induce 50-60ms latency (100-120ms
   round trip).

 . Using both a tunnel and policy based routing, the system can sustain 45K
   packets/sec in each direction. This can be improved by about 10% by
   upgrading the VIP running the fast ethernet on GP2 to VIP2-50 (for some
   reason, the cards running the fast ethernets were supplied with a
   VIP2-40).

 . The average packet size on the Internet is close to 500 bytes. The
   system has no problem sustaining a T3 assuming this packet size. To get
   close to OC3 rate, all VIP2-40 cards will need to be upgraded.

Now, Test methodology:

The following setup was established in order to test the gigapop capacity:

    2xFE T3 2xFE
GP4======GP3-------GP2======GP1

a GRE tunnel was defined between GP3 and GP2. A policy route map was
defined on GP3 so that packets arriving on the I1 ethernet interface are
directed into the tunnel. A similar route-map is defined on GP2, which
directs all packets arriving on the tunnel to the I1 interface of GP1. This
setup allows us to create a multi-hop routing loop (the reason for doing so
will be cleared later) as follows: statically route a destination X from
GP1 to GP2. On GP2, X is routed to GP3. GP3 routes X to GP4. GP4 routes X
to GP3's I1 interface, which brings it to GP1 over the tunnel with the
above mentioned policy routing. A looping packet takes the following route:

Router(in) Router(out)

GP1(FA0/0) GP1(FA1/0)
GP2(FA1/0/0) GP2(Se0/0/0)
GP3(Se4/0/0) GP3(FA1/1/0)
GP4(FA1/1/0) GP4(FA1/0/0)
GP3(FA1/0/0) GP3(Tunnel/Se4/0/0)
GP2(Tunnel/Se0/0/0) GP2(FA1/1/0)
GP1...

Not only the loop checks for us all componenets of the system, it also
works as an amplifier. Injecting a packet destined to X to any of the
routers results in the packet looping between the routers until its TTL
dries out. This allows us to easily flood the system, by even moderately
injecting packets destined to X. Flooding using different packet sizes will
allow us to reveal bottlenecks within GP.

In the following tests, all routers were running 12.0.5(S), with
distributed CEF (almost everywhere), flow switching, and policy switching
enabled. Distributed CEF was not used (apart from GP1, which is a 7200)
on one of GP2's fast ethernets. This was done after the VIP CPU utilization
was found to be much higher than on the other routers. This issue still
needs to be investigated, since this VIP was supposed to do simple routing.

The test results are reported in a table below (the funny packet sizes are
after adding 28 bytes of IP+ICMP overhead to the ping payload):

Packet Size 1498 528 68

GP1 CPU 7% 16% 70%
FA0/0 in pps/bps 4.2K/52M 11K/49M 47K/31M
FA1/0 out pps/bps 4.3K/52M 11K/49M 46K/30M

GP2 CPU 12% 28% 99%
FA1/0/0 out 3.5K/43M 10K/42M 45K/30M
FA1/1/0 in 3.6K/43M 10K/43M 46K/30M
Se0/0/0 in 3.7K/41M 10K/42M 45K/30M
Se0/0/0 out 3.5K/43M 10K/42M 46K/25M
Tunnel in 3.6K/43M 10K/43M 45K/35M
VIP 0 CPU 16% 34% 99%
VIP 1 CPU 22% 50% 87%

GP3 CPU 8% 21% 78%
FA1/0/0 in 3.6K/43M 10K/43M 46K/30M
FA1/1/0 out 3.6K/43M 10K/43M 46K/30M
Se4/0/0 in 3.6K/44M 10K/43M 46K/26M
Se4/0/0 out 3.5K/43M 10K/43M 46K/35M
Tunnel out 3.6K/44M 10K/44M 46K/34M
VIP 1 CPU 22% 52% 93%
VIP 4 CPU 16% 35% 87%

GP4 CPU 0% 0% 0%
FA1/0/0 out 3.6K/43M 10K/44M 45K/30M
FA1/1/0 in 3.6K/43M 10K/43M 46K/30M
VIP 1 CPU 18% 45% 88%

Comments:

 . GP1 is a 7200 NPE200 (R5000@200MHz). All the others are 7500 RSP4
   (R5000@200MHz). The VIPs running the FastEthernets are VIP2-40
   (R4700@100MHz). the VIPs running the T3 are VIP2-50 (R5000@200MHz).

 . For each packet, the ethernet overhead is 14 bytes, and the tunnel
   overhead is 24 bytes.

 . There is some anomaly in the way the tunnel bandwidth utilization is
   reported by IOS.

Puf.

Oded.

On Tue, 11 Jan 2000, Hank Nussbacher wrote:

>At 21:39 10/01/00 -0600, Edward Henigin wrote:
>
>The following benchmark and extracted emails was done by a colleague here
>in Israel during the summer:
>
>>I have got some preliminary figures with respect to policy based routing
>>and tunneling running at the same time on an 7505/RSP4 with VIP2-50 cards.
>>Setup is such that everything received on a specific interface is policy
>>routed into a tunnel (see router setup below). As a result, the router was
>>able to process about 3500 large (1400bytes) packets on each direction:
>>
>>chicago-gp3#sh int tun1 | inc bits
>> 5 minute input rate 41985000 bits/sec, 3643 packets/sec
>> 5 minute output rate 38692000 bits/sec, 3369 packets/sec
>>chicago-gp3#sh int ser4/0/0 | inc bits
>> 5 minute input rate 40299000 bits/sec, 3530 packets/sec
>> 5 minute output rate 37748000 bits/sec, 3527 packets/sec
>>
>>At the same time, CPU utilization was:
>>
>>chicago-gp3#sh proc cpu | inc util
>>CPU utilization for five seconds: 24%/22%; one minute: 23%; five minutes: 22%
>>
>>and :
>>
>>chicago-gp3#sh controllers vip 4 tech-support | inc util
>>CPU utilization for five seconds: 17%/17%; one minute: 15%; five minutes: 15%
>>
>>With tiny (40bytes) packets, utilization was:
>>
>>chicago-gp3#sh int tun1 | inc bits/sec
>> 5 minute input rate 9476000 bits/sec, 6708 packets/sec
>> 5 minute output rate 8580000 bits/sec, 6232 packets/sec
>>chicago-gp3#sh int ser4/0/0 | inc bits/sec
>> 5 minute input rate 8809000 bits/sec, 6691 packets/sec
>> 5 minute output rate 9078000 bits/sec, 6691 packets/sec
>>chicago-gp3#sh proc cpu | inc util
>>CPU utilization for five seconds: 40%/37%; one minute: 38%; five minutes: 36%
>>chicago-gp3#sh controllers vip 4 tech | inc util
>>CPU utilization for five seconds: 22%/22%; one minute: 20%; five minutes: 19%
>>
>>
>>FYI.
>>
>>Oded.
>>
>>-------------Router Setup-------------
>>interface Tunnel1
>> ip address 192.114.99.129 255.255.255.240
>> no ip directed-broadcast
>> ip route-cache policy
>> ip route-cache flow
>> tunnel source FastEthernet1/0/0
>> tunnel destination 192.114.99.49
>>
>>interface FastEthernet1/0/0
>> ip address 192.114.101.49 255.255.255.240
>> no ip directed-broadcast
>> ip route-cache policy
>> ip route-cache flow
>> ip route-cache distributed
>> ip policy route-map to-tunnel1
>>
>>route-map to-tunnel1 permit 10
>> set interface Tunnel1
>>
>>
>
>
>> Is anyone doing policy routing on backbone interfaces? So
>>it would be on Internet traffic mix, running 10's to 100's of
>>megabits?
>>
>> I'm considering doing this for a specific application.
>>I'm concerned about potential performance hit (even after ip
>>route-cache policy) and it would be nice to have flow stats available,
>>but I guess I can live without it.
>>
>> Ed
>>
>>
>



This archive was generated by hypermail 2b29 : Sun Aug 04 2002 - 04:12:08 EDT