[cisco-bba] Problem with post-authentication timeouts?

Scott Lambert lambert at lambertfam.org
Tue Nov 15 04:06:49 EST 2005


I am having difficulties with certain users being unable to get online
or having a delay of 30 or so seconds between PPPoE authenticated and
passing traffic.  I think the former may be a side effect of the latter.

None of my known cisco PIX users on this ATM trunk can get online.  They
successfully authenticate three times per minute.  However, the PIX says
"Could not determine remote IP address" and sends a PPP term request.

When a technician goes on site and uses a Windows XP SP2 laptop at
the end of the ethernet cable that was plugged into the PIX they
report that they are able to get online, but there is a delay between
authentication, as the same username the PIX is using, and being able
to actually pass traffic.  The delay is reported at between 32 and 120
seconds.  There is a report of no notable delay for one attempt sometime
in the middle of the test process.

When I go on-site with my PowerBook, I authenticate as the same username
with no notable delay between authentication and passing traffic.

We had no problems until Friday afternoon when the power company dropped
the grid in half the town, including our office.  The generator and UPS
kept the routers and servers up just fine during the 2.5 hour outage,
but one of the outlets that supported our smartjack boxes is apparantly
not a generator backed circuit.  This was the first outage since I've
been here.  One of the guys trying to help while I looked into why the
T1s were down from the router perspective, unplugged the telco fiber
cabinet thinking that it was the one not on a generator backed circuit.

Power to the cabinet had already been down during the wait for the
generator to pick up the load.  It was assumed that the cabinet had
enough batteries to take care of itself for 30 seconds.  I don't have
proof that the cabinet ever lost power, but it's possible.  I think it
is likely due to the fact that all our ADSL PPPoE sessions on the DS3
from that cabinet were dropped.

At that point, no one was able to authenticate.  Even our radius
authentication for management access to IOS would not authenticate.
Reloading the RADIUS server did not fix the problem so I punted and
reloaded the router, after doing a "write".

When the router came back up, users were able to authenticate and we
thought everything was hunky dory.  We had a few users still reporting
difficulty authenticating, but power-cycleing there CPE fixed those
that reported problems Friday afternoon.  

All, with the exception of one company using a PIX.  That user was
authenticating three times per minute but unable to keep the connection
for even a second.  They had a Zywall which had been pulled from service
because they wanted a "business class router" sitting on the shelf next
to the PIX.  When we swapped it in for the PIX, their DSL connection
came up and everybody was happy.  So, we thought "hmm, PIX got fried."

This morning we walked into a hellstorm of complaints from the
businesses that had gone home before the power came up Friday.

One of them is using a PIX.  They have the same three authentications
per minute symptoms as the Friday place had.  These guys have more going
on behind the PIX than the Friday place and a router swap isn't going
to cut it.  So the techs go out and play with the windows client.  No
problem with the DSL line, just a funky post authentication delay. 

All over town, even the areas that didn't lose power are having similar
problems.  For the most part a power cycle and patience are able to
get them up.  But the users with routers are more likely to still have
problems.  We have one report that power cycling a linksys router didn't
fix the problem but hitting the "Disconnect" button on the PPPoE status
page and then hitting the "Connect" button made that site work.

I can't find anything that has changed on my aggragation box, Cisco
3661, from before the power event to after.  There are no changes in
RADIUS from before to after.  Well over 100 customers on the problem
aggregation box are able to get on easily and probably don't know there
was ever a problem.  At the end of Monday, we had 12 open problem
reports for this issue.  I have 200+ users on this box.  Not all of them
leave their CPE powered at all times.

This is a representative radius authentication sequence from the
aggregation router, Cisco 3661:

Nov 15 01:56:17 router-e0 1727136: 3d11h: RADIUS(00034691): Storing nasport 268566566 in rad_db
Nov 15 01:56:17 router-e0 1727137: 3d11h: RADIUS(00034691): Config NAS IP: 0.0.0.0
Nov 15 01:56:17 router-e0 1727138: 3d11h: RADIUS/ENCODE(00034691): acct_session_id: 265149
Nov 15 01:56:17 router-e0 1727139: 3d11h: RADIUS(00034691): sending
Nov 15 01:56:17 router-e0 1727140: 3d11h: RADIUS/ENCODE: Best Local IP-Address 10.61.218.5 for Radius-Server 10.61.218.2
Nov 15 01:56:17 router-e0 1727141: 3d11h: RADIUS(00034691): Send Access-Request to 10.61.218.2:1645 id 21745/59, len 80
Nov 15 01:56:17 router-e0 1727142: 3d11h: RADIUS:  authenticator C1 CB 72 5C 6E 0B C8 EE - A0 3A 44 8E 1D 97 00 26
Nov 15 01:56:17 router-e0 1727143: 3d11h: RADIUS:  User-Name           [1]   13  "domain.net"
Nov 15 01:56:17 router-e0 1727144: 3d11h: RADIUS:  User-Password       [2]   18  
Nov 15 01:56:17 router-e0 1727145: *
Nov 15 01:56:17 router-e0 1727146: 3d11h: RADIUS:  NAS-Port-Type       [61]  6   Virtual                   [5]
Nov 15 01:56:17 router-e0 1727147: 3d11h: RADIUS:  NAS-Port            [5]   6   268566566                 
Nov 15 01:56:17 router-e0 1727148: 3d11h: RADIUS:  Connect-Info        [77]  5   "dsl"
Nov 15 01:56:17 router-e0 1727149: 3d11h: RADIUS:  Service-Type        [6]   6   Outbound                  [5]
Nov 15 01:56:17 router-e0 1727150: 3d11h: RADIUS:  NAS-IP-Address      [4]   6   10.61.218.5              
Nov 15 01:56:17 router-e0 1727151: 3d11h: RADIUS: Received from id 21745/59 10.61.218.2:1645, Access-Reject, len 39
Nov 15 01:56:17 router-e0 1727152: 3d11h: RADIUS:  authenticator 67 A0 AD 3A AC 78 99 61 - 1E 4E 03 D1 6E 50 CB 8D
Nov 15 01:56:17 router-e0 1727153: 3d11h: RADIUS:  Reply-Message       [18]  19  
Nov 15 01:56:17 router-e0 1727154: 3d11h: RADIUS:   0D 0A 41 63 63 65 73 73 20 64 65 6E 69 65 64 0D  [??Access denied?]
Nov 15 01:56:17 router-e0 1727155: 3d11h: RADIUS:   0A                                               [?]
Nov 15 01:56:17 router-e0 1727156: 3d11h: RADIUS(00034691): Received from id 21745/59
Nov 15 01:56:17 router-e0 1727157: 3d11h: RADIUS/DECODE: Reply-Message fragments, 17, total 17 bytes
Nov 15 01:56:17 router-e0 1727158: 3d11h: RADIUS:  AAA Unsupported     [150] 10  
Nov 15 01:56:17 router-e0 1727159: 3d11h: RADIUS:   31 2F 30 2F 30 2F 32 2E                          [1/0/0/2.]
Nov 15 01:56:17 router-e0 1727160: 3d11h: RADIUS(00034691): Using existing nas_port 268566566
Nov 15 01:56:17 router-e0 1727161: 3d11h: RADIUS(00034691): Config NAS IP: 0.0.0.0
Nov 15 01:56:17 router-e0 1727162: 3d11h: RADIUS/ENCODE(00034691): acct_session_id: 265149
Nov 15 01:56:17 router-e0 1727163: 3d11h: RADIUS(00034691): sending
Nov 15 01:56:17 router-e0 1727164: 3d11h: RADIUS/ENCODE: Best Local IP-Address 10.61.218.5 for Radius-Server 10.61.218.2
Nov 15 01:56:17 router-e0 1727165: 3d11h: RADIUS(00034691): Send Access-Request to 10.61.218.2:1645 id 21745/60, len 97
Nov 15 01:56:17 router-e0 1727166: 3d11h: RADIUS:  authenticator 39 5D 5F 1F 9C 8F 47 80 - 2F 91 17 FD 75 75 37 EC
Nov 15 01:56:17 router-e0 1727167: 3d11h: RADIUS:  Framed-Protocol     [7]   6   PPP                       [1]
Nov 15 01:56:17 router-e0 1727168: 3d11h: RADIUS:  User-Name           [1]   24  "username at domain.net"
Nov 15 01:56:17 router-e0 1727169: 3d11h: RADIUS:  User-Password       [2]   18  *
Nov 15 01:56:17 router-e0 1727170: 3d11h: RADIUS:  NAS-Port-Type       [61]  6   Virtual                   [5]
Nov 15 01:56:17 router-e0 1727171: 3d11h: RADIUS:  NAS-Port            [5]   6   268566566                 
Nov 15 01:56:17 router-e0 1727172: 3d11h: RADIUS:  Connect-Info        [77]  5   "dsl"
Nov 15 01:56:17 router-e0 1727173: 3d11h: RADIUS:  Service-Type        [6]   6   Framed                    [2]
Nov 15 01:56:17 router-e0 1727174: 3d11h: RADIUS:  NAS-IP-Address      [4]   6   10.61.218.5              
Nov 15 01:56:17 router-e0 1727175: 3d11h: RADIUS: Received from id 21745/60 10.61.218.2:1645, Access-Accept, len 44
Nov 15 01:56:17 router-e0 1727176: 3d11h: RADIUS:  authenticator 9D D7 FF 4D 6F 08 70 3D - 59 FB 8B 90 DA 3E 35 3C
Nov 15 01:56:17 router-e0 1727177: 3d11h: RADIUS:  Service-Type        [6]   6   Framed                    [2]
Nov 15 01:56:17 router-e0 1727178: 3d11h: RADIUS:  Framed-Protocol     [7]   6   PPP                       [1]
Nov 15 01:56:17 router-e0 1727179: 3d11h: RADIUS:  Framed-IP-Netmask   [9]   6   255.255.255.255           
Nov 15 01:56:17 router-e0 1727180: 3d11h: RADIUS:  Framed-IP-Address   [8]   6   10.66.76.136              
Nov 15 01:56:17 router-e0 1727181: 3d11h: RADIUS(00034691): Received from id 21745/60
Nov 15 01:56:17 router-e0 1727182: 3d11h: RADIUS/ENCODE(00034691): Acct-session-id pre-pended with Nas Port = 1/0/0/2.38
Nov 15 01:56:17 router-e0 1727183: 3d11h: RADIUS(00034691): Using existing nas_port 268566566
Nov 15 01:56:18 router-e0 1727184: 3d11h: RADIUS(00034691): Config NAS IP: 0.0.0.0
Nov 15 01:56:18 router-e0 1727185: 3d11h: RADIUS/ENCODE: Best Local IP-Address 10.61.218.5 for Radius-Server 10.61.218.2
Nov 15 01:56:18 router-e0 1727186: 3d11h: RADIUS/ENCODE(00034691): Acct-session-id pre-pended with Nas Port = 1/0/0/2.38
Nov 15 01:56:18 router-e0 1727187: 3d11h: RADIUS(00034691): Using existing nas_port 268566566
Nov 15 01:56:18 router-e0 1727188: 3d11h: RADIUS(00034691): Config NAS IP: 0.0.0.0
Nov 15 01:56:18 router-e0 1727189: 3d11h: RADIUS/ENCODE: Best Local IP-Address 10.61.218.5 for Radius-Server 10.61.218.2
Nov 15 01:56:18 router-e0 1727190: 3d11h: RADIUS:  AAA Unsupported     [150] 10  
Nov 15 01:56:18 router-e0 1727191: 3d11h: RADIUS:   31 2F 30 2F 30 2F 32 2E                          [1/0/0/2.]

$ radlast -10 username
username   router   268566566 10.66.76.136    Tue Nov 15 02:00 - 02:00  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 02:00 - 02:00  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:59 - 01:59  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:59 - 01:59  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:59 - 01:59  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:58 - 01:58  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:58 - 01:58  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:58 - 01:58  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:57 - 01:57  (00:00)
username   router   268566566 10.66.76.136    Tue Nov 15 01:57 - 01:57  (00:00)

Here is a ppp negotiation debug session from the PIX:
PPP virtual access open, ifc = 0

Xmit Link Control Protocol pkt, Action code is: Config Request, len is: 6
tPkt dump: 05064affaac1
LCP Option: MAGIC_NUMBER, len: 6, data: 4affaac1

Rcvd Link Control Protocol pkt, Action code is: Config Request, len is: 14
Pkt dump: 010405d40304c02305064096ed14
LCP Option: Max_Rcv_Units, len: 4, data: 05d4
LCP Option: AUTHENTICATION_TYPES, len: 4, data: c023
LCP Option: MAGIC_NUMBER, len: 6, data: 4096ed14

Xmit Link Control Protocol pkt, Action code is: Config ACK, len is: 14
Pkt dump: 010405d40304c02305064096ed14
LCP Option: Max_Rcv_Units, len: 4, data: 05d4
LCP Option: AUTHENTICATION_TYPES, len: 4, data: c023
LCP Option: MAGIC_NUMBER, len: 6, data: 4096ed14

Rcvd Link Control Protocol pkt, Action code is: Config ACK, len is: 6
Pkt dump: 05064affaac1
LCP Option: MAGIC_NUMBER, len: 6, data: 4affaac1

Xmit Link Control Protocol pkt, Action code is: Echo Request, len is: 4
Pkt dump: 4affaac1

Rcvd Link Control Protocol pkt, Action code is: Echo Reply, len is: 4
Pkt dump: 4096ed14

PPP upap rcvd authen ack: 4096ed14
Rcvd IP Control Protocol pkt, Action code is: Config Request, len is: 6
Pkt dump: 030600000000
IPCP Option: Config IP, IP = 0.0.0.0

Xmit IP Control Protocol pkt, Action code is: Config Request, len is: 6
Pkt dump: 030600000000
IPCP Option: Config IP, IP = 0.0.0.0

Xmit IP Control Protocol pkt, Action code is: Config Request, len is: 6
Pkt dump: 030600000000
IPCP Option: Config IP, IP = 0.0.0.0

Xmit IP Control Protocol pkt, Action code is: Config Reject, len is: 6
Pkt dump: 030600000000
IPCP Option: Config IP, IP = 0.0.0.0

Rcvd IP Control Protocol pkt, Action code is: Config NAK, len is: 6
Pkt dump: 030641424c88
IPCP Option: Config IP, IP = 10.66.76.136

Xmit IP Control Protocol pkt, Action code is: Config Request, len is: 6
Pkt dump: 030641424c88
IPCP Option: Config IP, IP = 10.66.76.136

Rcvd IP Control Protocol pkt, Action code is: Config Request, len is: 0

Xmit IP Control Protocol pkt, Action code is: Config ACK, len is: 0

Rcvd IP Control Protocol pkt, Action code is: Config ACK, len is: 6
Pkt dump: 030641424c88
IPCP Option: Config IP, IP = 10.66.76.136
Could not determine remote IP address
Xmit IP Control Protocol pkt, Action code is: Termination Request, len is: 37
Pkt dump: 436f756c64206e6f742064657465726d696e652072656d6f74652049502061646472657373

PPP va close, device = 1

Xmit Link Control Protocol pkt, Action code is: Termination Request, len is: 15
Pkt dump: 50656572205465726d696e61746564

RADIUS users file:
username      Auth-Type = System
        Service-Type = Framed-User,
        Framed-Protocol = PPP,
        Framed-Netmask = 255.255.255.255,
        Framed-Address = 10.66.76.136

DEFAULT Auth-Type = System, Group = "dsl", Simultaneous-Use = 5
        Service-Type = Framed-User,
        Framed-Protocol = PPP,
        Framed-Netmask = 255.255.255.255,
        Ascend-Client-Primary-DNS = 10.61.218.2,
        Ascend-Client-Secondary-DNS = 10.61.218.3,
        Ascend-Client-Assign-DNS = DNS-Assign-Yes,
#       Ascend-Data-Filter = "ip in forward tcp est",
#       Ascend-Data-Filter = "ip in forward dstip 10.61.218.4/32",
#       Ascend-Data-Filter = "ip in drop tcp dstport = 25",
#       Ascend-Data-Filter = "ip in forward",
        Session-Timeout = 0,
        Port-Limit = 1,
        Framed-MTU = 1500

(The Ascend-Data-Filter lines were originally active.  I took them out
in an attempt to simplify during troubleshooting.)

I've munged the IPs, username, and domain name consistently.  If I've
given away my passwords in the encoded data, please let me know.

I would appreciate any insight you guys can provide.  This stuff is
still over my head.  

It only seems to affect our core town where the power outage occured.
The power event region was the East half of the town.  However the
complaints seem to be in the southern half of the town, in and out of
the power event region.  This router services towns all over the state
without, registerred, complaints from outside our town.  That and the
lack of config changes from working to non-working have me thinking it
could be a problem with the DSLAMs in the power affected areas.  Maybe
some of the DSLAMs service into the non-power affected region?

-- 
Scott Lambert                    KC5MLE                       Unix SysAdmin
lambert at lambertfam.org



More information about the cisco-bba mailing list