Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
Showing results for
Search instead for
Did you mean:
ASR9000 BNG debugging PPPoE sessions
Understanding how to debug PPPoE in IOS-XR for the ASR9K
Detailed step through guide for debugging PPPoE sessions.
Understanding BNG Architecture
Before we start troubleshooting sessions it is important to understand the architecture of how things link together. This overview below shows the different components as it related to PPP(oE).
In the next modules the various debugs will be elaborated on and some common things that generally go wrong.
1) 1) A session is initiated in PPPoE by the reception of a PADI packet in PPPoE. This PADI is a broadcast packet hitting the control plane. If the session is terminating on a physical (sub)interface, the PPPoE is handled on the linecard, if the session terminates as part of a bundle (sub)interface, PPPoE is handled by the RSP.
2) 2) The reception of a PADI triggers a “session-start” event in the control policy, during the session-start event we need to apply the dynamic template that holds at least the LCP specific parameters for when the session continues to PPP phase. Also we could do “pre-authentication” here based on PPPoE tag information in the session-start event.
3) 3) After we have sent the PADO to the client, and the client selected us as BNG, a PADR will be received which triggers the session-activate event. In this event we need a template, either provided during session start with NCP parameters (like unnumbered info), or more specific template info can be provided during this session-activate event. At this time the subscriber interface is created when we transmit the PADS and are now commited for 3 mins to that subscriber session (in IOS, the ncp timeout).
4) 4) The session moves now to the PPP LCP phase and will try to complete the LCP based on the LCP parameters defined in the dynamic-template provided during the session-start event.
5) 5) During LCP we generally negotiate some sort of authentication protocol and when we do, we enter the authentication phase. While the control policy is still in session-activate state we are starting our CHAP or PAP exchange to retrieve user credentials.
6) 6) Even if CHAP/PAP is not negotiated we can still do an authorization request, but probably not on line username/password in the absence of identity retrieval by PPP/Auth. This authen action is defined in the session-start event of the control policy
7) 7) In XR, in the absence of local authentication capabilities, we need to use radius (or tacacs/less common) to initiate the access-request
8) 8) The radius interaction will give us a response, or not when it times out. Either case, an event is triggered in the policy-plane again that we can trigger on to provide further directives or we “can live” with the response radius has given us.
9) 9) Events triggered are authen-failure (access-reject) or authen-no-response (timeout from our radius-server/list). If the response is a success/accept, we are continuing the session and start NCP.
10) 10) IPCP is started and when completed the route is installed.
This document will focus on debugging and understanding each of these components in in-depth level.
This picture above shows the 4 important packets from PPPoE.
The PADI is a broadcast packet, the PADO is a packet with SMAC of the BNG and the DMAC of the subscriber.
The subscriber will send a unicast PADR packet with the DMAC of the BNG it wants to establish a session with, followed by a PADS from the BNG with the pppoe session id.
This session id is unique for the segment and is to be used for EVERY packet that is sent forward from this session.
Note: The BNG will verify the SMAC, arriving access interface and pppoe session ID to prevent spoofing
The session ID is part of the 8 byte pppoe header that is slapped on to the packet (hence the 1492 maximum MTU size when pppoe encap is used.
PPPoE header explained:
MAC header: This is a standard ethernet II header with ethertype 0x8863 for PPPoE control (PADx messages) and 0x8864 for PPPoE data packets, including PPP/LCP.
Although the protocol is not provided here because debug pppoe protocol is off, we can tell that this is a PADI because the packet is a broadcast destination on L2 and also the packet starts with 1109 the “09” being the PADI packet
The added 0’s are expected here also because the PADx packets are very small and we need to add padding to them.
The LEN provided (46) is the Ether header length.
The PPPoE length is highlighted in ORANGE
RP/0/RSP0/CPU0:Sep1 15:37:36.372 : pppoe_ma: Bundle-Ether100.100: O dst 0019.2f43.9a38 src b4a4.e392.208b: len 21 0x11070000000f010100000102000
RP/0/RSP0/CPU0:Sep1 15:37:36.374 : pppoe_ma: Bundle-Ether100.100: I dst b4a4.e392.208b src 0019.2f43.9a38: len 46 0x11190000000f0102000741394b2
RP/0/RSP0/CPU0:Sep 1 15:51:23.188 : pppoe_ma: Session: Bundle-Ether100.100.pppoe7889: Received AAA Session Up Complete cb
RP/0/RSP0/CPU0:Sep 1 15:51:23.189 : pppoe_ma: Session: Received AAA batch end notication
This command is not necessarily that useful for normal troubleshooting, but it does signify the MTU being set on the interface and the session creation from PPPoE internally.
Common things to watch out for:
PPPoE session generally won’t establish for various reasons, here are some gotcha’s that I have ran into many many times
Service name matching: is the pppoe service name matching what the BBA group is configured for?
Control plane policing
Incorrect vlan (combo)
PPP consists of 3 phases, LCP, Authentication and NCP in order to establish a session allowing the transport of L3 protocols over it.
During LCP initial link parameters are agreed on such as MRU (like MTU), authentication protocol. If an Authentication protocol is agreed on, then we enter the Auth phase in which credentials are exchanged that likely are being handed off to radius for verification.
After that the NCP phase start which allows us to establish an L3 protocol connection such as IPCP for ipv4 or OSICP for CLNS etc. A9K supports only IPCP and soon IPV6CP (with xr430).
During LCP and NCP the exchange of options is done via the following packets. Each of these packets contains options.
A packet sent with options that a side of the ppp connection likes to propose
A response to a request indicating that all options in the request are acceptable
A response to a request indicating that an option can’t be honoured (eg one side does MLP the other side does not, we reject the option
A response to a request indicating that the options in the NAK packet can be fulfilled, but the option value is not desirable. Eg one side proposes CHAP authentication and the other side can only do PAP.
Requests and responses are linked together by a field in the packet called the “ID”.
After the sessions is completed, LCP echo requests, a keepalive mechanism, are exchanged and have to be acknowledged.
Keepalives from the peer are responded by the 9k in hardware. The 9k also originates keepalives which have to be responded by the peer.
Some helpful hints explaining the debugs from the component
Command: Debug ppp negotiation
RP/0/RSP0/CPU0:Sep 15 12:43:18.244 : PPP-MA: LCP: Bundle-Ether100.100.pppoe1: [Initial]: Up Event
RP/0/RSP0/CPU0:Sep 15 12:43:18.248 : PPP-MA: LCP: Bundle-Ether100.100.pppoe1: [Initial]: Change to state Closed
In this debug start, the location is RSP0 because PPP for bundle sessions as indicated by the interface are terminated by the RSP. Sessions terminating on a phy interface are terminated on the linecard.
I’ll remove the node and interface moving forward for space issues in the debug.
Identify the protocol state LCP, the LCP state itself in brackets [ ], the direction of the packet O output, I input the packet type (confreq) and the ID. At this point we expect a response from the peer on our ID 1.
.362 LCP: [Req-Sent]:EndpointDisc 1 Local (0x13070174657374)
We have move to state request sent, and we are getting an incoming configure request, also with ID 1, we are expected to send a response to ID1 now as well. Note that I deliberately let the client request options for MRRU and EndpointDisc, used for multilink that we don’t support on the 9k.
.363 LCP: Peer's MRRU: Option received on non-MP interface (reject)
.363 LCP: Peer's ED: Option received on non-MP interface (reject)
PPP identifies already the unsupported option
.363 LCP: [Req-Sent]: O CONFREJ id 1 len 15
.364 LCP: [Req-Sent]:MRRU 1524 (0x110405f4)
.364 LCP: [Req-Sent]:EndpointDisc 1 Local (0x13070174657374)
Here we are sending a REJECT on ID1,which is a response to the peers request ID1 in BLUE
We are rejecting the options for MLP. We should expect a new request to come in from the peer with a new offer. Hopefully, the peer is not persistent in its multilink request. If it is, and requests it again, we reject it again. How long that can go on? That will be the ppp max-configure configuration knob, default is 3.
After 3 attempts to configure the link we will terminate the ppp session.
.373 LCP: [Req-Sent]: I CONFACK id 1 len 19
.373 LCP: [Req-Sent]:MRU 1492 (0x010405d4)
.373 LCP: [Req-Sent]:AuthProto CHAP (0x0305c22305)
The peer did like out proposal and acknowledged it. If the peer didn’t like our CHAP authentication protocol but can do PAP, it would have sent us a PPP CONFNAK packet with authproto option in there. This would indicate to us to switch auth protocols IF we are configured for that.
If not, and we are persistent on CHAP, we end up in a dead lock also and after default of 3 attempts to configure we drop the session
The client can also request a real address, but that if we ack that be a semi static ip address, so many providers prefer to renegotiate the address and send a nak always on the address option and provide a new pool address.
When the secret key is misconfigured, the authenticator of the radius header will be wrong. Some radius-servers reply then always with a reject, some radius-servers will silently discard the message.
Radius packets are sent with an 8 bit ID, which is not enough for large scale, we extend that with the source port, is show radius showing bad authenticators? Likely you have ID overload. Make sure your radius server supports extended source ports
Are the reply items properly set for based on the session type? For instance IP sessions do not come up when the service-type is set to framed or outbound.
When a feature is attempted to be applied, such as an ACL that doesn’t exist, should we reject the session or keep it alive unrestricted.
NOTE: The radius-profile for an IPoE session should NOT include any service-type. Having Framed-User as service type in an access accept for an IP subscriber will cause the session to fail.
PPP/PPPoE Control Policy Events
Theformat of the debug is:
NODE : Time : <process>[pid] : [PACKET-direction] Interface mac-addr