Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 

ASR9000 BNG debugging PPPoE sessions

Introduction

Understanding how to debug PPPoE in IOS-XR for the ASR9K

Problem Description

Detailed step through guide for debugging PPPoE sessions.

Understanding BNG Architecture

Before we start troubleshooting sessions it is important to understand the architecture of how things link together. This overview below shows the different components as it related to PPP(oE).

In the next modules the various debugs will be elaborated on and some common things that generally go wrong.

policy-arch.JPG

1)    1) A session is initiated in PPPoE by the reception of a PADI packet in PPPoE. This PADI is a broadcast packet hitting the control plane. If the session is terminating on a physical (sub)interface, the PPPoE is handled on the linecard, if the session terminates as part of a bundle (sub)interface, PPPoE is handled by the RSP.

2)    2) The reception of a PADI triggers a “session-start” event in the control policy, during the session-start event we need to apply the dynamic template that holds at least the LCP specific parameters for when the session continues to PPP phase. Also we could do “pre-authentication” here based on PPPoE tag information in the session-start event.

3)    3) After we have sent the PADO to the client, and the client selected us as BNG, a PADR will be received which triggers the session-activate event. In this event we need a template, either provided during session start with NCP parameters (like unnumbered info), or more specific template info can be provided during this session-activate event. At this time the subscriber interface is created when we transmit the PADS and are now commited for 3 mins to that subscriber session (in IOS, the ncp timeout).

4)    4) The session moves now to the PPP LCP phase and will try to complete the LCP based on the LCP parameters defined in the dynamic-template provided during the session-start event.

5)    5) During LCP we generally negotiate some sort of authentication protocol and when we do, we enter the authentication phase. While the control policy is still in session-activate state we are starting our CHAP or PAP exchange to retrieve user credentials.

6)    6) Even if CHAP/PAP is not negotiated we can still do an authorization request, but probably not on line username/password in the absence of identity retrieval by PPP/Auth. This authen action is defined in the session-start event of the control policy

7)    7) In XR, in the absence of local authentication capabilities, we need to use radius (or tacacs/less common) to initiate the access-request

8)    8) The radius interaction will give us a response, or not when it times out. Either case, an event is triggered in the policy-plane again that we can trigger on to provide further directives or we “can live” with the response radius has given us.

9)    9) Events triggered are authen-failure (access-reject) or authen-no-response (timeout from our radius-server/list). If the response is a success/accept, we are continuing the session and start NCP.

10)  10) IPCP is started and when completed the route is installed.

This document will focus on debugging and understanding each of these components in in-depth level.

PPPoE

pppoe-disc.JPG

This picture above shows the 4 important packets from PPPoE.

The PADI is a broadcast packet, the PADO is a packet with SMAC of the BNG and the DMAC of the subscriber.

The subscriber will send a unicast PADR packet with the DMAC of the BNG it wants to establish a session with, followed by a PADS from the BNG with the pppoe session id.

This session id is unique for the segment and is to be used for EVERY packet that is sent forward from this session.

Note: The BNG will verify the SMAC, arriving access interface and pppoe session ID to prevent spoofing

The session ID is part of the 8 byte pppoe header that is slapped on to the packet (hence the 1492 maximum MTU size when pppoe encap is used.

PPPoE header explained:


MAC header

PPPoE header

Data :::

MAC header: This is a standard ethernet II header with ethertype 0x8863 for PPPoE control (PADx messages) and 0x8864 for PPPoE data packets, including PPP/LCP.

PPPoE header:


00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

Version

Type

Code

Session   ID

Length

Data :::

Version. 4 bits.
Protocol version. Must be set to 1.

Type. 4 bits.
Must be set to 1.

Code. 8 bits.

Session ID. 16 bits, unsigned.

Length. 16 bits.
Size of the Data field in bytes.

Data. Variable length.

Debugging PPPoE

First step in debugging is to evaluate the PPPoE protocol and packets.

Command: Debug pppoe protocol

Provides Basic packet level debugging from a pppoe protocol point of view

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

0019.2f43.9a38: this is the subscriber mac-address

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]:    vlan-id-outer 100

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]:    Service-name:

RP/0/RSP0/CPU0:Sep  1 15:27:09.633 : pppoe_ma[346]: [PADO-Sent]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

RP/0/RSP0/CPU0:Sep  1 15:27:09.633 : pppoe_ma[346]: [PADO-Sent]:    vlan-id-outer 100

RP/0/RSP0/CPU0:Sep  1 15:27:09.634 : pppoe_ma[346]: [PADR-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

RP/0/RSP0/CPU0:Sep  1 15:27:09.634 : pppoe_ma[346]: [PADR-Recv]:    vlan-id-outer 100

RP/0/RSP0/CPU0:Sep  1 15:27:09.634 : pppoe_ma[346]: [PADR-Recv]:    Service-name:

RP/0/RSP0/CPU0:Sep  1 15:27:09.853 : pppoe_ma[346]: [PADS-Sent]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

RP/0/RSP0/CPU0:Sep  1 15:27:09.853 : pppoe_ma[346]: [PADS-Sent]:    vlan-id-outer 100

The  format of the debug is:

NODE : Time : <process>[pid] : [PACKET-direction] Interface mac-addr

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

In this example the PPPoE session completed perfect, though little information on the packet contents are provided.

Things to look for are:

Is pppoe completing properly to a PADS

Is the interface shown expected

Is the mac address expected

Is the vlan (stack) what we’d expect?

Command: Debug pppoe packet

Provides detailed packet contents from pppoe

PADI

RP/0/RSP0/CPU0:Sep  1 15:37:36.372 : pppoe_ma[346]: Bundle-Ether100.100: I dst ffff.ffff.ffff src 0019.2f43.9a38: len 46 0x1109

0000000401010000000000000000000000000000000000000000000000000000000000000000000000000000

Although the protocol is not provided here because debug pppoe protocol is off, we can tell that this is a PADI because the packet is a broadcast destination on L2 and also the packet starts with 1109 the “09” being the PADI packet

The added 0’s are expected here also because the PADx packets are very small and we need to add padding to them.

The LEN provided (46) is the Ether header length.

The PPPoE length is highlighted in ORANGE

PADO

RP/0/RSP0/CPU0:Sep  1 15:37:36.372 : pppoe_ma[346]: Bundle-Ether100.100: O dst 0019.2f43.9a38 src b4a4.e392.208b: len 21 0x11070000000f010100000102000

741394b2d424e47

PADR

RP/0/RSP0/CPU0:Sep  1 15:37:36.374 : pppoe_ma[346]: Bundle-Ether100.100: I dst b4a4.e392.208b src 0019.2f43.9a38: len 46 0x11190000000f0102000741394b2

d424e470101000000000000000000000000000000000000000000000000000000

PADS

RP/0/RSP0/CPU0:Sep  1 15:37:36.591 : pppoe_ma[346]: Bundle-Ether100.100: O dst 0019.2f43.9a38 src b4a4.e392.208b: len 10 0x11651ed0000401010000


In RED is the assigned pppoe session ID. When the subscriber interface is created that should match this number in decimal, in this case bundle-ether100.100.pppoe7888

Packet type list:

Type              Code              Direction

PADI               0x09               In only

PADO             0x07               Out only

PADR             0x19               In only

PADS             0x65               Out only

PADT              0xa7               In and Out

Things to look for are:

Are the packet types expected

Are the lengths matching up for the packet to the actual content

Are we expecting the packets in the direction they are meant to be send

Is the mac address expected

Is the vlan (stack) what we’d expect?

Command: debug pppoe session detail

RP/0/RSP0/CPU0:Sep  1 15:51:22.917 : pppoe_ma[346]: Session: Bundle-Ether100.100: Initializing new session

RP/0/RSP0/CPU0:Sep  1 15:51:22.917 : pppoe_ma[346]: Session: Creating and inserting a new session idb on parent interface Bundle-Ether100.100, with se

ssion id 7889

RP/0/RSP0/CPU0:Sep  1 15:51:22.917 : pppoe_ma[346]: Session: Successfully added new session idb to database of parent Bundle-Ether100.100. Parent sess

ion IDB count = 3

RP/0/RSP0/CPU0:Sep  1 15:51:22.918 : pppoe_ma[346]: Session: Queuing interface create on parent interface Bundle-Ether100.100, with session id 7889

LC/0/0/CPU0:Sep  1 15:51:23.038 : pppoe_ea[288]: Interface: 0x0007b920: Creating and inserting a new intf idb

LC/0/1/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: Creating and inserting a new intf idb

LC/0/0/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: Successfully added new intf idb to database

LC/0/0/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE received

LC/0/0/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(1), idb lookups/creates completed successfully

LC/0/1/CPU0:Sep  1 15:51:23.040 : pppoe_ea[288]: Interface: 0x0007b920: Successfully added new intf idb to database

LC/0/1/CPU0:Sep  1 15:51:23.040 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE received

LC/0/1/CPU0:Sep  1 15:51:23.040 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(1), idb lookups/creates completed successfully

LC/0/0/CPU0:Sep  1 15:51:23.044 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(2), Hardware programming completed successfully

LC/0/1/CPU0:Sep  1 15:51:23.045 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(2), Hardware programming completed successfully

LC/0/0/CPU0:Sep  1 15:51:23.045 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(3), IMP interface create completed successfully

LC/0/1/CPU0:Sep  1 15:51:23.046 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(3), IMP interface create completed successfully

RP/0/RSP0/CPU0:Sep  1 15:51:23.050 : pppoe_ma[346]: Session: 0x00000000: Received interface create cb: No error

RP/0/RSP0/CPU0:Sep  1 15:51:23.052 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Inserting a session idb into global database

RP/0/RSP0/CPU0:Sep  1 15:51:23.053 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Successfully added new session idb to global database. Ses

sion IDB count = 3

RP/0/RSP0/CPU0:Sep  1 15:51:23.053 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received MTU notification, with MTU actual 1500

RP/0/RSP0/CPU0:Sep  1 15:51:23.057 : pppoe_ma[346]: Session: Received AAA batch start notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.057 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received AAA Session Create cb: No error

RP/0/RSP0/CPU0:Sep  1 15:51:23.058 : pppoe_ma[346]: Session: Received AAA batch end notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.075 : pppoe_ma[346]: Session: Received SubDB batch start notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.075 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received Activate Config cb: No error

RP/0/RSP0/CPU0:Sep  1 15:51:23.076 : pppoe_ma[346]: Session: Received SubDB batch end notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.159 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Interface ready call succeeded

RP/0/RSP0/CPU0:Sep  1 15:51:23.188 : pppoe_ma[346]: Session: Received AAA batch start notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.188 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received AAA Session Up Complete cb

RP/0/RSP0/CPU0:Sep  1 15:51:23.189 : pppoe_ma[346]: Session: Received AAA batch end notication

This command is not necessarily that useful for normal troubleshooting, but it does signify the MTU being set on the interface and the session creation from PPPoE internally.

Common things to watch out for:

PPPoE session generally won’t establish for various reasons, here are some gotcha’s that I have ran into many many times

  • Service name matching: is the pppoe service name matching what the BBA group is configured for?
  • PPPoE throttling
  • Control plane policing
  • Incorrect vlan (combo)
  • Malformed tags

PPP

PPP consists of 3 phases, LCP, Authentication and NCP in order to establish a session allowing the transport of L3 protocols over it.

During LCP initial link parameters are agreed on such as MRU (like MTU), authentication protocol. If an Authentication protocol is agreed on, then we enter the Auth phase in which credentials are exchanged that likely are being handed off to radius for verification.

After that the NCP phase start which allows us to establish an L3 protocol connection such as IPCP for ipv4 or OSICP for CLNS etc. A9K supports only IPCP and soon IPV6CP (with xr430).

ppp_setup.jpg

During LCP and NCP the exchange of options is done via the following packets. Each of these packets contains options.

CONFIGURE-REQUEST

A packet sent with options that a side of the ppp connection likes to propose

CONFIGURE-ACK

A response to a request indicating that all options in the request are acceptable

CONFIGURE-REJECT

A response to a request indicating that an option can’t be honoured (eg one side does MLP the other side does not, we reject the option

CONFIGURE-NAK

A response to a request indicating that the options in the NAK packet can be fulfilled, but the option value is not desirable. Eg one side proposes CHAP authentication and the other side can only do PAP.

Requests and responses are linked together by a field in the packet called the “ID”.

After the sessions is completed, LCP echo requests, a keepalive mechanism, are exchanged and have to be acknowledged.

Keepalives from the peer are responded by the 9k in hardware. The 9k also originates keepalives which have to be responded by the peer.

Debugging PPP

Some helpful hints explaining the debugs from the component

Command: Debug ppp negotiation

RP/0/RSP0/CPU0:Sep 15 12:43:18.244 : PPP-MA[343]: LCP: Bundle-Ether100.100.pppoe1: [Initial]: Up Event

RP/0/RSP0/CPU0:Sep 15 12:43:18.248 : PPP-MA[343]: LCP: Bundle-Ether100.100.pppoe1: [Initial]: Change to state Closed

In this debug start, the location is RSP0 because PPP for bundle sessions as indicated by the interface are terminated by the RSP. Sessions terminating on a phy interface are terminated on the linecard.

I’ll remove the node and interface moving forward for space issues in the debug.

LCP debug

.244 LCP: [Initial]: Up Event

.248 LCP: [Initial]: Change to state Closed

.249 LCP: [Closed]: Down Event

.250 LCP: [Closed]: Change to state Initial

.250 LCP: [Initial]: Up Event

.251 LCP: [Initial]: Change to state Closed

.358 LCP: [Closed]: Open Event

.358 LCP: [Closed]: Initialize-Restart-Counter

.358 LCP: [Closed]: O CONFREQ id 1 len 19

.359 LCP: [Closed]:    MRU 1492 (0x010405d4)

.359 LCP: [Closed]:    AuthProto CHAP (0x0305c22305)

.359 LCP: [Closed]:    MagicNumber 0x1e7824c4 (0x05061e7824c4)

Identify the protocol state LCP, the LCP state itself in brackets [ ], the direction of the packet O output, I input the packet type (confreq) and the ID. At this point we expect a response from the peer on our ID 1.

.360 LCP: [Closed]: Change to state Req-Sent

.360 LCP: [Req-Sent]: Open Event

.361 LCP: [Req-Sent]: I CONFREQ id 1 len 25

.361 LCP: [Req-Sent]:    MRU 1492 (0x010405d4)

.362 LCP: [Req-Sent]:    MagicNumber 0x659ff390 (0x0506659ff390)

.362 LCP: [Req-Sent]:    MRRU 1524 (0x110405f4)

.362 LCP: [Req-Sent]:    EndpointDisc 1 Local (0x13070174657374)

We have move to state request sent, and we are getting an incoming configure request, also with ID 1, we are expected to send a response to ID1 now as well. Note that I deliberately let the client request options for MRRU and EndpointDisc, used for multilink that we don’t support on the 9k.

.363 LCP: Peer's MRRU: Option received on non-MP interface (reject)

.363 LCP: Peer's ED: Option received on non-MP interface (reject)

PPP identifies already the unsupported option

.363 LCP: [Req-Sent]: O CONFREJ id 1 len 15

.364 LCP: [Req-Sent]:    MRRU 1524 (0x110405f4)

.364 LCP: [Req-Sent]:    EndpointDisc 1 Local (0x13070174657374)

Here we are sending a REJECT on ID1,which is a response to the peers request ID1 in BLUE

We are rejecting the options for MLP. We should expect a new request to come in from the peer with a new offer. Hopefully, the peer is not persistent in its multilink request. If it is, and requests it again, we reject it again. How long that can go on? That will be the ppp max-configure configuration knob, default is 3.

After 3 attempts to configure the link we will terminate the ppp session.

.373 LCP: [Req-Sent]: I CONFACK id 1 len 19

.373 LCP: [Req-Sent]:    MRU 1492 (0x010405d4)

.373 LCP: [Req-Sent]:    AuthProto CHAP (0x0305c22305)

.374 LCP: [Req-Sent]:    MagicNumber 0x1e7824c4 (0x05061e7824c4)

The peer did like out proposal and acknowledged it. If the peer didn’t like our CHAP authentication protocol but can do PAP, it would have sent us a PPP CONFNAK packet with authproto option in there. This would indicate to us to switch auth protocols IF we are configured for that.

If not, and we are persistent on CHAP, we end up in a dead lock also and after default of 3 attempts to configure we drop the session

.374 LCP: [Req-Sent]: Initialize-Restart-Counter

.374 LCP: [Req-Sent]: Change to state Ack-Rcvd

.374 LCP: [Ack-Rcvd]: I CONFREQ id 2 len 14

.374 LCP: [Ack-Rcvd]:    MRU 1492 (0x010405d4)

.375 LCP: [Ack-Rcvd]:    MagicNumber 0x659ff390 (0x0506659ff390)

We are getting a new proposal in from the peer, it retracted the mlp options, note that the ID has changed to 2. The ID MUST increase on every cycle sent.

.375 LCP: [Ack-Rcvd]: O CONFACK id 2 len 14

.375 LCP: [Ack-Rcvd]:    MRU 1492 (0x010405d4)

.375 LCP: [Ack-Rcvd]:    MagicNumber 0x659ff390 (0x0506659ff390)

.375 LCP: [Ack-Rcvd]: Change to state Open

That we can honor, so we send an ack on that proposal id2. At this point in time we have sent an ACK and received an ACK.

.376 LCP: [Open]: Report This-Layer-Up

IPCP debug

This handshake is the same as LCP, with the same packet types, just for a different protocol. IPCP instead of LCP.

.397 IPCP: [Initial]: I CONFREQ id 1 len 22

.398 IPCP: [Initial]:    Address 0.0.0.0 (0x030600000000)

.398 IPCP: [Initial]:    PrimaryDNS 0.0.0.0 (0x810600000000)

.398 IPCP: [Initial]:    SecondaryDNS 0.0.0.0 (0x830600000000)

.398 IPCP: [Initial]: Conf-Req packet stalled

.400 IPCP: [Initial]: Open Event

.400 IPCP: [Initial]: Change to state Starting

.402 IPCP: [Starting]: Report This-Layer-Started

.402 IPCP: [Starting]: Up Event

.402 IPCP: [Starting]: Initialize-Restart-Counter

.403 IPCP: [Starting]: O CONFREQ id 1 len 10

.403 IPCP: [Starting]:    Address 101.101.1.1 (0x030665650101)

Our address is the interface we are unnumbered to from the session, derived from the dynamic template or radius.

.404 IPCP: [Starting]: Change to state Req-Sent

.404 IPCP: [Req-Sent]: Restarting stalled Conf-Req packet

.405 IPCP: [Req-Sent]: I CONFREQ id 1 len 22

.405 IPCP: [Req-Sent]:    Address 0.0.0.0 (0x030600000000)

.405 IPCP: [Req-Sent]:    PrimaryDNS 0.0.0.0 (0x810600000000)

.405 IPCP: [Req-Sent]:    SecondaryDNS 0.0.0.0 (0x830600000000)

.406 IPCP: [Req-Sent]: I CONFACK id 1 len 10

.406 IPCP: [Req-Sent]:    Address 101.101.1.1 (0x030665650101)

.406 IPCP: [Req-Sent]: Initialize-Restart-Counter

.406 IPCP: [Req-Sent]: Change to state Ack-Rcvd

.408 IPCP: Peer's Primary DNS address: 0.0.0.0 (reject)

.408 IPCP: Peer's Secondary DNS address: 0.0.0.0 (reject)

.408 IPCP: [Ack-Rcvd]: O CONFREJ id 1 len 16

.408 IPCP: [Ack-Rcvd]:    PrimaryDNS 0.0.0.0 (0x810600000000)

.408 IPCP: [Ack-Rcvd]:    SecondaryDNS 0.0.0.0 (0x830600000000)

Client requested dns servers and we don’t have any info for them.

.409 IPCP: [Ack-Rcvd]: I CONFREQ id 2 len 10

.409 IPCP: [Ack-Rcvd]:    Address 0.0.0.0 (0x030600000000)

Client requests an ip address

.410 IPCP: [Ack-Rcvd]: O CONFNAK id 2 len 10

.410 IPCP: [Ack-Rcvd]:    Address 199.1.1.1 (0x0306c7010101)

We provide a pool address.

The client can also request a real address, but that if we ack that be a semi static ip address, so many providers prefer to renegotiate the address and send a nak always on the address option and provide a new pool address.

.411 IPCP: [Ack-Rcvd]: I CONFREQ id 3 len 10

.411 IPCP: [Ack-Rcvd]:    Address 199.1.1.1 (0x0306c7010101)

.411 IPCP: [Ack-Rcvd]: O CONFACK id 3 len 10

.411 IPCP: [Ack-Rcvd]:    Address 199.1.1.1 (0x0306c7010101)

.411 IPCP: [Ack-Rcvd]: Change to state Open

.412 IPCP: [Open]: Report This-Layer-Up

LCP echos debug

RP/0/RSP0/CPU0:Sep 15 12:43:19.405 LCP: [Open]: I ECHOREQ id 1 len 12 magic 0x659ff390

RP/0/RSP0/CPU0:Sep 15 12:43:19.405 LCP: [Open]: O ECHOREP id 1 len 12 magic 0x1e7824c4

Incoming echo requests and replies, note that the ID is incrementing every time.

Also note the magic numbers in this packet which are the Magics negotiated during LCP.

A9K does NOT do magic number validation.

RP/0/RSP0/CPU0:Sep 15 12:43:29.646 LCP: [Open]: I ECHOREQ id 2 len 12 magic 0x659ff390

RP/0/RSP0/CPU0:Sep 15 12:43:29.646 LCP: [Open]: O ECHOREP id 2 len 12 magic 0x1e7824c4

Note: I requested the lcp echo debug to be moved over to a different debug command. At the time of writing this article it is part of the debug ppp negotiation


AAA

radius.JPG

When the control policy determines that user authentication is in order, an access-request with several attributes is sent to the RADIUS server.

The radius server can check these attributes and reply back with either an ACCESS-ACCEPT or an ACCESS-REJECT.

The Reject can have a reply message stating the reason for reject, but that is optional.

An ACCEPT can come with reply attributes that will instruct the BNG to apply features or session parameters on the session

Debugging AAA

Some helpful hints explaining the debugs from the component

Command: Debug radius

Command: Debug aaa authentication/authorization/accounting

Common things to watch out for:

Some things that can go wrong with RADIUS are:

  • When the secret key is misconfigured, the authenticator of the radius header will be wrong. Some radius-servers reply then always with a reject, some radius-servers will silently discard the message.
  • Radius packets are sent with an 8 bit ID, which is not enough for large scale, we extend that with the source port, is show radius showing bad authenticators? Likely you have ID overload. Make sure your radius server supports extended source ports
  • Are the reply items properly set for based on the session type? For instance IP sessions do not come up when the service-type is set to framed or outbound.
  • When a feature is attempted to be applied, such as an ACL that doesn’t exist, should we reject the session or keep it alive unrestricted.

NOTE: The radius-profile for an IPoE session should NOT include any service-type. Having Framed-User as service type in an access accept for an IP subscriber will cause the session to fail.

PPP/PPPoE Control Policy Events

handling-events.jpg

-->

The  format of the debug is:

NODE : Time : <process>[pid] : [PACKET-direction] Interface mac-addr

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

Version history
Revision #:
1 of 1
Last update:
‎11-07-2011 07:02 AM
Updated by:
 
Labels (1)
Comments
New Member

Hi, Xander!

First want to thank you for your great articles regarding ASR9k and IOS XR! Keep up with the good work!

I am trying to configure PPPoE BNG session on ASR9k, 4.3.0 with radius authentication, but so far without any success. (DHCP was OK though). I keep receiving some strange error and I cannot troubleshoot what exactly is the problem. I guess there is a problem when the iEdge tries to send the aaa request to the radius (point 7 in the beginning of the troubleshooting guide).

The error I keep receiving is:

%L2-PPP_MA-4-ERR_AAA_SESSION_IF : Unknown Interface ID 08000860: Unexpected error encountered whilst receiving an update response: 'iEdge' detected the 'warning' condition 'iEdge invalid argument'

Here is the log from the “debug ppp all”, when trying to establish PPPoE session:

RP/0/RSP0/CPU0:ASR9K-2-SE#RP/0/RSP0/CPU0:Mar 13 16:55:20.542 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Up Event

RP/0/RSP0/CPU0:Mar 13 16:55:20.542 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Change to state Closed

RP/0/RSP0/CPU0:Mar 13 16:55:20.542 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]: Down Event

RP/0/RSP0/CPU0:Mar 13 16:55:20.542 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]: Change to state Initial

RP/0/RSP0/CPU0:Mar 13 16:55:20.542 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Up Event

RP/0/RSP0/CPU0:Mar 13 16:55:20.542 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Change to state Closed

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]: Open Event

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]: Initialize-Restart-Counter

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]:   MRU 1492 (0x010405d4)

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]: O CONFREQ id 1 len 19

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]:   AuthProto CHAP (0x0305c22305)

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]: Change to state Req-Sent

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Closed]:   MagicNumber 0x395e89d3 (0x0506395e89d3)

RP/0/RSP0/CPU0:Mar 13 16:55:20.648 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: Open Event

RP/0/RSP0/CPU0:Mar 13 16:55:23.650 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: Timeout+ Event

RP/0/RSP0/CPU0:Mar 13 16:55:23.650 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: O CONFREQ id 2 len 19

RP/0/RSP0/CPU0:Mar 13 16:55:23.650 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]:   MRU 1492 (0x010405d4)

RP/0/RSP0/CPU0:Mar 13 16:55:23.650 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]:   MagicNumber 0x395e89d3 (0x0506395e89d3)

RP/0/RSP0/CPU0:Mar 13 16:55:23.650 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]:   AuthProto CHAP (0x0305c22305)

RP/0/RSP0/CPU0:Mar 13 16:55:26.651 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: Timeout+ Event

RP/0/RSP0/CPU0:Mar 13 16:55:26.651 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: O CONFREQ id 3 len 19

RP/0/RSP0/CPU0:Mar 13 16:55:26.651 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]:   MRU 1492 (0x010405d4)

RP/0/RSP0/CPU0:Mar 13 16:55:26.651 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]:   AuthProto CHAP (0x0305c22305)

RP/0/RSP0/CPU0:Mar 13 16:55:26.651 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]:   MagicNumber 0x395e89d3 (0x0506395e89d3)

RP/0/RSP0/CPU0:Mar 13 16:55:26.753 : PPP-MA[374]: IPCP: Bundle-Ether2.101.pppoe24: [Initial]: Close Event

RP/0/RSP0/CPU0:Mar 13 16:55:26.753 : PPP-MA[374]: %L2-PPP_MA-4-ERR_AAA_SESSION_IF : Unknown Interface ID 08000860: Unexpected error encountered whilst receiving an update response: 'iEdge' detected the 'warning' condition 'iEdge invalid argument'

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: Down Event

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Req-Sent]: Change to state Starting

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Starting]: Close Event

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Starting]: Change to state Initial

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Report This-Layer-Finished

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Close Event

RP/0/RSP0/CPU0:Mar 13 16:55:26.754 : PPP-MA[374]: LCP: Bundle-Ether2.101.pppoe24: [Initial]: Report Closed

And here is the configuration, I’ve followed your config guide here…

aaa attribute format MY_AUTH

mac-address plus circuit-id plus remote-id separator #

!

aaa attribute format NAS_PORT_FORMAT

circuit-id plus remote-id separator .

!

aaa radius attribute nas-port format e SSAAPPPPQQQQQQQQQQVVVVVVVVVVUUUU type 32

aaa radius attribute nas-port-id format NAS_PORT_FORMAT

aaa group server radius radius-lab

server 100.100.0.2 auth-port 1812 acct-port 1813

!

aaa authorization subscriber default group radius

aaa authentication subscriber default group radius

dynamic-template

type ppp PPP_TPL_GRT

ppp authentication chap

ppp ipcp dns 1.1.1.1 2.2.2.2

ppp ipcp peer-address pool POOL_GRT

ipv4 unnumbered Loopback1001

!

interface Loopback1001

ipv4 address 20.20.0.254 255.255.255.255

!

pool vrf default ipv4 POOL_GRT

address-range 20.20.0.0 20.20.255.255

policy-map type control subscriber PPP_PM1

event session-start match-first

class type control subscriber PPPoE do-until-failure

   10 activate dynamic-template PPP_TPL_GRT

!

!

event session-activate match-first

class type control subscriber PPPoE do-until-failure

   10 authenticate aaa list default

!

!

end-policy-map

pppoe bba-group PPPOE_ACCESS

service selection disable

interface Bundle-Ether2.101

service-policy type control subscriber PPP_PM1

pppoe enable bba-group PPPOE_ACCESS

encapsulation dot1q 101

I can confirm that the radius server 100.100.0.2 is reachable and working, but no single packet is seen on its interfaces during the entire process. Also i can see that for a very short time the Subscriber session is up and then it is torn down.

RP/0/RSP0/CPU0:ASR9K-2-SE#show subscriber session all detail

Interface:                Bundle-Ether2.101.pppoe20

Circuit ID:               Unknown

Remote ID:                Unknown

Type:                     PPPoE

IPv4 State:               Up Pending, Wed Mar 13 16:32:13 2013

IPv6 State:               Up Pending, Wed Mar 13 16:32:13 2013

Mac Address:              000c.2994.47d0

Account-Session Id:       00000053

Nas-Port:                 Unknown

User name:                unknown

Outer VLAN ID:            101

Subscriber Label:         0x00000053

Created:                  Wed Mar 13 16:32:13 2013

State:                    Connected

Authentication:           unauthenticated

Access-interface:         Bundle-Ether2.101

Policy Executed:

policy-map type control subscriber PPP_PM1

  event Session-Start match-first [at Wed Mar 13 16:32:13 2013]

    class type control subscriber PPPoE do-until-failure [Succeeded]

      10 activate dynamic-template PPP_TPL_GRT [Succeeded]

Session Accounting: disabled

Last COA request received: unavailable

Can you advice how to troubleshoot further, what can be the problem ?

Thank you in advance,

Hristo

Cisco Employee

Thank you Hristo! I will

aha in this problem it looks like we are not receiving any confreq's from the client nor a conf<response> (whether that be an ack, nak or rej) on our proposal.

So this looks like a unidirectional PPP scenario and that can have various reasons.

Can you look at the PPP debugs from the client to see if they see our packets nad if they are responding to our proposals and it is originating a request on itself?

A proc restart on ppp_ma may help if there is a stall going on in the ppp manager.

There could also be a loadbalancing issue on your bundle (unlikely, but worth checking, so verify with a single member in the bundle to see).

Or some switches have reordering issues when it comes down to COS varying packets. So try to see if the switch is properly forwarding the PPP packets the 9k is originating.

Finally, if you dont need the features from 430, 423 might be a better release option for you if you are targeting a deployment shortly.

regards!

xander

New Member

Hi, Xander!

Thanks for the quick reply!

From the client side I can see that conf-ack messages are sent to the MAC address of the 9k. I've even sniffed the L2 port of the switch where the 9k is connected, and I can see the conf-ack there also. I've restarted the ppp_ma process, the bundle interface have only one member.

I guess I will try downgrading to 4.2.3 or contacting TAC for assistance.

Thank you for your time!

Hristo

Cisco Employee

Hi Hristo,

ok so the client is perfectly responding then but we are not seeing their packets, it could either be dropped in the

punt path or at ppp process level. I think it is best to have a tac case for this for better triaging as there is something buggy going on. Get a show tech ppp when you do. At the same time if you just want to try and recover to continue a test, try reloading the LC and or use 423.

regards!

xander

New Member

Hi, Xander!

Everything is now OK!

We were facing Caveat 4 from the upgrade guide:

http://www.cisco.com/web/Cisco_IOS_XR_Software/pdf/ASR9000_Upgrade_Procedure_430.pdf

After clean install (Turboboot) of 4.3.0, everything is working!

Thanks for your time and help,

Hristo

Cisco Employee

Hi Hristo, that is great to hear, glad to see it is fixed!!

enjoy playing with it

xander

Bronze

Hi,

I have the same problem now. I was running 4.3.0 and now 4.3.1.

Is it enough to deactivate BNG pie and activate it again?

And why is there no BNG package in 4.2.3?

Hi Xander

 Can I do debug radius and aaa authentication on the production? My concern is ASR9K will dead in case of this debug command. I am using ASR9001 with subscriber around 4000 user. I would like to troubleshoot why some user can't pass authentication. If you have any other solution please advise.

 

Thank you.

Pichet

Cisco Employee

hi pichet,

fortunately xr is not classic ios :) so you can enable a debug and eventhough there will be many messages, XR can handle it and either throttle the output but it will never clogg the full system. worst case is a process crash, but that is also a contained event.

Any case, there is a great option for filtering avaialble too on the radius debug:

 

RP/0/RSP0/CPU0:A9K-BNG#debug radius filter ?
  access-interface  Filter on subscriber access-interface
  inner-vlan-id     Filter on subscriber inner-vlan-id
  ipv4-address      Filter on subscriber IPv4 address
  ipv6-address      Filter on subscriber IPv6 address
  mac-address       Filter on subscriber MAC address
  nas-port-id       Filter on subscriber nas-port-id
  outer-vlan-id     Filter on subscriber outer-vlan-id
  username          Filter on subscriber username

 

cheers

xander

New Member

Hi Xander,

Hope you are doing fine.

A quick question:

Are there any plans on enabling the following (they don't provide any output in 5.2.4):

RP/0/RSP0/CPU0:bbras-llu-kln-02#debug radius ?
  accounting      RADIUS Accounting Debugging
  authentication  RADIUS Authentication Debugging
  authorization   RADIUS Authorization Debugging
  configuration   RADIUS Configuration Debugging

and on supporting debug filtering for them?

It quite is useful because "debug radius" produces extra output that cannot be filtered by username or vlan and it is kind off difficult to get a clear output for further process.

Regards,

Dimitris

 

Cisco Employee

the debug aaa <xxx> cmd's have little purpose for the iedge/bng.

you could use the debug subscriber subset that has very nice filters such as first trigger for a particular mac, username, session, access interface etc.

 

debug radius can be filtered like this, not sure if you have seen that:

RP/0/RSP0/CPU0:A9K-BNG#debug radius filter ?
  access-interface  Filter on subscriber access-interface
  inner-vlan-id     Filter on subscriber inner-vlan-id
  ipv4-address      Filter on subscriber IPv4 address
  ipv6-address      Filter on subscriber IPv6 address
  mac-address       Filter on subscriber MAC address
  nas-port-id       Filter on subscriber nas-port-id
  outer-vlan-id     Filter on subscriber outer-vlan-id
  username          Filter on subscriber username

cheers

xander

New Member

We need to have a way to check radius attributes for troubleshooting reasons.

We've seen the debug radius filters and we have opened an SR (636152695) because we couldn't find the way the filter is applied (is quite tricky).

What I am saying is that even with debug radius filter enabled (e.g. debug radius filter username xander), the output contains extra info which cannot be filtered (e.g. some CoA packets that have nothing to do with the specific username) and it becomes quite difficult to analyze.

Maybe more specific debug options (e.g. debug radius accounting filter username xander) would be more helpful, but for now not even debug radius accounting without filter is working, although is available (same for authentication/authorization) :-S

Cisco Employee

you are correct, I am noticin also that in 524 the debug radius filter is not working as it should nor does it filter correctly on the radius message type.

I'll review your case and make sure that the right ddts is filed for the right fix for this.

no workaround that I can see at this moment. (except for dumping everything and using manual grep filters).

xander

New Member

Thanks Xander,

Very helpful as always :)

New Member

I Xander how are you!!!..May I make a question...

In PTA model with session terminating in an interface bundle where are procces the ppp keepalive? 

in the RSP or in the NP of the linecard...

and with pw-headend?  

in the RSP or in the NP of the linecard...

Best regards!

Cisco Employee

PW-Ether is a virtual interface, meaning that packets can be received on multiple interfaces/cards. Hence the subscriber sessions must be terminated on the RSP.

New Member

Hi Alek.. thanks!!.. my question was where are processing de keepalives in both cases.

Let me see if I understood your answer .. you say that because of the session are terminated on the RSP, keepalives are process there, are you?

Regards

Javier

Cisco Employee

That's correct. :)

New Member

Hello Alex,

thanks for that great guide, made me understand the new PPPoE/IPoE Architecture on XR machines to get some basic system running.

Now - your post covers Authentication.

It does not cover Authorization and Accounting (which I haven't yet got working).

I run a freeradius Server and would like to setup a complete system, covering Authorization (Profiles/Disconnects etc) as well Accounting Packets.

Would be great if you (or someone) could hint me in the right direction.

Thanks,

Heiko

Bronze

Hi Heiko,

check these two links please.

https://supportforums.cisco.com/document/77526/asr9000-bng-training-guide-setting-pppoe-and-ipoe-sessions

https://supportforums.cisco.com/document/77646/asr9000-understanding-bng-configuration-walkthrough

If you have difficulties come back and we will help you out.

p
New Member

Can anyone tell me the reason of below log;

RP/0/RSP0/CPU0:Aug xxx : PPP-MA[374]: %L2-PPP_MA-4-ERR_AAA_SESSION_IF : Bundle-Etherxxxxxxxxxx: Unexpected error encountered whilst receiving an activate response: 'AAA_BASE' detected the 'fatal' condition 'AAA Radius Server Failure' 

Cisco Employee

hi there, when a session is getting fully activated, it needs an ok from AAA if you have configured that directive in the control policy.

if it fails, either because of a no response, bad attributes that cant be honored or a reject, then the session will need to be let go.

I think based on the specific message here the radius wasnt responding.

debug radius can help here and can also verify with the show subscr ses fail history to get an idea on the disconnect reasons. And show radius to see what the performance of the radius server is (not responding, rejects etc)

cheers

xander

New Member

Hi All

Can anyone help me out , I am migrating a BRAS from 7200 to ASR9010 using IOS 5.3.3.

Currently I do have two solution that are not working, only the normal BNG session ar able to come up.

When i say normal i meam dynamic template with pool and loopback under vrf default.

Whn I try to asign ppoe users under VRF is not workig. (radius is on global i used tow dyanmic templates one for start one for activation)

debugs show me radius request is accepted

Also a second issue, since the old brs 7200 was siting behind the 9010 some circuit will end up as xconnect on the 9010 and  they were being send to the BRAS using a L2 conection, so in order to migrate this circuit i decided to implement psweudowire headend for ppoe, but with no success.

RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#show run dynamic-template
Fri Sep 30 17:59:36.388 BOL
dynamic-template
type ppp PPP_1
ppp authentication pap
ppp ipcp dns 200.87.100.10 200.87.100.40
ppp ipcp peer-address pool FIH100
ipv4 unnumbered Loopback0
!
type ppp PPP_2
ppp authentication pap
ppp ipcp dns 200.87.100.10 200.87.100.40
ppp ipcp peer-address pool FIH500
ipv4 unnumbered Loopback0
!
type ppp PPP_3
ppp authentication pap
ppp ipcp dns 200.87.100.10 200.87.100.40
ppp ipcp peer-address pool FIC100
ipv4 unnumbered Loopback0
!
type ppp PPP_99
ppp authentication pap
ppp ipcp dns 10.240.234.11 10.240.234.12
ppp ipcp peer-address pool QRUN
vrf qrunnal
!
type ppp PPP_aaa
ppp authentication pap
ipv4 unnumbered Loopback0
!
!

RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#show run policy-map type control ?
subscriber Subscriber control policy-map
RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#show run policy-map type control subscriber
Fri Sep 30 17:59:59.594 BOL
policy-map type control subscriber PPP_PM_QRUN
event session-start match-first
class type control subscriber PPP do-until-failure
10 activate dynamic-template PPP_aaa
!
!
event session-activate match-first
class type control subscriber PPP do-until-failure
10 authenticate aaa list default
20 activate dynamic-template PPP_99
!
!
end-policy-map
!
policy-map type control subscriber PPP_PM_BASICO
event session-start match-first
class type control subscriber PPP do-until-failure
10 activate dynamic-template PPP_1
!
!
event session-activate match-all
class type control subscriber PPP do-until-failure
1 authenticate aaa list default
!
!
end-policy-map
!
policy-map type control subscriber PPP_PM_SILVER
event session-start match-first
class type control subscriber PPP do-until-failure
10 activate dynamic-template PPP_2
!
!
event session-activate match-first
class type control subscriber PPP do-until-failure
1 authenticate aaa list default
!
!
end-policy-map
!
policy-map type control subscriber PPP_PM_PREMIUM
event session-start match-first
class type control subscriber PPP do-until-failure
10 activate dynamic-template PPP_3
!
!
event session-activate match-first
class type control subscriber PPP do-until-failure
1 authenticate aaa list default
!
!
end-policy-map
!

RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#
RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#show run l2vpn xconnect group PWHE_G
Fri Sep 30 18:00:22.112 BOL
l2vpn
xconnect group PWHE_G
p2p PWHE-1
interface PW-Ether100
neighbor ipv4 190.129.253.1 pw-id 4017
pw-class PWHE
!
!
p2p PWHE-2
interface PW-Ether200
neighbor ipv4 190.129.253.2 pw-id 4020
pw-class PWHE
!
!
!
!

RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#show run inter PW-Ether100.4017
Fri Sep 30 18:00:36.205 BOL
interface PW-Ether100.4017
service-policy type control subscriber PPP_PM_SILVER
pppoe enable bba-group PPP-SILVER
encapsulation ambiguous dot1q 4017 second-dot1q 10-3499
!

RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#show run inter GigabitEthernet0/1/0/8.3888
Fri Sep 30 18:01:07.459 BOL
interface GigabitEthernet0/1/0/8.3888
description ### PRUEBA PPPoE ###
service-policy type control subscriber PPP_PM_QRUN
pppoe enable bba-group PPP-QRUN
encapsulation dot1q 3888
!

RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#
RP/0/RSP0/CPU0:TDD-CE-CENTRAL-01#

Cisco Employee

the loopback you are applying is done during start, but is used for both non-vrf and vrf users, 

if non vrf users only work, that means that the loopback is in teh global.

solution is to apply a vrf enabled loopback for those users in the vrf.

in start you can provide the template and details, in activate you dont need to reapply the template per-se.

another option is to pass on the unnumbered information (and vrf) from radius.

this allows you to use a single template for all users, and radius to profile the vrf, unnumbered and possibly pool information.

cheers!

xander

New Member

Thanks a lot Xander,(xthuijs)

With your recommendation vrf users are getting authenticated as well as no-vrf users, I am suing an loopback on the vrf for that purpose.

One issue  I can't still figure it out  is the use of the pw-ether interface.

If I piggy back the connection that are received over the psweudowires it works, but each time I try to use the PW-interfaces to process the traffic directly without havving to piggy back. I am not able to see any traffic under the xconnect whether  in the TX or RX direction .

Not sure if  I am missing something regarding the pw-ether interafce.

 l2vpn
xconnect group PWHE_G
p2p PWHE-1
interface PW-Ether100
neighbor ipv4 190.129.253.1 pw-id 4017
pw-class PWHE
!
!

interface PW-Ether100.4017
service-policy type control subscriber PPP_PM_SILVER
pppoe enable bba-group PPP-SILVER
encapsulation ambiguous dot1q 4017 second-dot1q 10-3499
!

New Member

Hello,

I'm trying to enable accounting packets on cisco side, and tried many things, like:

aaa accounting network default start-stop group radius

aaa accounting service default group radius

aaa accounting subscriber default group radius

aaa authorization subscriber default group radius

aaa authentication subscriber default group radius

aaa service-accounting extended

Resulting:

Interface:                TenGigE0/0/2/1.1555.pppoe370

Circuit ID:               Unknown

Remote ID:                Unknown

Type:                     PPPoE:PTA

IPv4 State:               Up, Thu Oct 27 06:42:57 2016

IPv4 Address:             x.x.x.x, VRF: default

Mac Address:              6a12.0b26.8e99

Account-Session Id:       04000172

Nas-Port:                 Unknown

User name:                logn

Formatted User name:      unknown

Client User name:         logn

Outer VLAN ID:            1555

Subscriber Label:         0x04000029

Created:                  Thu Oct 27 06:42:53 2016

State:                    Activated

Authentication:           authenticated

Authorization:            authorized

Access-interface:         TenGigE0/0/2/1.1555

Policy Executed:

policy-map type control subscriber politicaPPP

  event Session-Start match-all [at Thu Oct 27 06:42:53 2016]

    class type control subscriber classePPP do-until-failure [Succeeded]

      1 activate dynamic-template tempF [Succeeded]

  event Session-Activate match-first [at Thu Oct 27 06:42:57 2016]

    class type control subscriber classePPP do-until-failure [Succeeded]

      10 authenticate aaa list default [Succeeded]

      20 authorize aaa list default [Succeeded]

Session Accounting: disabled

Last COA request received: unavailable

[Last IPv6 down]

Monitoring from freeradius I can't see any accouting packets going on.

RP/0/RSP0/CPU0:ios#show radius accounting

Thu Oct 27 06:55:22.616 UTC

Server: x.x.x.x, port: 1813/

    0 requests, 0 pending, 0 retransmits

    0 responses, 0 timeouts, 0 bad responses

    0 bad authenticators, 0 unknown types, 0 dropped

    0 ms latest rtt

    Throttled: 0 transactions, 0 timeout, 0 failures

    Estimated Throttled Accounting Transactions: 0

    Maximum Throttled Accounting Transactions: 0

    Automated TEST Stats:

        0 requests, 0 timeouts, 0 response, 0 pending

Can anyone give me the steps necessary to enable it?

Thanks

Cisco Employee

hi!

it seems like you have not enabled the accounting on the session via radius attributes in access-accept or on the dynamic template as per:

"Session Accounting: disabled"

you'd need to add this to the dynamic template:

dynamic-template
 type ppp TPL
  accounting aaa list default type session periodic-interval 60
 !

the italic piece optional in case you like to see interim accounting records.

cheers

xander

New Member

Hello,

Great it worked! Thank you very much.

However, I have some doubts yet:

1) On my another nas, I have this column CalledStationId/radacct being inserted as:

eth1.209:00:26:b9:8e:12:19 (which means interface eth1.209 which is vlan 209:mac). I would like to set something similar on cisco.

2) I've tested as well a cisco av-pair to understand how it works: 

Cisco-AVPair = "ip:dns-servers=x.x.x.236", however, I'm a bit confused which is the correct way to send rate values for the connection (download/upload). 

Thanks

Cisco Employee

awesome wilson, yeah you can do the CSID too but you need to define the format you like to use, something like this:

RP/0/RSP0/CPU0:A9K-BNG(config)#

aaa radius attribute calling-station-id format CSID

aaa attribute format CSID
 format-string length 253 "%s.%s:%s" physical-port inner-vlan-id client-mac-address-ietf

cheers

xander

New Member

Wonderful! But and setting rates?

I was trying first to implement qos-policy-add-class using a static policy-map:

Example:

policy-map controlebanda

class class-default

  shape average 5 mbps

!

end-policy-map

!

But I'm not sure why it didn't work or if this is the correct path.

Cisco Employee

a debug qos-ma/ea will help identify what is not ok about the policy pushed.

if you dont need a parameterized or custom profile per user, or you have only a few set of services/rates best to define them locally, but you need it hierarchical:

policy-map child

class class-default

queue limit 5 packets

policy-map parentRATEX

class class-default

shape av <rate>

service-policy child

and then apply that on the dynamic template OR pass it via radius

ip:sub-qos-policy-out=parentRATEX

xander

New Member

I'm getting duplicated records. Example:

| 0400017e         | username  | 1.1555:6a-12-0b-26-8e-99 | 2016-10-27 13:49:22 | 0000-00-00 00:00:00 |

| 0400017e         | username  | 1.1555:6a-12-0b-26-8e-99 | 2016-10-27 13:49:22 | 0000-00-00 00:00:00 |

Do you see any reason for that?

New Member

Hello xander,

I'm still having trouble defining profiles per group of subscribers.

In my case, let's say I have 100 customers with 10Mbps download and 5 Mbps upload.

Some questions that arised:

1) Which is best for an ISP scenario? Shaping or Policying? It seems that for performance with many users, policying will be better, am I right?

2) I would like to prioritize a few types of traffic, example: voip, P2P, http, ftp. So, how can I define those types?

Let me show what I'm exactly looking for:

To create a "plan" 10mbps/5mbps that will have QoS in something like:

256kbps minimum guaranteed for VOIP.

1Mbps minimum guaranteed for games traffic type (an example).

I need to pass those rates (10/5mbps) from radius and it will use a "default" profile created within cli with the QoS rules above.

I'm expecting to use:

+= ip:sub-qos-policy-out=bwcontrol(download=10240)

+= ip:sub-qos-policy-in=bwcontrol(upload=5098)

Can you help me with this? I'm sure it will be a great resource for many users.

Cisco Employee

for your requirement I would recommend a simple policer for inbound to the rate desired and for the downstream considering you need some per traffic class prioritization and bw guanratees to put a parent shaper and in your child classes provide the shaping and prioritization for the specific services.

cheers!

xander

New Member

Xander,

I tried this as a test:

dynamic-template

type ppp tempFOL

  ppp authentication pap

  ppp ipcp peer-address pool POOL

  service-policy input controlebanda

  service-policy output controlebandain

  accounting aaa list default type session periodic-interval 60

  ipv4 unnumbered TenGigE0/0/2/0

!

!

policy-map controlebanda

class class-default

  shape average 10 mbps

!

end-policy-map

!

policy-map controlebandain

class class-default

  police rate 5 mbps

  !

!

end-policy-map

!

When I connect, I  receive:

LC/0/0/CPU0:Nov  4 09:31:45.235 : pppoe_ma[389]: Session: TenGigE0/0/2/1.1555.pppoe464: Received Activate Config cb: 'qos-ea' detected the 'warning' condition 'Ingress queueing features is not supported on this line card'

LC/0/0/CPU0:Nov  4 09:31:45.235 : pppoe_ma[389]: Session: [ERROR] TenGigE0/0/2/1.1555.pppoe464: Activate config called back with error: 'qos-ea' detected the 'warning' condition 'Ingress queueing features is not supported on this line card'

What am I doing wrong?

Cisco Employee

ingress queing is waste of resources really, unless you are very itchy on MEF type stuff :)

also the a9k typhoon linecard doesnt support ingress q'ing when the port to npu load is >30G.

egress Q'ing not a problem.

so you need to flip around your service/policy input/output on the template.

  service-policy input  output controlebanda

  service-policy output  input controlebandain

cheers

xander

New Member

Sorry for that!

It worked. Then I tested it on speedtest and I'm getting this results:

9,48Mbps of download ( which I need to be near 10Mbps)

2,5Mbps of upload. (which I need to be near 5Mbps).

What I need to workout to get better results ? Maybe burst? How can I do that.

Cisco Employee

the client you use may not take L2 overhead and all that into consideration that the shaper does account for.

for your policer inbound you probably want to play with the bursts a bit.

check the asr9000 quality of service architecture document for some recommendations on setting up bursts with policers.

xander

New Member

Great. I'll look that.

The other problem I'm actually having is setting from radius:

ip:sub-qos-policy-out=controlebanda(rate=10240)

policy-map controlebanda

class class-default

  shape average $rate = 1024 kbps

!

end-policy-map

LC/0/0/CPU0:Nov  4 10:45:45.339 : qos_ma[314]: th: 1, qos_ma_imc_populate_caps: Caps add  failed: ifh: 0x4008080, policy: controlebanda(rate=20240), caps type: 1, caps num: 166: 'infra-app-obj' detected the 'resource not available' condition 'app-obj: Not enough memory'

Didn't get it.

Cisco Employee

yeah the parameter passing doesnt work for the radius attribute, you can use parameterized qos to construct the pmap the (add-class) directives for the vsa's to selectively build your pmap.

though it is recommended to define your service profiles in XR and reference the name in radius, this scales better from a resource perspective and call set up rate.

xander

New Member

What about this?

http://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r6-0/bng/configuration/guide/b-bng-cg60xasr9k/b-bng-cg60xasr9k_chapter_0110.html#concept_CFC96DAA1A784A818898D40BB0F07FB4

dynamic-template type service SERVICE-POLICY-OUT    

    service-policy output out-policy merge 10

policy-map out-policy

    class class-default

       shape average $shape-rate= 100000 Kbps

service-policy output-child

policy-map output-child

      class class-default

Cisco-avpair = "subscriber:sa=SERVICE-POLICY-OUT(shape-rate=1203000)”
Bronze

Hi,

I think that you wrote somewhere how to check the available resources on A9K, i.e. how many PPP sessions can be established. 

I forgot to write it down. Can you please tell me how to check this?

Cisco Employee

hey smail,

was that maybe subscriber manager statistics summary totals?

like this:

RP/0/RSP0/CPU0:A9K-BNG#show subscr man stat sum total

Mon Nov 21 10:17:47.720 EDT

[ IEDGE SUMMARY STATISTICS ]

Location: 0/RSP0/CPU0

IEDGE SUMMARY

=============

Control Policy errors

  Subscriber control policy not applied on interface = 0

  No class match in Start Request                    = 65

Attribute format warnings

  NAS Port                                           = 0

  NAS Port id                                        = 0

  Destination station id                             = 65

  Calling station id                                 = 65

  User Name                                          = 0

User Profile Statistics

  User Profile Install                               = 77

  User Profile Install errors                        = 0

  User Profile Removes                               = 76

  User Profile Errors                                = 0

Session Disconnect Flow Control

  Inflight                                           = 0

  Queued                                             = 0

Location: 0/0/CPU0

IEDGE SUMMARY

=============

Control Policy errors

  Subscriber control policy not applied on interface = 0

  No class match in Start Request                    = 0

Attribute format warnings

  NAS Port                                           = 0

  NAS Port id                                        = 0

  Destination station id                             = 0

  Calling station id                                 = 0

  User Name                                          = 0

User Profile Statistics

  User Profile Install                               = 0

  User Profile Install errors                        = 0

  User Profile Removes                               = 0

  User Profile Errors                                = 0

Session Disconnect Flow Control

  Inflight                                           = 0

  Queued                                             = 0

Location: 0/2/CPU0

IEDGE SUMMARY

=============

Control Policy errors

  Subscriber control policy not applied on interface = 0

  No class match in Start Request                    = 0

Attribute format warnings

  NAS Port                                           = 0

  NAS Port id                                        = 0

  Destination station id                             = 0

  Calling station id                                 = 0

  User Name                                          = 0

User Profile Statistics

  User Profile Install                               = 0

  User Profile Install errors                        = 0

  User Profile Removes                               = 0

  User Profile Errors                                = 0

Session Disconnect Flow Control

  Inflight                                           = 0

  Queued     

Bronze

Hmm I think that it's not this one. 

I need to check how much HW resources (memory, NP etc) I have on my BNG for max. number of sessions?

Customer has two MOD80 where only one bay is used. This means one NP per MOD80...32K sessions per NP.

Cisco Employee

hi Smail,

there is a command to check the number of allocated WFQ profiles ("show qoshal resource summary"), which is the common bottleneck.

hth,

/Aleksandar

Bronze

Thanks. I found it also in BRKSPG-2904. 

Can you please help me out to understand the output (it's attached)

We have around 26K PPP sessions.

Two MOD80 where only NP0 is in use (bay 1).

Under  QoS-EA I see Total 51813 for loc0/1 and for loc0/2 it's 25861.

Total number of session on BE10 (loc0/1) is: 12965

On BE20 (loc0/2) it's 13021

We are using RP based session but BE has only one member.

Plan is to acquire additional MPA's and SFP's. That is why I need your help so I can fully understand the limits on A9K.

Cisco Employee

The usual bottleneck in high scale deployments is the number of allocated L3 queues. We support 8k queues at Level 3.

  SUMMARY per NP: 
  =========================
   Policy Instances: Ingress 25913 Egress 25915  Total: 51828
   Entities: (L4 level: Queues)
     Level        Chunk 0           Chunk 1           Chunk 2           Chunk 3          
     L4        20(   20/   20)    4(    4/    4)    4(    4/    4)    4(    4/    4)
     L3(8Q)     5(    5/    5)    1(    1/    1)    1(    1/    1)    1(    1/    1)
     L3(16Q)    0(    0/    0)    0(    0/    0)    0(    0/    0)    0(    0/    0)
     L2         4(    4/    4)    1(    1/    1)    1(    1/    1)    1(    1/    1)
     L1         7(    7/    7)    0(    0/    0)    0(    0/    0)    0(    0/    0)

You can explicitly bind a subscriber access interface to a chunk of your choice:

interface Bundle-Ether8.2
 service-policy output svlan-parent subscriber-parent resource-id 1

where the "resource-id" is the chunk number.

New Member

Hello,

We tried to put cisco on production yesterday and have some issues, I will describe my scenario:

Licenses:

FeatureID: A9K-BNG-LIC-8K (Slot based, Permanent) 

  Total licenses 2

  Available for use         2

  Allocated to location     0

  Active                    0

  Store name             Permanent

  Store index               1

    Pool: Owner

      Total licenses in pool: 2

      Status: Available     2    Operational:    0

We enable it and started to receive connections, but we had two problems:

The connections didn't get over 8600 sessions (debugging I couldn't see any explicit errors, but It shows up in connecting status). The licenses never get in "allocated/active status".

The other issue I saw is accounting records being duplicated, I'm pasting my config below:

aaa accounting network default start-stop group radius

aaa group server radius RADIUSGROUP

server IP auth-port 1812 acct-port 1813

source-interface TenGigE0/0/2/0

!

dynamic-template

type ppp temp

  ppp authentication pap

  ppp ipcp dns x.x.x.x

  ppp ipcp peer-address pool poolIPs

  accounting aaa list default type session periodic-interval 60

  ipv4 unnumbered TenGigE0/0/2/0

!

type service REDIRECIONAMENTO

  service-policy type pbr polREDIRECIONAMENTO

!

!

interface TenGigE0/0/2/1.40

service-policy type control subscriber politicaPPP

pppoe enable bba-group provider

encapsulation dot1q 40

!

interface TenGigE0/0/2/1.41

service-policy type control subscriber politicaPPP

pppoe enable bba-group provider

encapsulation dot1q 41

!

interface TenGigE0/0/2/1.42

service-policy type control subscriber politicaPPP

pppoe enable bba-group provider

encapsulation dot1q 42

!

interface TenGigE0/0/2/1.43

service-policy type control subscriber politicaPPP

pppoe enable bba-group provider

encapsulation dot1q 43

!

interface TenGigE0/0/2/1.44

service-policy type control subscriber politicaPPP

pppoe enable bba-group provider

encapsulation dot1q 44

!

aaa attribute format CSID

format-string length 253 "%s.%s:%s" physical-port outer-vlan-id client-mac-address-ietf

!

aaa radius attribute calling-station-id format CSID

aaa accounting subscriber default group radius

aaa authorization subscriber default group radius

aaa authentication subscriber default group radius

pppoe bba-group provider

sessions mac limit 1

!

class-map type control subscriber match-any classePPP

match protocol ppp

end-class-map

!

!

policy-map type control subscriber politicaPPP

event session-start match-all

  class type control subscriber classePPP do-until-failure

   1 activate dynamic-template temp

  !

!

event session-activate match-first

  class type control subscriber classePPP do-until-failure

   10 authenticate aaa list default

   20 authorize aaa list default format FULL_AUTH password use-from-line

  !

!

end-policy-map

!

Cisco Employee

it's hard to say from 10000 feet. Sessions can fail to connect at many points. Best is to run "sh subscriber manager disconnect-history last" to see the reason for disconnect.

If subscribers fail to authenticate, a quick look into "sh radius" will also show you whether the server is reachable, whether it's the radius server that sends a disconnect,  whether there's a timeout reaching the radius, etc.

/Aleksandar

New Member

Alek, do you see any reason for duplicated acct records?

Do you licensing session limit a possible issue cause I can't see the licenses in activated mode / allocated.