cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9078
Views
16
Helpful
28
Replies

ISE behind load balancer

bert.lefevre
Level 1
Level 1

I have a question regarding ISE profiling servers that are placed behind a load balancer:

If you have a ISE environment where both computers and users are being authenticated, and Machine Access Restriction (MAR) is enabled (so users can only authenticate on a previously authenticated machine), are the ISE servers aware of all succesfull computer authentications handled by the other ISE servers?

For example:

There are 2 ISE appliances (ISE01 and ISE02) behind a load balancer.

A user starts up his computer, and computer authentication is handled by ISE01 (and the authentication is successful). At the moment the user logs in on that computer, the load balancer chooses ISE02 to authenticate the user.

Will ISE02 be aware that the corresponding computer was already succesfully authenticated on ISE01, so that the user is able to log in? Or will it deny the user authentication because it thinks the computer is not (yet) authenticated and Machine Access Restrictions is enabled?

Kind regards,

Bert

1 Accepted Solution

Accepted Solutions

Nicolas Darchis
Cisco Employee
Cisco Employee

are the ISE servers aware of all succesfull computer authentications handled by the other ISE servers?

=> No

they are independant servers that just replicate their configuration.

So a user should authenticate always with the same ISE.

Moreover a load balancer kills profiling since profiling requires you to span some traffic to an ISE

View solution in original post

28 Replies 28

Nicolas Darchis
Cisco Employee
Cisco Employee

are the ISE servers aware of all succesfull computer authentications handled by the other ISE servers?

=> No

they are independant servers that just replicate their configuration.

So a user should authenticate always with the same ISE.

Moreover a load balancer kills profiling since profiling requires you to span some traffic to an ISE

Nicolas,

thanks a lot for this explanation. Now I'm at least warned that we shouldn't place them behind a load balancer (although load balancing ISE policy servers is mentioned in the Cisco ISE user guide under section 9 "Setting up Cisco ISE in a distributed Environment").

Actually I don't understand why Cisco didn't implement synchronization of the machine cache for MAR (which is in fact just a cache of the mac-addresses of the authenticated computers) between ISE servers that are in the same node group. Synchronizing a table of mac-adresses isn't a big challenge I assume? Or is there another reason this wasn't implemented?

Implementing this synchronization would be a big improvement if you ask me, as this adds extra redundancy in case one ISE server fails and users try to log on to machines that were already authenticated on that failed ISE.

Kind regards

I would guess it's probably a bit more complex than a mac table synchronization. Especially the real time synchronization could take a lot of bandwitdh/cpu I can guess but yes it would make sense as a feature request. I think the feature request list has to be 1km long now for ISE :-)

I'll check with the developers if it's already on their roadmap or not.

After checking, the MAR cache synchronization will be present in ACS 5.4

Logically it should also be included in a future ISE release but no further details.

Nicolas,

I'm glad to hear that the MAR cache sync is already in development for ACS and I hope it will be soon implemented in ISE also. I'll keep an eye on new release notes.

Thanks a lot!

Nicolas,

I'm glad to hear that the MAR cache sync is already in development for ACS and I hope it will be soon implemented in ISE also. I'll keep an eye on new release notes.

Thanks a lot!

>> they are independant servers that just replicate their configuration.

So a user should authenticate always with the same ISE.

Moreover a load balancer kills profiling since profiling requires you to span some traffic to an ISE <<

Not entirely correct.  Policy Service nodes are most certainly supported behind a load balancer which is the intention of a node group. This is often the preferred method for high availability and scaling.  In addition to supporting load distribution of RADIUS and other requests, members of a node group maintain a heartbeat to determine if a peer member should fail.  If so, the Monitoring node is queried to determine if there are any transient sessions which may require clean-up via RADIUS COA to help ensure that an endpoint is left in a defunt auth state.  LB functionality will depend on load balancer used.  Cisco ACE for example supports stickiness of RADIUS transactions based on source IP, Calling-Station-ID, or Framed-IP-Address.

The impact of LB on profiling or other Policy Service node functions depends on the service/probe in question.  For services like client provisioning, posture, and central web auth, https redirection always occurs back to the node which terminated the RADIUS session, so LB is transparent provided direct access is permitted to the real IP for redirected https trnasactions (RADIUS tranasactions would be sent to virtual IP).

Specific to profiling, SNMP Queries can be triggered and will be sent by Policy Service node that received the RADIUS Accounting Start packet (assumes RADIUS probe enabled) or SNMP Trap (assumes SNMP Trap probe enabled).  SPAN is only one data collection method used primarily for HTTP or DHCP capture.  Methods other than SPAN/RSPAN are available to capture this data, but if used, then it is correct that there is no specific mechansim to move SPANs from one interface to another in case of NIC or node failure.  I believe intelligent taps are available that can accomplish this, or else traffic can be mirrored to multiple nodes at the cost of duplicating profile data.

As noted, replication of MAR cache will be added to ACS 5.4, and no, this feature is not altogether trivial due to the number of transactions and updates that must be replicated and kept in sync across each node performing RADIUS services. 

/CH

Hi,

Thanks, this is useful information.

In the documentation, it mentions that when using a node group the NAS should have all of the ISE's configured under AAA to allow CoA.  Would it be possible to use the VIP address and NAT the ISE's when they instigate a outbound connection from behind the ACE for CoA, or is Radius a bit deeper than that.

Would you configure a node group for a pair of policy nodes on a remote site that were not load balanced, what makes this specific to policy server nodes behind a LB. Assume both policy server nodes where configured in all NAS's on that particular site.

I assume when profiling is carried out all data is replicated to the admin node anyway, this is using DHCP helper, DNS, SNMP.  When you start to look at a distributed ISE architechure and using profiling it starts to get messy, potentially a lot of helper, SNMP addresses have to be configured in NAS's.

Thanks.

Hi  Craig,

 

We are in the process of migrating our ISE infrastructure from ACE to F5.

We followed your document for the configuration.

 

All looks ok except EAP-TLS authentication. (PEAP user/computer works fine)

In the document there is nothing special mentioned that needs to be done for TLS.

 

I think it may be related to fragmentation but not sure.

I can also add here that if we point the NAD's to the PSN directly it works.

The problem is only when we use the VIP.

(PEAP work with the VIP also)

 

Do you know  if something special needs to be done for TLS to work.

Any information or hint is appreciated.

 

Thanks,

Laszlo

 

It is not uncommon to see RADIUS load balancing issues with EAP-TLS related to fragmentation.  The typical cases are either 1) failure of load balancer to reassemble large RADIUS packets, for example, TLS with larger key sizes, or 2) dropping of fragments by load balancer that are deemed too small.  For first case, both Cisco ACE and F5 LTM should accommodate automatic reassembly if using the standard LB mechanism for RADIUS.  LTM does not reassemble FastL4 by default, but that protocol is normally not used and guide does not use that profile for RADIUS. If fragments too small, for both ACE and LTM you would need to change the default minimum fragment size to accept the exceptionally small fragment for reassembly.  This can serve as a workaround, but recommend find and eliminate the device causing RADIUS packets to be fragmented below reasonable size.

Another common issue in load balancing is failure to understand exact path taken for the entire flow to/from real servers. Often there is a case where ingress packets take one path but responses take another path.  This asymmetry often results in packet drops by load balancer or other device in the path.

/CH

Hi laposilaszlo,

Did you end up resolving the issue with EAP-TLS?

I am using a F5 to load balance RADIUS & having the same issue.

I am not sure if I want to alter the fragment size as a work around.

Regards,

Raj

Hi Raj,

Yes we solved it.

In our case it was the Nexus switch. It has a security feature that discards small UDP packets.

And the last part of the certificate was a small UDP packet that got discarded.

So we disabled this one and all is ok now.

After this we had some problem on the F5 regarding UDP fragments that was solved with an F5 upgrade.

This was a long time ago so this fix sould be in the current releses.

laszlo

Hi Laszlo,

Thanks for the quick reply!

We run a Nexus core as well. Could you please tell me how to check/disable the feature?

We are running ISE 2.2 & the issue still seems to persists.

We are running version 11.5.3 on the F5.

Regards,

Raj

MTN-GDC-AGG-N7018A-1# show hardware forwarding ip verify module 3

 

IPv4 and v6 IDS Checks         Status     Packets Failed 

-----------------------------+---------+------------------

address source broadcast       Enabled    0              

address source multicast       Enabled    0              

address destination zero       Enabled    0              

address identical              Enabled    134            

address reserved               Enabled    2334940        

address class-e                Disabled   --

checksum                       Enabled    0              

protocol                       Enabled    0              

fragment                       Enabled    34254          

length minimum                 Enabled    0              

length consistent              Enabled    0              

length maximum max-frag        Enabled    0              

length maximum udp             Disabled   --

length maximum max-tcp         Enabled    0              

tcp flags                      Disabled   --

tcp tiny-frag                  Enabled    176552         

version                        Enabled    0              

-----------------------------+---------+------------------

IPv6 IDS Checks                Status     Packets Failed 

-----------------------------+---------+------------------

length consistent              Enabled    0              

length maximum max-frag        Enabled    0              

length maximum udp             Disabled   --

length maximum max-tcp         Enabled    0              

tcp tiny-frag                  Enabled    0              

version                        Enabled    0              

 

Workaround:

 

Disable packet length check using command

 

Config t

no hardware ip verify length

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: