Below is the description of the problem.
I've tried deleting all sub-interfaces on the Juniper and tried re-adding one at a time. Sometimes it keeps the high CPU on the switches (As we start seeing VRRP mac flaps as if something goes into a loop) and other times it's fine, but what is consistent is the following:
We have a MGT vlan 2 , which is a L3VPN on the Juniper's . This is obviously carried on the Trunk ports between the 2 Cisco Switches. When I allow all vlan's on the Uplink Trunk ports on the Cisco to the Juniper and bring up Vlan 2 , I get packet loss and slow SSH response to the 2 switches( CPU remains normal on the Cisco Switches i.e around 11%) on this MGT vlan . If I only allow vlan 2 on the Trunk/Port channel up to LONIR2(MX960) , It's fine. I don't get packet loss or slow SSH,telnet etc etc response times. This only happens on the link between Cisco switch BRIIS2 (4948) and LONIR2(MX960). This is an issue occuring on 2 sites we currently have. The only common factor is that Virgin/NTL provide the tail circuit and that it goes through our LONIR2 router.
The packet loss and slow response I experience is when I’m coming from our MGT server that resides behind a MGT VPN . This VPN is overlapped with this MGT VPN to get to the Bristol site .
We currently have 16 sites with the same setup , some have LACP enabled on the links(as some of the Telco carriers don't provide Link Loss forwarding) and some we have not enabled. On the sites this is occuring on , the one has LACP enabled and the other does not, so I don't believe this has anything to do with traffic across the LACP link.
When testing all hops along the way I don’t get any packet loss or slow response , it’s only end-to-end.
We've now run packet captures on both ends of the link and noticed No PVST+ information is coming across from the Bristol end.
Any suggestion are more than welcome