getting continous B3-TCA alarms in SONET links

Unanswered Question

Dear team,

               Iam getting continous alarma in my SONET links from my 7606 router.

Jan  2 13:04:23.637: %SONET-4-ALARM:  POS3/2/0: B3-TCA declared

Jan  2 13:04:33.645: %SONET-4-ALARM:  POS3/2/0: B3-TCA cleared

Jan  2 13:16:22.637: %SONET-4-ALARM:  POS3/2/0: B3-TCA declared

Jan  2 13:16:32.637: %SONET-4-ALARM:  POS3/2/0: B3-TCA cleared

Jan  2 14:22:26.677: %SONET-4-ALARM:  POS3/3/0: B3-TCA declared

Jan  2 14:22:26.677: %SONET-4-ALARM:  POS3/3/1: B3-TCA declared

Jan  2 14:22:36.681: %SONET-4-ALARM:  POS3/3/0: B3-TCA cleared

Jan  2 14:22:36.681: %SONET-4-ALARM:  POS3/3/1: B3-TCA cleared

Jan  2 20:41:49.779: %SONET-4-ALARM:  POS3/2/0: B3-TCA declared

Jan  2 20:41:59.799: %SONET-4-ALARM:  POS3/2/0: B3-TCA cleared

Jan  2 20:47:11.305: %SONET-4-ALARM:  POS3/2/0: B3-TCA declared

Jan  2 20:47:24.325: %SONET-4-ALARM:  POS3/2/0: B3-TCA cleared

Jan  2 21:05:22.831: %SONET-4-ALARM:  POS3/2/0: B3-TCA declared

Jan  2 21:05:32.843: %SONET-4-ALARM:  POS3/2/0: B3-TCA cleared

Jan  2 21:31:56.729: %SONET-4-ALARM:  POS3/2/0: B3-TCA declared

Do any body having some clarification on this???

Thanks in advance

Mahesh

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Giuseppe Larosa Mon, 01/04/2010 - 02:45

Hello Mahesh,

I see that multiple links are involved and this leads to a different approach if it was a single link I would say contact the service provider.

What is your hardware ( SIP xx with SPA) and the IOS version you are running on the 7609?

And how are configured the POS ports?

about the error: an error theshold is reached and then cleared. POS interfaces have mutiple bit error rate tests running on their framing structure.

Edit:

error message decoder doesn't provide more information

1. %L2-SONET-4-ALARM   [chars]: [chars] [chars] The specified SONET Alarm has been declared or released.

Recommended Action: Recommended action is to repair the source of the alarm.

Related documents- No specific documents apply to this error message.

the datasheet provides some info:

https://www.cisco.com/en/US/prod/collateral/modules/ps2831/ps4373/product_data_sheet09186a008009223e_ps368_Products_Data_Sheet.html

Path:
Path Alarm Indication Signal (PAIS)
Path Remote Defect Indication (PRDI)
Path Remote Error Indication (PREI)
Error Counts for B3
>>>>>> Threshold Crossing Alarms (TCA) for B3
Loss of Pointer (LOP)
New Pointer Events (NEWPTR)
Positive Stuffing Event (PSE)
Negative Stuffing Event (NSE)
Path Unequipped Indication Signal (PUNEQ)
Path Payload Mismatch (PPLM) Indication Signal

your error is at the SONET/SDH path level it may be wise to contact the service provider.

Hope to help

Giuseppe

marikakis Tue, 01/05/2010 - 08:17

Hi Mahesh,

I once again agree with Giuseppe. Path level alarm practically means 'somewhere away' from your router's immediate cable connection to the provider network. It doesn't take a lot of time for the alarms to clear (usually 10 seconds). If you aren't getting any other close alarms, it is better to contact your circuit provider. Do your interfaces get into up/down or down/down state? Down/down means check your cables or your router (however it's not very likely that all your cables suddenly became faulty). Up/down means your cables are most possibly fine, and in this case I wouldn't recommend starting to remove your cables or doing shut/no shuts and things like that. This will only confuse the issue and your provider about where the problem lies, and you might get late resolution. Sometimes, an original problem in the provider network (resulting in some difference in power levels received on your side for example) might cause your device to subsequently get somewhat crazy, and in this case you might need to also do something on your side for things to get back to normal, but that's not a very likely possibility. The most likely cause for such issues is a problem in the provider network that might cause many circuits to be re-routed in a short period of time or some clocking issue or anything. The good thing about this case is that your circuit provider most likely already knows about the issue, since many customers will be calling.

Kind Regards,

Maria

Edit: By the way, if you do monitor the RTT of your links, you might see the re-routing in action by observing differences in the RTT of your circuits after each alarm. This is because each path within a circuit provider's network doesn't genarally have the same delay.

viyuan700 Tue, 01/05/2010 - 11:57

"Path level alarm practically means 'somewhere away' from your router's immediate cable connection to the provider network."

"1. %L2-SONET-4-ALARM   [chars]: [chars] [chars] The specified SONET Alarm has been declared or released.

Recommended Action: Recommended action is to repair the source of the alarm."

Hi Maria,

Path level alarm doesnot mean that away from router's immediate cable. Giuseppe message recommends repair source of the alarm. Source of the alarm here is the POS card in the Router. There are 3 BIP's in SONET/SDH.

B1 checks the section and used by Service Provider equipment see the (Between 2 immediate equipments can be MUX or Regenerator).

B2 check the line and used by Service provider equipment.(Between 2 immediate MUX ONLY)

B3 checks the Path means this value is given by the router's card and checked by the other end router card and these bits are not accessed by service provider equipments.

I am not trying to say that it cannot be changed in service provider network as Path level doesnot mean the way you explained. This is my understanding of SONET , i can be wrong.

Service provider is not always the culprit though most of the time they are (you can check messages here on this forum where it is told that Service Provider Tech/Engineer dont know things etc etc). Wasted 3-4 days to prove a customer that his ATM card in router is not working whereas he was suspecting Service Provider equipment.

marikakis Tue, 01/05/2010 - 12:58

Hi,

I might have not been very technically accurate with the expression I used in parenthesis. However, I also talked about the possibility of other 'closer' alarms. By that I meant a Loss Of Signal for example. I do not have any prejudice about circuit provider engineers. It's only that from my experience there have been only a few cases with router issues, and many more cases with circuit provider issues. I keep all options open. I have used the words 'likely' and 'possibly' many times. The reason for this is the following: When you have a problem with a link, you might be on the phone with many people (local operators, remote router operators, and maybe more than one circuit provider involved in the end-to-end circuit). Such situations tend to get messy not because someone is stupid, but when too many people are involved, it gets harder to understand what's going on. So, you try to work with possibilities because you have to in order to speed things up. Of course the first question should be to the local operators: did you guys do anything or should I start calling outside?

Back to this case: We do not know the remote termination points of those 3 circuits that have the alarms. If they are on the same router, then the remote router should be checked as well as you said. If they terminate in different routers, then it seems 'unlikely' to me (but not impossible) for all 3 remote routers to simultaneously have problems. The local router should be checked just in case, but the duration of the errors, the fact that 2 of the links reported errors simultaneously in one occasion, and the fact that not all alarms happened close in time make me suspect either a circuit provider issue or even provider maintenance procedure (if we also take into account the day the alarms occurred, Saturday, Jan 2) or circuit provider issue followed by a maintenance procedure to put things back to work. All those are just guesses of course, and by no means have I found the specific provider equipment at fault.

Kind Regards,

Maria

viyuan700 Tue, 01/05/2010 - 14:14

Hi,

I agree 110% that most of the time it is SP who ahve problems as there are many equipments involved. Well you have no prejudice to SP but there are few here who have.

There are only B3 alarms no other alarm thats why i was thinking to check the cables at both ends also and service provider option was already told.

marikakis Tue, 01/05/2010 - 14:29

I have seen cases where prejudices might cause issues to sustain because someone refuses to check their router in a similar way as you mentioned or considers the router an error-free machine. I have also seen problem in router right after problem in circuit provider was resolved (had to remove cables and re-insert them for things to go back to normal). Prejudices do not help to get issues resolved, so do not worry about what others say, and let's get this issue resolved!

vmiller Tue, 01/05/2010 - 12:19

Lets get a show controllers pos 3/2/0 and see what we see there.

Also, the show int POS 3/2/0.

Actions

This Discussion