Solved: Re: Setting up Call Home help

sgmorale1 · ‎12-12-2013

Hi,

So I'm trying to set up email notifications for one of our UCS setups and I'm running into some issues, or rather, im not sure what im running into.

Let me paint a picture and see if anyone can help me figure out what is wrong.

I am under the asumption that Call Home can be set up internally, without the need to contact cisco, for the purpose of recieving email alerts whenever there are any issues in the system. (the cisco assisted one being smart call home, which im not interested in at the moment). Is this a correct assumption?

Requirements for setting up call home ask for ip conectivity between the fabric interconects and the smtp server. Currently the fabrics and the mail server are on different vlans, however we have access lists to allow smtp traffic through those vlans. In addition, there's precedent of other servers on the same vlan as the fabrics contacting the smtp server to send out notifications; so it is confirmed that traffic of that nature flows between those two vlans.

So on to the configuration.

On the general tab for call home my email is on the email field for Contact Information as well as in the From and Reply To fields. The smtp server ip and port 25 are also specified in the corresponding section.

On the profiles tab theres the 3 defautl profiles (CiscoTAC-a, full_txt, short_txt) plus an adutional profile created by me. Alert groups in the extra profile are Cisco Tac, Diagnostic, and Environmental. No other options are available. it also has "Warning" as its level, Full Txt format, and my email as recipient.

In the Policies tab I enabled aquipment-degraded, equipment inoperable, equipment-problem, and link-down.

So, knowing these settings, should I wexpect email notifications when a disk goes down or, in the case of link-down, when a host restarts(reason I added that one in, to test)? Or am i missing something. Any other info that I could provide to better understand my predicament?

Thanks in advance,

Stephen

--
Stephen Garcia
Ringling College of Art and Design

-- Stephen Garcia Ringling College of Art and Design

Keny Perez · ‎12-13-2013

Stephen,

Yes, Call Home is just internal while Smart Call Home is the one that opens case with TAC and you need to pay for that feature. So you are right with your first assumption.

In regards to the connection between the Fabrics and the SMTP server, run a ping from the FIs and check connectivity, if it works you should be fine. The commands, should be:

UCS# connect local-mgmt

B(local-mgmt)# ping x.x.x.x <<< If you do not specify the amount of pings to send with the "count", the ping will be an extended ping ( equivalent to -t)

The Call Home settigns look totally fine to me, but for all I know (IMHO) disks inoperable are NOT reported by Call Home while a link down is definitely reported.

Just remeber that the Policies not only have to be created, they need to be enabled too; and also remeber that the system will send only messages clasified according to the level you specified (warning) in the profile you created.

You can also use this document for further troubleshooting, like sending a "System Inventory" and check it went through.

http://www.cisco.com/en/US/products/ps10280/products_tech_note09186a0080bef123.shtml#topic1 <<< See "Troubleshooting methodology".

I hope this helps and if it did, please rate it.

-Kenny

Cisco Support Community is also present in Spanish:

https://supportforums.cisco.com/community/spanish/data_center

View solution in original post

Keny Perez · ‎12-13-2013

Stephen,

Yes, Call Home is just internal while Smart Call Home is the one that opens case with TAC and you need to pay for that feature. So you are right with your first assumption.

In regards to the connection between the Fabrics and the SMTP server, run a ping from the FIs and check connectivity, if it works you should be fine. The commands, should be:

UCS# connect local-mgmt

B(local-mgmt)# ping x.x.x.x <<< If you do not specify the amount of pings to send with the "count", the ping will be an extended ping ( equivalent to -t)

The Call Home settigns look totally fine to me, but for all I know (IMHO) disks inoperable are NOT reported by Call Home while a link down is definitely reported.

Just remeber that the Policies not only have to be created, they need to be enabled too; and also remeber that the system will send only messages clasified according to the level you specified (warning) in the profile you created.

You can also use this document for further troubleshooting, like sending a "System Inventory" and check it went through.

http://www.cisco.com/en/US/products/ps10280/products_tech_note09186a0080bef123.shtml#topic1 <<< See "Troubleshooting methodology".

I hope this helps and if it did, please rate it.

-Kenny

Cisco Support Community is also present in Spanish:

https://supportforums.cisco.com/community/spanish/data_center

sgmorale1 · ‎12-13-2013

Kenny,

Thanks for the help tips. I managed to get the debug information for the inventory check test as well as packet captures and im getting some timeout messages. I also checked on our mail appliance and the entire subnet where the FIs are is both listed as trusted relay and allowed to send emails, for good measure.

However, after the FI tries to send the message, this is what I get back:(note taht allAllerts is my added profile and that i raised the warning level for cisco-tac1 so it wouldnt try an send to that one)

...

...(tail of output)

...

2013 Dec 13 11:21:57.901619 callhome: Destination profile allAllerts

2013 Dec 13 11:21:57.901875 callhome: IN send_callhome_email_mesg with msg_id :1376941062

2013 Dec 13 11:21:57.902153 callhome: executing command:mv /isan/etc/callhome.d/workspace//workspaceyqECY9 /callhome/spool/J47pAb

2013 Dec 13 11:21:57.931028 callhome: In function schedule_procjob for msg_id: 1376941062, alert: SAM_ALERT_INVENTORY_NORMAL

2013 Dec 13 11:21:57.931274 callhome: calling procjob_fork for msg_id: 1376941062, alert: SAM_ALERT_INVENTORY_NORMAL

2013 Dec 13 11:21:57.938574 callhome: Callhome handle 21512 added to procjob list

2013 Dec 13 11:21:57.938806 callhome: retcode[0] 23 : retcode[1] 0

2013 Dec 13 11:21:57.939024 callhome: EXIT transport_callhome_mesg

2013 Dec 13 11:21:57.939242 callhome: format_and_send successfull for transport method 1 , msg_id 1376941062, destination_profile allAllerts

2013 Dec 13 11:21:57.939474 callhome: At least one message sent successfully

2013 Dec 13 11:21:57.939716 callhome: Exiting send_callhome_mesg

2013 Dec 13 11:21:57.940146 callhome: Successfully formatted and sent the message for alert : SAM_ALERT_INVENTORY_NORMAL, message id : 1376941062

2013 Dec 13 11:21:57.940373 callhome: Sending the message for alert : SAM_ALERT_INVENTORY_NORMAL, message id : 1376941062

2013 Dec 13 11:21:57.940654 callhome: alert group ALL has been configured for dest profile full_txt

2013 Dec 13 11:21:57.941666 callhome: IN dispatch_callhome_mesg with msg_id 1376941062

2013 Dec 13 11:21:57.941978 callhome: executing command:/callhome/spool/J47pAb/script

2013 Dec 13 11:21:57.948214 callhome: alert group ALL has been configured for dest profile full_txt

2013 Dec 13 11:21:57.948541 callhome: alert group ALL has been configured for dest profile full_txt

2013 Dec 13 11:21:57.948860 callhome: alert group ALL has been configured for dest profile full_txt

2013 Dec 13 11:21:57.949182 callhome: mts_dest_profile_conf: Got the destination profile full_txt

2013 Dec 13 11:21:57.949503 callhome: mts_dest_profile_conf: Got the destination profile short_txt

2013 Dec 13 11:21:57.949824 callhome: mts_dest_profile_conf: Got the destination profile CiscoTAC-1

2013 Dec 13 11:21:57.950155 callhome: mts_dest_profile_conf: Got the destination profile allAllerts

wireshark-broadcom-rcpu-dissector: ethertype=0xde08, devicetype=0x0

2013-12-13 11:21:58.011991 {masked.FI.IP} -> {masked.smtp.server.IP} TCP 59882 > smtp [SYN] Seq=0 Len=0 MSS=1460 TSV=623763289 TSER=0 WS=9

2013-12-13 11:22:01.010655 {masked.FI.IP} -> {masked.smtp.server.IP} TCP 59882 > smtp [SYN] Seq=0 Len=0 MSS=1460 TSV=623763589 TSER=0 WS=9

2013-12-13 11:22:07.010139 {masked.FI.IP} -> {masked.smtp.server.IP} TCP 59882 > smtp [SYN] Seq=0 Len=0 MSS=1460 TSV=623764189 TSER=0 WS=9

2013-12-13 11:22:19.009151 {masked.FI.IP} -> {masked.smtp.server.IP} TCP 59882 > smtp [SYN] Seq=0 Len=0 MSS=1460 TSV=623765389 TSER=0 WS=9

2013-12-13 11:22:43.007164 {masked.FI.IP} -> {masked.smtp.server.IP} TCP 59882 > smtp [SYN] Seq=0 Len=0 MSS=1460 TSV=623767789 TSER=0 WS=9

2013 Dec 13 11:22:43.038224 callhome: timeout 13 11:21:58 2013> <4> <4> Command line arguments (14): <4> ^I0: smtpclient <4> ^I1: -f <4> ^I2: {masked@email} <4> ^I3: -r <

2013 Dec 13 11:22:43.038532 callhome: smtp failure

2013 Dec 13 11:22:43.039716 callhome: In cleanup_spool:

2013 Dec 13 11:22:43.039943 callhome: executing command:rm -rf /callhome/spool/J47pAb

2013 Dec 13 11:22:43.043505 callhome: Exiting dispatch_callhome_mesg ret_val

2013 Dec 13 11:22:43.046040 callhome: procjobcb_job_done: Handle 21512 syserr "SUCCESS" ret_info_size 2120 ret_syserr 0 ret_info b7ee0820

2013 Dec 13 11:22:43.046275 callhome: procjob callback called for alert : SAM_ALERT_INVENTORY_NORMAL, message id : 108270162, with procjob handle 21512

2013 Dec 13 11:22:43.046495 callhome: Found procjob handle 21512 with msg_id 1376941062

2013 Dec 13 11:22:43.046903 callhome: IN CALLHOME_SEND_EMAIL

2013 Dec 13 11:22:43.047127 callhome: problem in transporting the message Error in transporting email message for allAllerts SMTPclient: sockfd opened...:4

2013 Dec 13 11:22:43.047248 callhome: Unable to send callhome message for alert : SAM_ALERT_INVENTORY_NORMAL, message id : 1376941062

2013 Dec 13 11:22:43.047261 callhome: The specified message level for destination profile: full_txt is higher than the level for alert SAM_ALERT_INVENTORY_NORMAL(1) The specified message level for destination profile: short_txt is higher than the level for alert SAM_ALERT_INVENTORY_NORMAL(1) The specified me

2013 Dec 13 11:22:43.047416 callhome: mts response sent for alert : SAM_ALERT_INVENTORY_NORMAL, message id : 1376941062

2013 Dec 13 11:22:43.047436 callhome: setting error code for last callhome message to 1077608465

2013 Dec 13 11:22:43.047537 callhome: No of elements in the process list is 0

UCS2-6296UP-A(nxos)# 5 packets captured

I'm not sure where it's getting stuck. this is with all tcp traffic allowed

--
Stephen Garcia
Ringling College of Art and Design

-- Stephen Garcia Ringling College of Art and Design

Keny Perez · ‎12-13-2013

Stephen,

3 Things....

1-Can you change, in the profile, the "Level" from "Warning" to "Normal" and try to send the inventory again?

2-What version of UCSM are you running? There is a bug

https://tools.cisco.com/bugsearch/bug/CSCtk84080 that was affecting some environments cause the emails were too big cause basically they were over 1MB in size for each chassis present in the domain, so the particular customer's email rules did not allow the email to be sent. For example a domain with 18 chassis would represent 18MB to send the inventory email...

3-What happens when you ping the SMTP server from the FIs?

Last but not least, you might need to open a TAC case using the keyword "Error Logs and Messages" to analyze the output you shared, in particular for the messages below:

2013 Dec 13 11:22:43.038532 callhome: smtp failure

013 Dec 13 11:22:43.047127 callhome: problem in transporting the message Error in transporting email message for allAllerts SMTPclient: sockfd opened...:4

2013 Dec 13 11:22:43.047261 callhome: The specified message level for destination profile: full_txt is higher than the level for alert SAM_ALERT_INVENTORY_NORMAL(1) The specified message level for destination profile: short_txt is higher than the level for alert

-Kenny

sgmorale1 · ‎12-13-2013

Well Kenny, thanks for the help. Getting that debugging information helped out a lot. Had to step back and look at the bigger picture to get to the bottom of it. Turns out it was the gateway between those vlans. Kinda silly after solvign the issue, but its workign beautifully now. I just have to polish up the severity and types of errors that get sent though; and next hard disk that fails I'll post back if it triggered a notification or not for clarification.

-Stephen

--
Stephen Garcia
Ringling College of Art and Design

-- Stephen Garcia Ringling College of Art and Design

Keny Perez · ‎12-13-2013

Stephen,

My pleasure, I am glad to see the community helped you solved the first issue posted here.

Hope to see you replying soon.

-Kenny