This document describes Network Time Protocol (NTP) for Cisco Unified Communications Manager (CUCM). This document will cover the purpose of NTP with CUCM, configuration of NTP, what data to collect for troubleshooting, example analysis of the data, and related resources for additional research.
There are no specific requirements for this document.
This document is not restricted to specific software and hardware versions as currently supported software and hardware include this feature.
The purpose of NTP with CUCM is to ensure the servers are aware of the correct time. Timing for CUCM is important because Voice Over Internet Protocol (VOIP) is extremely sensitive to time. Another reason time is important for CUCM is that a cluster of servers must maintain a time syncronization which remains close to the other servers in the cluster. This is due to database replication requirements. Lastly, time is important for troubleshooting as we want to have the correct timestamps in the logs.
It is important to note that CUCM requires certain NTP servers. Windows NTP server is not supported for CUCM; however, other types such as Linux NTP sources, Cisco IOS NTP sources, and Nexus OS NTP sources are acceptable. Although other Cisco Solution may utilize Windows Servers for the NTP solution, UC Solutions such as CallManager, Unity, IM&P, etc., are unable to do so, and require either a Linux based, or IOS based NTP solution. This is because Windows Time Services often use SNTP which Linux systems have difficulty synchronizing to.
The CUCM publisher needs an NTP source which is not a member of the CUCM cluster; therefore the CUCM publisher synchronizes it's time with the NTP server. In this exchange the CUCM publisher is an NTP client.
The CUCM subscribers synchronize their time with the CUCM publisher. In this exchange the CUCM publisher is an NTP server wheres the CUCM subscribers are NTP clients.
When CUCM is being installed there is a prompt to determine if the server is the first node in the cluster.
If the server is not the first node in the cluster, then the installation wizard will move past the NTP configuration phase; however, you will be prompted for the NTP server(s) if it is the first node in the cluster.
When troubleshooting an NTP issue you should collect the following data from whichever CUCM server(s) are experiencing NTP issues:
CUCM Publisher:
Version: 11.5.1.15900-18
FQDN: cucm-115.home.lab
IP addresses: 192.168.7.100
Google NTP Server:
FQDN: time1.google.com.ntp
IP addresses: 216.239.35.0
Notice the port number is 123. This is the port for NTP. In the output from the command below we can see the NTP version is 4 as noted by the "NTPv4". We can also take note of the publisher acting as a client when communicating with "time1.google.com"; however, it works as a server when communicating with cucm-sub1 / cucm-sub2 / cucm-sub3.
From the CLI of the publisher execute the command "Utils network capture port 123" Wait until you see traffic (this can take a little time, or it may be instant) then hit
ctrl+c. Look in the traffic to find where your publisher is communicating with its NTP
server and the NTP server is communication with the publisher (if the NTP server isn't
replying then it is an issue in the network or with the NTP server). The primary focus of
this output is the NTP version. In CUCM 9 and later NTP version 3 (NTPv3) can cause issues
and an NTP source using NTPv4 should be the NTP server for the publisher.
admin:utils network capture size all count 10000000 port 123 Executing command with options: size=128 count=1000 interface=eth0 src=dest= port=123 ip= 16:08:43.199710 IP cucm-sub3.home.lab.39417 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:08:43.199737 IP cucm-115.home.lab.ntp > cucm-sub3.home.lab.39417: NTPv4, Server, length 48 16:08:43.199823 IP cucm-sub3.home.lab.39417 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:08:43.199859 IP cucm-115.home.lab.ntp > cucm-sub3.home.lab.39417: NTPv4, Server, length 48 16:09:01.640980 IP cucm-115.home.lab.50141 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.654675 IP time1.google.com.ntp > cucm-115.home.lab.50141: NTPv4, Server, length 48 16:09:01.654733 IP cucm-115.home.lab.50141 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.667368 IP time1.google.com.ntp > cucm-115.home.lab.50141: NTPv4, Server, length 48 16:09:01.668612 IP cucm-115.home.lab.50141 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.681366 IP time1.google.com.ntp > cucm-115.home.lab.50141: NTPv4, Server, length 48 16:09:01.681518 IP cucm-115.home.lab.50141 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.694108 IP time1.google.com.ntp > cucm-115.home.lab.50141: NTPv4, Server, length 48 16:09:01.875016 IP cucm-115.home.lab.48422 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.884476 IP cucm-sub3.home.lab.58072 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:01.884568 IP cucm-115.home.lab.ntp > cucm-sub3.home.lab.58072: NTPv4, Server, length 48 16:09:01.884954 IP cucm-sub3.home.lab.58072 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:01.884999 IP cucm-115.home.lab.ntp > cucm-sub3.home.lab.58072: NTPv4, Server, length 48 16:09:01.885381 IP cucm-sub3.home.lab.58072 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:01.885423 IP cucm-115.home.lab.ntp > cucm-sub3.home.lab.58072: NTPv4, Server, length 48 16:09:01.886147 IP cucm-sub3.home.lab.58072 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:01.886184 IP cucm-115.home.lab.ntp > cucm-sub3.home.lab.58072: NTPv4, Server, length 48 16:09:01.888555 IP time1.google.com.ntp > cucm-115.home.lab.48422: NTPv4, Server, length 48 16:09:01.888642 IP cucm-115.home.lab.48422 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.900926 IP time1.google.com.ntp > cucm-115.home.lab.48422: NTPv4, Server, length 48 16:09:01.901017 IP cucm-115.home.lab.48422 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.913497 IP time1.google.com.ntp > cucm-115.home.lab.48422: NTPv4, Server, length 48 16:09:01.913566 IP cucm-115.home.lab.48422 > time1.google.com.ntp: NTPv4, Client, length 48 16:09:01.926693 IP time1.google.com.ntp > cucm-115.home.lab.48422: NTPv4, Server, length 48 16:09:02.038981 IP cucm-sub2.home.lab.42078 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.039117 IP cucm-115.home.lab.ntp > cucm-sub2.home.lab.42078: NTPv4, Server, length 48 16:09:02.039281 IP cucm-sub2.home.lab.42078 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.039345 IP cucm-115.home.lab.ntp > cucm-sub2.home.lab.42078: NTPv4, Server, length 48 16:09:02.039434 IP cucm-sub2.home.lab.42078 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.039535 IP cucm-115.home.lab.ntp > cucm-sub2.home.lab.42078: NTPv4, Server, length 48 16:09:02.039607 IP cucm-sub2.home.lab.42078 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.039814 IP cucm-115.home.lab.ntp > cucm-sub2.home.lab.42078: NTPv4, Server, length 48 16:09:02.066544 IP cucm-sub1.home.lab.46400 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.066622 IP cucm-115.home.lab.ntp > cucm-sub1.home.lab.46400: NTPv4, Server, length 48 16:09:02.066751 IP cucm-sub1.home.lab.46400 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.066892 IP cucm-115.home.lab.ntp > cucm-sub1.home.lab.46400: NTPv4, Server, length 48 16:09:02.066968 IP cucm-sub1.home.lab.46400 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.067104 IP cucm-115.home.lab.ntp > cucm-sub1.home.lab.46400: NTPv4, Server, length 48 16:09:02.067155 IP cucm-sub1.home.lab.46400 > cucm-115.home.lab.ntp: NTPv4, Client, length 48 16:09:02.067189 IP cucm-115.home.lab.ntp > cucm-sub1.home.lab.46400: NTPv4, Server, length 48
The filter I used is udp.port == 123. With that filter I could see the CUCM publisher communicating with the Google NTP server and I could see the CUCM publisher communicating with the CUCM subscribers.
NOTE: All nodes will show the current time in UTC regardless of the time zone of the server
(listed below UTC time). This makes it easy to compare times on the different CUCM nodes.
NOTE: If there is a time difference of 15 minutes or more, it is expected that DB replication
will be broken
1) If the publisher is ahead by 15 minutes, this can result in the pub sending data to the
sub and the sub would have a delay in processing the data because it has not yet reached the time
in the timestamp of the packets from the publisher (this is expected behavior in this type of situation)
2) If the subscriber is ahead by 15 minutes, this would result in the subscriber dropping
the data from the publisher because the subscriber sees it as old data (15 minutes old)
admin:utils ntp status ntpd (pid 28435) is running... remote refid st t when poll reach delay offset jitter ============================================================================== 216.239.35.0 .GOOG. 1 u 44 64 3 11.724 -0.021 0.064 unsynchronised polling server every 8 s Current time in UTC is : Fri Sep 6 20:54:50 UTC 2019 Current time in America/New_York is : Fri Sep 6 16:54:50 EDT 2019 admin:
The box below explains the information in the box above.
The very first column contains the "tally code" character. Short overview: * the source you are synchronized to (syspeer) # source selected, distance exceeds maximum value o the PPS(Pulse Per Second) source if your ntpd (ppspeer, only if you have a PPS capable system and refclock) + candidate, i.e. it is considered a good source - outlyer, i.e. quality is not good enough x falseticker, i.e. this one is considered to distribute bad time blank: source discarded, failed sanity See the Select field of the Peer status word on the NTP Event Messages and
Status Words page for more information on the tally codes. remote
the hostname or IP of the remote machine. refid
the identification of the time source to which the remote machines is synced.
May be (for example) a radio clock or another ntp server) st
the stratum of the remote machine. 16 is "unsynchronized". 0 is the best
value, that could be (for example) a radio clock or the ntp servers private
caesium clock (see http://www.eecis.udel.edu/~mills/ntp/html/index.html#intro
for more information about ntp in general). t
types available: l = local (such as a GPS, WWVB) u = unicast (most common) m = multicast b = broadcast - = netaddr when
how many seconds since the last poll of the remote machine. poll
the polling interval in seconds. reach
an 8-bit left-rotating register. Any 1 bit means that a "time packet" was
received. The right most bit indicate the status of the last connection
with the NTP server. It is Octal number. Use calculator in progammer
interface to translate from OCT to BIN: For example 377 translates to
11111111. Each 1 means a successful connection to the NTP server. If you
just start a NTP service, and it connects successfully with its server, this
number will change as follows (assuming connectivity is good): 00000001 = 001 00000011 = 003 00000111 = 007 00001111 = 017 00011111 = 037 00111111 = 077 01111111 = 177 11111111 = 377 delay
the time delay (in milliseconds) to communicate with the remote. offset
the offset (in milliseconds) between our time and that of the remote. jitter
the observed jitter (in milliseconds) of time with the remote.
The explanations above come from http://support.ntp.org/bin/view/Support/TroubleshootingNTP#Section_9.4.
admin:utils diagnose test Log file: platform/log/diag1.log Starting diagnostic test(s) =========================== test - disk_space : Passed (available: 6463 MB, used: 12681 MB) skip - disk_files : This module must be run directly and off hours test - service_manager : Passed test - tomcat : Passed test - tomcat_deadlocks : Passed test - tomcat_keystore : Passed test - tomcat_connectors : Passed test - tomcat_threads : Passed test - tomcat_memory : Passed test - tomcat_sessions : Passed skip - tomcat_heapdump : This module must be run directly and off hours test - validate_network : Passed test - raid : Passed test - system_info : Passed (Collected system information in diagnostic log) test - ntp_reachability : Passed test - ntp_clock_drift : Passed test - ntp_stratum : Passed skip - sdl_fragmentation : This module must be run directly and off hours skip - sdi_fragmentation : This module must be run directly and off hours Diagnostics Completed The final output will be in Log file: platform/log/diag1.log Please use 'file view activelog platform/log/diag1.log' command to see the output admin:
If NTP were to fail in the utils diagnose test output you would see something similar to this:
admin:utils diagnose test Log file: platform/log/diag1.log Starting diagnostic test(s) =========================== test - disk_space : Passed (available: 6463 MB, used: 12681 MB) skip - disk_files : This module must be run directly and off hours test - service_manager : Passed test - tomcat : Passed test - tomcat_deadlocks : Passed test - tomcat_keystore : Passed test - tomcat_connectors : Passed test - tomcat_threads : Passed test - tomcat_memory : Passed test - tomcat_sessions : Passed skip - tomcat_heapdump : This module must be run directly and off hours test - validate_network : Passed test - raid : Passed test - system_info : Passed (Collected system information in diagnostic log) test - ntp_reachability : Warning The NTP service is restarting, it can take about 5 minutes. test - ntp_clock_drift : Warning The local clock is not synchronised. None of the designated NTP servers are reachable/functioning or legitimate. test - ntp_stratum : Warning The local clock is not synchronised. None of the designated NTP servers are reachable/functioning or legitimate. skip - sdl_fragmentation : This module must be run directly and off hours
run sql select pkid,name,dbinfo('utc_to_datetime', cdrtime) as CDRTIME from device where cdrtime > getCurrTime()
This command compares the current time to the cdrtime (when the table was modified). If the customer used a bad NTP during install/upgrade, then corrected the NTP, the database will go out of sync every time a change is made. This issue would not been seen when viewing your typical NTP commands (i.e. utils ntp status) because the customer has since moved away from the bad NTP source to a good one.
It would be good that the customer moved away from the bad NTP to a good one; however, simply moving to a good NTP source would not fix the tables that were created during install/upgrade.
When one runs this command the expected output should look like this:
admin:run sql select pkid,name,dbinfo('utc_to_datetime', cdrtime) as CDRTIME from device where cdrtime > getCurrTime() pkid name cdrtime ==== ==== ======= admin:
If you have output similar to the output below, it is a sign that the NTP used during install/upgrade should not have been used and has caused problems that will affect database replication (the command was ran in the year 2015):
admin:run sql select pkid,name,dbinfo('utc_to_datetime', cdrtime) as CDRTIME from device where cdrtime > getCurrTime() pkid name cdrtime ============================= ===== ===================== bf80dd31-9911-43ce-81fd-a99ec0333fb5 MTP_2 2016-09-11 14:38:14.0 4c38fc05-760d-4afb-96e8-69333c195e74 CFB_2 2016-09-11 14:38:14.0 90878c80-e213-4c7e-82b9-6c780aac72f3 ANN_2 2016-09-11 14:38:14.0 08b5bff4-da94-4dfb-88af-ea9ffa96872c MOH_2 2016-09-11 14:38:14.0 93320e4d-1b73-4099-9a7c-c4cddfadb5d9 MTP_3 2016-09-11 14:38:14.0 a6850d42-5f0a-49ce-9fa3-80d45b800e23 CFB_3 2016-09-11 14:38:14.0 9963c9cb-58b0-4191-93e1-8676584f6461 ANN_3 2016-09-11 14:38:14.0 def79fb7-c801-4fb3-85fb-4e94310bf0bd MOH_3 2016-09-11 14:38:14.0 4cd64584-089b-4331-9291-79774330cbc 2 MTP_4 2016-09-11 14:38:14.0 27b18882-db83-4d14-8bce-d3f8dc439610 CFB_4 2016-09-11 14:38:14.0 a40da882-e04f-4649-b2eb-2f79d1289e81 ANN_4 2016-09-11 14:38:14.0 36575ff4-cdea-4945-87e7-638cc555463e MOH_4 2016-09-11 14:38:14.0
The pcap collected from the lab setup is attached to this TechZone.