|
OpenBSD General Other questions regarding OpenBSD which do not fit in any of the categories below. |
|
Thread Tools | Display Modes |
|
|||
NTP 4.2.8 loses sync, system time far out
Good $TIMEOFDAY,
I am running a number of virtual OpenBSD-Hosts on a VMware-vSphere, with the stock ntpd from packages. Their ntpd.confs are pretty straightforward: Code:
# $OpenBSD: ntpd.conf,v 1.14 2015/07/15 20:28:37 ajacoutot Exp $ # # See ntpd.conf(5) and /etc/examples/ntpd.conf servers pool.ntp.org constraints from "https://www.google.com" losing sync and drifts waaay out. A newstart of ntpd brings the system time back on track, but after a few minutes it loses sync again and drifts out. ntpd ist started with ntpd_flags='-s -v'. Here's a typical output: Code:
Nov 15 13:57:24 n2 ntpd[9622]: ntp engine ready Nov 15 13:57:51 n2 ntpd[94990]: set local clock to Tue Nov 15 13:57:51 CET 2016 (offset 26.320751s) Nov 15 13:57:52 n2 ntpd[9622]: constraint reply from 172.217.16.36: offset -0.877704 Nov 15 13:58:09 n2 ntpd[9622]: peer 52.59.88.68 now valid Nov 15 13:58:13 n2 ntpd[9622]: peer 176.9.253.75 now valid Nov 15 13:58:15 n2 ntpd[9622]: peer 78.46.189.152 now valid Nov 15 13:58:18 n2 ntpd[9622]: peer 87.106.126.46 now valid Nov 15 14:02:22 n2 ntpd[39746]: adjusting local clock by 3.956680s Nov 15 14:02:22 n2 ntpd[9622]: clock is now synced Nov 15 14:02:34 n2 ntpd[9622]: peer 52.59.88.68 now invalid Nov 15 14:06:11 n2 ntpd[39746]: adjusting local clock by 3.816125s Nov 15 14:06:11 n2 ntpd[9622]: clock is now unsynced after restarting ntpd. From that point on, the clock stays out of sync and veers off. An almost identical virtual machine, running on the same host, is unaffected. I'm particularly wondering what's happening here: Code:
Nov 15 14:06:11 n2 ntpd[9622]: clock is now unsynced Matthias Last edited by ocicat; 16th November 2016 at 10:08 PM. Reason: Please use [code] & [/code] tags when posting file contents. |
|
|||
Uh...things to look at, off the top of my head:
VMware can set time on an OpenBSD guest via a sensor. Check your VM settings to see if you have that enabled/disabled on the effected VM vs the others. Do you enable the sensor in any ntpd.conf? I think "sensors *" is in the default config. You can also check the current status of servers and sensors and constraints with "ntpctl -sa". |
|
|||
The ntpd.conf files contain the directive
Code:
sensor * I have two more or less identical OpenBSD-6.0 virtual machines with 100% identical ntpd.conf files. Currently I am watching both. One is drifting around no more than 200 millisecs (normally way below that), the other is losing sync within minutes, and hardly ever gets back into sync (probably because the max allowed deviation is exceeded). Last edited by ocicat; 16th November 2016 at 12:00 PM. Reason: Please use [code] & [/code] tags when posting file contents. |
|
|||
In the VM's configuration on the host, you can enable or disable controlling the guest's clock. That's that only other thing I can think to check.
Also try finding a local timeserver (or two) and using that instead of the pool servers. The way the pool works, each system can get different servers at any time. There is no consistancy there for comparison. EDIT: Also, you said "stock ntpd from packages". Did you install NTPd from packages or do you mean the base openntpd? |
|
|||
Quote:
dmesg|grep vmt has Code:
vmt0 at pvbus0 vmt0 at pvbus0 three of the above entries on the reference host, one entry on another, also unaffected (properly working) host. EDIT: Code:
sysctl | grep hw ... hw.sensors.vmt0.timedelta0=0.042866 secs, OK, Tue Nov 15 17:12:41.440 ... Last edited by ocicat; 16th November 2016 at 06:41 PM. Reason: Please use [code] & [/code] tags when posting file contents. |
|
|||
Quote:
Quote:
Quote:
EDIT: In the VM configuration I have found a "synchronize guet stime with host" option. However, it is unchecked in ALL machines. Last edited by MatthiasKoch; 15th November 2016 at 04:32 PM. |
|
||||
The built-in ntpd(8) is not the same as the net/ntp package.
Packages are third party applications that are not included with OpenBSD, but have been ported to OpenBSD. If you are using the built-in ntpd, the daemon is running /usr/sbin/ntpd. If you are using the third party net/ntp package, the daemon is running /usr/local/sbin/ntpd. Which are you using? |
|
|||
Looks like I confused the terminology here... it's the built-in ntpd, running from /usr/sbin/ntpd.
|
|
|||
Quote:
Code:
sensor * Last edited by ocicat; 16th November 2016 at 12:05 PM. Reason: Please use [code] & [/code] tags when posting file contents. |
|
|||
No idea if this is related... I notice that the reference machines readjust their clock frequencies from time to time, like
Code:
adjusting clock frequency by 0.061162 to -9.519647ppm Last edited by ocicat; 16th November 2016 at 12:08 PM. Reason: Please use [code] & [/code] tags when posting file contents. |
|
||||
Quote:
Quote:
Quote:
Is it possible you have discovered a bug? Yes. If so, is it a bug in OpenNTPd? net/ntp? VMWare? vmt(4)? I couldn't even begin to guess. |
|
|||
Quote:
What confuses me most is that both the reference machine and the defective one have been created from the same template. I have already tried moving them from one physical machine to another in the cluster... no effect. A year ago someone reported a very similar problem with FreeBSD which indicates a virtual hardware problem. However it wasn't explained what had been done to the hardware to fix it. |
|
||||
If the problem moves with the guest from host to host, then there is something about that virtual machine which is different than the others. Determine exactly what is different.
|
|
|||
A short update: I have installed net/ntp from packages and found that this one, too, is unable to keep the time. Most likely this narrows it down to a hardware problem. I have removed it again and put the original ntpd back into action.
A typical log excerpt shows that after starting ntpd, it manages to set the clock once. After a short time, the clock loses sync, sometimes (for some reason) gets back into sync before losing it for good. Code:
Nov 16 16:17:09 n2 ntpd[13697]: ntp engine ready Nov 16 16:17:21 n2 ntpd[96214]: set local clock to Wed Nov 16 16:17:21 CET 2016 (offset 12.031074s) Nov 16 16:17:22 n2 ntpd[13697]: constraint reply from 216.58.208.36: offset -0.700312 Nov 16 16:17:46 n2 ntpd[13697]: peer 192.168.2.10 now valid Nov 16 16:18:35 n2 ntpd[16182]: adjusting local clock by 12.031074s Nov 16 16:19:07 n2 ntpd[16182]: adjusting local clock by 0.841569s Nov 16 16:22:53 n2 ntpd[13697]: clock is now synced Nov 16 16:24:28 n2 ntpd[16182]: adjusting local clock by 1.007848s Nov 16 16:30:48 n2 ntpd[16182]: adjusting local clock by 3.983608s Nov 16 16:35:00 n2 ntpd[16182]: adjusting local clock by 3.734702s Nov 16 16:35:00 n2 ntpd[13697]: clock is now unsynced Nov 16 16:39:16 n2 ntpd[16182]: adjusting local clock by 5.437277s Nov 16 16:39:48 n2 ntpd[16182]: adjusting local clock by 6.278187s Nov 16 16:40:50 n2 ntpd[16182]: adjusting local clock by 8.963168s Nov 16 16:44:30 n2 ntpd[16182]: adjusting local clock by 7.865542s Nov 16 16:45:32 n2 ntpd[16182]: adjusting local clock by 7.565163s Last edited by ocicat; 16th November 2016 at 06:36 PM. Reason: Please use [code] & [/code] tags when posting file contents. |
|
|||
For a final check I have disabled ntp on both the defector and the reference machine. After 18 hrs of running without being synced, the defector is off by 19 minutes, the reference machine by 3 seconds.
|
|
|||
This one could be interesting.
vmt0 is present on all of my 6.0 and 5.9 machines (6 machines altogether): # dmesg | grep vmt vmt0 at pvbus0 According to vmt(4), vmt reports the guest's hostname and first non-loopback IP address to the host. On my vSphere Web Client, I see this in the summary panel for every machine: <machinename> Guest OS: OpenBSD6.0 Compatibility: ESXi 5.5 and later (VM version 10) VMware Tools: Running, version:2147483647 (Guest Managed) DNS name: <machine's FQDN> IP Addresses: <machine's IP> Host: <name of physical host> This is consistent for all machines, with the exception of the one that's drifting. In that machine's summary panel, the entry for VMware Tools says VMware Tools: Not running, version:2147483647 (Guest Managed) and the DNS Name and IP Addresses entries are missing. This would indicate that VMware Tools is not running, because a) the entry on the summary panel says so and b) hostname and IP address aren't reported back. Last edited by MatthiasKoch; 17th November 2016 at 01:42 PM. Reason: typo |
|
||||
I'll ask again, then.
What can you discover is different about the virtual machine that has a malfunctioning clock? Does diff() show a difference in kernel messages? In packages installed? In daemons provisioned? If there is no apparent difference, consider comparing all of the configuration files in /etc. |
|
|||
Quote:
# sysctl | grep hw among other things, reveals this: Code:
hw.model=Intel(R) Xeon(R) CPU X5650 @ 2.67GHz X5550. This is definitley a difference. The full output of the defector is Code:
# sysctl | grep hw hw.machine=amd64 hw.model=Intel(R) Xeon(R) CPU X5650 @ 2.67GHz hw.ncpu=1 hw.byteorder=1234 hw.pagesize=4096 hw.disknames=cd0:,sd0:264b5329c6cf44b1,fd0: hw.diskcount=3 hw.sensors.acpiac0.indicator0=On (power supply) hw.sensors.vmt0.timedelta0=-102.742591 secs, OK, Thu Nov 17 15:17:59.983 hw.cpuspeed=2666 hw.vendor=VMware, Inc. hw.product=VMware Virtual Platform hw.version=None hw.serialno=VMware-42 0d 04 ba d8 72 18 e4-f8 30 99 33 56 cb 69 7d hw.uuid=420d04ba-d872-18e4-f830-993356cb697d hw.physmem=4278059008 hw.usermem=4278046720 hw.ncpufound=4 hw.allowpowerdown=1 Code:
cpu at mainbus0: not configured cpu at mainbus0: not configured cpu at mainbus0: not configured By the way, I do appreciate your patience. Seriously. EDIT: It appears that the virtual CPU is originally created as Xeon X5550, and after adding more cores it reports as X5650. Adding more cores to the reference machine changed the type to 5650 there too, but it still runs properly. The change of the CPU type appears to be unrelated, as both machines are running on 5650 now, with their behaviour unchanged. Last edited by MatthiasKoch; 18th November 2016 at 12:26 PM. Reason: change of CPU type probably unrelated |
Tags |
clock, ntpd, virtual machine |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Attack code exploiting critical bugs in net time sync(NTP) puts servers at risk | J65nko | News | 15 | 31st December 2014 06:59 PM |
DoS attacks that took down big game sites abused Web’s time-sync protocol | J65nko | News | 0 | 9th January 2014 07:34 PM |
How to know if the system is in sync | sepuku | OpenBSD Installation and Upgrading | 29 | 8th September 2011 12:24 PM |
vBulletin date/time system | Beastie | Feedback and Suggestions | 6 | 24th March 2010 01:57 AM |
GENERIC.MP kernel failing to boot AMD dual-core system < 75% of the time | JMJ_coder | NetBSD General | 3 | 9th June 2008 01:54 PM |