Time and Clock Synchronization

by on January 23, 2014

Hi all
Time. Don’t you wish you had more of it? If you were a robot you would probably be worrying about having consistent time. Why do we need consistent time? In many robots there are multiple systems/computers that are generating and using data, if the systems/computers are not in sync than the result can be a robot acting on data that is old or that has occurred in the future. In addition to having everything on the robot be in sync with each other, you want to have consistent timing information in your logs. Having consistent logs will make the data be more accurate and easier to post process. Even if there is only one system that uses time in your robot you might need to sync the clock to some absolute global reference.

The timing problem continues even after you perform an initial synchronization of the clocks. The internal crystal oscillators in the computers/devices all drift (at a non-linear rate). The drift rate is based on how good the crystal is, the type of crystal, vibration, shock, and the temperature of the crystal. It should be noted that better crystals/clocks also consume more power and cost more than the ones with higher drift. The video linked at the bottom of this page has a good discussion on crystal oscillator drift.

You can also have a small discussion about absolute vs relative time. In many cases relative time (ie time since the test began) is all you care about, however there are definitely times where you need to know the absolute time (the time that your watch would say). In many embedded systems you might not know what the absolute time is, and the clock might start from 0 when the computer boots up. There are many cases when you are analyzing data and your first-step is subtracting the start time from all of the other times, in those cases relative time works great. However when you are doing a long study during the day and night or if the computers all go down for the night and you want to have an accurate representation of when data was collected you will want absolute time.

So how do we get consistent timing from all of the different systems in the robot? There are several ways that we will talk about including; manual, software based, and hardware based.

Manual

Manually setting timing is the most basic of ways to sync various computers but it is the least reliable and accurate. There are several ways to manually sync computers together. The first is to set the clocks by hand (similar to setting your watch based on a friends watch). Another way is that at the start off each log/test you can start all of the processes at the same time so that all of the logging starts at the same time and you can easily figure out the time offset between systems. A third way is to add a comment (or unique marker) into a log file so you can figure out the exact spot where the times overlap. This method is often used in video production to line up sound and video steams for production.

The big problems with doing this manually is making sure to remember to set the clocks. It is easy to say that you will do it, but it is even easier to forget to sync the clocks. If you forget to sync the clocks it can often be very difficult to figure out the timing in post-processing. You also get reduced accuracy since the synchronization is based on a human doing something. Another big problem with doing this manually is that it does not account for clock drift. System clocks (even crystals) can drift with time based on temperature.

Software Based

Using a software based method for time synchronization lets the computers update their time automatically without you needing to worry about it (you should still check from time to time and make sure it is working). The most common way to sync time is with the Network Time Protocol (NTP). With NTP there are two pieces; the daemon and the server. The daemon is the piece of code that runs on the computer that will get the time from the server and can usually maintain time between systems to a few milliseconds. If the computer boots up at time 0 and you want to sync it to another computer you will often have to run the “ntpdate” command to manually sync the clocks before allowing NTP to run. The server can either be on your local robot network or on an external network. If the robot is able to connect to the internet then you can use a public NTP server-based on an atomic clock for accurate time. It should be noted that if you are doing NTP updates over the internet you might get slightly less accuracy based on network traffic. The other option for the server is to run it on a local machine on the robot. If you do this then you should pick a computer that has an accurate clock (or more accurate than the other clocks). Running the server on a local computer is often the better choice. In an ideal world where you need the “correct” time and not just “synchronized” time each client will be configured to use multiple (at least 4) NTP servers to help with error detection and error estimation of time.

One common problem with NTP is that if the time gets out of sync by 0.128s (the default value) the clock value will jump to the correct value instead of slowly re-achieving synchronization. To change that value you can change the “tinker step” to “0.0” in your ntp.conf. Setting the “tinker step” to 0.0 will disable jumps to correct the time. The downside to this is that if the clock is significantly (even a couple of seconds) different from the servers time it may never be able to catch up and reach the correct time.

!!! If you are using APIC and ACPI or have a variable frequency processor then NTP might not work well since it relies on the processor frequency being constant. You can disable APIC and ACPI in the computers startup configuration file or in the BIOS. !!!

PTP grandmaster clock

This is a sample PTP Grandmaster Clock

Another protocol that can be used is the IEEE 1588 Precision Time Protocol (PTP). This protocol can reach sub-microsecond synchronization between clocks, which can be important for certain systems. With PTP you will want to make sure your network interfaces and switches support it in hardware. If it does not have hardware support you are likely to see similar performance to NTP. The best way to test this is to run both NTP and PTP and compare the results. You can also purchase a dedicated “PTP grandmaster clock” which will give the best software based time keeping for your network (I know it is a hardware box…).

Some possible vendors are Time Tools and Time and Frequency Solutions.

With either of these protocols (and with the hardware options below) you need to minimize latency between the server/master and the client/consumer. For example you will want to have connection be as point-point as possible and minimize switching and routing delays. If possible it is a good idea to have a dedicated network for time information and dedicated ethernet controllers on each system to handle the timing data.

Hardware Based

Within the world of hardware there are some options for time keeping. Probably the easiest to use is GPS. Using a GPS system you can get the current time and many GPS units will also output a Pulse per Second (PPS) signal that can be used to aid in timing for other devices. You can use the time from the GPS to update your system clock as well as use it for your log files. NTP can be configured to use the GPS/PPS signal for improved accuracy. Here is link that walks through configuring a computer to use NTP with GPS http://www.rjsystems.nl/en/2100-ntpd-garmin-gps-18-lvc-gpsd.php. I should point out that the GPS time is not actually the correct time, and there is an approximately 16 second offset ( click here for more details).

With the PTP grandmaster clock from the section above you can feed a GPS signal into it so it can maintain absolute time and it will also output either a PPS signal or a 10Mhz signal.

real time clock
For many embedded systems there is no battery and the computer can not maintain a time between reboots. They often will also exhibit large clock drifts. One way to solve this is via a dedicated clock chip like this one on the left from ST Microelectronics. Many embedded computers will come with a purchasing option to have one installed on the board (or in the case of a PC104, in the stackup).


Here is an interesting video about the drift in crystal oscillators.


Image from http://upload.wikimedia.org/wikipedia/commons/7/70/Wooden_hourglass_3.jpg

Liked it? Take a second to support David Kohanbash on Patreon!
Become a patron at Patreon!

Comments

As more people use virtual machines either onboard or offboard robots, it is important to remember that their system clock does not necessarily stay in synch with the hardware’s system clock. You may have to tune the parameter for hardware clock speed (each VM software is different but usually exposes this). You also want to turn off any power optimization in the BIOS that can throttle hardware clock speed.

Of course, it is best to use a PPS input or GPS time for precision sensor data inside a VM, but still a good idea to make the clock sane.

That’s a good point. I have not really had to deal with virtual machines on our robots yet. While I like the idea of being able to backup the VM in case you need to recover, I am always nervous about interfacing all of the sensors into the VM and any associated latency.

Have you ever had to deal with that?

Modern virtual machines are pretty amazing. Not just useful for recovery, but also developing offboard and deploying to multiple platforms, especially on a USB drive. I haven’t done formal experiments, but I haven’t noticed any issues. My thought would be that if your application is sensitive to the VM’s added latency then you should be running a real-time OS on non-x86 architecture, anyway.

The place I’ve seen the clock discrepancy was a data collection. Fortunately the team was able to recover the data using logged GPS time.

More info on the subject here:
http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

I should have been more specific: I’ve done formal robot tests without latency issues, but I have not isolated the VM’s contribution specifically.

Leave a Reply