In my blog “Causality and Time in MiFID II“, I looked at the central role of global time synchronization in enabling regulatory authorities to reconstruct all events relating to an electronic order, and implicitly in discerning causal relationships between those events. The arcane discipline of precision timekeeping thus becomes an integral activity in engaging in electronic trading and in being compliant with regulations governing it.
While there are timekeeping technologies readily available today that can deliver time-signals to well within the tightest divergences specified in MiFID II (100 microseconds), it is not always easy to achieve this reliably.
Given this, we have found that it is often not sufficient to simply roll out a network-based time-keeping protocol and assume it will work correctly.
At Corvil, we have our own clocks that we use to timestamp network packets, and we can also consume timestamps from other systems, such as packet brokers (also known as tap aggregation switches). We have learned through experience that it is critical for us to constantly run sanity checks on the external sources of timestamps to ensure that the analytics we report from them are at the same quality as those based on our own timestamps.
It should not be a surprise that the core principle we use to do so is causality: we can detect excessive jitter and noise in external clocks from apparent violations in causality that a naive trusting of those timestamps would suggest.
Even when NTP or PTP are well-deployed, it is normal for network congestion and other noise to occasionally cause slave clocks to drift far (relatively speaking) from the master. When a clean network time-signal is restored, the slave might realise its clock has been running too far ahead of the master, and an important decision is required: most implementations of clock-discipline will jump the clock back to being as close as possible to the master.
In some respects, this is a sensible decision, as it corrects the error as quickly as possible. However this can result in a compromised history: an event that occurred before the correction of the clock can end up with a later timestamp than an event after the correction. In the worst case, this could make it look like the later event caused the earlier one, when the opposite is true!
These jumps in clocks are usually very small, but small is a relative term: a few microseconds is not noticeable on a human scale but, when MiFID II mandates microsecond-granular timestamps for some events, such jumps can skew the reporting of critical events.
The lesson here is that it is wise to double-check your selection and configuration of reporting systems, to understand how they discipline their clocks, and to put appropriate sanity checks on the timestamps they record. Again, it is key to check for apparent violations of causality.
Having a well-disciplined clock is the foundation for the ability to accurately timestamp events, but is only part of the puzzle. Reading the clock is more complex than it might seem at first glance: you need to decide where exactly in the chain of processing to consult the clock, and you need to consider just how long it takes to actually read it.
This last point might seem odd at first: reading a computer clock is just a matter of copying the integer counter that counts the passing of "ticks" of time. Consider an analogy: if you're running late for work, you'll have no problem checking the time on your watch or smartphone. The second or two it takes you to look at the device makes no difference to how late you'll be. But at the Olympic 100 metre race, winning times are obviously not taken by human judges consulting a watch. The differences between first and second place in the 2012 Olympics was 120ms, and these events are timed using electronic equipment with an accuracy of better than 1ms. By the same token, when dealing with electronic trading events, you need to ensure that the time it takes to read a clock is a negligible compared to the speed at which events occur.
So just how long does it take? The standard way to check the time in software on POSIX systems (Linux and other flavors of UNIX) is a system call named gettimeofday. It's instructive to do some tests of this call and measure just how long it takes: on a modern Intel-based platform and a recent version of Linux, we found that it typically takes less than a microsecond or two, but that it not infrequently takes over 10us. In the context of high-performance trading, these are long times. Electronic exchanges can process and match orders in under 100us, and individual gateways and DMA engines can process messages in a few microseconds.
To compare again: gettimeofday taking over 10us to return the time is the equivalent of an Olympic judge taking the whole duration of a 100m race to read his stopwatch! There are faster and more reliable ways of getting an accurate reading of the system clock in modern Linux systems, but the lesson here is that you must do your research before just relying on gettimeofday for software timestamps.
It is also important to consider where, exactly, the timestamping occurs. Take a trading firm that submits a high-frequency order to an exchange, an event they must report under MiFID II with a granularity of 1us. There are several options for where the firm could timestamp the submission:
Any of these options are reasonable candidates for point at which to timestamp, and yet the difference between the resulting timestamps can be large, relatively speaking. The difference between the first and third timestamps is the latency through the FIX engine, which can be several tens of microseconds, and the second timestamp is at some unknown point in between; the difference between the third and fourth timestamp is the latency of the OS network stack and the NIC (network interface card) which can also be tens of microseconds.
It is striking how large the differences between these various choices of timestamps are in the context of the MiFID II requirements of clock synchronization to within 100us. These differences are also subject to the uncertainty that we have seen can be involved in reading clocks in software, and can be seriously compounded by poor clock-synchronization.
While the timing requirements currently articulated in MiFID II are not too onerous, they are only going to be tightened. We know that earlier drafts of the RTSs specified much tighter maximum divergences. In this environment, establishing and proving compliance with MiFID II will require us all to pay much greater attention to the arcana of clock synchronization and timestamping.