digitalmars.D - The Trouble with MonoTimeImpl (including at least one bug)
- Forest (62/62) Apr 02 2024 I'm working on code that needs to know not only how much time has
- Jonathan M Davis (188/250) Apr 02 2024 Well, it would appear that the core problem here is that you're trying t...
- Forest (59/108) Apr 09 2024 I believe you. I think what I've been trying is reasonable,
- Gregor =?UTF-8?B?TcO8Y2ts?= (5/7) Apr 03 2024 I think you are looking for `QueryPerformanceFrequency()`.
I'm working on code that needs to know not only how much time has elapsed between two events, but also the granularity of the timestamp counter. (It's networking code, but I think granularity can be important in multimedia stream synchronization and other areas as well. I would expect it to matter in many of the places where using MonoTimeImpl.ticks() makes sense.) For clarity, I will use "units" to mean the counter's integer value, and "steps" to mean the regular increases in that value. POSIX exposes counter granularity as nanoseconds-per-step via clock_getres(), and MonoTimeImpl exposes its reciprocal (1/n) via ticksPerSecond(). I'm developing on linux, so this appeared at first to be sufficient for my needs. However, I discovered later that ticksPerSecond() doesn't return counter granularity on Windows or Darwin. On those platforms, it returns units-per-second instead: the precision of one unit, rather than one step. This is problematic because: - The function returns conceptually different information depending on the platform. - The API offers the needed granularity information on only one platform. - The API is confusing, by using the word "ticks" for two different concepts. I think this has gone unnoticed due to a combination of factors: - Most programs do only simple timing calculations that don't include a granularity term. - There happens to be a 1:1 unit:step ratio in some common cases, such as on my linux box when using MonoTime's ClockType.normal. - On Windows and Darwin, MonoTimeImpl uses the same fine-grained source clock regardless of what ClockType is selected. It's possible that these clocks have a 1:1 unit:step ratio as well. (Unconfirmed; I don't have a test environment for these platforms, and I haven't found a definitive statement in their docs.) - On POSIX, although selecting ClockType.coarse should reveal the problem, it turns out that ticksPerSecond() has a special case when clock steps are >= 1us, that silently discards the platform's clock_getres() result and uses a hard-coded value unit:step ratio, hiding the problem. Potential fixes/improvements: 1. Give MonoTimeImpl separate functions for reporting units-per-second and steps-per-second (or some other representation of counter granularity, like units-per-step) on all platforms. author used that hard-coded value not because clock_getres() ever returned wrong data, but instead because they misunderstood what clock_getres() does. (Or if *I* have misunderstood it, please enlighten me.) 3. Implement ClockType.coarse with an actually-coarse clock on all platforms that have one. This wouldn't solve the above problems, but it would give programmers access to a presumably more efficient clock and would allow them to avoid Apple's extra scrutiny/hoops for use of a clock that can fingerprint devices. https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time Open questions for people who use Win32 or Darwin: Does Win32 have an API to get the granularity (units-per-step or steps-per-second) of QueryPerformanceCounter()? Does Darwin have such an API for mach_absolute_time()? If the unit:step ratio of the Win32 or Darwin clocks are always 1:1, is that clearly documented somewhere official? Do either of those platforms offer a coarse monotonic clock?
Apr 02 2024
On Tuesday, April 2, 2024 12:35:24 PM MDT Forest via Digitalmars-d wrote:I'm working on code that needs to know not only how much time has elapsed between two events, but also the granularity of the timestamp counter. (It's networking code, but I think granularity can be important in multimedia stream synchronization and other areas as well. I would expect it to matter in many of the places where using MonoTimeImpl.ticks() makes sense.) For clarity, I will use "units" to mean the counter's integer value, and "steps" to mean the regular increases in that value. POSIX exposes counter granularity as nanoseconds-per-step via clock_getres(), and MonoTimeImpl exposes its reciprocal (1/n) via ticksPerSecond(). I'm developing on linux, so this appeared at first to be sufficient for my needs. However, I discovered later that ticksPerSecond() doesn't return counter granularity on Windows or Darwin. On those platforms, it returns units-per-second instead: the precision of one unit, rather than one step. This is problematic because: - The function returns conceptually different information depending on the platform. - The API offers the needed granularity information on only one platform. - The API is confusing, by using the word "ticks" for two different concepts. I think this has gone unnoticed due to a combination of factors: - Most programs do only simple timing calculations that don't include a granularity term. - There happens to be a 1:1 unit:step ratio in some common cases, such as on my linux box when using MonoTime's ClockType.normal. - On Windows and Darwin, MonoTimeImpl uses the same fine-grained source clock regardless of what ClockType is selected. It's possible that these clocks have a 1:1 unit:step ratio as well. (Unconfirmed; I don't have a test environment for these platforms, and I haven't found a definitive statement in their docs.) - On POSIX, although selecting ClockType.coarse should reveal the problem, it turns out that ticksPerSecond() has a special case when clock steps are >= 1us, that silently discards the platform's clock_getres() result and uses a hard-coded value unit:step ratio, hiding the problem. Potential fixes/improvements: 1. Give MonoTimeImpl separate functions for reporting units-per-second and steps-per-second (or some other representation of counter granularity, like units-per-step) on all platforms. author used that hard-coded value not because clock_getres() ever returned wrong data, but instead because they misunderstood what clock_getres() does. (Or if *I* have misunderstood it, please enlighten me.) 3. Implement ClockType.coarse with an actually-coarse clock on all platforms that have one. This wouldn't solve the above problems, but it would give programmers access to a presumably more efficient clock and would allow them to avoid Apple's extra scrutiny/hoops for use of a clock that can fingerprint devices. https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time Open questions for people who use Win32 or Darwin: Does Win32 have an API to get the granularity (units-per-step or steps-per-second) of QueryPerformanceCounter()? Does Darwin have such an API for mach_absolute_time()? If the unit:step ratio of the Win32 or Darwin clocks are always 1:1, is that clearly documented somewhere official? Do either of those platforms offer a coarse monotonic clock?Well, it would appear that the core problem here is that you're trying to use MonoTime in a way that was not designed for or was even thought of when it was written. For some history here, previously, we had TickDuration, which was used both as a point in time / timestamp of the monotonic clock and as a duration in ticks of the monotonic clock, so it conflated the two, making it potentially quite confusing to use (and making it so that the type system couldn't differentiate between a timestamp of the monotonic clock and a duration in ticks of the monotonic clock). MonoTime was written to replace it and to be only a timestamp, with Duration being left as the only way to represent durations of time (or at least, the only way to represent durations of time with units). Originally, MonoTime simply used the normal monotonic clock on the system. There was no way to configure it. However, when std.logger was originally being written, it was using MonoTime fairly heavily (I don't know what it does now), and they were finding that it was too much of a performance hit to be getting the monotonic time as frequently as they were. So, to improve the situation, I created MonoTimeImpl as a type which was templated on the type of the monotonic clock, and I made MonoTime an alias to that which used the normal monotonic clock. That way, the logger stuff could use the coarse clock on POSIX systems (other than Mac OS X), and since I was templating it on the clock type, I added support for several of the clock types available on POSIX systems. For simplicity, I also made it so that systems which didn't actually support a coarse clock would just use the normal monotonic clock so that stuff like std.logger wouldn't have to worry about whether a particular system actually supported the coarse clock. It would just get the coarse clock if it was available and get the normal clock otherwise. But I was not aware of either Windows or Mac OS X having alternative monotonic clocks, so they got none of that support. They just get the normal clock, and trying to use the coarse clock on those systems gives you the normal clock. But the basic design of MonoTime did not change as part of this. The other clocks were really just added for performance reasons. And as far as using MonoTime/MonoTimeImpl goes, it was intended that the type of the clock be irrelevant to its usage. As for the basic design behind MonoTime, the idea was that you were going to do something like immutable start = MonoTime.currTime(); ... immutable end = MonoTime.currTime(); immutable timeElapsed = end - start; That's it. It was never intended to tell you _anything_ about the monotonic clock. It was just for getting the time from the monotime clock so that you could determine how much time had elapsed when you subtracted two monotonic times. The only reason that ticks and ticksPerSecond were exposed was because Duration uses hecto-nanoseconds, whereas the monotonic clock will be more precise than that on some systems, so ticks and ticksPerSecond were provided so that anyone who needed units more precise than hecto-nanoseconds could do the math themselves. So, while ideally, ticksPerSecond corresponds to the resolution of the system clock, it was never intended as a way to decide anything about the system clock. It's there purely so that the appropriate math can be done on the difference between two different values for ticks, and it never occurred to me that anyone might try to do anything else with it - if nothing else, because I've never encountered code that cared about anything about the monotonic time other than using it to get the difference between two points in time without worrying about the system clock changing on you like can happen with the real time clock. In principle, ticks is supposed to be a number from the system clock representing the number of ticks of the system clock, with the exact number and its meaning being system-dependent. How often the number that you can get from the system is updated was not considered at all relevant to the design, and it never occured to me to call the number anything other than ticks, because in principle, it represents the current tick of the system clock when the time was queried. Either way, it's purely a monotonic timestamp. It's not intended to tell you anything about how often it's updated by the system. ticksPerSecond is then how many of those ticks of the system clock there are per second so that we can do the math to figure out the actual duration of time between two ticks of the system clock. Windows and Mac OS X both provide what is basically the correct API for that. They have a function which gives you the current tick of the system clock, and they have a function which tells you what you need to know to know how many ticks there are per second (though both of them provide that in a slightly odd way instead of as just an integer value). Unfortunately, other POSIX systems don't have functions that work that way. Instead, they reused the same function and data type that's used with the real-time clock but changed one of the arguments to let you tell it to use a different clock. So, instead of getting the ticks of the system clock, you get the duration in nanoseconds. So, to make that fit the model of ticks and ticks-per-second, we have to convert that to ticks. For the purposes of MonoTime and how it was designed to be used, we could have just always made it nanoseconds and assumed nanosecond resolution for the system clock, but instead, it attempts to get the actual resolution of the system's monotonic clock and convert the nanoseconds back to that. Obviously, that's also better for anyone who wants to do something with the actual clock resolution, but really, it was done just to try to make the implementation consistent with what happens on other platforms. It never occured to me that anyone would even be trying to query the clock resolution to do something with that information besides what was necessary to convert a duration in ticks to a duration in seconds or fractional seconds. And honestly, if it had occurred to me that anyone would ever consider using ticksPerSecond for anything other than doing math on the difference between two values of ticks, I probably would have provided a function for giving the difference between two ticks in nanoseconds rather than exposing either ticks or ticksPerSecond, since then everything about how that's calculated or stored could be considered to be entirely an implementation detail, particularly since trying to expose information about the system clock gets kind of hairy given that the APIs on each OS are completely different. And that way, all of the issues that you're dealing with right now wouldn't have even come up. You may have then created a feature request to get what you were looking for, but at least you wouldn't have been confused about what MonoTime was doing. As for the oddities with how ticksPerSecond is set when a weird value is detected, I did some digging, and it comes from two separate issues. The first - https://issues.dlang.org/show_bug.cgi?id=16797 - has to do with how apparently on some Linux systems (CentOS 6.4/7 were mentioned as specific cases), clock_getres will report 0 for some reason, which was leading to a division by zero. So, to make that work properly, it was set to nanosecond precision just like in the case that already existed. As for the other case, it looks like that was an issue that predates MonoTime and was originally fixed in TickDuration. The original PR was https://github.com/dlang/druntime/pull/88 The developer who created that PR reported that with clock_getres, some Linux kernels were giving a bogus value that was close to 1 millisecond when he had determined that (on his system at least) the actual resolution was 1 nanosecond. And while it's not exactly ideal to just assume that the resolution is 1 nanosecond in that case, it works perfectly well, because on those systems, the way that the monotonic time is actually reported is in nanoseconds. So, to fix those systems, that's what we did. And naturally, that fix made it into MonoTime when it was created so that MonoTime would also work correctly on those systems. I don't know if either of those fixes is still required, but neither of them violates the intended use case for MonoTime, and since it never occurred to me that someone would try to use ticksPerSecond for anything else other than doing some math on ticks, I thought that it was acceptable to have ticksPerSecond be nanosecond resolution in those cases even if it wasn't ideal. And the exact resolution at which MonoTime gives up and decides to just use nanosecond resolution doesn't matter all that much in that case either, because the math is still all going to come out correctly. And at the time it was written, the coarse clock didn't even enter into the equation, so realistically, the only affected systems were ones that were reporting bizarrely low clock resolutions with clock_getres. Now, obviously, if you're looking to get the actual resolution of the monotonic clock, the current situation is going to cause problems on any system where clock_getres reports a value that causes MonoTime to decide that it's invalid and that it needs to use nanoseconds instead. So, the question becomes what we do about this situation. As for your discussion on units vs steps, I have no clue what to do with that and any API. On Windows and Mac OS X, we get a number from the system clock with no information about how often that's updated - either internally or with regards to the number that we get when making the function call. Because they're ticks of the monontonic clock, I called them ticks. But how they're incremented is an implementation detail of the system, and the documentation doesn't say. All I know is that they're incremented monotonically. So, I guess that what I'm calling ticks, you would call units (Mac OS X appears to call them tick units - https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time), but they're what the system provides, and I'm not aware of anything from those APIs which would allow me to provide any information about what you're calling steps. Those function calls update at whatever rate they update, and without something in the documentation or some other function which provides more information, I don't see how to provide an API that would give that information. On Linux and the BSDs, the situation isn't really any better. Instead of getting the "tick units", we get nanoseconds. We can then use clock_getres to get the resolution of a given system clock, which then indicates how many units there are per second, but it doesn't say anything about steps. Though looking over your post again, it sounds like maybe you interpret the clock resolution as being steps? I've always taken it to mean the resolution of the actual clock, not how often the value from clock_gettime would be updated, which if I understand what you're calling units and what you're calling steps correctly would mean that clock_getres was giving the number of units per second, not the number of steps per second. But even if clock_getres actually returns steps, if there is no equivalent on Windows and Mac OS X, then it's not possible to provide a cross-platform solution. So, I don't see how MonoTime could provide what you're looking for here. Honestly, at this point, I'm inclined to just make it so that ticksPerSecond is always nanoseconds on POSIX systems other than Mac OS X. That way, we're not doing some math that is pointless if all we're trying to do is get monotonic timestamps and subtract them. It should also improve performance slightly if we're not doing that math. And of course, it completely avoids any of the issues surrounding clock_res reporting bad values of any kind on any system. Obviously, either way, I need to improve the documentation on ticks and ticksPerSecond, because they're causing confusion. If we can then add functions of some kind that give you the additional information that you're looking for, then we can look at adding them. But all that MonoTime was designed to deal with was what you're calling units, and I don't see how we can add anything about steps, because I don't know how we'd get that information from the system. As for adding coarse clocks on other systems, if there are coarse clocks on other systems, then I'm all for adding them. But I'm not aware of any such clocks on either Windows or Mac OS X. That's why they're not part of MonoTime. - Jonathan M Davis
Apr 02 2024
On Wednesday, 3 April 2024 at 00:09:25 UTC, Jonathan M Davis wrote:Well, it would appear that the core problem here is that you're trying to use MonoTime in a way that was not designed for or was even thought of when it was written.I believe you. I think what I've been trying is reasonable, though, given that the docs and API are unclear and the source code calls a platform API that suggests I've been doing it right. Maybe this conversation can lead to improvements for future users.In principle, ticks is supposed to be a number from the system clock representing the number of ticks of the system clock, with the exact number and its meaning being system-dependent. How often the number that you can get from the system is updated was not considered at all relevant to the design, and it never occured to me to call the number anything other than ticks, because in principle, it represents the current tick of the system clock when the time was queried. Either way, it's purely a monotonic timestamp. It's not intended to tell you anything about how often it's updated by the system.The use of "ticks" has been throwing me. I am familiar with at least two common senses of the word: 1. A clock's basic step forward, as happens when a mechanical clock makes a tick sound. It might take a fraction of a second, or a whole second, or even multiple seconds. This determines clock resolution. 2. One unit in a timestamp, which determines timestamp resolution. On some clocks, this is the same as the first sense, but not on others. From what you wrote above, I *think* you've generally been using "ticks" in the second sense. Is that right? [Spoiler: Yes, as stated toward the end of your response.] If so, and if the API's use of "ticks" is intended to be that as well, then I don't see why ticksPerSecond() is calling clock_getres(), which measures "ticks" in the first sense of the word. (This is my reading of the glibc man page, and is confirmedUnfortunately, other POSIX systems don't have functions that work that way. [...] So, instead of getting the ticks of the system clock, you get the duration in nanoseconds. So, to make that fit the model of ticks and ticks-per-second, we have to convert that to ticks. For the purposes of MonoTime and how it was designed to be used, we could have just always made it nanoseconds and assumed nanosecond resolution for the system clock, but instead, it attempts to get the actual resolution of the system's monotonic clock and convert the nanoseconds back to that.Ah, so it turns out MonoTime is trying to represent "ticks" in the first sense (clock steps / clock resolution). That explains the use of clock_getres(), but it's another source of confusion, both because the API doesn't include anything to make that conversion useful, and because ticksPerSecond() has that hard-coded value that sometimes renders the conversion incorrectAs for the oddities with how ticksPerSecond is set when a weird value is detected, I did some digging, and it comes from two separate issues. The first - https://issues.dlang.org/show_bug.cgi?id=16797 - has to do with how apparently on some Linux systems (CentOS 6.4/7 were mentioned as specific cases), clock_getres will report 0 for some reason, which was leading to a division by zero.Curious. The clock flagged in that bug report is CLOCK_MONOTONIC_RAW, which I have never used. I wonder, could clock_getres() have been reporting 0 because that clock's resolution was finer than that of the result type? Or, could the platform have been determining the result by sampling the clock at two points in time so close together that they landed within the same clock step, thereby yielding a difference of 0? In either case, perhaps the platform code has been updated since that 2013 CentOS release; it reported 1 when I tried it on my Debian system today.As for the other case, it looks like that was an issue that predates MonoTime and was originally fixed in TickDuration. The original PR was https://github.com/dlang/druntime/pull/88 The developer who created that PR reported that with clock_getres, some Linux kernels were giving a bogus value that was close to 1 millisecond when he had determined that (on his system at least) the actual resolution was 1 nanosecond.I disagree with that developer's reasoning. Why should our standard library override a value reported by the system, even if the value was surprising? If the system was reporting 1 millisecond for a good reason, I would want my code to use that value. If it was a system bug, I would want it confirmed by the system maintainers before meddling with it, and even then, I would want any workaround to be in my application, not library code where the fake value would persist long after the system bug was fixed.Honestly, at this point, I'm inclined to just make it so that ticksPerSecond is always nanoseconds on POSIX systems other than Mac OS X. That way, we're not doing some math that is pointless if all we're trying to do is get monotonic timestamps and subtract them. It should also improve performance slightly if we're not doing that math.I think that makes sense. The POSIX API's units are defined as nanoseconds, after all, so treating them as such would make the code both correct and easier to follow. If we eventually discover APIs or docs on the other platforms that give us clock resolution (in either timestamp units or fractions of a second) like clock_getres() does on POSIX, then it could be exposed through a separate MonoTime method.If we can then add functions of some kind that give you the additional information that you're looking for, then we can look at adding them.Yes, we're thinking along the same lines. :) Thanks for the thoughtful response. Forest
Apr 09 2024
On Tuesday, 2 April 2024 at 18:35:24 UTC, Forest wrote:Does Win32 have an API to get the granularity (units-per-step or steps-per-second) of QueryPerformanceCounter()?I think you are looking for `QueryPerformanceFrequency()`. There is also this document detailing how Windows implements this timer (including some usage guidance) here: https://learn.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps
Apr 03 2024