digitalmars.D - Printing floating points
- deadalnix (10/19) Jan 21 2021 While we are on this, I should point out that there has been a
- Stefan Koch (5/17) Jan 21 2021 I think Ilya has implemented this in mir already.
- jmh530 (3/8) Jan 21 2021 Discussed here
- deadalnix (5/9) Jan 22 2021 ryu pushes thing forward significantly, and proved it was correct
- Walter Bright (3/7) Jan 24 2021 True, but you *can* do it for float values, and the double algorithm is ...
- Berni44 (9/11) Jan 21 2021 This is well known to me (I wrote PR 7264). Unfortunately ryu and
- Walter Bright (3/4) Jan 22 2021 It's Boost licensed, yay!
- Steven Schveighoffer (9/15) Jan 22 2021 I believe Berni44 has already made a PR for this (and did so a while ago...
- Imperatorn (3/21) Jan 22 2021 Interesting, doesn't seem to be ryu though
- Paul Backus (11/15) Jan 22 2021 According to the Ryu README on Github [2], "Ryu generates the
- deadalnix (3/19) Jan 22 2021 People have been suing variation of this for printf. But yes, it
- deadalnix (2/4) Jan 22 2021 Related paper: https://dl.acm.org/doi/pdf/10.1145/3360595
- Berni44 (105/117) Jan 26 2021 Being the author of the PR mentioned here several times, I feel
- Joseph Rushton Wakeling (10/12) Jan 26 2021 It also notes that "the size could be halved to 52 kB with no
- Bruce Carneal (11/24) Jan 26 2021 I'd vote for "overweight and almost-certainly-correct" over "trim
- Joseph Rushton Wakeling (6/16) Jan 26 2021 Yup, I'm in complete agreement with you on this. I am curious if
- Paul Backus (4/8) Jan 26 2021 Judging by this presentation, it's just regular Ryu:
- Bruce Carneal (8/17) Jan 26 2021 This link: https://github.com/microsoft/STL/issues/172
- Berni44 (31/37) Jan 27 2021 I'd like to compare the task to write floating point numbers to
- Bruce Carneal (5/19) Jan 27 2021 I'm quite confident that I have significantly less knowledge in
- Berni44 (5/6) Jan 27 2021 I added one more, going through the code in detail:
- H. S. Teoh (8/22) Jan 26 2021 If there was a way to make these tables pay-as-you-go, I'd vote for it.
- Joseph Rushton Wakeling (4/8) Jan 26 2021 Agree completely -- pay-as-you-go should be important for
- Walter Bright (5/13) Jan 26 2021 I've always thought things like this should be in the operating system. ...
- H. S. Teoh (10/25) Jan 27 2021 Or get it wrong once, and everyone rolls their own, all slightly wrong
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/10) Jan 26 2021 Network transfer costs responsiveness, Web/WASM. 100Kb to do
- Walter Bright (2/6) Jan 26 2021 We only really need it when the C stdlib printf is inadequate.
- Steven Schveighoffer (7/12) Jan 27 2021 There is a way -- dynamic libraries. It's how C does it.
- H. S. Teoh (12/25) Jan 27 2021 [...]
- Steven Schveighoffer (8/34) Jan 29 2021 The idea is it would go into the phobos shared library, not a specific
- deadalnix (4/13) Jan 22 2021 I looked. i can tell you it is not ryu. I would need to dive much
- Walter Bright (3/5) Jan 24 2021 One way is to randomly generate floating point bit patterns and compare ...
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/8) Jan 25 2021 You only need to test the boundary cases, so one can do this, no
- Bruce Carneal (9/18) Jan 25 2021 My guess is that knowing what the actual boundary cases are is
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/6) Jan 26 2021 Why is that? If the proof is for the algorithm then it has
- Bruce Carneal (13/19) Jan 26 2021 If you can construct a tractable exhaustive proof of an
- John Colvin (15/36) Jan 26 2021 You can probably inductively prove - or at least become quite
- Bruce Carneal (11/52) Jan 26 2021 Yes. If we can define, with confidence, a representative set
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/16) Jan 26 2021 In general, if one understands the algorithm as implemented then
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (9/14) Jan 26 2021 Or to put it another way. If you can prove that these implication
- Bruce Carneal (14/28) Jan 26 2021 I think this is a great way to go for simple functions. If the
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/7) Jan 26 2021 Not sure what you mean... Why would anyone waste their time on
- Bruce Carneal (19/27) Jan 26 2021 You made an assertion, in a proof discussion, that the tractable
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (9/12) Jan 26 2021 Eh. I argued that you can fully cover an implementation without
- Bruce Carneal (18/26) Jan 27 2021 No, you made an assertion regarding the tractable extension of an
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (24/32) Jan 27 2021 The original statement was that there was only a choice between
- Bruce Carneal (6/10) Jan 27 2021 Great. Two questions: 1) Do you believe that a trivial
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (16/20) Jan 27 2021 Certainly trivial from a theoretical perspective, in the sense
- Bruce Carneal (8/28) Jan 27 2021 Perhaps you mistook my meaning when I used the term "trivial". I
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (7/10) Jan 28 2021 Asking for a mapping does not really make much sense without
- Bruce Carneal (21/32) Jan 28 2021 For the problem under discussion the domain is 0 .. 2**NBits
- Bruce Carneal (2/6) Jan 28 2021 That should be 2**N - 1.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (30/40) Jan 28 2021 Let's backtrack then. My suggestion was that one does an informal
- Bruce Carneal (19/60) Jan 28 2021 No, of course not. The parenthetical phrase was meant to apply
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/17) Jan 29 2021 I am not asserting anything, other than that I don't think full
- Bruce Carneal (9/27) Jan 29 2021 Then I misunderstood you all along. I thought that you were
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/7) Jan 29 2021 A formal proof would have to be machine checked. So you need a
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/17) Jan 29 2021 And you would also need to annotate the D code with Hoare logic
- sarn (9/17) Jan 29 2021 Have you seen KLEE? It applies a solver like Z3 to normal code
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/14) Jan 29 2021 I've only read about KLEE, which looks interesting. Using LLVMIR
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/9) Jan 30 2021 Or more likely 128 boolean expressions, since you probably have
- Afgdr (8/10) Jan 30 2021 how about a full coverage of a half float type, assuming the
- Afgdr (3/13) Jan 30 2021 Well I forgot that it may use LUTs so maybe not in this case, but
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/8) Jan 30 2021 Yes, I get your idea, but I think one has to be a bit more
- Bruce Carneal (9/20) Jan 28 2021 We seem to repeatedly misunderstand one another on this topic so
- Bruce Carneal (16/33) Jan 26 2021 I believed that to be obvious. There is no utility in
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (9/12) Jan 26 2021 Yes, the real issue in practice is constantly changing
https://forum.dlang.org/post/r1hsv2$tt0$1 digitalmars.com On Thursday, 6 February 2020 at 20:29:31 UTC, Walter Bright wrote:This one is always going to be a tough one: https://github.com/dlang/phobos/pull/7264 Floating point code is extremely difficult to get right, and contains a lot of very subtle traps for the unwary. (It's why people are still writing research papers on formatting floating point numbers.) If there's a mistake in it that subtly breaks someone's floating point code, it would cause them to not trust D. Just finding someone capable of reviewing it thoroughly would be very difficult for any community.While we are on this, I should point out that there has been a major breakthrough on the matter fairly recently called ryu. I invite everyone who has an interest in printing floating to look at it. Here are a few useful links: The original paper: https://dl.acm.org/doi/pdf/10.1145/3296979.3192369 Another paper: https://dl.acm.org/doi/pdf/10.1145/3360595 Source code in various languages: https://github.com/ulfjack/ryu
Jan 21 2021
On Thursday, 21 January 2021 at 14:02:52 UTC, deadalnix wrote:https://forum.dlang.org/post/r1hsv2$tt0$1 digitalmars.com On Thursday, 6 February 2020 at 20:29:31 UTC, Walter Bright wrote:I think Ilya has implemented this in mir already. ryu is less of a breakthrough imo. It's a modification of grisu2 (hence the name ryu (dragon in Japanese))[...]While we are on this, I should point out that there has been a major breakthrough on the matter fairly recently called ryu. I invite everyone who has an interest in printing floating to look at it. Here are a few useful links: The original paper: https://dl.acm.org/doi/pdf/10.1145/3296979.3192369 Another paper: https://dl.acm.org/doi/pdf/10.1145/3360595 Source code in various languages: https://github.com/ulfjack/ryu
Jan 21 2021
On Thursday, 21 January 2021 at 14:13:32 UTC, Stefan Koch wrote:[snip] I think Ilya has implemented this in mir already. ryu is less of a breakthrough imo. It's a modification of grisu2 (hence the name ryu (dragon in Japanese))Discussed here https://forum.dlang.org/thread/yobmmccdvbmmbaolehbs forum.dlang.org
Jan 21 2021
On Thursday, 21 January 2021 at 14:13:32 UTC, Stefan Koch wrote:I think Ilya has implemented this in mir already. ryu is less of a breakthrough imo. It's a modification of grisu2 (hence the name ryu (dragon in Japanese))ryu pushes thing forward significantly, and proved it was correct to do so. I can't stress enough the importance of that last part, because there are no way you can check all possible double values by yourself and see if the output is correct.
Jan 22 2021
On 1/22/2021 5:11 AM, deadalnix wrote:ryu pushes thing forward significantly, and proved it was correct to do so. I can't stress enough the importance of that last part, because there are no way you can check all possible double values by yourself and see if the output is correct.True, but you *can* do it for float values, and the double algorithm is just more bits.
Jan 24 2021
On Thursday, 21 January 2021 at 14:02:52 UTC, deadalnix wrote:While we are on this, I should point out that there has been a major breakthrough on the matter fairly recently called ryu.This is well known to me (I wrote PR 7264). Unfortunately ryu and format("%f") have different goals, so ryu cannot be used there. Anyway, I have some ideas of extending the abilities of format, where ryu (or a similiar algorithm) might make it into phobos. But don't expect this to happen too soon. As far as I know, there are allready several implementations of ryu in D available. Berni
Jan 21 2021
On 1/21/2021 6:02 AM, deadalnix wrote:Source code in various languages: https://github.com/ulfjack/ryuIt's Boost licensed, yay! Who wants to get the leet street cred for integrating this into dmd?
Jan 22 2021
On 1/22/21 4:33 AM, Walter Bright wrote:On 1/21/2021 6:02 AM, deadalnix wrote:I believe Berni44 has already made a PR for this (and did so a while ago). Original PR: https://github.com/dlang/phobos/pull/7264 (includes a nice paper on what he did) New PR: https://github.com/dlang/phobos/pull/7757 Or maybe I misunderstand what he did. I can't make heads or tails of this stuff. Amaury, can you take a look at that PR? -SteveSource code in various languages: https://github.com/ulfjack/ryuIt's Boost licensed, yay! Who wants to get the leet street cred for integrating this into dmd?
Jan 22 2021
On Friday, 22 January 2021 at 14:02:53 UTC, Steven Schveighoffer wrote:On 1/22/21 4:33 AM, Walter Bright wrote:Interesting, doesn't seem to be ryu thoughOn 1/21/2021 6:02 AM, deadalnix wrote:I believe Berni44 has already made a PR for this (and did so a while ago). Original PR: https://github.com/dlang/phobos/pull/7264 (includes a nice paper on what he did) New PR: https://github.com/dlang/phobos/pull/7757 Or maybe I misunderstand what he did. I can't make heads or tails of this stuff. Amaury, can you take a look at that PR? -SteveSource code in various languages: https://github.com/ulfjack/ryuIt's Boost licensed, yay! Who wants to get the leet street cred for integrating this into dmd?
Jan 22 2021
On Friday, 22 January 2021 at 16:42:20 UTC, Imperatorn wrote:Interesting, doesn't seem to be ryu thoughFrom the linked PDF [1]:Ryu and other fast algorithms have been rejected, because they cannot be used for printf-like functions due to a different design goal.According to the Ryu README on Github [2], "Ryu generates the shortest decimal representation of a floating point number that maintains round-trip safety." For printf, the user is allowed to specify things like the precision, the field width, and whether leading zeros should be used for padding, so a more flexible algorithm is needed. [1] https://raw.githubusercontent.com/berni44/printFloat/master/printFloat.pdf [2] https://github.com/ulfjack/ryu
Jan 22 2021
On Friday, 22 January 2021 at 18:01:41 UTC, Paul Backus wrote:On Friday, 22 January 2021 at 16:42:20 UTC, Imperatorn wrote:People have been suing variation of this for printf. But yes, it is not suffisient by itself.Interesting, doesn't seem to be ryu thoughFrom the linked PDF [1]:Ryu and other fast algorithms have been rejected, because they cannot be used for printf-like functions due to a different design goal.According to the Ryu README on Github [2], "Ryu generates the shortest decimal representation of a floating point number that maintains round-trip safety." For printf, the user is allowed to specify things like the precision, the field width, and whether leading zeros should be used for padding, so a more flexible algorithm is needed. [1] https://raw.githubusercontent.com/berni44/printFloat/master/printFloat.pdf [2] https://github.com/ulfjack/ryu
Jan 22 2021
On Friday, 22 January 2021 at 21:37:08 UTC, deadalnix wrote:People have been suing variation of this for printf. But yes, it is not suffisient by itself.Related paper: https://dl.acm.org/doi/pdf/10.1145/3360595
Jan 22 2021
Being the author of the PR mentioned here several times, I feel somewhat obliged to write something too. On Saturday, 23 January 2021 at 00:13:16 UTC, deadalnix wrote:Related paper: https://dl.acm.org/doi/pdf/10.1145/3360595That's great. I'll have a deeper look into it, whether there is something, that can be used for our implementation of printf. From first glance: It overcomes some of the speed problems I encountered and where I currently am trying to find something better. The drawback of RYU printf is the large amount of memory needed for the tables involved. The paper states 104KB for doubles (can be compressed on the expense of slowing everthing down). For reals it will be even more. We should make sure, that we do not blow up the executables more than needed. We will probably eventually have to decide between speed and code size - or we'll have to find some sort of a compromise. On Monday, 25 January 2021 at 04:27:49 UTC, Walter Bright wrote:One way is to randomly generate floating point bit patterns and compare the output with that of print formatting on C.The last commit of my PR adds a test, that is doing exactly this. I would be happy, if people could run that test and report differences. (There will be some, because the snprintf implementations are known to contain bugs - see my comments above the test.) To do so, check out the PR, uncomment the line "version = printFloatTest;" and run the unittests of std.format. The test lasts for 30 minutes (you can configure this by changing the value of "duration"). After that it prints some summary. (Feel free to change the test to make it meet your needs.) On Monday, 25 January 2021 at 09:17:25 UTC, Ola Fosheim Grøstad wrote:You only need to test the boundary cases, so one can do this, no problem.I fear, with this approach you'll have a hard time: There are so many boundary cases, it's easy to miss some. - You need to test precision, rounding mode, different algorithms depending on the size of the exponent, nans, infs, subnormals, zero, float/double/real(s) and probably more. On Monday, 25 January 2021 at 21:40:01 UTC, Bruce Carneal wrote:If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper.There is a subtle, but important difference between proving correctness for RYU and printf-implementations: For RYU, the theorem to prove is simple to state: Find one shortest representation of a number, that works in round robin. But what would be the equivalent theorem for printf? Produce the same result, that snprintf produces? But which implementation of snprintf? And what exactly does snprintf produce? Can you write that down? And wouldn't it make more sense to have an implementation that fixes the known bugs in snprintf instead of copying them? (For example, I read about an implementation where snprintf("%.2f",double.inf) produces "#J" instead of the expected rounded up to "J"... *lol*, or printing different results for identical numbers, just because one is float and the other is double.) I'm not completely sure, I have not found the time to read the paper about RYU printf carefully, so I might be wrong, but from first glance I think they only prove, that the algorithm produces the correct digits. It ignores all the flags and stuff which make up about 90% of the PR. The heart of the PR (which could be replaced by RYU printf) is this (version for numbers, that have no fractional part - see below for fractions):int msu = 0; while (msu < count - 1 || mybig[$-1] != 0) {> ulong mod = 0; > foreach (i;msu .. count){> mybig[i] |= mod << 60; // (a) > mod = mybig[i] % 10; // (b) > mybig[i] /= 10; // (c) > } > if (mybig[msu] == 0) > ++msu; > > buffer[--left] = cast(byte) ('0'+mod); > } It's basically: Divide by 10 and print the reminder until you reach zero (printing from right to left). The algorithm is a little bit more complicated, because the numbers involved are up to 1000 digits long and cannot be saved anymore in a single ulong. Going a little bit more into details here: The number which should be print is given in mybig[] as an integer (similar to the implementation of BigInt). The outer loop produces one decimal digit every time it is executed - the digit is saved in buffer (last line). The inner loop is a division by 10 for big numbers. It is quite similar to a long division, like we learned it in school (here division by 3): 736 : 3 = 245 reminder 1. 6 - 13 12 -- 16 15 -- 1 First, we divide (7 / 3 = 2 reminder 1) and add another digit to the reminder (1 => 13) and so on. The same is done in the algorithm: (a) adds another digit, (b) calculates the reminder and (c) divides by 10. (If you worry about the reordering of the steps: This has the benefit, that the loop is without branch and therefore faster; in the first step a zero is added, which doesn't harm.). Finally, the last reminder is the next digit. (And the if below the inner loop is for speed reasons.) As for fractional digits: Here a similar algorithm is used, but this time the number is multiplied each round by 10 and the overflow leads to the next digit. Don't know, if this will comfort you or not, but: This algorithm *is* already part of Phobos since 7th of Februar 2020 and hence in the stable version since v2.091. It's already used for printing floats with %e. The PR is about adding it for %f too.
Jan 26 2021
On Tuesday, 26 January 2021 at 16:25:54 UTC, Berni44 wrote:The paper states 104KB for doubles (can be compressed on the expense of slowing everthing down).It also notes that "the size could be halved to 52 kB with no performance impact". For comparison, a simple "Hello, World!", built with optimizations and stripped, comes to about 700 kB. Even allowing that the 128-bit table (which will support 80-bit reals) is going to be larger, is this really an issue on modern machines? The only circumstances I can see it mattering is in very low-resource embedded use-cases where D stdlib is unlikely to be viable anyway.
Jan 26 2021
On Wednesday, 27 January 2021 at 00:15:41 UTC, Joseph Rushton Wakeling wrote:On Tuesday, 26 January 2021 at 16:25:54 UTC, Berni44 wrote:I'd vote for "overweight and almost-certainly-correct" over "trim but iffy" if the extra poundage is reasonable which, in the non-embedded environments, it appears to be. On another note, it looks like ryu was submitted by MS in late 2019 for inclusion in libc++. Despite a massive 3 minutes of google-fuing, I'm still not sure how far that has gotten but the direction seems pretty clear. A large body of others believe that ryu is a good way to go and are moving to standardize on it. It's not a live-or-die issue for me but ryu looks pretty good.The paper states 104KB for doubles (can be compressed on the expense of slowing everthing down).It also notes that "the size could be halved to 52 kB with no performance impact". For comparison, a simple "Hello, World!", built with optimizations and stripped, comes to about 700 kB. Even allowing that the 128-bit table (which will support 80-bit reals) is going to be larger, is this really an issue on modern machines? The only circumstances I can see it mattering is in very low-resource embedded use-cases where D stdlib is unlikely to be viable anyway.
Jan 26 2021
On Wednesday, 27 January 2021 at 01:16:40 UTC, Bruce Carneal wrote:I'd vote for "overweight and almost-certainly-correct" over "trim but iffy" if the extra poundage is reasonable which, in the non-embedded environments, it appears to be. On another note, it looks like ryu was submitted by MS in late 2019 for inclusion in libc++. Despite a massive 3 minutes of google-fuing, I'm still not sure how far that has gotten but the direction seems pretty clear. A large body of others believe that ryu is a good way to go and are moving to standardize on it. It's not a live-or-die issue for me but ryu looks pretty good.Yup, I'm in complete agreement with you on this. I am curious if the MS implementation and usage carries the extra weight described (ryu alone is more lightweight, it's ryu printf that has the larger tables).
Jan 26 2021
On Wednesday, 27 January 2021 at 01:38:38 UTC, Joseph Rushton Wakeling wrote:Yup, I'm in complete agreement with you on this. I am curious if the MS implementation and usage carries the extra weight described (ryu alone is more lightweight, it's ryu printf that has the larger tables).Judging by this presentation, it's just regular Ryu: https://www.youtube.com/watch?v=4P_kbF0EbZM
Jan 26 2021
On Wednesday, 27 January 2021 at 01:56:55 UTC, Paul Backus wrote:On Wednesday, 27 January 2021 at 01:38:38 UTC, Joseph Rushton Wakeling wrote:This link: https://github.com/microsoft/STL/issues/172 suggests that they're adding 121KB of constant data and that they're compiling 385KB of source code. Wow. Hope that's mostly the tables. Also, per the youtube presentation that Paul pointed at, that does not include 80/128 bit FPs as their "long double" is defined to be "double".Yup, I'm in complete agreement with you on this. I am curious if the MS implementation and usage carries the extra weight described (ryu alone is more lightweight, it's ryu printf that has the larger tables).Judging by this presentation, it's just regular Ryu: https://www.youtube.com/watch?v=4P_kbF0EbZM
Jan 26 2021
On Wednesday, 27 January 2021 at 01:16:40 UTC, Bruce Carneal wrote:I'd vote for "overweight and almost-certainly-correct" over "trim but iffy" if the extra poundage is reasonable which, in the non-embedded environments, it appears to be.I'd like to compare the task to write floating point numbers to building a car: You need a chassis, wheels, seats and other stuff. And you need an engine. The engine in the PR is a simple division by 10 (as explained above). Ryu Printf would be an other engine, a better one when concerned of speed only (but much worse in memory usage). But all else of the car would remain the same (and is not touched by the proof of Ryu Printf, as far as I know). (My plan is to start with a simple engine and replace that one by better ones step by step.) Coming back to your vote, I'd say it's a vote between "overweight and almost-certainly-correct" vs. "trim and correct". I don't say that, because I'm overly self confident, but because the "engine" I used is quite simple. And furthermore, that simple engine is already part of phobos (in the %e-car) sind almost one year, I just reuse it for the %f-car.A large body of others believe that ryu is a good way to go and are moving to standardize on it. It's not a live-or-die issue for me but ryu looks pretty good.We are in complete agreement here. My point is: Adding ryu printf would be the second step before the first step: First build the car with a simple engine (that's what the PR does) and then replace this engine by a better one. I will (probably) do this job, but only in this order. Adding ryu is jet another unrelated step. I've got plans to add ryu, or at least to prepare everything for adding it. But to do this std.format needs some other improvements (including refactoring) first, in order to add some space, where ryu nicely first step here, not absolutely needed, but with all these additions IMHO std.format is getting too large. It also simplifies working in parallel on replacing snprintf and improving the rest of std.format.
Jan 27 2021
On Wednesday, 27 January 2021 at 08:27:13 UTC, Berni44 wrote:On Wednesday, 27 January 2021 at 01:16:40 UTC, Bruce Carneal wrote:I'm quite confident that I have significantly less knowledge in this area than you do. Your explanations/tutorials and corrections are welcome. I'm just happy to see someone working on an important, foundational, capability.I'd vote for "overweight and almost-certainly-correct" over "trim but iffy" if the extra poundage is reasonable which, in the non-embedded environments, it appears to be.I'd like to compare the task to write floating point numbers to building a car: ...A large body of others believe that ryu is a good way to go and are moving to standardize on it. It's not a live-or-die issue for me but ryu looks pretty good.We are in complete agreement here. My point is: Adding ryu printf would be the second step before the first step: First build ...
Jan 27 2021
On Wednesday, 27 January 2021 at 09:22:24 UTC, Bruce Carneal wrote:Your explanations/tutorials and corrections are welcome.I added one more, going through the code in detail: https://github.com/berni44/printFloat/blob/master/walkthrough.pdf I hope this helps to tear down the fears somewhat...
Jan 27 2021
On Wed, Jan 27, 2021 at 12:15:41AM +0000, Joseph Rushton Wakeling via Digitalmars-d wrote:On Tuesday, 26 January 2021 at 16:25:54 UTC, Berni44 wrote:If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue. T -- May you live all the days of your life. -- Jonathan SwiftThe paper states 104KB for doubles (can be compressed on the expense of slowing everthing down).It also notes that "the size could be halved to 52 kB with no performance impact". For comparison, a simple "Hello, World!", built with optimizations and stripped, comes to about 700 kB. Even allowing that the 128-bit table (which will support 80-bit reals) is going to be larger, is this really an issue on modern machines? The only circumstances I can see it mattering is in very low-resource embedded use-cases where D stdlib is unlikely to be viable anyway.
Jan 26 2021
On Wednesday, 27 January 2021 at 02:17:52 UTC, H. S. Teoh wrote:If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.Agree completely -- pay-as-you-go should be important for something like this. (But then, it should be important for the whole stdlib...)
Jan 26 2021
On 1/26/2021 6:20 PM, Joseph Rushton Wakeling wrote:On Wednesday, 27 January 2021 at 02:17:52 UTC, H. S. Teoh wrote:I've always thought things like this should be in the operating system. Get it right once, then everyone uses it. Date/time/DST should be in the OS, so should the ridiculously large amount of code to deal with the level 3 Unicode madness.If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.Agree completely -- pay-as-you-go should be important for something like this. (But then, it should be important for the whole stdlib...)
Jan 26 2021
On Tue, Jan 26, 2021 at 11:16:58PM -0800, Walter Bright via Digitalmars-d wrote:On 1/26/2021 6:20 PM, Joseph Rushton Wakeling wrote:Or get it wrong once, and everyone rolls their own, all slightly wrong in a different way. :-DOn Wednesday, 27 January 2021 at 02:17:52 UTC, H. S. Teoh wrote:I've always thought things like this should be in the operating system. Get it right once, then everyone uses it.If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.Agree completely -- pay-as-you-go should be important for something like this. (But then, it should be important for the whole stdlib...)Date/time/DST should be in the OS, so should the ridiculously large amount of code to deal with the level 3 Unicode madness.The next best thing is to reuse a reputable library that's reasonably widespread, like libicu. But that adds an additional dependency, and the API isn't always conducive to how the language works. Reuse is hard. T -- VI = Visual Irritation
Jan 27 2021
On Wednesday, 27 January 2021 at 02:17:52 UTC, H. S. Teoh wrote:If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.Network transfer costs responsiveness, Web/WASM. 100Kb to do float to string conversion in a web app would be crazy. Same with embedded. Anyway, in most system programming scenarios where you control the format it would be better to use hex for accuracy anyway.
Jan 26 2021
On 1/26/2021 6:17 PM, H. S. Teoh wrote:If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.We only really need it when the C stdlib printf is inadequate.
Jan 26 2021
On 1/26/21 9:17 PM, H. S. Teoh wrote:If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.There is a way -- dynamic libraries. It's how C does it. But seriously, I can't imagine why we are concerned about 100KB of code space. My vibe.d server is 80MB compiled (which I admit is a lot more than I would like, but still). My phone has as much code space as 100 computers from 10 years ago. Are you still using floppies to share D code? -Steve
Jan 27 2021
On Wed, Jan 27, 2021 at 10:18:58PM -0500, Steven Schveighoffer via Digitalmars-d wrote:On 1/26/21 9:17 PM, H. S. Teoh wrote:That's total overkill for 100KB of data.If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.There is a way -- dynamic libraries. It's how C does it.But seriously, I can't imagine why we are concerned about 100KB of code space. My vibe.d server is 80MB compiled (which I admit is a lot more than I would like, but still). My phone has as much code space as 100 computers from 10 years ago. Are you still using floppies to share D code?[...] It's not so much this specific 100KB that I'm concerned about; it's the general principle of pay-as-you-go. You can have 100KB here and 50KB there and 70KB for something else, and pretty soon it adds up to something not so small. Individually they're not worth bothering with; together they can add quite a bit of bloat, which may be desirable to get rid of, for embedded applications say. T -- What doesn't kill me makes me stranger.
Jan 27 2021
On 1/28/21 1:21 AM, H. S. Teoh wrote:On Wed, Jan 27, 2021 at 10:18:58PM -0500, Steven Schveighoffer via Digitalmars-d wrote:The idea is it would go into the phobos shared library, not a specific shared library just for floats.On 1/26/21 9:17 PM, H. S. Teoh wrote:That's total overkill for 100KB of data.If there was a way to make these tables pay-as-you-go, I'd vote for it. Adding 104KB or 52KB even if the code never once formats a float, is not ideal, though granted, in this day and age of cheap, high-capacity RAM not really a big issue.There is a way -- dynamic libraries. It's how C does it.I get that there is a need for it in niche cases. But those are very niche, and I'd expect a custom runtime for them anyway. But the solution to having code that is small is to use a shared library for the common code. This seems like a perfect fit for that. -SteveBut seriously, I can't imagine why we are concerned about 100KB of code space. My vibe.d server is 80MB compiled (which I admit is a lot more than I would like, but still). My phone has as much code space as 100 computers from 10 years ago. Are you still using floppies to share D code?[...] It's not so much this specific 100KB that I'm concerned about; it's the general principle of pay-as-you-go. You can have 100KB here and 50KB there and 70KB for something else, and pretty soon it adds up to something not so small. Individually they're not worth bothering with; together they can add quite a bit of bloat, which may be desirable to get rid of, for embedded applications say.
Jan 29 2021
On Friday, 22 January 2021 at 14:02:53 UTC, Steven Schveighoffer wrote:I believe Berni44 has already made a PR for this (and did so a while ago). Original PR: https://github.com/dlang/phobos/pull/7264 (includes a nice paper on what he did) New PR: https://github.com/dlang/phobos/pull/7757 Or maybe I misunderstand what he did. I can't make heads or tails of this stuff. Amaury, can you take a look at that PR? -SteveI looked. i can tell you it is not ryu. I would need to dive much deeper into it to be confident to include this.
Jan 22 2021
On 1/22/2021 1:35 PM, deadalnix wrote:I looked. i can tell you it is not ryu. I would need to dive much deeper into it to be confident to include this.One way is to randomly generate floating point bit patterns and compare the output with that of print formatting on C.
Jan 24 2021
On Monday, 25 January 2021 at 04:27:49 UTC, Walter Bright wrote:On 1/22/2021 1:35 PM, deadalnix wrote:You only need to test the boundary cases, so one can do this, no problem.I looked. i can tell you it is not ryu. I would need to dive much deeper into it to be confident to include this.One way is to randomly generate floating point bit patterns and compare the output with that of print formatting on C.
Jan 25 2021
On Monday, 25 January 2021 at 09:17:25 UTC, Ola Fosheim Grøstad wrote:On Monday, 25 January 2021 at 04:27:49 UTC, Walter Bright wrote:My guess is that knowing what the actual boundary cases are is equivalent to having solved the original problem. If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper. I also note that RYU's printf incompatibilities mentioned earlier in this thread seem to have been addressed. A recent variant of RYU reportedly accommodates printf.On 1/22/2021 1:35 PM, deadalnix wrote:You only need to test the boundary cases, so one can do this, no problem.I looked. i can tell you it is not ryu. I would need to dive much deeper into it to be confident to include this.One way is to randomly generate floating point bit patterns and compare the output with that of print formatting on C.
Jan 25 2021
On Monday, 25 January 2021 at 21:40:01 UTC, Bruce Carneal wrote:If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper.Why is that? If the proof is for the algorithm then it has nothing to do with proving your implementation to be correct.
Jan 26 2021
On Tuesday, 26 January 2021 at 16:31:35 UTC, Ola Fosheim Grøstad wrote:On Monday, 25 January 2021 at 21:40:01 UTC, Bruce Carneal wrote:If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof. I'm very happy to see the 32 bit floats done exhaustively BTW. It is very suggestive, but not definitive if I understand the problem correctly. The mapping from the domain to the co-domain is the entire problem IIUC so asserting that certain ranges within the domain can be skipped is tantamount to saying that you know the implementation performs correctly in those regions. What am I missing?If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper.Why is that? If the proof is for the algorithm then it has nothing to do with proving your implementation to be correct.
Jan 26 2021
On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:On Tuesday, 26 January 2021 at 16:31:35 UTC, Ola Fosheim Grøstad wrote:You can probably inductively prove - or at least become quite certain - that your implementation is identical to the original algorithm within a given set of inputs, given one good result within that set. What those sets look like depends on the algorithm of course. E.g. given an algorithm for a function with type integer -> integer, if you can prove your implementation of an algorithm is linear w.r.t. its input and you can demonstrate it's correct for 1, you can then infer it will be correct for any other integer input. I think? Off the top of my head. Of course <insert interesting algorithm here> isn't linear w.r.t. its inputs, but maybe it is within some fixed subset of its inputs, so you only have to test one of those to cover the whole set. Also, linearity isn't the only property one can consider.On Monday, 25 January 2021 at 21:40:01 UTC, Bruce Carneal wrote:If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof. I'm very happy to see the 32 bit floats done exhaustively BTW. It is very suggestive, but not definitive if I understand the problem correctly. The mapping from the domain to the co-domain is the entire problem IIUC so asserting that certain ranges within the domain can be skipped is tantamount to saying that you know the implementation performs correctly in those regions. What am I missing?If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper.Why is that? If the proof is for the algorithm then it has nothing to do with proving your implementation to be correct.
Jan 26 2021
On Tuesday, 26 January 2021 at 19:47:47 UTC, John Colvin wrote:On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:Yes. If we can define, with confidence, a representative set which spans the entire domain then we're good to go: inductive, piecewise, whatever. We trade one proof (the original) for another (that the coverage is exhaustive/sufficient). That can be a great trade but I'm not confident in my ability to define such a set in this case (gt 32 bit floats). Perhaps others are confident. However it goes from this point on, I'm very happy to see the progress made towards repeatability, towards solid foundations. My thanks to those hammering on this for the rest of us.On Tuesday, 26 January 2021 at 16:31:35 UTC, Ola Fosheim Grøstad wrote:You can probably inductively prove - or at least become quite certain - that your implementation is identical to the original algorithm within a given set of inputs, given one good result within that set. What those sets look like depends on the algorithm of course. E.g. given an algorithm for a function with type integer -> integer, if you can prove your implementation of an algorithm is linear w.r.t. its input and you can demonstrate it's correct for 1, you can then infer it will be correct for any other integer input. I think? Off the top of my head. Of course <insert interesting algorithm here> isn't linear w.r.t. its inputs, but maybe it is within some fixed subset of its inputs, so you only have to test one of those to cover the whole set. Also, linearity isn't the only property one can consider.On Monday, 25 January 2021 at 21:40:01 UTC, Bruce Carneal wrote:If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof. I'm very happy to see the 32 bit floats done exhaustively BTW. It is very suggestive, but not definitive if I understand the problem correctly. The mapping from the domain to the co-domain is the entire problem IIUC so asserting that certain ranges within the domain can be skipped is tantamount to saying that you know the implementation performs correctly in those regions. What am I missing?If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper.Why is that? If the proof is for the algorithm then it has nothing to do with proving your implementation to be correct.
Jan 26 2021
On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof.Of course you do, you need proof of the tests being correct...can be skipped is tantamount to saying that you know the implementation performs correctly in those regions.In general, if one understands the algorithm as implemented then one can either manually or automatically prove certain regions/ranges. Then you only need to test the regions/ranges that are hard to prove. For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.
Jan 26 2021
On Tuesday, 26 January 2021 at 20:27:25 UTC, Ola Fosheim Grøstad wrote:For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.Or to put it another way. If you can prove that these implication holds: 0-1 works => 2-8 works 0-11 works => 12-98 works 0-101 works => 102-998 works etc then you only need to test 9-11, 99-101.
Jan 26 2021
On Tuesday, 26 January 2021 at 20:34:51 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 26 January 2021 at 20:27:25 UTC, Ola Fosheim Grøstad wrote:I think this is a great way to go for simple functions. If the domain ==> co-domain mapping is truly obvious then we can exploit that to write an efficient unittest that we can take as the next best thing to the simpler exhaustive proof. I don't see a mapping for the current problem that would allow for such efficient enumerative testing but it's not my field. Still, if/when you prove that you've fully covered the domain I'd be very excited to read about it. It could be a major advance. Note: by "proof" I do not mean probabilistic evidence. We already have that. Even if you don't come up with proof, the attempt may be worth your time. Good luck.For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.Or to put it another way. If you can prove that these implication holds: 0-1 works => 2-8 works 0-11 works => 12-98 works 0-101 works => 102-998 works etc then you only need to test 9-11, 99-101.
Jan 26 2021
On Tuesday, 26 January 2021 at 21:10:20 UTC, Bruce Carneal wrote:Even if you don't come up with proof, the attempt may be worth your time. Good luck.Not sure what you mean... Why would anyone waste their time on trying to prove something for code they have no control over, it could change any minute, and which implementation by the way? *shrugs*
Jan 26 2021
On Tuesday, 26 January 2021 at 22:23:34 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 26 January 2021 at 21:10:20 UTC, Bruce Carneal wrote:You made an assertion, in a proof discussion, that the tractable extension of the exhaustive proof to cover larger FP types, was "no problem". Either it really is "no problem", and clarification to the good of all concerned would take little of your time, or you were mistaken and others should not expect proof level enlightenment from you on the topic any time soon. It is certainly "no problem" to extend probabilistic testing but closing the gap between probable and proven appears to me to be quite a bit harder. That's why I had hoped that you'd had a breakthrough. Your inability and/or unwillingness to produce a "no problem" proof suggests that you have not had a breakthrough (Fermat at least had the excuse of dieing :-) ). Ah well. There are a lot of other things to work on. I just hope that we can come to agree on what it means to actually prove something.Even if you don't come up with proof, the attempt may be worth your time. Good luck.Not sure what you mean... Why would anyone waste their time on trying to prove something for code they have no control over, it could change any minute, and which implementation by the way? *shrugs*
Jan 26 2021
On Wednesday, 27 January 2021 at 00:53:11 UTC, Bruce Carneal wrote:You made an assertion, in a proof discussion, that the tractable extension of the exhaustive proof to cover larger FP types, was "no problem".Eh. I argued that you can fully cover an implementation without doing exhaustive testing. Don't confuse what I will waste my time on with what is possible. For algorithms with limited loops and limited input you can also fully unroll them and prove them correct by bit-blasting. Algorithms that are written to execute few instructions are not unlikely candidates for that.
Jan 26 2021
On Wednesday, 27 January 2021 at 06:14:47 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 27 January 2021 at 00:53:11 UTC, Bruce Carneal wrote:No, you made an assertion regarding the tractable extension of an exhaustive proof from 32 bits to longer FP representations for a particular implementation, a problem that I believe to be wholly unlike the simple examples that you later provided (those all enjoyed very clear domain co-domain mappings while suspected opacity of that mapping is the main concern regarding your "no problem" assertion). Also, please understand that I do not regard your time as more valuable than the time of your readers. In that spirit I suggest that direct answers to questions regarding simple topics ("no problem" topics), are preferable to assertions that you don't have time to help others understand. Also note that "I was mistaken" and "I'm not sure what you mean, could you clarify x,y,z?" are perfectly acceptable responses (that I've used on more than one occasion and will, no doubt, use again in the future).You made an assertion, in a proof discussion, that the tractable extension of the exhaustive proof to cover larger FP types, was "no problem".Eh. I argued that you can fully cover an implementation without doing exhaustive testing. Don't confuse what I will waste my time on with what is possible.
Jan 27 2021
On Wednesday, 27 January 2021 at 09:09:08 UTC, Bruce Carneal wrote:No, you made an assertion regarding the tractable extension of an exhaustive proof from 32 bits to longer FP representations for a particular implementation, a problem that I believe to be wholly unlike the simple examplesThe original statement was that there was only a choice between exhaustive testing and a full formal proof. Then there was confusion between proving the correctness of tables, algorithms and the actual implementation. Nobody wants to implement an algorithm for which there are no proofs of correctness. That is the basic foundation we have to presume, so a proof of the abstract algorithm is not wasted even if you only test the implementation. That proof is kinda part of the specification you base your implementation and testing on. When you implement the algorithm you also have to consider how it will be tested (including partial proofs). The "no problem" as related to not having to choose between an exhaustive proof and full formal verification. Neither option is likely in the D community. I pointed out that one can test the boundary cases. I have clearly not studied the details of the implementation. It was a general suggestion for when neither a full formal proof is likely or exhaustive testing is reasonable.valuable than the time of your readers. In that spirit I suggest that direct answers to questions regarding simple topics ("no problem" topics), are preferable to assertions that you don't have time to help others understand.I haven't made any such assertion. If you ask questions, I'll answer them to my ability. Please stop making assumptions. We are talking about finite input, finite input, small scale, and fairly regular. Clearly not intractable.
Jan 27 2021
On Wednesday, 27 January 2021 at 09:37:34 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 27 January 2021 at 09:09:08 UTC, Bruce Carneal wrote: I haven't made any such assertion. If you ask questions, I'll answer them to my ability. Please stop making assumptions.Great. Two questions: 1) Do you believe that a trivial domain/co-domain mapping, like those in the examples you gave, exists for the problem under discussion? and 2) If so, do you have time to provide a proof?
Jan 27 2021
On Wednesday, 27 January 2021 at 17:00:36 UTC, Bruce Carneal wrote:Great. Two questions: 1) Do you believe that a trivial domain/co-domain mapping, like those in the examples you gave, exists for the problem under discussion? andCertainly trivial from a theoretical perspective, in the sense that they are finite and therefore can be easily tabulated. If you want a more compact form you probably should look at modular arithmetics. You left out an important criterion, space-time efficiency. The inverse mapping is also not terribly complex, but first you need to establish what you want to show. In the context of Phobos, minimal length is of little importance, it has not impact on correct usage. We don't have to care if the algorithm produces more digits than necessary, so we can limit ourselves to the inverse mapping from decimal to binary. Basically proving that we then get the x = f⁻¹(f(x)) = id(x).2) If so, do you have time to provide a proof?Proof of what? And are you willing to pay for it? (I suggest you drop the juvenile tone, it is annoying.)
Jan 27 2021
On Wednesday, 27 January 2021 at 22:49:11 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 27 January 2021 at 17:00:36 UTC, Bruce Carneal wrote:Perhaps you mistook my meaning when I used the term "trivial". I could also have used "simple", "readily understood", "clearly provable", etc. Regardless, you clearly do not understand what I'm asking in question 1) and I do not think it likely that additional interaction on this topic will be useful to either of us.Great. Two questions: 1) Do you believe that a trivial domain/co-domain mapping, like those in the examples you gave, exists for the problem under discussion? andCertainly trivial from a theoretical perspective, in the sense that they are finite and therefore can be easily tabulated. If you want a more compact form you probably should look at modular arithmetics. You left out an important criterion, space-time efficiency. The inverse mapping is also not terribly complex, but first you need to establish what you want to show. In the context of Phobos, minimal length is of little importance, it has not impact on correct usage. We don't have to care if the algorithm produces more digits than necessary, so we can limit ourselves to the inverse mapping from decimal to binary. Basically proving that we then get the x = f⁻¹(f(x)) = id(x).2) If so, do you have time to provide a proof?Proof of what? And are you willing to pay for it? (I suggest you drop the juvenile tone, it is annoying.
Jan 27 2021
On Thursday, 28 January 2021 at 02:25:57 UTC, Bruce Carneal wrote:Regardless, you clearly do not understand what I'm asking in question 1) and I do not think it likely that additional interaction on this topic will be useful to either of us.Asking for a mapping does not really make much sense without specifying the domain/codomain, but my answer actually holds even then. Anyway, if you _actually_ are interested in this topic then you could start by looking at automatic test case generation using SMT solvers.
Jan 28 2021
On Thursday, 28 January 2021 at 08:30:40 UTC, Ola Fosheim Grøstad wrote:On Thursday, 28 January 2021 at 02:25:57 UTC, Bruce Carneal wrote:For the problem under discussion the domain is 0 .. 2**NBits (binary FP) and the codomain is the problem specified string output of the function. This has not changed and I had thought it was clear to all concerned throughout. The correctness check for the mapping in this case is trivial, by which I mean: obvious, readily understood, not needing further explanation, ... a *good* thing in this context.Regardless, you clearly do not understand what I'm asking in question 1) and I do not think it likely that additional interaction on this topic will be useful to either of us.Asking for a mapping does not really make much sense without specifying the domain/codomain, but my answer actually holds even then.Anyway, if you _actually_ are interested in this topic then you could start by looking at automatic test case generation using SMT solvers.I am actually interested in proving correctness by establishing full coverage of the domain with something less expensive than direct enumeration, less expensive than a for loop across all possibilities (an SMT-like solution). This is what I thought that you said you had accomplished with a trivial coverage, your "no problem" remark. IOW, I thought you claimed to have solved the problem. I found such a claim remarkable, I did not see it, so I sought clarification. IIUC now, I was mistaken then. You have no particular insight to offer toward the solution of the above problem and, by your lights, never made such a claim. Apparently I took your well meaning remarks much too seriously.
Jan 28 2021
On Thursday, 28 January 2021 at 15:13:57 UTC, Bruce Carneal wrote:On Thursday, 28 January 2021 at 08:30:40 UTC, Ola Fosheim Grøstad wrote: For the problem under discussion the domain is 0 .. 2**NBits (binary FP)...That should be 2**N - 1.
Jan 28 2021
On Thursday, 28 January 2021 at 15:13:57 UTC, Bruce Carneal wrote:For the problem under discussion the domain is 0 .. 2**NBits (binary FP) and the codomain is the problem specified string output of the function. This has not changed and I had thought it was clear to all concerned throughout.Let's backtrack then. My suggestion was that one does an informal proof of the easier aspects one can reason about and then do testing to cover those aspects which one cannot convince oneself of. So in order to list those tests one would have to do a deep analysis of the actual implementation informed by the properties/proofs of the abstract algorithm (the spec). Without such an analysis one cannot give examples of dependencies expressed as propositions. The core problem float2string is comparable to fixedpoint2string which is comparable to integer2string, because the transform from float to fixed point is trivial. (We also note that float is base2 and the target is base10, and that 10=5·2 so we may be rewarded for paying attention to modulo 5.) Then we may choose to represent a base 10 digit with something that is easier to deal with for a solver. We then need to think about, not increments of 1, but relationships in terms of modular arithmetic/congruence - related formalizations.I am actually interested in proving correctness by establishing full coverage of the domain with something less expensive than direct enumeration, less expensive than a for loop across all possibilities (an SMT-like solution).SMT/SAT based solutions are not like a for-loop across all possibilities. Are you actually serious about proving correctness? I don't get that feeling.You have no particular insight to offer toward the solution of the above problem and, by your lights, never made such a claim.I am not sure what you mean by this, but you sure manage to make it sound offensive... I've suggested utilizing bit-blasting, identity transform, modular arithmetics etc. I've never hinted or suggested that I have any intent of working on this problem, and explicitly stated that I have not taken a close look at the Mir implementation, nor do I intend to. I do however refute the idea that an exhaustive test is needed, since the problem involves regular patterns.
Jan 28 2021
On Thursday, 28 January 2021 at 23:27:28 UTC, Ola Fosheim Grøstad wrote:On Thursday, 28 January 2021 at 15:13:57 UTC, Bruce Carneal wrote:No, of course not. The parenthetical phrase was meant to apply to everything that preceeded it. An SMT like solution is what I thought you had asserted, something that substantially cut down the work while provably covering the full domain.For the problem under discussion the domain is 0 .. 2**NBits (binary FP) and the codomain is the problem specified string output of the function. This has not changed and I had thought it was clear to all concerned throughout.Let's backtrack then. My suggestion was that one does an informal proof of the easier aspects one can reason about and then do testing to cover those aspects which one cannot convince oneself of. So in order to list those tests one would have to do a deep analysis of the actual implementation informed by the properties/proofs of the abstract algorithm (the spec). Without such an analysis one cannot give examples of dependencies expressed as propositions. The core problem float2string is comparable to fixedpoint2string which is comparable to integer2string, because the transform from float to fixed point is trivial. (We also note that float is base2 and the target is base10, and that 10=5·2 so we may be rewarded for paying attention to modulo 5.) Then we may choose to represent a base 10 digit with something that is easier to deal with for a solver. We then need to think about, not increments of 1, but relationships in terms of modular arithmetic/congruence - related formalizations.I am actually interested in proving correctness by establishing full coverage of the domain with something less expensive than direct enumeration, less expensive than a for loop across all possibilities (an SMT-like solution).SMT/SAT based solutions are not like a for-loop across all possibilities.Are you actually serious about proving correctness? I don't get that feeling.Well, you've not refuted anything that I can see. You've asserted something, yes, something that I think is quite useful if provable. The intutition that prompts you to make such an assertion is what I'm most interested in. I do not believe the assertion is provable but would be very happy to be proven wrong as that would be a win for everyone. Since you continue to take offense at my writing, where none was intended, I think direct two way communication would be better than continuing here. I'll be hanging out at beerconf if you think it's worth your time to converge on understanding. Even if we aren't able to take your intuition to a proof or refutation quickly, we should at least be able to comprehend a lack of ill will.You have no particular insight to offer toward the solution of the above problem and, by your lights, never made such a claim.I am not sure what you mean by this, but you sure manage to make it sound offensive... I've suggested utilizing bit-blasting, identity transform, modular arithmetics etc. I've never hinted or suggested that I have any intent of working on this problem, and explicitly stated that I have not taken a close look at the Mir implementation, nor do I intend to. I do however refute the idea that an exhaustive test is needed, since the problem involves regular patterns.
Jan 28 2021
On Friday, 29 January 2021 at 02:58:38 UTC, Bruce Carneal wrote:The parenthetical phrase was meant to apply to everything that preceeded it. An SMT like solution is what I thought you had asserted, something that substantially cut down the work while provably covering the full domain.I am not asserting anything, other than that I don't think full formal verification (unless bitblasting works) is within the perimeter of the D community, so that leaves informal proofs combined with selective testing as the viable option that the D community as a whole (not any specific individual) can accomplish. A solver is a tool that can be used to make this viable. Without a solver, maybe a reimplementation with a correspondence proof is easier.better than continuing here. I'll be hanging out at beerconf if you think it's worth your time to converge on understanding.I like the idea of beerconf, but I have a deliverable deadline on monday, so I don't have time this week.
Jan 29 2021
On Friday, 29 January 2021 at 13:42:49 UTC, Ola Fosheim Grøstad wrote:On Friday, 29 January 2021 at 02:58:38 UTC, Bruce Carneal wrote:Then I misunderstood you all along. I thought that you were describing something that would be part of a formal proof. As an aside: I consider an exhaustive f32 proof to be very strong evidence that an implementation will likely be correct for larger formats.The parenthetical phrase was meant to apply to everything that preceeded it. An SMT like solution is what I thought you had asserted, something that substantially cut down the work while provably covering the full domain.I am not asserting anything, other than that I don't think full formal verification (unless bitblasting works) is within the perimeter of the D community, so that leaves informal proofs combined with selective testing as the viable option that the D community as a whole (not any specific individual) can accomplish.A solver is a tool that can be used to make this viable. Without a solver, maybe a reimplementation with a correspondence proof is easier.OK. I hope we have the opportunity to chat directly at a later date. Good luck meeting your deadline.better than continuing here. I'll be hanging out at beerconf if you think it's worth your time to converge on understanding.I like the idea of beerconf, but I have a deliverable deadline on monday, so I don't have time this week.
Jan 29 2021
On Friday, 29 January 2021 at 14:24:17 UTC, Bruce Carneal wrote:Then I misunderstood you all along. I thought that you were describing something that would be part of a formal proof.A formal proof would have to be machine checked. So you need a good solver/verifiers. And experience to make good use of, and patience... I think we can safely say that this is out of scope for the D community at this point.
Jan 29 2021
On Friday, 29 January 2021 at 18:35:54 UTC, Ola Fosheim Grøstad wrote:On Friday, 29 January 2021 at 14:24:17 UTC, Bruce Carneal wrote:And you would also need to annotate the D code with Hoare logic and generate verification conditions. Although there are tools that can help with that. But you would still need expertise to make good use of it. It is not a push-button topic. The only other alternative would be bit blasting, as I mentioned, which may or may not work out. Probably not if the code is too convoluted, but it might. Depends on the solver and structure of the input.Then I misunderstood you all along. I thought that you were describing something that would be part of a formal proof.A formal proof would have to be machine checked. So you need a good solver/verifiers. And experience to make good use of, and patience... I think we can safely say that this is out of scope for the D community at this point.
Jan 29 2021
On Friday, 29 January 2021 at 18:48:42 UTC, Ola Fosheim Grøstad wrote:And you would also need to annotate the D code with Hoare logic and generate verification conditions. Although there are tools that can help with that. But you would still need expertise to make good use of it. It is not a push-button topic. The only other alternative would be bit blasting, as I mentioned, which may or may not work out. Probably not if the code is too convoluted, but it might. Depends on the solver and structure of the input.Have you seen KLEE? It applies a solver like Z3 to normal code with assert() statements, as long as it can be compiled to LLVM bitcode and run with KLEE-compatible standard libraries. The original KLEE doesn't support floating point, but there's a fork that does (which I've never used). https://theartofmachinery.com/2019/05/28/d_and_klee.html (There are some things I need to fix/update in that post.)
Jan 29 2021
On Friday, 29 January 2021 at 21:11:02 UTC, sarn wrote:Have you seen KLEE? It applies a solver like Z3 to normal code with assert() statements, as long as it can be compiled to LLVM bitcode and run with KLEE-compatible standard libraries.I've only read about KLEE, which looks interesting. Using LLVMIR is one approach one can use for analysis when the input/output is tiny. You can (after unrolling) convert the IR to 64 logical propositions. One big logical (boolean) formula for each bit in the output. In this specific application you could try to prove that output bit 1 only depends on input bit 1, output bit 2 only depends on input bit 2 and so on. (After applying the inverse function, so that the function returns the same value as the input.)
Jan 29 2021
On Friday, 29 January 2021 at 22:03:26 UTC, Ola Fosheim Grøstad wrote:input/output is tiny. You can (after unrolling) convert the IR to 64 logical propositions. One big logical (boolean) formula for each bit in the output.Or more likely 128 boolean expressions, since you probably have more chance of resolution if you first show that there is no solution for output1 == false && input1 == true, then you do the same for output1 == true && input1 == false.
Jan 30 2021
On Thursday, 28 January 2021 at 23:27:28 UTC, Ola Fosheim Grøstad wrote:I do however refute the idea that an exhaustive test is needed, since the problem involves regular patterns.how about a full coverage of a half float type, assuming the function tested works the same with float and double and anay arbitrary lond FP type, e.g template param for mantissa and exp. Sof if a smaller FP type fully passes the for loop test, e.g all permutations we can assume bigger FP types will be 100% correctly converted.
Jan 30 2021
On Saturday, 30 January 2021 at 11:02:54 UTC, Afgdr wrote:On Thursday, 28 January 2021 at 23:27:28 UTC, Ola Fosheim Grøstad wrote:Well I forgot that it may use LUTs so maybe not in this case, but otherwise you should get the idea.I do however refute the idea that an exhaustive test is needed, since the problem involves regular patterns.how about a full coverage of a half float type, assuming the function tested works the same with float and double and anay arbitrary lond FP type, e.g template param for mantissa and exp. Sof if a smaller FP type fully passes the for loop test, e.g all permutations we can assume bigger FP types will be 100% correctly converted.
Jan 30 2021
On Saturday, 30 January 2021 at 11:06:34 UTC, Afgdr wrote:Well I forgot that it may use LUTs so maybe not in this case, but otherwise you should get the idea.Yes, I get your idea, but I think one has to be a bit more careful and base what is tested on what the implementation does. For instance, float32 does not expose overflow issues. We can assume that LUTs are correct if they are pre-generated, though.
Jan 30 2021
On Thursday, 28 January 2021 at 08:30:40 UTC, Ola Fosheim Grøstad wrote:On Thursday, 28 January 2021 at 02:25:57 UTC, Bruce Carneal wrote:We seem to repeatedly misunderstand one another on this topic so a quick back and forth, without burdening others in the forum, would be a good way to go from here. I'll be at the end-of-Jan beerconf for much of the time (I'm GMT+8 but will probably overlap with your daytime hours) so if you're interested we can establish a 2-way comm channel there and quickly, I believe, converge on understanding.Regardless, you clearly do not understand what I'm asking in question 1) and I do not think it likely that additional interaction on this topic will be useful to either of us.Asking for a mapping does not really make much sense without specifying the domain/codomain, but my answer actually holds even then. Anyway, if you _actually_ are interested in this topic then you could start by looking at automatic test case generation using SMT solvers.
Jan 28 2021
On Tuesday, 26 January 2021 at 20:27:25 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:I believed that to be obvious. There is no utility in exhaustively looping over tests that are incorrect. The domain <==> co-domain properties in this case means that the correctness check is direct.If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof.Of course you do, you need proof of the tests being correct...Yes. So now we have to prove that we can subset the exhaustive test. This is something like the algorithm development technique that starts with a backtracking solution (full enumeration) and then prunes away via memoization or other to achieve a lower complexity. If, OTOH, I start in the middle, asserting that I "know" what the shape of the problem is, then I'm on shaky ground. As practicing programmers we live with "shaky ground" every day, but I'm hoping we can do better in this case. It's already moving in a good direction.can be skipped is tantamount to saying that you know the implementation performs correctly in those regions.In general, if one understands the algorithm as implemented then one can either manually or automatically prove certain regions/ranges. Then you only need to test the regions/ranges that are hard to prove. For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.
Jan 26 2021
On Tuesday, 26 January 2021 at 20:45:22 UTC, Bruce Carneal wrote:As practicing programmers we live with "shaky ground" every day, but I'm hoping we can do better in this case. It's already moving in a good direction.Yes, the real issue in practice is constantly changing requirements and evolving software over time. We don't really have programming languages that support restructuring/modifying code very well. And well, if you have lots of "correct" tests, it becomes even more expensive to change... So, breaking things up into smaller units with concepts like actors might be a way to go, perhaps. Dunno.
Jan 26 2021