digitalmars.D - printf() metaprogramming challenge

Walter Bright (111/111) May 23 2019 While up at night with jetlag at DConf, I started toying about solving a...

Jonathan Marler (111/229) May 23 2019 It uses mixin, so not pretty, but it works...

Les De Ridder (42/43) May 23 2019 Similar solution:
Alex (4/22) May 23 2019 this can all be simplified:

Yuxuan Shui (11/15) May 23 2019 What a coincidence, I had this exact problem today as well. It

Andrei Alexandrescu (2/24) May 23 2019 Did you try .expand with the tuple?

Yuxuan Shui (2/27) May 23 2019 It's 1 character shorter to just write someTuple[0..$] :)

ag0aep6g (28/32) May 23 2019 I don't know if this satisfies the "no extra overhead" rule. Maybe when
bpr (4/15) May 23 2019 Are you sure this works for betterC? It's been a while for me,

Radu (5/20) May 24 2019 Indeed it doesn't work with -betterC flag. Easily testable on

Petar Kirov [ZombineDev] (10/34) May 24 2019 In cases like this, one needs to use the enum lambda trick:

Sebastiaan Koppe (3/12) May 24 2019 Ohh, that is nice one. Thanks!
Radu (3/21) May 24 2019 Yes, good point! I forgot about this trick.

Petar Kirov [ZombineDev] (13/14) May 24 2019 Best verified on d.godbolt.org. Compare:

Radu (5/20) May 24 2019 I used the same method to generate C header files for a betterC
Andrei Alexandrescu (3/23) May 24 2019 Interesting. These problems seem to be implementation-specific, not

Jacob Carlborg (11/40) May 24 2019 This is kind of nice, but I would prefer to have a complete

Walter Bright (5/11) May 24 2019 C's sprintf is already @nogc nothrow and pure. Doing our own is not that...

Jacob Carlborg (11/17) May 24 2019 Technically it's not pure because it access `errno`, that's what I meant...

Walter Bright (14/28) May 24 2019 The C standard doesn't say printf can set errno. Be that as it may, I di...

Andrei Alexandrescu (8/16) May 24 2019 This 100x. Once C++ variadics were out, everybody and their cat had an
Jonathan Marler (32/48) May 24 2019 It took me about an hour to port this "float to string"

Walter Bright (14/18) May 24 2019 https://github.com/ulfjack/ryu says: "The Java implementation differs fr...

Jonathan Marler (10/33) May 24 2019 I didn't design an implementation in an hour, I just ported one :)

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/14) May 25 2019 It is quite interesting that you get that performance without

Patrick Schluter (12/22) May 25 2019 L1 instruction cache are small and the cost of code bloat is only

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/16) May 25 2019 Yes, in benchmarking one should only test full applications… I

Mike Franklin (20/24) May 24 2019 That may be true, but one problem with `printf` is it is much too

Jonathan Marler (3/8) May 24 2019 My implementation is "pay for what you use". A pure D
Andrei Alexandrescu (5/26) May 25 2019 The high impact part is the metaprogramming and introspection machinery....

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/8) May 25 2019 Programmers are looking for solutions, not machinery…
bpr (12/17) May 25 2019 I'd think you'd be commenting on the "Issue 5710" thread then, as

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (22/32) May 25 2019 AFAIK, Ulf Adams is stating that the Java implementation is
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (23/33) May 25 2019 AFAIK, Ulf Adams is stating that the Java specification is
Joseph Rushton Wakeling (12/23) May 25 2019 FWIW the Ryu algorithm looks like a serious piece of work — see

Walter Bright (15/29) May 25 2019 Thank you. I've saved a copy of the paper. It it is indeed

Walter Bright <newshound2 digitalmars.com> writes:

While up at night with jetlag at DConf, I started toying about solving a small 
problem. In order to use printf(), the format specifiers in the printf format 
string have to match the types of the rest of the parameters. This is well
known 
to be brittle and error-prone, especially when refactoring the types of the 
arguments.

(Of course, this is not a problem with writefln() and friends, but that isn't 
available in the dmd front end, nor when using betterC. Making printf better 
would mesh nicely with betterC. Note that many C compilers have extensions to 
tell you if there's a mismatch, but they won't fix it for you.)

I thought why not use D's metaprogramming to fix it. Some ground rules:

1. No extra overhead
2. Completely self-contained
3. Only %s specifiers are rewritten
4. %% is handled
5. diagnose mismatch between number of specifiers and number of arguments

Here's my solution:

     int i;
     dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);

gets rewritten to:

     printf("hello %d %g %p %s betty\n", 3, 4.0, &i, "abc".ptr);

The code at the end accomplishes this. Yay!

But what I'd like it to do is to extend it to convert a `string s` argument
into 
`cast(int)s.length, s.ptr` tuple and use the "%.*s" specifier for it.

I completely failed at that. I suspect the language has a deficiency in 
manipulating expression tuples.

Does anyone see a way to make this work?

Note: In order to minimize template bloat, I refactored most of the work into a 
regular function, minimizing the size of the template expansions.

------ Das Code ------------
import core.stdc.stdio : printf;

template Seq(A ...) { alias Seq = A; }

int dprintf(string f, A ...)(A args)
{
     enum Fmts = Formats!(A);
     enum string s = formatString(f, Fmts);
     __gshared const(char)* s2 = s.ptr;
     return printf(Seq!(s2, args[0..2], args[2..4]));
}

template Formats(T ...)
{
     static if (T.length == 0)
	enum Formats = [ ];
     else static if (T.length == 1)
	enum Formats = [Spec!(T[0])];
     else
	enum Formats = [Spec!(T[0])] ~ Formats!(T[1 .. T.length]);
}

template Spec(T : byte)    { enum Spec = "%d"; }
template Spec(T : short)   { enum Spec = "%d"; }
template Spec(T : int)     { enum Spec = "%d"; }
template Spec(T : long)    { enum Spec = "%lld"; }

template Spec(T : ubyte)   { enum Spec = "%u"; }
template Spec(T : ushort)  { enum Spec = "%u"; }
template Spec(T : uint)    { enum Spec = "%u"; }
template Spec(T : ulong)   { enum Spec = "%llu"; }

template Spec(T : float)   { enum Spec = "%g"; }
template Spec(T : double)  { enum Spec = "%g"; }
template Spec(T : real)    { enum Spec = "%Lg"; }

template Spec(T : char)    { enum Spec = "%c"; }
template Spec(T : wchar)   { enum Spec = "%c"; }
template Spec(T : dchar)   { enum Spec = "%c"; }

template Spec(T : immutable(char)*)   { enum Spec = "%s"; }
template Spec(T : const(char)*)       { enum Spec = "%s"; }
template Spec(T : T*)                 { enum Spec = "%p"; }

/******************************************
  * Replace %s format specifiers in f with corresponding specifiers in A[].
  * Other format specifiers are left as is.
  * Number of format specifiers must match A.length.
  * Params:
  *	f = printf format string
  *	A = replacement format specifiers
  * Returns:
  *	replacement printf format string
  */
string formatString(string f, string[] A ...)
{
     string r;
     size_t i;
     size_t ai;
     while (i < f.length)
     {
	if (f[i] != '%' || i + 1 == f.length)
	{
	    r ~= f[i];
	    ++i;
	    continue;
	}
	char c = f[i + 1];
	if (c == '%')
	{
	    r ~= "%%";
	    i += 2;
	    continue;
	}
	assert(ai < A.length, "not enough arguments");
	string fmt = A[ai];
	++ai;
	if (c == 's')
	{
	    r ~= fmt;
	    i += 2;
	    continue;
	}
	r ~= '%';
	++i;
	continue;
     }
     assert(ai == A.length, "not enough formats");
     return r;
}
----- End Of Das Code ----------

May 23 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 While up at night with jetlag at DConf, I started toying about 
 solving a small problem. In order to use printf(), the format 
 specifiers in the printf format string have to match the types 
 of the rest of the parameters. This is well known to be brittle 
 and error-prone, especially when refactoring the types of the 
 arguments.

 (Of course, this is not a problem with writefln() and friends, 
 but that isn't available in the dmd front end, nor when using 
 betterC. Making printf better would mesh nicely with betterC. 
 Note that many C compilers have extensions to tell you if 
 there's a mismatch, but they won't fix it for you.)

 I thought why not use D's metaprogramming to fix it. Some 
 ground rules:

 1. No extra overhead
 2. Completely self-contained
 3. Only %s specifiers are rewritten
 4. %% is handled
 5. diagnose mismatch between number of specifiers and number of 
 arguments

 Here's my solution:

     int i;
     dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);

 gets rewritten to:

     printf("hello %d %g %p %s betty\n", 3, 4.0, &i, "abc".ptr);

 The code at the end accomplishes this. Yay!

 But what I'd like it to do is to extend it to convert a `string 
 s` argument into `cast(int)s.length, s.ptr` tuple and use the 
 "%.*s" specifier for it.

 I completely failed at that. I suspect the language has a 
 deficiency in manipulating expression tuples.

 Does anyone see a way to make this work?

 Note: In order to minimize template bloat, I refactored most of 
 the work into a regular function, minimizing the size of the 
 template expansions.

 ------ Das Code ------------
 import core.stdc.stdio : printf;

 template Seq(A ...) { alias Seq = A; }

 int dprintf(string f, A ...)(A args)
 {
     enum Fmts = Formats!(A);
     enum string s = formatString(f, Fmts);
     __gshared const(char)* s2 = s.ptr;
     return printf(Seq!(s2, args[0..2], args[2..4]));
 }

 template Formats(T ...)
 {
     static if (T.length == 0)
 	enum Formats = [ ];
     else static if (T.length == 1)
 	enum Formats = [Spec!(T[0])];
     else
 	enum Formats = [Spec!(T[0])] ~ Formats!(T[1 .. T.length]);
 }

 template Spec(T : byte)    { enum Spec = "%d"; }
 template Spec(T : short)   { enum Spec = "%d"; }
 template Spec(T : int)     { enum Spec = "%d"; }
 template Spec(T : long)    { enum Spec = "%lld"; }

 template Spec(T : ubyte)   { enum Spec = "%u"; }
 template Spec(T : ushort)  { enum Spec = "%u"; }
 template Spec(T : uint)    { enum Spec = "%u"; }
 template Spec(T : ulong)   { enum Spec = "%llu"; }

 template Spec(T : float)   { enum Spec = "%g"; }
 template Spec(T : double)  { enum Spec = "%g"; }
 template Spec(T : real)    { enum Spec = "%Lg"; }

 template Spec(T : char)    { enum Spec = "%c"; }
 template Spec(T : wchar)   { enum Spec = "%c"; }
 template Spec(T : dchar)   { enum Spec = "%c"; }

 template Spec(T : immutable(char)*)   { enum Spec = "%s"; }
 template Spec(T : const(char)*)       { enum Spec = "%s"; }
 template Spec(T : T*)                 { enum Spec = "%p"; }

 /******************************************
  * Replace %s format specifiers in f with corresponding 
 specifiers in A[].
  * Other format specifiers are left as is.
  * Number of format specifiers must match A.length.
  * Params:
  *	f = printf format string
  *	A = replacement format specifiers
  * Returns:
  *	replacement printf format string
  */
 string formatString(string f, string[] A ...)
 {
     string r;
     size_t i;
     size_t ai;
     while (i < f.length)
     {
 	if (f[i] != '%' || i + 1 == f.length)
 	{
 	    r ~= f[i];
 	    ++i;
 	    continue;
 	}
 	char c = f[i + 1];
 	if (c == '%')
 	{
 	    r ~= "%%";
 	    i += 2;
 	    continue;
 	}
 	assert(ai < A.length, "not enough arguments");
 	string fmt = A[ai];
 	++ai;
 	if (c == 's')
 	{
 	    r ~= fmt;
 	    i += 2;
 	    continue;
 	}
 	r ~= '%';
 	++i;
 	continue;
     }
     assert(ai == A.length, "not enough formats");
     return r;
 }
 ----- End Of Das Code ----------

It uses mixin, so not pretty, but it works...

void main()
{
     int i = 0;
     dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);

     const msg = "AAA!";
     dprintf!"A dstring '%s'\n"(msg[0 .. 3]);
}

template Seq(A ...) { alias Seq = A; }

int dprintf(string f, A ...)(A args)
{
     import core.stdc.stdio : printf;

     enum Fmts = Formats!(A);
     enum string s = formatString(f, Fmts);
     __gshared const(char)* s2 = s.ptr;
     enum call = function() {
         import std.conv : to;
         string printfCall = "printf(s2";
         foreach(i, T; A)
         {
            static if (is(T : string))
            {
                printfCall ~= ", cast(size_t)args[" ~ i.to!string 
~ "].length, args["
                     ~ i.to!string ~ "].ptr";
            }
            else
            {
                printfCall ~= ", args[" ~ i.to!string ~ "]";
            }

         }
         return printfCall ~ ")";
     }();
     //pragma(msg, call); // uncomment to see the final call
     return mixin(call);
}

template Formats(T ...)
{
     static if (T.length == 0)
	    enum Formats = [];
     else static if (T.length == 1)
	    enum Formats = [Spec!(T[0])];
     else
	    enum Formats = [Spec!(T[0])] ~ Formats!(T[1 .. T.length]);
}

template Spec(T : byte)    { enum Spec = "%d"; }
template Spec(T : short)   { enum Spec = "%d"; }
template Spec(T : int)     { enum Spec = "%d"; }
template Spec(T : long)    { enum Spec = "%lld"; }

template Spec(T : ubyte)   { enum Spec = "%u"; }
template Spec(T : ushort)  { enum Spec = "%u"; }
template Spec(T : uint)    { enum Spec = "%u"; }
template Spec(T : ulong)   { enum Spec = "%llu"; }

template Spec(T : float)   { enum Spec = "%g"; }
template Spec(T : double)  { enum Spec = "%g"; }
template Spec(T : real)    { enum Spec = "%Lg"; }

template Spec(T : char)    { enum Spec = "%c"; }
template Spec(T : wchar)   { enum Spec = "%c"; }
template Spec(T : dchar)   { enum Spec = "%c"; }
template Spec(T : string)  { enum Spec = "%.*s"; }

template Spec(T : immutable(char)*)   { enum Spec = "%s"; }
template Spec(T : const(char)*)       { enum Spec = "%s"; }
template Spec(T : T*)                 { enum Spec = "%p"; }

/******************************************
  * Replace %s format specifiers in f with corresponding 
specifiers in A[].
  * Other format specifiers are left as is.
  * Number of format specifiers must match A.length.
  * Params:
  *	f = printf format string
  *	A = replacement format specifiers
  * Returns:
  *	replacement printf format string
  */
string formatString(string f, string[] A ...)
{
     string r;
     size_t i;
     size_t ai;
     while (i < f.length)
     {
	if (f[i] != '%' || i + 1 == f.length)
	{
	    r ~= f[i];
	    ++i;
	    continue;
	}
	char c = f[i + 1];
	if (c == '%')
	{
	    r ~= "%%";
	    i += 2;
	    continue;
	}
	assert(ai < A.length, "not enough arguments");
	string fmt = A[ai];
	++ai;
	if (c == 's')
	{
	    r ~= fmt;
	    i += 2;
	    continue;
	}
	r ~= '%';
	++i;
	continue;
     }
     assert(ai == A.length, "not enough formats");
     return r;
}

May 23 2019

Les De Ridder <les lesderid.net> writes:

On Thursday, 23 May 2019 at 22:48:33 UTC, Jonathan Marler wrote:
 It uses mixin, so not pretty, but it works...

Similar solution:

--- printf.d	2019-05-24 00:48:44.840543714 +0200
+++ printf_s.d	2019-05-24 00:52:47.829178613 +0200
   -1,13 +1,12   
  import core.stdc.stdio : printf;

-template Seq(A ...) { alias Seq = A; }
-
  int dprintf(string f, A ...)(A args)
  {
      enum Fmts = Formats!(A);
      enum string s = formatString(f, Fmts);
+    alias args_ = Args!(args);
      __gshared const(char)* s2 = s.ptr;
-    return printf(Seq!(s2, args[0..2], args[2..4]));
+    mixin( q{return printf(s2, } ~ args_ ~ q{);} );
  }

  template Formats(T ...)
   -42,6 +41,22   
  template Spec(T : const(char)*)       { enum Spec = "%s"; }
  template Spec(T : T*)                 { enum Spec = "%p"; }

+template Spec(T : string)  { enum Spec = "%.*s"; }
+
+template Args(A ...)
+{
+    static if (A.length == 0)
+	enum Args = "";
+    else static if (A.length == 1)
+	enum Args = Arg!(A[0]);
+    else
+	enum Args = Arg!(A[0]) ~ ", " ~ Args!(A[1 .. A.length]);
+}
+
+template Arg(alias string arg) { enum Arg = 
"cast(int)"~arg.stringof~".length,"~arg.stringof~".ptr"; }
+
+template Arg(alias arg) { enum Arg = arg.stringof; }
+
  /******************************************
   * Replace %s format specifiers in f with corresponding 
specifiers in A[].
   * Other format specifiers are left as is.

May 23 2019

Alex <AJ gmail.com> writes:

 template Spec(T : byte)    { enum Spec = "%d"; }
 template Spec(T : short)   { enum Spec = "%d"; }
 template Spec(T : int)     { enum Spec = "%d"; }
 template Spec(T : long)    { enum Spec = "%lld"; }

 template Spec(T : ubyte)   { enum Spec = "%u"; }
 template Spec(T : ushort)  { enum Spec = "%u"; }
 template Spec(T : uint)    { enum Spec = "%u"; }
 template Spec(T : ulong)   { enum Spec = "%llu"; }

 template Spec(T : float)   { enum Spec = "%g"; }
 template Spec(T : double)  { enum Spec = "%g"; }
 template Spec(T : real)    { enum Spec = "%Lg"; }

 template Spec(T : char)    { enum Spec = "%c"; }
 template Spec(T : wchar)   { enum Spec = "%c"; }
 template Spec(T : dchar)   { enum Spec = "%c"; }
 template Spec(T : string)  { enum Spec = "%.*s"; }

 template Spec(T : immutable(char)*)   { enum Spec = "%s"; }
 template Spec(T : const(char)*)       { enum Spec = "%s"; }
 template Spec(T : T*)                 { enum Spec = "%p"; }

this can all be simplified:

static foreach(k,v: ["byte":"%d", "short":"%d", ...])
     mixin(`template Spec(T : `~k~`) { enum Spec = "`~v~`"; }`);

The string mixin is not necessary but easier than an aliasSeq.

May 23 2019

Yuxuan Shui <yshuiv7 gmail.com> writes:

On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 [snip]

 I completely failed at that. I suspect the language has a 
 deficiency in manipulating expression tuples.

 

What a coincidence, I had this exact problem today as well. It 
seems currently the only way to do this is either with mixins, or 
using tuple.

Assuming you already have all of the arguments in a tuple:

     auto args = tuple(...);

And args[x] is a string, you can do this:

     auto args_prime = tuple(args[0..x], args[x].length, 
args[x].ptr, args[x..$]);

You then need to do some template magic to expand all such 
arguments... Using mixin is probably better.

May 23 2019

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 5/23/19 6:58 PM, Yuxuan Shui wrote:
 On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 [snip]

 I completely failed at that. I suspect the language has a deficiency 
 in manipulating expression tuples.

 
 What a coincidence, I had this exact problem today as well. It seems 
 currently the only way to do this is either with mixins, or using tuple.
 
 Assuming you already have all of the arguments in a tuple:
 
      auto args = tuple(...);
 
 And args[x] is a string, you can do this:
 
      auto args_prime = tuple(args[0..x], args[x].length, args[x].ptr, 
 args[x..$]);
 
 You then need to do some template magic to expand all such arguments... 
 Using mixin is probably better.

Did you try .expand with the tuple?

May 23 2019

Yuxuan Shui <yshuiv7 gmail.com> writes:

On Friday, 24 May 2019 at 00:41:31 UTC, Andrei Alexandrescu wrote:
 On 5/23/19 6:58 PM, Yuxuan Shui wrote:
 On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 [snip]

 I completely failed at that. I suspect the language has a 
 deficiency in manipulating expression tuples.

 
 What a coincidence, I had this exact problem today as well. It 
 seems currently the only way to do this is either with mixins, 
 or using tuple.
 
 Assuming you already have all of the arguments in a tuple:
 
      auto args = tuple(...);
 
 And args[x] is a string, you can do this:
 
      auto args_prime = tuple(args[0..x], args[x].length, 
 args[x].ptr, args[x..$]);
 
 You then need to do some template magic to expand all such 
 arguments... Using mixin is probably better.

 Did you try .expand with the tuple?

It's 1 character shorter to just write someTuple[0..$] :)

May 23 2019

ag0aep6g <anonymous example.com> writes:

On 23.05.19 21:33, Walter Bright wrote:
 But what I'd like it to do is to extend it to convert a `string s` 
 argument into `cast(int)s.length, s.ptr` tuple and use the "%.*s" 
 specifier for it.

[...]
 Does anyone see a way to make this work?

I don't know if this satisfies the "no extra overhead" rule. Maybe when 
`arrlen` and `arrptr` are inlined?

int dprintf(string f, A ...)(A args)
{
     enum Fmts = Formats!(A);
     enum string s = formatString(f, Fmts);
     __gshared const(char)* s2 = s.ptr;
     import std.meta: staticMap;
     return printf(Seq!(s2, staticMap!(arg, args)));
}

template arg(alias a)
{
     static if (is(typeof(a) == string))
         alias arg = Seq!(arrlen!a, arrptr!a);
     else alias arg = a;
}

auto arrlen(alias a)() { return a.length; }
auto arrptr(alias a)() { return a.ptr; }

template Spec(T : string) { enum Spec = "%.*s"; }

void main()
{
     int i;
     dprintf!"hello %s %s %s %s betty %s\n"(3, 4.0, &i,
         "abc".ptr, "foobar");
}

// ... rest of the code unchanged ...

May 23 2019

bpr <brogoff gmail.com> writes:

On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 string formatString(string f, string[] A ...)
 {
     string r;
     size_t i;
     size_t ai;
     while (i < f.length)
     {
 	if (f[i] != '%' || i + 1 == f.length)
 	{
 	    r ~= f[i];


Are you sure this works for betterC? It's been a while for me, 
but I think it won't, the string appends will stop it.

Good job at getting unit tests and final switch in!

 ----- End Of Das Code ----------

May 23 2019

Radu <void null.pt> writes:

On Friday, 24 May 2019 at 04:49:22 UTC, bpr wrote:
 On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 string formatString(string f, string[] A ...)
 {
     string r;
     size_t i;
     size_t ai;
     while (i < f.length)
     {
 	if (f[i] != '%' || i + 1 == f.length)
 	{
 	    r ~= f[i];


 Are you sure this works for betterC? It's been a while for me, 
 but I think it won't, the string appends will stop it.

 Good job at getting unit tests and final switch in!

 ----- End Of Das Code ----------


Indeed it doesn't work with -betterC flag. Easily testable on 
run.dlang.io

This probably would work if CTFE was supported when compiling 
with betterC.

May 24 2019

Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:

On Friday, 24 May 2019 at 07:09:56 UTC, Radu wrote:
 On Friday, 24 May 2019 at 04:49:22 UTC, bpr wrote:
 On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
 string formatString(string f, string[] A ...)
 {
     string r;
     size_t i;
     size_t ai;
     while (i < f.length)
     {
 	if (f[i] != '%' || i + 1 == f.length)
 	{
 	    r ~= f[i];


 Are you sure this works for betterC? It's been a while for me, 
 but I think it won't, the string appends will stop it.

 Good job at getting unit tests and final switch in!

 ----- End Of Das Code ----------


 Indeed it doesn't work with -betterC flag. Easily testable on 
 run.dlang.io

 This probably would work if CTFE was supported when compiling 
 with betterC.


In cases like this, one needs to use the enum lambda trick:

// Before:
string foo(string arg1) { /* .. */ }

// After:
enum foo(string arg1) = () { /* .. */ };


(Replace `string arg1` with all compile-time and run-time 
parameters that `foo` may take.)

That way, `foo` won't reach the code-generator and hence you 
won't get errors with `-betterC`.

May 24 2019

Sebastiaan Koppe <mail skoppe.eu> writes:

On Friday, 24 May 2019 at 08:00:31 UTC, Petar Kirov [ZombineDev] 
wrote:
 In cases like this, one needs to use the enum lambda trick:

 // Before:
 string foo(string arg1) { /* .. */ }

 // After:
 enum foo(string arg1) = () { /* .. */ };


 (Replace `string arg1` with all compile-time and run-time 
 parameters that `foo` may take.)

 That way, `foo` won't reach the code-generator and hence you 
 won't get errors with `-betterC`.

Ohh, that is nice one. Thanks!

May 24 2019

Radu <void null.pt> writes:

On Friday, 24 May 2019 at 08:00:31 UTC, Petar Kirov [ZombineDev] 
wrote:
 On Friday, 24 May 2019 at 07:09:56 UTC, Radu wrote:
 On Friday, 24 May 2019 at 04:49:22 UTC, bpr wrote:
 [...]


 Indeed it doesn't work with -betterC flag. Easily testable on 
 run.dlang.io

 This probably would work if CTFE was supported when compiling 
 with betterC.


 In cases like this, one needs to use the enum lambda trick:

 // Before:
 string foo(string arg1) { /* .. */ }

 // After:
 enum foo(string arg1) = () { /* .. */ };


 (Replace `string arg1` with all compile-time and run-time 
 parameters that `foo` may take.)

 That way, `foo` won't reach the code-generator and hence you 
 won't get errors with `-betterC`.

Yes, good point! I forgot about this trick.

May 24 2019

Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:

On Friday, 24 May 2019 at 09:52:58 UTC, Radu wrote:
 Yes, good point! I forgot about this trick.

Best verified on d.godbolt.org. Compare:

* https://d.godbolt.org/z/E8aoBg - compiles without -betterC, 
generates a ton of bloat

* https://d.godbolt.org/z/GGh9c1 - same, but doesn't compile with 
-betterC

* https://d.godbolt.org/z/mPQMcc - compiles with -betterC

* 
https://run.dlang.io/gist/run-dlang/1caf15c8c7dded16ba812353361feda9 - what I
would like to write, but currently produces too much bloat and doesn't work
with -betterC

The examples above, were inspired by 
https://twitter.com/Cor3ntin/status/1127210941718962177. I wanted 
to check how D with -betterC would compare to C++23+ w.r.t code 
gen (bloat).

May 24 2019

Radu <void null.pt> writes:

On Friday, 24 May 2019 at 12:14:09 UTC, Petar Kirov [ZombineDev] 
wrote:
 On Friday, 24 May 2019 at 09:52:58 UTC, Radu wrote:
 Yes, good point! I forgot about this trick.

 Best verified on d.godbolt.org. Compare:

 * https://d.godbolt.org/z/E8aoBg - compiles without -betterC, 
 generates a ton of bloat

 * https://d.godbolt.org/z/GGh9c1 - same, but doesn't compile 
 with -betterC

 * https://d.godbolt.org/z/mPQMcc - compiles with -betterC

 * 
 https://run.dlang.io/gist/run-dlang/1caf15c8c7dded16ba812353361feda9 - what I
would like to write, but currently produces too much bloat and doesn't work
with -betterC

 The examples above, were inspired by 
 https://twitter.com/Cor3ntin/status/1127210941718962177. I 
 wanted to check how D with -betterC would compare to C++23+ 
 w.r.t code gen (bloat).

I used the same method to generate C header files for a betterC 
library, I know it works and doesn't produce runtime bloat.

To bad it is something you forget, i.e. it is not obvious :)

May 24 2019

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 5/24/19 8:14 AM, Petar Kirov [ZombineDev] wrote:
 On Friday, 24 May 2019 at 09:52:58 UTC, Radu wrote:
 Yes, good point! I forgot about this trick.

 
 Best verified on d.godbolt.org. Compare:
 
 * https://d.godbolt.org/z/E8aoBg - compiles without -betterC, generates 
 a ton of bloat
 
 * https://d.godbolt.org/z/GGh9c1 - same, but doesn't compile with -betterC
 
 * https://d.godbolt.org/z/mPQMcc - compiles with -betterC
 
 * https://run.dlang.io/gist/run-dlang/1caf15c8c7dded16ba812353361feda9 - 
 what I would like to write, but currently produces too much bloat and 
 doesn't work with -betterC
 
 The examples above, were inspired by 
 https://twitter.com/Cor3ntin/status/1127210941718962177. I wanted to 
 check how D with -betterC would compare to C++23+ w.r.t code gen (bloat).

Interesting. These problems seem to be implementation-specific, not 
fundamental.

May 24 2019

Jacob Carlborg <doob me.com> writes:

On 2019-05-23 21:33, Walter Bright wrote:
 While up at night with jetlag at DConf, I started toying about solving a 
 small problem. In order to use printf(), the format specifiers in the 
 printf format string have to match the types of the rest of the 
 parameters. This is well known to be brittle and error-prone, especially 
 when refactoring the types of the arguments.
 
 (Of course, this is not a problem with writefln() and friends, but that 
 isn't available in the dmd front end, nor when using betterC. Making 
 printf better would mesh nicely with betterC. Note that many C compilers 
 have extensions to tell you if there's a mismatch, but they won't fix it 
 for you.)
 
 I thought why not use D's metaprogramming to fix it. Some ground rules:
 
 1. No extra overhead
 2. Completely self-contained
 3. Only %s specifiers are rewritten
 4. %% is handled
 5. diagnose mismatch between number of specifiers and number of arguments
 
 Here's my solution:
 
      int i;
      dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);
 
 gets rewritten to:
 
      printf("hello %d %g %p %s betty\n", 3, 4.0, &i, "abc".ptr);
 

This is kind of nice, but I would prefer to have a complete 
implementation written in D (of sprintf) that is  nogc  safe nothrow and 
pure. To avoid having to add various hacks to apply these attributes.

Would be nice if it recognizes objects and calls `toString` or `toChars` 
as well.

I can also add that there was this guy at DConf that said that if a D 
string should be passed to a C library it should manually pass the 
pointer and length separately without any magic ;)

-- 
/Jacob Carlborg

May 24 2019

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2019 8:35 AM, Jacob Carlborg wrote:
 This is kind of nice, but I would prefer to have a complete implementation 
 written in D (of sprintf) that is  nogc  safe nothrow and pure. To avoid
having 
 to add various hacks to apply these attributes.

C's sprintf is already  nogc nothrow and pure. Doing our own is not that easy, 
in particular, the floating point formatting is a fair amount of tricky work.

Besides, this is a few lines of code, and would fit in fine with betterC.


 I can also add that there was this guy at DConf that said that if a D string 
 should be passed to a C library it should manually pass the pointer and length 
 separately without any magic ;)

That wouldn't work with %.*s because the .length argument must be cast to int.

May 24 2019

Jacob Carlborg <doob me.com> writes:

On 2019-05-24 20:39, Walter Bright wrote:

 C's sprintf is already  nogc nothrow and pure.

Technically it's not pure because it access `errno`, that's what I meant 
with "various hacks".

 Doing our own is not that 
 easy, in particular, the floating point formatting is a fair amount of 
 tricky work.

Stefan Koch has an implementation for that [3], even works at CTFE. Not 
sure if it's compatible with the C implementation though.

 That wouldn't work with %.*s because the .length argument must be cast 
 to int.

Of course it works. The DMD code base is littered with calls to printf 
with D strings the manually way of passing the pointer and length 
separately, including the casting.

[3] https://github.com/UplinkCoder/fpconv/blob/master/src/fpconv_ctfe.d

-- 
/Jacob Carlborg

May 24 2019

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2019 12:15 PM, Jacob Carlborg wrote:
 On 2019-05-24 20:39, Walter Bright wrote:
 
 C's sprintf is already  nogc nothrow and pure.

 
 Technically it's not pure because it access `errno`, that's what I meant with 
 "various hacks".

The C standard doesn't say printf can set errno. Be that as it may, I did find 
one printf that did:

"If a multibyte character encoding error occurs while writing wide characters, 
errno is set to EILSEQ and a negative number is returned."

http://www.cplusplus.com/reference/cstdio/printf/

It's pure if not sending it malformed UTF.


 Doing our own is not that easy, in particular, the floating point formatting 
 is a fair amount of tricky work.

 Stefan Koch has an implementation for that [3], even works at CTFE. Not sure
if 
 it's compatible with the C implementation though.

I have one, too, the DMC++ one, though it doesn't do the fp formatting exactly 
right. I infer Stefan's doesn't, either, simply because his test suite spans 
lines 574-583 and is completely inadequate.

You can get an idea of what is required by reading:

https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf


 That wouldn't work with %.*s because the .length argument must be cast to int.

 Of course it works. The DMD code base is littered with calls to printf with D 
 strings the manually way of passing the pointer and length separately,
including 
 the casting.

The compiler doesn't know to do the cast when passing `string` arguments by 
.ptr/.length.

May 24 2019

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 5/24/19 2:39 PM, Walter Bright wrote:
 On 5/24/2019 8:35 AM, Jacob Carlborg wrote:
 This is kind of nice, but I would prefer to have a complete 
 implementation written in D (of sprintf) that is  nogc  safe nothrow 
 and pure. To avoid having to add various hacks to apply these attributes.

 
 C's sprintf is already  nogc nothrow and pure. Doing our own is not that 
 easy, in particular, the floating point formatting is a fair amount of 
 tricky work.

This 100x. Once C++ variadics were out, everybody and their cat had an 
article about "safely replacing the printf family". Invariably the 
implementations ditched printf and as a consequence were bulky and they 
did awfully with floating point numbers. The right approach in 10% of 
the code is to check the arguments during compilation and then forward 
to the C function. The only remaining slightly tricky part (for printing 
to a string) is figuring out the maximum buffer size needed.

May 24 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Friday, 24 May 2019 at 18:39:41 UTC, Walter Bright wrote:
 On 5/24/2019 8:35 AM, Jacob Carlborg wrote:
 This is kind of nice, but I would prefer to have a complete 
 implementation written in D (of sprintf) that is  nogc  safe 
 nothrow and pure. To avoid having to add various hacks to 
 apply these attributes.

 C's sprintf is already  nogc nothrow and pure. Doing our own is 
 not that easy, in particular, the floating point formatting is 
 a fair amount of tricky work.

It took me about an hour to port this "float to string" 
implementation to D:

https://github.com/ulfjack/ryu
https://github.com/dragon-lang/mar/blob/master/src/mar/ryu.d

You can use `floatToString` to print a default-formatted float, 
or you can add your own formats by calling `f2d` which gives you 
the exponent and mantissa.

I only added support for 32-bit floats though.  Will add support 
for more when I need it.

 Besides, this is a few lines of code, and would fit in fine 
 with betterC.

True

 I can also add that there was this guy at DConf that said that 
 if a D string should be passed to a C library it should 
 manually pass the pointer and length separately without any 
 magic ;)

 That wouldn't work with %.*s because the .length argument must 
 be cast to int.

Not sure if you'll find it helpful, but I wrote my own "print" 
framework in my library that's meant to be usable in -betterC and 
with/without druntime/phobos.

https://github.com/dragon-lang/mar/blob/master/Print.md
https://github.com/dragon-lang/mar/tree/master/src/mar/print

It doesn't use format strings, instead, allows you to return a 
struct with a "print" function, i.e.

import mar.print;

int a = 42;
sprint("a is: ", a);
sprint("a in hex is: 0x", a.formatHex);

struct Point
{
     int x;
     int y;
     auto print(P)(P printer) const
     {
         return printArgs(printer, x, ',', y);
     }
}

sprint("point is ", Point(1, 2));

May 24 2019

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2019 2:07 PM, Jonathan Marler wrote:
 It took me about an hour to port this "float to string" implementation to D:
 
 https://github.com/ulfjack/ryu
 https://github.com/dragon-lang/mar/blob/master/src/mar/ryu.d

https://github.com/ulfjack/ryu says: "The Java implementation differs from the 
output of Double.toString in some cases: sometimes the output is shorter (which 
is arguably more accurate) and sometimes the output may differ in the precise 
digits output" which I find fairly concerning. Please review the paper I linked 
to in my reply to Jacob.

Floating point formatting is not something that can be knocked out in an hour. 
You can get a "mostly working" implementation that way, but not a serious, 
robust, correct implementation with the expected flexibility. (And the test 
cases to prove it correct.)

The fact that people write academic papers about it should be good evidence.

C's printf has been hammered on by literally generations of programmers over 3 
decades. While the interface to it is old-fashioned and unsafe, the guts of it 
are rock solid, fast, and correct.

May 24 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:
 On 5/24/2019 2:07 PM, Jonathan Marler wrote:
 It took me about an hour to port this "float to string" 
 implementation to D:
 
 https://github.com/ulfjack/ryu
 https://github.com/dragon-lang/mar/blob/master/src/mar/ryu.d

 https://github.com/ulfjack/ryu says: "The Java implementation 
 differs from the output of Double.toString in some cases: 
 sometimes the output is shorter (which is arguably more 
 accurate) and sometimes the output may differ in the precise 
 digits output" which I find fairly concerning. Please review 
 the paper I linked to in my reply to Jacob.

 Floating point formatting is not something that can be knocked 
 out in an hour. You can get a "mostly working" implementation 
 that way, but not a serious, robust, correct implementation 
 with the expected flexibility. (And the test cases to prove it 
 correct.)

 The fact that people write academic papers about it should be 
 good evidence.

 C's printf has been hammered on by literally generations of 
 programmers over 3 decades. While the interface to it is 
 old-fashioned and unsafe, the guts of it are rock solid, fast, 
 and correct.

I didn't design an implementation in an hour, I just ported one :)

Ulf's algorithm can be implemented in only a few hundred lines 
and apparently is the fastest implementation to-date that 
maintains a 100% robust algorithm. At least that what I remember 
from watching his video.

https://pldi18.sigplan.org/details/pldi-2018-papers/20/Ry-Fast-Float-to-String-Conversion

He explains in the video why this is a hard problem and tries to 
explain his paper/algorithm.  But's it's very new, only a year 
old I think.  Cool innovation.

May 24 2019

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 24 May 2019 at 23:55:13 UTC, Jonathan Marler wrote:
 Ulf's algorithm can be implemented in only a few hundred lines 
 and apparently is the fastest implementation to-date that 
 maintains a 100% robust algorithm.

It is quite interesting that you get that performance without 
bloat.

I wonder if it is faster than the special cased float 
implementations. (using an estimator that chooses a faster 
floating point version where it works).

 But's it's very new, only a year old I think.  Cool innovation.

Yes:

ACM SIGPLAN Notices - PLDI '18
Volume 53 Issue 4, April 2018
Pages 270-282

May 25 2019

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Saturday, 25 May 2019 at 07:26:47 UTC, Ola Fosheim Grøstad 
wrote:
 On Friday, 24 May 2019 at 23:55:13 UTC, Jonathan Marler wrote:
 Ulf's algorithm can be implemented in only a few hundred lines 
 and apparently is the fastest implementation to-date that 
 maintains a 100% robust algorithm.


 It is quite interesting that you get that performance without 
 bloat.

L1 instruction cache are small and the cost of code bloat is only 
rarely counted. Benchmarks are overwhelmingly good mannered 
concerning instruction caches.
This makes that optimisation for instruction cache are neglected.

I had once on our project a heavily optimised function with a lot 
of subcases, loop unrolling etc. In the test benchmark it was the 
fastest to all alternatives. When using in the final application, 
the simple 2 line loop in pure C, outrun it in the concrete 
application. With valgrind cachegrind I discovered that the 
misses in instruction cache made a big, big, difference.

 I wonder if it is faster than the special cased float 
 implementations. (using an estimator that chooses a faster 
 floating point version where it works).

 But's it's very new, only a year old I think.  Cool innovation.

May 25 2019

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Saturday, 25 May 2019 at 11:27:43 UTC, Patrick Schluter wrote:
 L1 instruction cache are small and the cost of code bloat is 
 only rarely counted. Benchmarks are overwhelmingly good 
 mannered concerning instruction caches.

Yes, in benchmarking one should only test full applications… I 
guess invalidating the cache every N iterations is a possibility 
in a synthetic benchmark.

 This makes that optimisation for instruction cache are 
 neglected.

Right, I'm interested in seeing what Mike Franklin does for 
embedded. A minimalistic framework would be interesting to see.

 concrete application. With valgrind cachegrind I discovered 
 that the misses in instruction cache made a big, big, 
 difference.

Interesting, I really need to try that cachegrind some time. 
Sounds very useful.

May 25 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:

 C's printf has been hammered on by literally generations of 
 programmers over 3 decades. While the interface to it is 
 old-fashioned and unsafe, the guts of it are rock solid, fast, 
 and correct.

That may be true, but one problem with `printf` is it is much too 
large and inefficient for some problem domains [1].

Rust has a more efficient `printf` alternative which is not 
dependent on a runtime or libc [2].

D could offer a *much* more efficient, pay-for-what-you-use 
implementation that doesn't require libc, a runtime, etc., like 
Rust's implementation.  It wouldn't be easy (especially wrt 
floating point types), but it would be a great benefit to D and 
its users.  Maybe I'll add it to dlang/projects [3].

There seems to be a perception about C that because it's old and 
proven, it's magical.  There's nothing `printf` is doing that D 
can't do better, if someone would just be willing to do the hard 
work.

Mike


use printf() - 
https://embeddedgurus.com/stack-overflow/tag/printf/
[2] - std.fmt : https://doc.rust-lang.org/std/fmt/
[3] - dlang/projects - https://github.com/dlang/projects

May 24 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Friday, 24 May 2019 at 23:58:46 UTC, Mike Franklin wrote:
 On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:

 [...]

 That may be true, but one problem with `printf` is it is much 
 too large and inefficient for some problem domains [1].

 [...]

My implementation is "pay for what you use".  A pure D 
implementation that's also extensible.

May 24 2019

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/24/19 7:58 PM, Mike Franklin wrote:
 On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:
 
 C's printf has been hammered on by literally generations of 
 programmers over 3 decades. While the interface to it is old-fashioned 
 and unsafe, the guts of it are rock solid, fast, and correct.

 
 That may be true, but one problem with `printf` is it is much too large 
 and inefficient for some problem domains [1].
 
 Rust has a more efficient `printf` alternative which is not dependent on 
 a runtime or libc [2].
 
 D could offer a *much* more efficient, pay-for-what-you-use 
 implementation that doesn't require libc, a runtime, etc., like Rust's 
 implementation.  It wouldn't be easy (especially wrt floating point 
 types), but it would be a great benefit to D and its users.  Maybe I'll 
 add it to dlang/projects [3].
 
 There seems to be a perception about C that because it's old and proven, 
 it's magical.  There's nothing `printf` is doing that D can't do better, 
 if someone would just be willing to do the hard work.

The high impact part is the metaprogramming and introspection machinery. 
This is where D can contribute something innovative to the larger 
programming community. Yet another implementation of formatting 
primitives is low impact.

May 25 2019

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Saturday, 25 May 2019 at 14:40:09 UTC, Andrei Alexandrescu 
wrote:
 The high impact part is the metaprogramming and introspection 
 machinery. This is where D can contribute something innovative 
 to the larger programming community. Yet another implementation 
 of formatting primitives is low impact.

Programmers are looking for solutions, not machinery…

Many people might want to use D for Arduino if it was a plug&play.

May 25 2019

bpr <brogoff gmail.com> writes:

On Saturday, 25 May 2019 at 14:40:09 UTC, Andrei Alexandrescu 
wrote:
 The high impact part is the metaprogramming and introspection 
 machinery. This is where D can contribute something innovative 
 to the larger programming community.

I'd think you'd be commenting on the "Issue 5710" thread then, as 
that very much does affect the metaprogramming capabilities of D. 
I do agree that this is an area where D shines and where D can 
show off the most compared to its competitors.

 Yet another implementation of formatting primitives is low 
 impact.

You're probably right, but the people implementing this in D are 
already doing or have done it, so that's a good thing. Also, as 
someone who will likely only use the betterC subset of D, having 
some better primitives than printf in BetterC is a small boon, 
though I'd rather have something more like the new C++ fmt 
library than like printf.

May 25 2019

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:
 https://github.com/ulfjack/ryu says: "The Java implementation 
 differs from the output of Double.toString in some cases: 
 sometimes the output is shorter (which is arguably more 
 accurate) and sometimes the output may differ in the precise 
 digits output" which I find fairly concerning. Please review 
 the paper I linked to in my reply to Jacob.

AFAIK, Ulf Adams is stating that the Java implementation is 
sloppy (my word).

He states that other implementations provide more digits than is 
necessary to get an accurate representation.

https://dl.acm.org/citation.cfm?id=3192369

 C's printf has been hammered on by literally generations of 
 programmers over 3 decades. While the interface to it is 
 old-fashioned and unsafe, the guts of it are rock solid, fast, 
 and correct.

Not really. He argues that the C spec isn't clear, so he follows 
a more stringent criteria than C printf.


Burger and Dybvig found errors in implementations from DEC, HP 
and SGI:
https://www.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96.pdf


Others claim to find roundoff errors in a common printf 
implementations e.g.:«The implementation that ships with 
Microsoft Visual C++ 2010 Express sometimes has an off-by-one 
error when rounding to the closest value.»

http://www.ryanjuckett.com/programming/printing-floating-point-numbers/

  (I haven't checked the claim, but it would not surprise me).


It is clear that not using C standard lib will bring more 
consistent and portable results across platforms, even the C 
version is correct as the C-standard leaves wiggle room.  This 
can be important in scientific computing when comparing results 
from various platforms.

May 25 2019

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:
 https://github.com/ulfjack/ryu says: "The Java implementation 
 differs from the output of Double.toString in some cases: 
 sometimes the output is shorter (which is arguably more 
 accurate) and sometimes the output may differ in the precise 
 digits output" which I find fairly concerning. Please review 
 the paper I linked to in my reply to Jacob.

AFAIK, Ulf Adams is stating that the Java specification is 
unclear, so it is up for debate a to whether the Java 
implementation is wrong or whether the spec should be reviewed.

He also states that other implementations provide more digits 
than is necessary to get an accurate representation.

https://dl.acm.org/citation.cfm?id=3192369

 C's printf has been hammered on by literally generations of 
 programmers over 3 decades. While the interface to it is 
 old-fashioned and unsafe, the guts of it are rock solid, fast, 
 and correct.

Not really. He argues that the C spec isn't clear, so he follows 
a more stringent criteria than C printf.


Burger and Dybvig found errors in implementations from DEC, HP 
and SGI:
https://www.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96.pdf


Others claim to find roundoff errors in a common printf 
implementations e.g.:«The implementation that ships with 
Microsoft Visual C++ 2010 Express sometimes has an off-by-one 
error when rounding to the closest value.»

http://www.ryanjuckett.com/programming/printing-floating-point-numbers/

  (I haven't checked the claim, but it would not surprise me).


It is clear that not using C standard lib will bring more 
consistent and portable results across platforms, even if the C 
version is correct as the C-standard leaves wiggle room.  This 
can be important in scientific computing when comparing results 
from various platforms.

May 25 2019

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Friday, 24 May 2019 at 22:59:10 UTC, Walter Bright wrote:
 https://github.com/ulfjack/ryu says: "The Java implementation 
 differs from the output of Double.toString in some cases: 
 sometimes the output is shorter (which is arguably more 
 accurate) and sometimes the output may differ in the precise 
 digits output" which I find fairly concerning. Please review 
 the paper I linked to in my reply to Jacob.

FWIW the Ryu algorithm looks like a serious piece of work — see 
this paper, which references (and compares in detail) to the 
paper you linked to:
https://dl.acm.org/citation.cfm?id=3192369

It covers in some detail the rationale for the differences you 
note.

 Floating point formatting is not something that can be knocked 
 out in an hour. You can get a "mostly working" implementation 
 that way, but not a serious, robust, correct implementation 
 with the expected flexibility. (And the test cases to prove it 
 correct.)

One interesting remark in the paper on the Ryu algorithm: "We did 
not compare our implementation against the C standard library 
function printf, as its specification does not include the 
correctness criteria set forth by Steele and White [15], and, 
accordingly, neither the glibc nor the MacOS implementation does."

May 25 2019

Walter Bright <newshound2 digitalmars.com> writes:

On 5/25/2019 1:42 AM, Joseph Rushton Wakeling wrote:
 FWIW the Ryu algorithm looks like a serious piece of work — see this paper, 
 which references (and compares in detail) to the paper you linked to:
 https://dl.acm.org/citation.cfm?id=3192369
 
 It covers in some detail the rationale for the differences you note.

Thank you. I've saved a copy of the paper. It it is indeed

1. faster
2. more accurate
3. supports all the options (precision, etc.) with %e %f %g

then it is indeed a candidate for inclusion in Phobos' std.format. But not for 
this exercise.

 Floating point formatting is not something that can be knocked out in an hour. 
 You can get a "mostly working" implementation that way, but not a serious, 
 robust, correct implementation with the expected flexibility. (And the test 
 cases to prove it correct.)

 
 One interesting remark in the paper on the Ryu algorithm: "We did not compare 
 our implementation against the C standard library function printf, as its 
 specification does not include the correctness criteria set forth by Steele
and 
 White [15], and, accordingly, neither the glibc nor the MacOS implementation
does."

The C standard library does indeed not have correctness criteria for floating 
point formatting nor for the trig functions, and that does lead to some sloppy 
implementations, but the mainstream compilers do it well and are expected to.

I once attended a presentation on numerics by a math professor, who said the C 
trig functions on FreeBSD were extremely reliable. He was rather upset (and 
didn't believe me) when I said I'd tested them and found errors in the last
digit.

It should be a point of pride for the D language to have the floating point 
"good to the last bit" and I'm always interested in contributions that get us
there.

May 25 2019

D Programming

C/C++ Programming

Other

digitalmars.D - printf() metaprogramming challenge