digitalmars.D.announce - std.format with wstring and dstring
- Robert Schadek (27/27) Sep 07 I spend one day at dconf this year and removed wstring and
- monkyyy (3/30) Sep 07 wouldnt be far faster to just rip out all the c api complexity
- Robert Schadek (3/5) Sep 07 No, not at all. The compile time goes into all the duck typing of
- monkyyy (3/10) Sep 08 Based on what? This api is crazy
- Richard (Rikki) Andrew Cattermole (18/18) Sep 07 std.format is on my list of modules that I want replaced.
- H. S. Teoh (41/64) Sep 08 IMO, a lot of template bloat could be removed by internally converting
- Richard (Rikki) Andrew Cattermole (20/88) Sep 08 Agreed it is too much boilerplate.
- H. S. Teoh (21/80) Sep 08 Oh you mean use a string building internally? That makes sense. I
- IchorDev (4/6) Sep 08 As I understand it, Phobos 3 will basically not support UTF-16 or
- Dom DiSc (5/13) Sep 08 This "simplicity" is gone in the moment you start working with
- H. S. Teoh (6/10) Sep 08 [...]
I spend one day at dconf this year and removed wstring and dstring support from std.format to improve compile speed. std.format.FormatSpec is no longer a template on the character type, and the bitfield template was removed as well. The dub package can be found here https://code.dlang.org/packages/std2_format https://github.com/burner/std2.format When compiling the below format call with ldc and -ftime-trace ``` import std2.format; //import std.format; void main(){ string s = format("Hello %s %s %.2f", "World", 1337, 13.37); assert(s == "Hello World 1337 13.37", s); } ``` The overall compile time decreases from 290ms to 223ms and the frontend time for the format call goes from 71ms to 23ms. Currently, alias this and toString tests fail. And I can't really figure out why. Also some float tests fails. PR's are always welcome. Meta: Removing wstring and dstring support from std.format for phobos 3 should be looked at IMHO.
 Sep 07
On Sunday, 7 September 2025 at 12:29:08 UTC, Robert Schadek wrote:I spend one day at dconf this year and removed wstring and dstring support from std.format to improve compile speed. std.format.FormatSpec is no longer a template on the character type, and the bitfield template was removed as well. The dub package can be found here https://code.dlang.org/packages/std2_format https://github.com/burner/std2.format When compiling the below format call with ldc and -ftime-trace ``` import std2.format; //import std.format; void main(){ string s = format("Hello %s %s %.2f", "World", 1337, 13.37); assert(s == "Hello World 1337 13.37", s); } ``` The overall compile time decreases from 290ms to 223ms and the frontend time for the format call goes from 71ms to 23ms. Currently, alias this and toString tests fail. And I can't really figure out why. Also some float tests fails. PR's are always welcome. Meta: Removing wstring and dstring support from std.format for phobos 3 should be looked at IMHO.wouldnt be far faster to just rip out all the c api complexity and just do a simple sane api?
 Sep 07
On Sunday, 7 September 2025 at 13:15:21 UTC, monkyyy wrote:wouldnt be far faster to just rip out all the c api complexity and just do a simple sane api?No, not at all. The compile time goes into all the duck typing of toString on classes and structs and nested formats like %(%s,%)
 Sep 07
On Sunday, 7 September 2025 at 19:09:01 UTC, Robert Schadek wrote:On Sunday, 7 September 2025 at 13:15:21 UTC, monkyyy wrote:Based on what? This api is crazy In what world is "nested" formats a simple or sane api?wouldnt be far faster to just rip out all the c api complexity and just do a simple sane api?No, not at all. The compile time goes into all the duck typing of toString on classes and structs and nested formats like %(%s,%)
 Sep 08
std.format is on my list of modules that I want replaced.
Its written with a lot of error conditions, rather than just adapting to 
the inputs. Hence lots of potential for exceptions being thrown that 
don't need to be.
Multiple multipliers in ``formattedWrite``: format string, output range.
What I'd like to do is to force IES for one or more values, if you want 
finer grained control you want do one value a time (``formatValue``).
```d
writeln(i"$i ${i:X}: $(j + 1)/${(j + 1):X} $1 ${0:X}", k, obj, "atend");
```
Require the use of a string builder, rather than any old output range.
This is a required change for pretty printing. It requires the use of 
arbitrary inserts and removals that ranges can't do.
Every template parameter like these that you have is a multiplier of 
instances, and that isn't good for compile times. Simplifying them down 
may seem like a pain, but it helps quite significantly. Given that there 
are some clear requirements and use cases we can in fact simplify it 
without hurting anyone enough to care.
 Sep 07
On Mon, Sep 08, 2025 at 01:09:02PM +1200, Richard (Rikki) Andrew Cattermole via Digitalmars-d-announce wrote:std.format is on my list of modules that I want replaced. Its written with a lot of error conditions, rather than just adapting to the inputs. Hence lots of potential for exceptions being thrown that don't need to be. Multiple multipliers in ``formattedWrite``: format string, output range.IMO, a lot of template bloat could be removed by internally converting output ranges to a delegate of static type that receives a const(char)[] and writes to whatever output range was passed in from user code. 90% of the std.format code does not actually care for the concrete type of the output range; we do not need a copy of the entire formatting code for every output range type passed in. Just erase the type at the entry function and make most of the formatting code non-templated.What I'd like to do is to force IES for one or more values, if you want finer grained control you want do one value a time (``formatValue``). ```d writeln(i"$i ${i:X}: $(j + 1)/${(j + 1):X} $1 ${0:X}", k, obj, "atend"); ```I'm still a fan of old-school format strings, I've to admit. Having to manually type `formatValue(x), formatValue(y), ...` is just way too much boilerplate.Require the use of a string builder, rather than any old output range.Too much boilerplate to use a string builder.This is a required change for pretty printing. It requires the use of arbitrary inserts and removals that ranges can't do.Format strings should be just strings. It should not accept arbitrary ranges (does it do that currently?). Arguments may be ranges.Every template parameter like these that you have is a multiplier of instances, and that isn't good for compile times. Simplifying them down may seem like a pain, but it helps quite significantly. Given that there are some clear requirements and use cases we can in fact simplify it without hurting anyone enough to care.See, the thing is that the current implementation of std.format goes about things the wrong way. It really should be just a thin wrapper template, the sole purpose of which is to unpack the incoming arguments and forward them to non-templated formatting functions. Or, at least, formatting functions templated only on a *single* argument type (like formatValue!int, formatValue!string, formatValue!float, ...), or perhaps just overload on various basic types, maybe plus a couple of templates for handling structs and classes, not on the entire `Args...` tuple of types. The latter causes combinatorial explosion of template instances, which is both bloating and needless. Only the top-level std.format.format needs to be templated on `Args...`. This should be split up so that instead of O(n*m) template instantiations we have only O(n) template instantiations (or preferably, O(1) template instantiations if all the type-dependent stuff is handled at the top level, and all lower-level functions are isolated formatting functions that only do one job each). Also, I dream of the day when we can pass compile-time format strings to std.format and it will *only* instantiate the formatting functions that you actually use. Float-formatting functions are particularly complex and bloaty; why should your program pay for that extra baggage if you never actually format a float? The various formatting functions should be pulled in only when you actually use them. Just because you call std.format with "%d" should not also pull in the whole shebang for formatting floats, structs, classes, BigInts, and who knows what else. T -- Economics: (n.) The science of explaining why yesterday's predictions didn't come true today.
 Sep 08
On 09/09/2025 3:39 AM, H. S. Teoh wrote:On Mon, Sep 08, 2025 at 01:09:02PM +1200, Richard (Rikki) Andrew Cattermole via Digitalmars-d-announce wrote:Agreed it is too much boilerplate. If we have to add a runtime string option then we can, the machinery will all be there. However it shouldn't be the option people should be reaching for. How long have we been recommending the template parameter for formatting over the runtime one? A good 10+ years now right.std.format is on my list of modules that I want replaced. Its written with a lot of error conditions, rather than just adapting to the inputs. Hence lots of potential for exceptions being thrown that don't need to be. Multiple multipliers in ``formattedWrite``: format string, output range.IMO, a lot of template bloat could be removed by internally converting output ranges to a delegate of static type that receives a const(char)[] and writes to whatever output range was passed in from user code. 90% of the std.format code does not actually care for the concrete type of the output range; we do not need a copy of the entire formatting code for every output range type passed in. Just erase the type at the entry function and make most of the formatting code non-templated.What I'd like to do is to force IES for one or more values, if you want finer grained control you want do one value a time (``formatValue``). ```d writeln(i"$i ${i:X}: $(j + 1)/${(j + 1):X} $1 ${0:X}", k, obj, "atend"); ```I'm still a fan of old-school format strings, I've to admit. Having to manually type `formatValue(x), formatValue(y), ...` is just way too much boilerplate.Nah. We use appenders in place of the string builder today. Its a direct 1:1 swap.Require the use of a string builder, rather than any old output range.Too much boilerplate to use a string builder.The reply isn't matching what I said?This is a required change for pretty printing. It requires the use of arbitrary inserts and removals that ranges can't do.Format strings should be just strings. It should not accept arbitrary ranges (does it do that currently?). Arguments may be ranges.Looks like its three template parameters: writer, format spec char and value type. Two of which I want gone. Doesn't appear to have a central dispatcher, its leaving it to overloading which is not a good design IMO. We'd like it to be closer to mine (although for whatever reason I've still got the builder templated): https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/text/format/rawwrite.d#L13Every template parameter like these that you have is a multiplier of instances, and that isn't good for compile times. Simplifying them down may seem like a pain, but it helps quite significantly. Given that there are some clear requirements and use cases we can in fact simplify it without hurting anyone enough to care.See, the thing is that the current implementation of std.format goes about things the wrong way. It really should be just a thin wrapper template, the sole purpose of which is to unpack the incoming arguments and forward them to non-templated formatting functions. Or, at least, formatting functions templated only on a *single* argument type (like formatValue!int, formatValue!string, formatValue!float, ...), or perhaps just overload on various basic types, maybe plus a couple of templates for handling structs and classes, not on the entire `Args...` tuple of types. The latter causes combinatorial explosion of template instances, which is both bloating and needless. Only the top-level std.format.format needs to be templated on `Args...`. This should be split up so that instead of O(n*m) template instantiations we have only O(n) template instantiations (or preferably, O(1) template instantiations if all the type-dependent stuff is handled at the top level, and all lower-level functions are isolated formatting functions that only do one job each).Also, I dream of the day when we can pass compile-time format strings to std.format and it will *only* instantiate the formatting functions that you actually use. Float-formatting functions are particularly complex and bloaty; why should your program pay for that extra baggage if you never actually format a float? The various formatting functions should be pulled in only when you actually use them. Just because you call std.format with "%d" should not also pull in the whole shebang for formatting floats, structs, classes, BigInts, and who knows what else.It already is. https://github.com/dlang/phobos/blob/master/std/format/internal/write.d#L575
 Sep 08
On Tue, Sep 09, 2025 at 02:26:20PM +1200, Richard (Rikki) Andrew Cattermole via Digitalmars-d-announce wrote:On 09/09/2025 3:39 AM, H. S. Teoh wrote:[...]I'm still a fan of old-school format strings, I've to admit. Having to manually type `formatValue(x), formatValue(y), ...` is just way too much boilerplate.Agreed it is too much boilerplate. If we have to add a runtime string option then we can, the machinery will all be there. However it shouldn't be the option people should be reaching for. How long have we been recommending the template parameter for formatting over the runtime one? A good 10+ years now right.Oh you mean use a string building internally? That makes sense. I misunderstood, I thought you meant for user code to use a string builder. [...]Nah. We use appenders in place of the string builder today.Require the use of a string builder, rather than any old output range.Too much boilerplate to use a string builder.Yeah, the writer should be type-erased to a delegate that receives string data, the format spec should not be templated on char type. Value type should pretty much be the only template parameter.See, the thing is that the current implementation of std.format goes about things the wrong way. It really should be just a thin wrapper template, the sole purpose of which is to unpack the incoming arguments and forward them to non-templated formatting functions. Or, at least, formatting functions templated only on a *single* argument type (like formatValue!int, formatValue!string, formatValue!float, ...), or perhaps just overload on various basic types, maybe plus a couple of templates for handling structs and classes, not on the entire `Args...` tuple of types. The latter causes combinatorial explosion of template instances, which is both bloating and needless. Only the top-level std.format.format needs to be templated on `Args...`. This should be split up so that instead of O(n*m) template instantiations we have only O(n) template instantiations (or preferably, O(1) template instantiations if all the type-dependent stuff is handled at the top level, and all lower-level functions are isolated formatting functions that only do one job each).Looks like its three template parameters: writer, format spec char and value type. Two of which I want gone.Doesn't appear to have a central dispatcher, its leaving it to overloading which is not a good design IMO.Yeah it's a mess. :-/ I did look at this code some years ago, hoping to find low-hanging fruits to improve it, but gave up after struggling with the tangled mess that it was in.We'd like it to be closer to mine (although for whatever reason I've still got the builder templated): https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/text/format/rawwrite.d#L13Standardizing to a delegate for the builder allows us to swap out different builders without incurring any template bloat. Not sure if that's necessary, but could be a nice escape hatch just in case an unusual case arises. Templates are powerful but sometimes type erasure is called for.+1. T -- Talk is cheap. Whining is actually free. -- Lars WirzeniusAlso, I dream of the day when we can pass compile-time format strings to std.format and it will *only* instantiate the formatting functions that you actually use. Float-formatting functions are particularly complex and bloaty; why should your program pay for that extra baggage if you never actually format a float? The various formatting functions should be pulled in only when you actually use them. Just because you call std.format with "%d" should not also pull in the whole shebang for formatting floats, structs, classes, BigInts, and who knows what else.It already is. https://github.com/dlang/phobos/blob/master/std/format/internal/write.d#L575
 Sep 08
On Sunday, 7 September 2025 at 12:29:08 UTC, Robert Schadek wrote:Meta: Removing wstring and dstring support from std.format for phobos 3 should be looked at IMHO.As I understand it, Phobos 3 will basically not support UTF-16 or UTF-32 anymore except for encoding conversion. As someone who appreciates UTF-32's elegant simplicity, I find this saddening.
 Sep 08
On Monday, 8 September 2025 at 08:49:53 UTC, IchorDev wrote:On Sunday, 7 September 2025 at 12:29:08 UTC, Robert Schadek wrote:This "simplicity" is gone in the moment you start working with graphemes. A "unit" on the screen may consist of multiple characters no matter which encoding you use - so why not stay with a single one (UTF-8)?Meta: Removing wstring and dstring support from std.format for phobos 3 should be looked at IMHO.As I understand it, Phobos 3 will basically not support UTF-16 or UTF-32 anymore except for encoding conversion. As someone who appreciates UTF-32's elegant simplicity, I find this saddening.
 Sep 08
On Sun, Sep 07, 2025 at 12:29:08PM +0000, Robert Schadek via Digitalmars-d-announce wrote:I spend one day at dconf this year and removed wstring and dstring support from std.format to improve compile speed. std.format.FormatSpec is no longer a template on the character type, and the bitfield template was removed as well.[...] +1, time to get rid of excess baggage. T -- PENTIUM = Produces Erroneous Numbers Thru Incorrect Understanding of Mathematics
 Sep 08








 
  
  
 
 monkyyy <crazymonkyyy gmail.com>
 monkyyy <crazymonkyyy gmail.com> 