www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - What are best practices around toString?

reply christian.koestlin <christian.koestlin gmail.com> writes:
Dear Dlang experts,

up until now I was perfectly happy with implementing `(override) 
string toString() const` or something to get nicely formatted 
(mostly debug) output for my structs, classes and exceptions.

But recently I stumbled upon 
https://wiki.dlang.org/Defining_custom_print_format_specifiers 
and additionally 
https://github.com/dlang/dmd/blob/4ff1eec2ce7d990dcd58e5b641ef3d0a1676b9bb/druntim
/src/object.d#L2637 which at first sight is great, because it provides the same
customization of an objects representation with less memory allocations.

When grepping through phobos, there are a bunch of "different" 
signatures implemented for this, e.g.

```d
...
phobos/std/typecons.d:        void toString(DG)(scope DG sink) 
const
...
phobos/std/typecons.d:        void toString(DG, Char)(scope DG 
sink,  scope const ref FormatSpec!Char fmt) const
...
phobos/std/typecons.d:        void toString()(scope void 
delegate(const(char)[]) sink, scope const ref FormatSpec!char fmt)
...
phobos/std/sumtype.d:        void toString(this This, Sink, 
Char)(ref Sink sink, const ref FormatSpec!Char fmt);
...
```
to just show a few.

Furthermore, when one works with instances of struct, objects or 
exceptions a `aInstance.toString()` does not "work" when one only 
implements the sink interface (which is to be expected), whereas 
a `std.conv.to!string` or a formatted write with `%s` always 
works (no matter what was used to implement the toString).

So I wonder, what is best practice in the community and would it 
make sense to add something to dscanner, that "warns" on usages 
of `aInstance.toString()`?


Kind regards,
Christian
Sep 30 2022
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 30 September 2022 at 13:11:56 UTC, christian.koestlin 
wrote:
 Dear Dlang experts,

 up until now I was perfectly happy with implementing 
 `(override) string toString() const` or something to get nicely 
 formatted (mostly debug) output for my structs, classes and 
 exceptions.
Human beings read extremely slowly compared to how quickly the GC can allocate and free `string`s as needed, so there is no need to complicate your code with more text formatting strategies unless you want to generate this debug output far faster than a human can actually read it.
 But recently I stumbled upon 
 https://wiki.dlang.org/Defining_custom_print_format_specifiers 
 and additionally 
 https://github.com/dlang/dmd/blob/4ff1eec2ce7d990dcd58e5b641ef3d0a1676b9bb/druntim
/src/object.d#L2637 which at first sight is great, because it provides the same
customization of an objects representation with less memory allocations.

 When grepping through phobos, there are a bunch of "different" 
 signatures implemented for this, e.g.

 ```d
 ...
 phobos/std/typecons.d:        void toString(DG)(scope DG sink) 
 const
 ...
 phobos/std/typecons.d:        void toString(DG, Char)(scope DG 
 sink,  scope const ref FormatSpec!Char fmt) const
 ...
 phobos/std/typecons.d:        void toString()(scope void 
 delegate(const(char)[]) sink, scope const ref FormatSpec!char 
 fmt)
 ...
 phobos/std/sumtype.d:        void toString(this This, Sink, 
 Char)(ref Sink sink, const ref FormatSpec!Char fmt);
 ...
 ```
 to just show a few.
The `FormatSpec` parameter only belongs there if you're actually going to do something useful with it in your `toString` implementation. Even if you are going to use it, you should probably still provide a convenience overload with a default specifier.
 Furthermore, when one works with instances of struct, objects 
 or exceptions a `aInstance.toString()` does not "work" when one 
 only implements the sink interface (which is to be expected), 
 whereas a `std.conv.to!string` or a formatted write with `%s` 
 always works (no matter what was used to implement the 
 toString).
I generally do something like this: ```D struct A { string message; int enthusiasm; void toString(DG)(scope DG sink) scope const safe if(is(DG : void delegate(scope const(char[])) safe) || is(DG : void function(scope const(char[])) safe)) { import std.format : formattedWrite; sink(message); sink(" x "); formattedWrite!"%d"(sink, enthusiasm); sink("!"); } string toString() scope const pure safe { StringBuilder builder; toString(&(builder.opCall)); // Find the exact string length. builder.allocate(); toString(&(builder.opCall)); // Actually write the chars. return builder.finish(); } } ``` So, the first `toString` overload defines how to format the value to text, while the second overload does memory management and forwards the formatting work to the first. `StringBuilder` is a utility shared across the entire project: ```D struct StringBuilder { private: char[] buffer; size_t next; public: void opCall(scope const(char[]) str) scope pure safe nothrow nogc { const curr = next; next += str.length; if(buffer !is null) buffer[curr .. next] = str[]; } void allocate() scope pure safe nothrow { buffer = new char[next]; next = 0; } void allocate(const(size_t) maxLength) scope pure safe nothrow { buffer = new char[maxLength]; next = 0; } string finish() pure trusted nothrow nogc { assert(buffer !is null); string ret = cast(immutable) buffer[0 .. next]; buffer = null; next = 0; return ret; } } ``` The first formatting pass to find the required buffer length can be skipped if you can somehow pre-calculate the maximum possible length, or if you prefer the common strategy of repeatedly re-allocating the buffer with exponentially increasing size used by the likes of `std.array.Appender`. Since the API for `toString` remains the same regardless, you are free to choose the best strategy for each type.
Oct 01 2022
parent reply Salih Dincer <salihdb hotmail.com> writes:
On Saturday, 1 October 2022 at 08:26:43 UTC, tsbockman wrote:
 So, the first `toString` overload defines how to format the 
 value to text, while the second overload does memory management 
 and forwards the formatting work to the first.

 `StringBuilder` is a utility shared across the entire project:
Appender not good enough; at least in terms of allocating memory and accumulating a string? Thanks... SDB 79
Oct 01 2022
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 1 October 2022 at 10:02:34 UTC, Salih Dincer wrote:
 On Saturday, 1 October 2022 at 08:26:43 UTC, tsbockman wrote:
 `StringBuilder` is a utility shared across the entire project:
Appender not good enough; at least in terms of allocating memory and accumulating a string?
`Appender` is a legitimate option, but unless it is provided with a good estimate of the final length at the beginning, it will allocate several times for a longer string, and the final buffer will be, on average, 50% larger than needed. Neither of these things is a major problem, but `StringBuilder` is only a few lines of code to perfectly minimize allocation, so why not?
Oct 01 2022
next sibling parent Salih Dincer <salihdb hotmail.com> writes:
On Saturday, 1 October 2022 at 17:50:54 UTC, tsbockman wrote:
 but unless it is provided with a good estimate of the final
 length at the beginning, it will allocate several times for
 a longer string, and the final buffer will be, on average, 50% 
 larger than needed.
I see, it's smart! SDB 79
Oct 02 2022
prev sibling parent christian.koestlin <christian.koestlin gmail.com> writes:
On Saturday, 1 October 2022 at 17:50:54 UTC, tsbockman wrote:
 On Saturday, 1 October 2022 at 10:02:34 UTC, Salih Dincer wrote:
 On Saturday, 1 October 2022 at 08:26:43 UTC, tsbockman wrote:
 `StringBuilder` is a utility shared across the entire project:
Appender not good enough; at least in terms of allocating memory and accumulating a string?
`Appender` is a legitimate option, but unless it is provided with a good estimate of the final length at the beginning, it will allocate several times for a longer string, and the final buffer will be, on average, 50% larger than needed. Neither of these things is a major problem, but `StringBuilder` is only a few lines of code to perfectly minimize allocation, so why not?
Thanks a lot. One needs to go twice through the serialization, but perhaps thats better than reallocing memory. Kind regards, Christian
Oct 06 2022