www.digitalmars.com         C & C++   DMDScript  

digitalmars.dip.ideas - opNew operator overload

reply Dennis <dkorpel gmail.com> writes:


When a `struct Allocator` or `class Allocator` defines `T 
opNew(T, A...)(A a) => ...`, then `allocator.new Item("x")` will 
be rewritten to `allocator.opNew!Item("x")`.



Some D programs use a different allocator than the built-in GC 
allocator you get when using the `new` operator. To make 
allocation type safe, allocator libraries often offer a templated 
function that creates an instance of an arbitrary type using that 
allocator. However, these template functions have to awkwardly 
use a synonym for `new`, since it's a reserved keyword. A common 
choice is 'make' or 'alloc' (short for 'allocate').

Some real examples:

* `make!T` or `makeArray!T` in 
[std.experimental.allocator](https://github.com/dlang-community/stdx-allocator/blob/4903a249e83b9797cc1d5167a13701a6dfd111a7/source/stdx/allocator/typed.d#L283)
* `alloc!T` in 
[memutils](https://code.dlang.org/packages/memutils)
* `constructNew!T` in 
[memterface](https://git.sleeping.town/ichordev/memterface/src/commit/6e4e4984927d9e2039dd887c3edf637ba3a4a305/source/memterface/ctor.d#L33)

Since D programmers are familiar with `new T()` syntax, enabling 
it for custom allocators makes code both more pleasing to read 
and easier to refactor when switching from using the GC to a 
custom allocator. It also makes it easier to switch from one 
custom allocator to another when there's a unified syntax, since 
you don't need to learn and apply new function signatures every 
time.



- D runtime hooks have been translated to templates. `new X()` is 
already being rewritten to instances of `_d_newitemT`, 
`_d_newarrayT`, or `_d_newclassT`, so overrding the implicitly 
imported object.d module lets you override the allocator with a 
custom one. This is hacky though, and only provides the means to 
switch a single global allocator since there's still no way to 
pass an allocator parameter.
- [class allocators have been 
deprecated](https://dlang.org/deprecate.html#Class%20allocators%2
and%20deallocators) because "Classes should not be responsible for their own
(de)allocation strategy". This proposal is different because it doesn't let the
class decide which allocator to use, it requires explicit mentioning of the
allocator object on each allocation with `new`.



The rewrite works just like existing operator overloading.
`opNew` has to be defined inside an aggregate.
The `typeof(new T())` will be passed as the first template 
parameter, while the arguments passed to `new` are passed as 
run-time arguments.

```D
struct GcAllocator
{
     // Simple wrapper, can be swapped out later
     T opNew(T, A...)(A args) => new T(args);
}

void main()
{
     GcAllocator allocator;
     allocator.new int(3);       // a.opNew!(int*)(3)
     allocator.new int[3];       // a.opNew!(int[])(3)
     allocator.new int[](3);     // a.opNew!(int[])(3)
     allocator.new int[3][4];    // a.opNew!(int[][])(3, 4)
     allocator.new Struct;       // a.opNew!(Struct*)()
     allocator.new Struct(args); // a.opNew!(Struct*)(args)
     allocator.new Class(args);  // a.opNew!Class(args)
}
```



The `.new` syntax is already used for [explicit instantiation of 
a nested 
class](https://dlang.org/spec/class.html#nested-explicit). 
However, I consider this an obscure legacy feature. For backwards 
compatibility, this will have precedence over the operator 
overload.
The only limitation this brings is that allocators may not 
contain a nested class, which shouldn't be a problem in practice.



One could imagine a zero-argument variant like this:

```D
struct Unique(T)
{
     T t;
     Unique opNew() => new Unique(T.init);
}

void main()
{
     auto a = Unique!int.new;
}
```

However, there might be an expectation that it's the same as:

```D
auto a = new Unique!int;
```

So I'm not sure whether it's a good idea. It could be added in 
the future.



The with statement does not allow you to implicitly call `opNew`, 
as that could lead to surprising results and breakage.

```D
with (a)
     auto p = new int; // don't try a.opNew!int
```



Since the GC takes care of freeing memory, there is usually no 
need to call `object.destroy()` on an object created with `new`. 
When the custom allocator does region-based memory management, 
there's no need to free individual objects either. When the 
allocator does require manual freeing of objects, one could 
implement `destroy` as a member function of the allocator.

```D
struct Allocator
{
     T opNew(T, A...)(A args) => new T(args);
     void destroy(T)(T o) {}
}

void main()
{
     Allocator allocator;
     auto o = allocator.new Object;
     allocator.destroy(o);
}
```

Since `destroy` is a library function in object.d, which can 
already be shadowed by a member function, there's no need for any 
special consideration here.
Apr 05
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
Adapting this to have a right version might be quite a good idea too:

```d
struct Foo {
	static Bar opNewRight(Allocator, Args)(ref Allocator, Args) {
		
	}
}

struct Bar {
	Foo* _;
}
```

I'd suggest making this be more preferred than ``opNew``. So that a type 
can control and see its memory allocator (for i.e. storage and then 
deallocation).
Apr 05
parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 5 April 2025 at 18:23:34 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Adapting this to have a right version might be quite a good 
 idea too
I don't see what problem that's trying to solve.
Apr 05
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 06/04/2025 6:29 AM, Dennis wrote:
 On Saturday, 5 April 2025 at 18:23:34 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 Adapting this to have a right version might be quite a good idea too
I don't see what problem that's trying to solve.
If you have a reference counted type you may want to control the memory allocator used (i.e. data structure). It has to be able to deallocate and that means it needs the allocator stored in its state.
Apr 05
parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 5 April 2025 at 18:31:17 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 It has to be able to deallocate and that means it needs the 
 allocator stored in its state.
If you need to store an allocator as a field, then that allocator must be passed as a regular constructor parameter. No need to be clever.
Apr 05
parent bauss <jacobbauss gmail.com> writes:
On Saturday, 5 April 2025 at 18:41:41 UTC, Dennis wrote:
 On Saturday, 5 April 2025 at 18:31:17 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 It has to be able to deallocate and that means it needs the 
 allocator stored in its state.
If you need to store an allocator as a field, then that allocator must be passed as a regular constructor parameter. No need to be clever.
I agree with this.
Apr 05
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Saturday, 5 April 2025 at 13:49:57 UTC, Dennis wrote:


 When a `struct Allocator` or `class Allocator` defines `T 
 opNew(T, A...)(A a) => ...`, then `allocator.new Item("x")` 
 will be rewritten to `allocator.opNew!Item("x")`.



 [...]

 Since D programmers are familiar with `new T()` syntax, 
 enabling it for custom allocators makes code both more pleasing 
 to read and easier to refactor when switching from using the GC 
 to a custom allocator. It also makes it easier to switch from 
 one custom allocator to another when there's a unified syntax, 
 since you don't need to learn and apply new function signatures 
 every time.
* D programmers are familiar with using `new` to allocate with the GC. It's not obvious that giving a new, different meaning to the same synatx will make code easier to read and understand. * Some D programmers are also familiar with the existing [`obj.new T` syntax for instantiating nested classes.][1] In general, differentiating between this existing syntax and your proposed syntax will require non-local information (i.e., looking up the type of `obj`). * It's not obvious to me that refactoring from `new T(args)` to `alloc.new T(args)` is any easier than refactoring from `new T(args)` to `alloc.make!T(args)`. * It's also not obvious to me that refactoring from `alloc1.new T` to `alloc2.new T` is any easier than refactoring from `alloc1.make!T` to `alloc2.make!T`. Overall, I'm not convinced that this proposal offers any compelling benefits compared to the current `make!T` approach. The fact that it overlaps with an existing syntax (nested class `new`) is also a strike against it. [1]: https://dlang.org/spec/class.html#nested-explicit
Apr 05
parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 5 April 2025 at 18:43:57 UTC, Paul Backus wrote:
 * D programmers are familiar with using `new` to allocate with 
 the GC. It's not obvious that giving a new, different meaning 
 to the same synatx will make code easier to read and understand.
Well you're obviously not supposed to make it do anything different than allocate just like new. I don't see how this is anything different from overloading ~= for a custom array type. As long as it means appending, there should be no surprises.
 * Some D programmers are also familiar with the existing 
 [`obj.new T` syntax for instantiating nested classes.][1]
Yeah I wish that feature didn't exist.
 * It's not obvious to me that refactoring from `new T(args)` to 
 `alloc.new T(args)` is any easier than refactoring from `new 
 T(args)` to `alloc.make!T(args)`.

 * It's also not obvious to me that refactoring from `alloc1.new 
 T` to `alloc2.new T` is any easier than refactoring from 
 `alloc1.make!T` to `alloc2.make!T`.
It's about being standard/uniform and recognizable. Not everyone uses `make`. Having to parenthesize array types like `make!(int[])` is annoying, so you also have `makeArray!int`, others use `alloc`, or `alloc.array!int` etc. When scanning D code, I can easily locate creation of objects with `new` syntax. Calls to make!T or constructNew!T don't stand out from other functions like read!T or parse!T. In Java you have ArrayList where you have to use `list.get(i)` and `list.size()` instead of `list[i]` and `list.length`, which is perfectly functional and it's not a complex refactor, but when reading code it's harder to recognize where array operations happen. In D, I love how you can use operator overloading to create library arrays that are almost drop-in replacements for language arrays. It helps when incrementally refactoring GC-based array code to use custom allocators for performance / portability (to WebAssembly in my case). I thought it would be nice to expand this library array type love to the `new` operator when allocating fixed size arrays, for example `auto pixels = new ubyte[4 * width * height];`. That being said, I'm not 100% sold on the idea myself, I can see downsides as well. As an alternative, I guess we could officially declare `make!T` to be the library equivalent of `new` and hope it catches on.
Apr 05
parent reply Paul Backus <snarwin gmail.com> writes:
On Saturday, 5 April 2025 at 19:30:49 UTC, Dennis wrote:
 It's about being standard/uniform and recognizable. Not 
 everyone uses `make`.
Not everyone will use `opNew` either. If you are capable of achieving uniformity with `opNew`, you are also capable of achieving uniformity without it.
 Having to parenthesize array types like `make!(int[])` is 
 annoying, so you also have `makeArray!int`, others use `alloc`, 
 or `alloc.array!int` etc.
I don't think those are any uglier than `new int[](n)`, and they're certainly less confusing than `new int[n]`. In any case, "writing parentheses is annoying sometimes" is kind of a weak justification for introducing new syntax.
 When scanning D code, I can easily locate creation of objects 
 with `new` syntax. Calls to make!T or constructNew!T don't 
 stand out from other functions like read!T or parse!T.
I cannot think of a single situation where I've been reading D code and felt the need to "easily locate creation of objects." I *have* been in situations where I wanted to know where a *specific* object was being created, and grepped for `new T` to find it--but I could have grepped for `make!T` just as easily.
 In D, I love how you can use operator overloading to create 
 library arrays that are almost drop-in replacements for 
 language arrays. It helps when incrementally refactoring 
 GC-based array code to use custom allocators for performance / 
 portability (to WebAssembly in my case).

 I thought it would be nice to expand this library array type 
 love to the `new` operator when allocating fixed size arrays, 
 for example `auto pixels = new ubyte[4 * width * height];`.
So, let me get this straight: you're proposing that `myAllocator.new ubyte[n]` would (potentially) return some custom library type, instead of a `ubyte[]`? That's...well, there are a lot of words I could use to describe it, but let's just say that "uniform" and "consistent" would not be among them.
Apr 05
next sibling parent Dennis <dkorpel gmail.com> writes:
On Saturday, 5 April 2025 at 21:39:01 UTC, Paul Backus wrote:
 Not everyone will use `opNew` either. If you are capable of 
 achieving uniformity with `opNew`, you are also capable of 
 achieving uniformity without it.
Like I said, we can try to convince everyone to use 'make' as the one and only synonym for 'new', but when you offer an operator overload with nicer syntax, people naturally gravitate towards that. I've never seen someone create an array that you have to index with `array.index(i)` in D.
 In any case, "writing parentheses is annoying sometimes" is 
 kind of a weak justification for introducing new syntax.
Huh? Making code nicer to read/write is the whole point of syntax. It's why Lisp has its reputation, and why D allows `make!T` instead of `make!(T)`, and why some propose to make parentheses around if conditions optional.
 I cannot think of a single situation where I've been reading D 
 code and felt the need to "easily locate creation of objects."
That's understandable, many programmers don't (need to) care where memory allocations happen.
 I *have* been in situations where I wanted to know where a 
 *specific* object was being created, and grepped for `new T` to 
 find it--but I could have grepped for `make!T` just as easily.
But would someone who doesn't know about all the allocator libraries in use by a project know to include make!T in their search, as well as alloc!T etc.?
 So, let me get this straight: you're proposing that 
 `myAllocator.new ubyte[n]` would (potentially) return some 
 custom library type, instead of a `ubyte[]`?
No, not at all. opNew should always return the same type as the equivalent `new` expression, I should probably add to the DIP that the compiler verifies that. To illustrate what I'm talking about, here's the actual git history of a line of code in an OpenGL application of mine: ``` lightBias = newArrayMalloced!float(w * h); lightBias = newArrayM!float(w * h); lightBias = allocator.array!float(w * h); lightBias = new float[](w * h); ``` You can tell I've been struggling with my memory allocator APIs. However, I've never struggled with the API for my custom allocator-backed array types, because `opIndex` and `opBinaryAssign!"~"` are the obvious choice which make refactoring from `int[]` to `Array!int` easy. Had `opNew` existed, I would have used that from the start and be content with it. Yes, this might just be a very 'me' problem. And again, it's no big deal to rename `newArrayMalloced` to `newArrayM` in 35 places. I just thought this DIP could be a little nicety, filling a small hole in D's operator overloading story. I posted this DIP here to poll if anyone else is interested in this.
Apr 05
prev sibling parent reply Mike Shah <mshah.475 gmail.com> writes:
On Saturday, 5 April 2025 at 21:39:01 UTC, Paul Backus wrote:
 On Saturday, 5 April 2025 at 19:30:49 UTC, Dennis wrote:
 It's about being standard/uniform and recognizable. Not 
 everyone uses `make`.
Not everyone will use `opNew` either. If you are capable of achieving uniformity with `opNew`, you are also capable of achieving uniformity without it.
 Having to parenthesize array types like `make!(int[])` is 
 annoying, so you also have `makeArray!int`, others use 
 `alloc`, or `alloc.array!int` etc.
I don't think those are any uglier than `new int[](n)`, and they're certainly less confusing than `new int[n]`. In any case, "writing parentheses is annoying sometimes" is kind of a weak justification for introducing new syntax.
 When scanning D code, I can easily locate creation of objects 
 with `new` syntax. Calls to make!T or constructNew!T don't 
 stand out from other functions like read!T or parse!T.
I cannot think of a single situation where I've been reading D code and felt the need to "easily locate creation of objects." I *have* been in situations where I wanted to know where a *specific* object was being created, and grepped for `new T` to find it--but I could have grepped for `make!T` just as easily.
 In D, I love how you can use operator overloading to create 
 library arrays that are almost drop-in replacements for 
 language arrays. It helps when incrementally refactoring 
 GC-based array code to use custom allocators for performance / 
 portability (to WebAssembly in my case).

 I thought it would be nice to expand this library array type 
 love to the `new` operator when allocating fixed size arrays, 
 for example `auto pixels = new ubyte[4 * width * height];`.
So, let me get this straight: you're proposing that `myAllocator.new ubyte[n]` would (potentially) return some custom library type, instead of a `ubyte[]`? That's...well, there are a lot of words I could use to describe it, but let's just say that "uniform" and "consistent" would not be among them.
Just tossing in my two cents, that when I first came to D (having been from a C++ background) one of the earlier things I tried to do was overload 'new' and was sad it was missing. Having this overload can be really useful for instrumentation to track allocations and just have them all stem from 'new'. One use case I did this was for various factory functions I had in a game. I was able to overload 'new' temporarily and 'count' how often objects lived, (and with overloaded delete) see how long those objects lived. This helped me then make an object pool of a fixed size later (based on maximum allocations observed) to recycle objects from. For gaming applications The other use case I have used for 'new' and 'delete' overloads in C++ is for a simple memory leak tracker, for counting and building a map of allocations. Indeed, now I am currently writing my own 'alloc' function, but the consistency would be very useful to me as someone who wants to write more linters/tooling (and even just for having extra debug on some of my containers library). Having folks use opNew versus alloc/make/create/spawn/heapAlloc/etc. would be quite nice! I suspect looking at Odin's 'contexts' for their allocators may provide more motivation: https://odin-lang.org/docs/overview/#allocators. I like the motivation here to be able to swap in and out allocators and have standard operator overloads for new and/or destroy. 'with' maybe seems like the tool here, but I need to think about it some more.
Apr 05
parent reply Johan <j j.nl> writes:
On Saturday, 5 April 2025 at 23:25:17 UTC, Mike Shah wrote:
 Just tossing in my two cents, that when I first came to D 
 (having been from a C++ background) one of the earlier things I 
 tried to do was overload 'new' and was sad it was missing.
I think you misunderstand the OP. This is not at all similar to C++ `new` overloading. What C++ has is deprecated in D: https://dlang.org/deprecate.html#Class%20allocators%20and%20deallocators. It is merely proposing syntactic sugar that changes `allocator.new T` into `allocator.opNew!T`. The functionality is already possible today with clear unambiguous syntax. Therefore in my opinion the proposal is *not* and improvement. It makes the language worse by adding new learning burden of an ambiguous* syntax. -Johan *ambiguous because of sharing same syntax as explicit instantiation of a nested class, as mentioned by OP.
Apr 06
parent Mike Shah <mshah.475 gmail.com> writes:
On Sunday, 6 April 2025 at 13:22:06 UTC, Johan wrote:
 On Saturday, 5 April 2025 at 23:25:17 UTC, Mike Shah wrote:
 Just tossing in my two cents, that when I first came to D 
 (having been from a C++ background) one of the earlier things 
 I tried to do was overload 'new' and was sad it was missing.
I think you misunderstand the OP. This is not at all similar to C++ `new` overloading. What C++ has is deprecated in D: https://dlang.org/deprecate.html#Class%20allocators%20and%20deallocators. It is merely proposing syntactic sugar that changes `allocator.new T` into `allocator.opNew!T`. The functionality is already possible today with clear unambiguous syntax. Therefore in my opinion the proposal is *not* and improvement. It makes the language worse by adding new learning burden of an ambiguous* syntax. -Johan *ambiguous because of sharing same syntax as explicit instantiation of a nested class, as mentioned by OP.
I see -- indeed overloading 'new' seems to be a deprecation -- thanks for sharing that (That is actually the kind of thing I would want to have, but free functions are also fine)! I understand the proposal a bit more now and need to think on it from the standpoint of consistency.
Apr 06
prev sibling parent monkyyy <crazymonkyyy gmail.com> writes:
On Saturday, 5 April 2025 at 13:49:57 UTC, Dennis wrote:
 The with statement does not allow you to implicitly call opNew, 
 as that could lead to surprising results and breakage.
Without `with(allocator):` your not actually making the syntactical leap that allows code to be part of someones else's grand plan of memory management. Making the entire enterprise worthless. I expect that to be the outcome so therefore all libs of allocators are worthless, but in theory you should be aiming for the prize of replacing the gc with someones grand plan, id argue via compiler flag, but if your not even handling the simpler case....
Apr 05