www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - C is Brittle D is Plastic

reply Walter Bright <newshound2 digitalmars.com> writes:
It's true that writing code in C doesn't automatically make it faster.

For example, string manipulation. 0-terminated strings (the default in C) are, 
frankly, an abomination. String processing code is a tangle of strlen, strcpy, 
strncpy, strcat, all of which require repeated passes over the string looking 
for the 0. (Even worse, reloading the string into the cache just to find its 
length makes things even slower.)

Worse is the problem that, in order to slice a string, a malloc is needed to 
copy the slice to. And then carefully manage the lifetime of that slice.

The fix is simple - use length-delimited strings. D relies on them to great 
effect. This can be done in C, but there is no succor from the language, and 
such a package is not standardized. I've proposed a simple enhancement for C to 
make them work https://www.digitalmars.com/articles/C-biggest-mistake.html but 
nobody in the C world has any interest in it (which is baffling, as it is so 
simple!).

Another source of slowdown in C that became apparent over the years is C is a 
brittle language, rather than a plastic one. The first algorithm selected for a 
C project gets so welded into it that it cannot be changed without great 
difficulty. (And we all know that algorithms are the key to speed, not coding 
details.) Why does this happen with C?

It's because one cannot switch back and forth between a reference type and a 
value type without extensively rewriting every use of it. For example:

```C
struct S { int a; }
int foo(struct S s) { return s.a; }
int bar(struct S *s) { return s->a; }
```
To switch between reference and value, it's necessary to go through all the
code 
swapping . and ->. It's just too tedious and never happens. In D:
```D
struct S { int a; }
int foo(S s) { return s.a; }
int bar(S *s) { return s.a; }
```
Working on D shows that there is no reason for the C and C++ -> operator to
even 
exist, the . operator covers both bases!
Mar 21
next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 Another source of slowdown in C that became apparent over the 
 years is C is a brittle language, rather than a plastic one.
Another example where D shines in this regard is UFCS allowing you to turn fields into methods and vice versa: ```D struct S { ubyte* ptr; ubyte* end; size_t length() => end - ptr; } ``` ```D struct S { ubyte* ptr; size_t length; ubyte* end() => ptr + length; } ``` You don't need to change `.end` into `.end()`, `.end` works on both fields and methods.
Mar 22
next sibling parent reply Sergey <kornburn yandex.ru> writes:
On Sunday, 22 March 2026 at 11:56:00 UTC, Dennis wrote:
 You don't need to change `.end` into `.end()`, `.end` works on 
 both fields and methods.
Some languages consider this as a downside. Because of clarity for human or llm who is reading the code - which is trying to understand what is it method or field, structure or pointer to the structure..
Mar 22
next sibling parent reply Kapendev <alexandroskapretsos gmail.com> writes:
On Sunday, 22 March 2026 at 12:23:28 UTC, Sergey wrote:
 On Sunday, 22 March 2026 at 11:56:00 UTC, Dennis wrote:
 You don't need to change `.end` into `.end()`, `.end` works on 
 both fields and methods.
Some languages consider this as a downside. Because of clarity for human or llm who is reading the code - which is trying to understand what is it method or field, structure or pointer to the structure..
The same can be said for replacing the `->` operator with a `.`. For example: `int a = variable.value;` Is `variable` a pointer or a number in this example? You don't know. Do you care? No, and if you do then you should know what your code is doing in the first place lol
Mar 22
parent reply Sergey <kornburn yandex.ru> writes:
On Sunday, 22 March 2026 at 15:05:41 UTC, Kapendev wrote:
 The same can be said for replacing the `->` operator with a `.`.
 For example: `int a = variable.value;`
 Is `variable` a pointer or a number in this example?
 You don't know. Do you care? No, and if you do then you should 
 know what your code is doing in the first place lol
This is wrong example Kap. Imagine you have a function.. with many rows. And then you a calling another function there like: ```d // ... many rows above another_func(foo.bar, foo.baz, x); // ... many rows below ``` here for x parameter you need to give a reference to foo. But you don't know which type foo now - is it a pointer or not.. You will have to go somewhere above search the foo somewhere - then go back to another_func and change the x..
Mar 22
next sibling parent Sergey <kornburn yandex.ru> writes:
On Sunday, 22 March 2026 at 16:32:25 UTC, Sergey wrote:
 here for x parameter you need to give a reference to foo. But 
 you don't know which type foo now - is it a pointer or not..
 You will have to go somewhere above search the foo somewhere - 
 then go back to another_func and change the x..
Of course partially tooling is solving this - with properly configured editor - you could see it easier But there are other more complicated examples (with templates involved) where tooling won't help you
Mar 22
prev sibling parent Kapendev <alexandroskapretsos gmail.com> writes:
On Sunday, 22 March 2026 at 16:32:25 UTC, Sergey wrote:
 This is wrong example Kap.

 Imagine you have a function.. with many rows.
 And then you a calling another function there like:
 ```d
 // ... many rows above
 another_func(foo.bar, foo.baz, x);
 // ... many rows below
 ```

 here for x parameter you need to give a reference to foo. But 
 you don't know which type foo now - is it a pointer or not..
 You will have to go somewhere above search the foo somewhere - 
 then go back to another_func and change the x..
You will get a compiler error telling you the type. It's the same thing.
Mar 22
prev sibling parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Sunday, 22 March 2026 at 12:23:28 UTC, Sergey wrote:
 On Sunday, 22 March 2026 at 11:56:00 UTC, Dennis wrote:
 You don't need to change `.end` into `.end()`, `.end` works on 
 both fields and methods.
Some languages consider this as a downside. Because of clarity for human or llm who is reading the code - which is trying to understand what is it method or field, structure or pointer to the structure..
and it’s considered a win for each. Neither human nor AI should care if it’s a property or a field. The whole point of properties is being largely equivalent to fields when used. The rather annoying thing about D is that properties aren’t really first-class, but piggy-backed on functions.
Mar 25
next sibling parent reply Sergey <kornburn yandex.ru> writes:
On Wednesday, 25 March 2026 at 13:20:16 UTC, Quirin Schroll wrote:

 and it’s considered a win for each. Neither human nor AI should 
 care if it’s a property or a field. The whole point of 
 properties is being largely equivalent to fields when used. The 
 rather annoying thing about D is that properties aren’t really 
 first-class, but piggy-backed on functions.
For example languages like Rust, Go, and C++ deliberately avoid property syntax because: - It hides computation behind field access - Makes performance less obvious - Reduces clarity (is it a field or a function?)
Mar 25
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/25/2026 7:06 AM, Sergey wrote:
 For example languages like Rust, Go, and C++ deliberately avoid property
syntax 
 because:
 - It hides computation behind field access
 - Makes performance less obvious
 - Reduces clarity (is it a field or a function?)
I don't think C++ can claim any advantage here. Simply declaring a variable in C++ can invisibly invoke all kinds of code (same in D!). General operator overloading also suffers from this.
Mar 29
prev sibling parent reply Sergey <kornburn yandex.ru> writes:
On Wednesday, 25 March 2026 at 13:20:16 UTC, Quirin Schroll wrote:

 and it’s considered a win for each. Neither human nor AI should 
 care if it’s a property or a field. The whole point of 
 properties is being largely equivalent to fields when used. The 
 rather annoying thing about D is that properties aren’t really 
 first-class, but piggy-backed on functions.
Also Zig mentioned it explicitly: https://ziglang.org/learn/why_zig_rust_d_cpp/ So nothing odd
Mar 25
parent reply libxmoc <libxmoc gmail.com> writes:
On Wednesday, 25 March 2026 at 14:08:41 UTC, Sergey wrote:
 On Wednesday, 25 March 2026 at 13:20:16 UTC, Quirin Schroll 
 wrote:

 one and it’s considered a win for each. Neither human nor AI 
 should care if it’s a property or a field. The whole point of 
 properties is being largely equivalent to fields when used. 
 The rather annoying thing about D is that properties aren’t 
 really first-class, but piggy-backed on functions.
Also Zig mentioned it explicitly: https://ziglang.org/learn/why_zig_rust_d_cpp/ So nothing odd
general purpose programming language that targets machines, while JIT. With that being said, properties are not liked by everyone, Jeffrey Richter[1], a renowned .NET expert and author of CLR via [1] https://www.goodreads.com/book/show/7121415-clr-via-c
Mar 29
parent Kapendev <alexandroskapretsos gmail.com> writes:
On Sunday, 29 March 2026 at 10:33:14 UTC, libxmoc wrote:

 general purpose programming language that targets machines, 

 with a JIT.

 With that being said, properties are not liked by everyone, 
 Jeffrey Richter[1], a renowned .NET expert and author of CLR 


 [1] https://www.goodreads.com/book/show/7121415-clr-via-c
I get why some people avoid them, but any feature can be abused. If you are hiding 2000 loc inside a property, then it's a you problem. This makes me think of `scope(exit)`/`defer` for some reason. Simple feature, but some people like to spam it so much that the control flow gets hard to follow.
Mar 29
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Nice example!
Mar 22
prev sibling parent Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Sunday, 22 March 2026 at 11:56:00 UTC, Dennis wrote:
 On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 Another source of slowdown in C that became apparent over the 
 years is C is a brittle language, rather than a plastic one.
Another example where D shines in this regard is UFCS allowing you to turn fields into methods and vice versa: ```D struct S { ubyte* ptr; ubyte* end; size_t length() => end - ptr; } ``` ```D struct S { ubyte* ptr; size_t length; ubyte* end() => ptr + length; } ``` You don't need to change `.end` into `.end()`, `.end` works on both fields and methods.
That isn’t UFCS, that’s just empty parentheses being optional. Those are completely orthogonal language concepts.
Mar 25
prev sibling next sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 It's because one cannot switch back and forth between a 
 reference type and a value type without extensively rewriting 
 every use of it. For example:

 ```C
 struct S { int a; }
 int foo(struct S s) { return s.a; }
 int bar(struct S *s) { return s->a; }
 ```
 To switch between reference and value, it's necessary to go 
 through all the code swapping . and ->.
Not really, if one has a little for-thought, and commits to accessing the members via ->, the two become interchangeable, with no performance impact. ```c int inner1 (struct foo fa) { struct foo *fp = &fa; fp->a += 1; return fp->a + 1; } ``` One can also adapt in the other direction if one started with pointer arguments (add const as desired): ```c int inner3 (struct foo *fp) { struct foo fa = *fp; fa.a += 1; return fa.a + 1; } ``` DF
Mar 22
parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Sunday, 22 March 2026 at 17:08:05 UTC, Derek Fawcus wrote:
 On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 To switch between reference and value, it's necessary to go 
 through all the code swapping . and ->.
Not really, if one has a little for-thought, and commits to accessing the members via ->, the two become interchangeable, with no performance impact.
One can even play macro games (using _Generic) to make a wrapper for the function, such that one does not even need to go through and update the call sites to add or remove the & . However going back to the original point, as of late last year / early this year it is now obsolete. That specific example of switching the code to use . or -> is something which the LLM assisted coding agents are particularly adept at. Such refactoring tasks are what I've been using it for, where it is quite easy to specify some form of grep pattern and then rules to update the code. So these days that'll take less than an hour to achieve. DF
Mar 22
next sibling parent Lance Bachmeier <no spam.net> writes:
On Sunday, 22 March 2026 at 17:19:26 UTC, Derek Fawcus wrote:

 However going back to the original point, as of late last year 
 / early this year it is now obsolete.  That specific example of 
 switching the code to use . or -> is something which the LLM 
 assisted coding agents are particularly adept at.

 Such refactoring tasks are what I've been using it for, where 
 it is quite easy to specify some form of grep pattern and then 
 rules to update the code.  So these days that'll take less than 
 an hour to achieve.
If that's the standard (unlimited time and resources) then it was never really a thing. You could have always hired someone to make the changes for you. There's really no reason to use high-level programming languages at all if you're willing to make those assumptions.
Mar 22
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/22/2026 10:19 AM, Derek Fawcus wrote:
 One can even play macro games (using _Generic) to make a wrapper for the 
 function, such that one does not even need to go through and update the call 
 sites to add or remove the & .
If _Generic is needed, you have outgrown C. (I am regularly baffled by the weird useless things added to C, but the *needed* things are totally ignored.)
 However going back to the original point, as of late last year / early this
year 
 it is now obsolete.  That specific example of switching the code to use . or
-> 
 is something which the LLM assisted coding agents are particularly adept at.
D doesn't need a coding agent to refactor the code. It paid off handsomely with the Warp program I did a few years ago. I found it very easy to try different algorithms and data structures.
 Such refactoring tasks are what I've been using it for, where it is quite easy 
 to specify some form of grep pattern and then rules to update the code.  So 
 these days that'll take less than an hour to achieve.
I've done these sorts of refactorings. It takes 0 minutes, because the compiler figures it out and does it. It also doesn't wind up with a blizzard of diffs in the git history. It doesn't need code reviewing, either. As I've become more experienced, my tolerance for all the weird kludges necessary for professional C getting lower and lower. LLMs don't really solve the problem, as the wrangled C code is just as ugly looking. I've used LLMs to a modest extent, and am impressed by it. So, I might be wrong about the above. But it's fun to talk about it!
Mar 22
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/03/2026 2:51 PM, Walter Bright wrote:
 As I've become more experienced, my tolerance for all the weird kludges 
 necessary for professional C getting lower and lower. LLMs don't really 
 solve the problem, as the wrangled C code is just as ugly looking.
Its not like we can talk, we've got some real uglies hiding out in our implementation. Like monitor proxy support. Uck, I'd love to gut that out of druntime and swap to an operator overload. But alas not approved, but hopefully that will change.
Mar 22
prev sibling next sibling parent Lance Bachmeier <no spam.net> writes:
On Monday, 23 March 2026 at 01:51:40 UTC, Walter Bright wrote:

 As I've become more experienced, my tolerance for all the weird 
 kludges necessary for professional C getting lower and lower. 
 LLMs don't really solve the problem, as the wrangled C code is 
 just as ugly looking.
I recently watched a YouTube video about how LLMs produce almost correct code. It's just good enough that errors slip past the code reviewers, and then down the road you're dealing with those errors as they get caught in production. It hit home, because it matches my experience. The trustworthy solution to the problem under discussion here is to have the LLM write a parser and transpile the old code to its new form. Using D and not having to decide on the trustworthy path is obviously a win.
Mar 22
prev sibling parent Derek Fawcus <dfawcus+dlang employees.org> writes:
On Monday, 23 March 2026 at 01:51:40 UTC, Walter Bright wrote:
 On 3/22/2026 10:19 AM, Derek Fawcus wrote:
 LLMs don't really solve the problem, as the wrangled C code is 
 just as ugly looking.
In the example I cited, that was not an issue. I was able to tightly specify the replacement, such that it effectively added as an editor "search and replace" facility on steroids. So the result was exactly what I would have written. I'm still in the process of evaluating LLM based agents in my workflow, but the above is one where I have concluded they are useful. The one detrimental issue being the volume of change which may result from a global application to fix a poor pattern. However, for your quoted example, that would not arise. As there would only be the one (or small set of) function(s) implementing the algorithm to be changed, together with use of _Generic hidden by a macro to catch all of the call sites. Only if that algorithmic change the proved to be valuable would it then be worth the cost of updating the call instances, and eliminating the _Generic using macro. Then there the review cost for the LLM generated diffs would be minimal. As to using an LLM agent to produce 'original' code for a project, I'm still skeptical.
Mar 23
prev sibling next sibling parent reply user1234 <user1234 12.de> writes:
On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 [...]
 Working on D shows that there is no reason for the C and C++ -> 
 operator to even exist, the . operator covers both bases!
Yes from the user POV. From the compilers-details POV I think that semantics checks for that are awfully complex. Add to this your system of UFCS. Boom, 1000 lines of code to desintricate what the programmer really meant. However that's certain that once implemented properly it's 100% profit for hundreds of programmers.
Mar 23
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/23/2026 8:12 AM, user1234 wrote:
 On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 [...]
 Working on D shows that there is no reason for the C and C++ -> operator to 
 even exist, the . operator covers both bases!
Yes from the user POV. From the compilers-details POV I think that semantics checks for that are awfully complex. Add to this your system of UFCS. Boom, 1000 lines of code to desintricate what the programmer really meant. However that's certain that once implemented properly it's 100% profit for hundreds of programmers.
I admit the implementation code is not a dreamboat. But in defense, in the 1980s I attempted to develop the be-all and end-all windowing user interface library. I soon discovered that what is intuitive and easy for the user is a giant mess of special cases in the implementation. And when the implementation was simple and consistent, the users thought it completely unintuitive. Longtime D users might remember when I was the lone holdout on how the name lookup worked in D. I thought it was perfectly straightforward, and it was easy to implement in clean code. Nobody agreed with me. So now we have a complicated 2-phase lookup that everybody likes. Go figure! P.S. C and C++ require semantic analysis in order to parse the source code correctly (the C cast expressions, and the >> tokens in C++). These are unnecessary and simply bad language design.
Mar 23
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/03/2026 12:48 PM, Walter Bright wrote:
 But in defense, in the 1980s I attempted to develop the be-all and end- 
 all windowing user interface library.
Did you ever study IBM CUA?
Mar 23
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/23/2026 4:49 PM, Richard (Rikki) Andrew Cattermole wrote:
 Did you ever study IBM CUA?
I had the book but never read it. It came out years after my attempt.
Mar 23
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/03/2026 3:55 PM, Walter Bright wrote:
 On 3/23/2026 4:49 PM, Richard (Rikki) Andrew Cattermole wrote:
 Did you ever study IBM CUA?
I had the book but never read it. It came out years after my attempt.
I've got it as well, the last of the series I think it is. https://www.amazon.com/Object-Oriented-Interface-Design-Common-Guidelines/dp/1565291700 You should open it up, it talks about how it combines earlier CUA specifications ;) It originally was about TUI's, not even graphical.
Mar 23
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/23/2026 8:01 PM, Richard (Rikki) Andrew Cattermole wrote:
 On 24/03/2026 3:55 PM, Walter Bright wrote:
 On 3/23/2026 4:49 PM, Richard (Rikki) Andrew Cattermole wrote:
 Did you ever study IBM CUA?
I had the book but never read it. It came out years after my attempt.
I've got it as well, the last of the series I think it is. https://www.amazon.com/Object-Oriented-Interface-Design-Common-Guidelines/dp/1565291700 You should open it up, it talks about how it combines earlier CUA specifications ;) It originally was about TUI's, not even graphical.
I sold the book (!) because I wanted to focus on other things and the book fetched a good price.
Mar 24
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 24/03/2026 8:05 PM, Walter Bright wrote:
 On 3/23/2026 8:01 PM, Richard (Rikki) Andrew Cattermole wrote:
 On 24/03/2026 3:55 PM, Walter Bright wrote:
 On 3/23/2026 4:49 PM, Richard (Rikki) Andrew Cattermole wrote:
 Did you ever study IBM CUA?
I had the book but never read it. It came out years after my attempt.
I've got it as well, the last of the series I think it is. https://www.amazon.com/Object-Oriented-Interface-Design-Common- Guidelines/dp/1565291700 You should open it up, it talks about how it combines earlier CUA specifications ;) It originally was about TUI's, not even graphical.
I sold the book (!) because I wanted to focus on other things and the book fetched a good price.
Oh lol, its cheap now and also on archive.org.
Mar 24
prev sibling parent reply Max Samukha <maxsamukha gmail.com> writes:
On Monday, 23 March 2026 at 23:48:27 UTC, Walter Bright wrote:

 So now we have a complicated 2-phase lookup that everybody 
 likes.
Again, not everybody likes it.
Mar 24
parent reply user1234 <user1234 12.de> writes:
On Tuesday, 24 March 2026 at 10:36:19 UTC, Max Samukha wrote:
 On Monday, 23 March 2026 at 23:48:27 UTC, Walter Bright wrote:

 So now we have a complicated 2-phase lookup that everybody 
 likes.
Again, not everybody likes it.
I don't know what this "2-phases lookup system" is refering to (my interest in D started by 2014 let's say and I have the feeling that this is an older topic). What was the problem in first place ?
Mar 24
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Mar 24, 2026 at 12:38:28PM +0000, user1234 via Digitalmars-d wrote:
 On Tuesday, 24 March 2026 at 10:36:19 UTC, Max Samukha wrote:
 On Monday, 23 March 2026 at 23:48:27 UTC, Walter Bright wrote:
 
 So now we have a complicated 2-phase lookup that everybody likes.
Again, not everybody likes it.
I don't know what this "2-phases lookup system" is refering to (my interest in D started by 2014 let's say and I have the feeling that this is an older topic). What was the problem in first place ?
In the beginning, D's symbol lookup mechanism was very simple and straightforward: whenever an import statement is encountered, symbols from an imported module are loaded and injected into the symbol table at the current scope. This is straightforward to implement (you literally translate "import xyz" as parsing xyz and adding its public symbols to the current scope's symbol table). However, that led to counterintuitive behaviours like: void main() { myFunc("abc"); } void myFunc(string text) { writeln(text); // prints "abc" import std.conv; writeln(text); // prints "" } Walter was very resistant to changing this behaviour, as all proposed solutions involved "inelegant" complications in the semantics and implementation of "import". Eventually, however, the crowd prevailed and Walter relented. T -- Why does a pound of hamburger have less energy than a pound of steak? Because it is in the ground state.
Mar 24
next sibling parent reply user1234 <user1234 12.de> writes:
On Tuesday, 24 March 2026 at 14:36:57 UTC, H. S. Teoh wrote:
 On Tue, Mar 24, 2026 at 12:38:28PM +0000, user1234 via 
 Digitalmars-d wrote:
 On Tuesday, 24 March 2026 at 10:36:19 UTC, Max Samukha wrote:
 On Monday, 23 March 2026 at 23:48:27 UTC, Walter Bright 
 wrote:
 
 So now we have a complicated 2-phase lookup that everybody 
 likes.
Again, not everybody likes it.
I don't know what this "2-phases lookup system" is refering to (my interest in D started by 2014 let's say and I have the feeling that this is an older topic). What was the problem in first place ?
In the beginning, D's symbol lookup mechanism was very simple and straightforward: whenever an import statement is encountered, symbols from an imported module are loaded and injected into the symbol table at the current scope. This is straightforward to implement (you literally translate "import xyz" as parsing xyz and adding its public symbols to the current scope's symbol table). However, that led to counterintuitive behaviours like: void main() { myFunc("abc"); } void myFunc(string text) { writeln(text); // prints "abc" import std.conv; writeln(text); // prints "" } Walter was very resistant to changing this behaviour, as all proposed solutions involved "inelegant" complications in the semantics and implementation of "import". Eventually, however, the crowd prevailed and Walter relented. T
Thanks. I would not like to be rude and reopen old wounds but this looks more a problem of `text` being either a variable or a function and as D allows function calls without params to me it looks like it has never been a problem of lookup. Now imagine std.conv.text is a global variable. That does not change anything. Order of declaration matters in the local scope. Anyway, despite of this curious change, the front-end is still fast.
Mar 24
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Mar 24, 2026 at 03:25:29PM +0000, user1234 via Digitalmars-d wrote:
 On Tuesday, 24 March 2026 at 14:36:57 UTC, H. S. Teoh wrote:
[...]
 In the beginning, D's symbol lookup mechanism was very simple and
 straightforward: whenever an import statement is encountered,
 symbols from an imported module are loaded and injected into the
 symbol table at the current scope.  This is straightforward to
 implement (you literally translate "import xyz" as parsing xyz and
 adding its public symbols to the current scope's symbol table).
 
 However, that led to counterintuitive behaviours like:
 
 	void main() {
 		myFunc("abc");
 	}
 	void myFunc(string text) {
 		writeln(text);	// prints "abc"
 		import std.conv;
 		writeln(text);	// prints ""
 	}
[...]
 I would not like to be rude and reopen old wounds but this looks more
 a problem of `text` being either a variable or a function and as D
 allows function calls without params to me it looks like it has never
 been a problem of lookup. Now imagine std.conv.text is a global
 variable. That does not change anything. Order of declaration matters
 in the local scope.
It's not about `text` being a function called without parens; it's about an imported module hijacking symbols in the local scope. The same problem occurs in this situation: ``` // xyz.d module xyz; static immutable text = "haha you got hijacked"; // main.d module main; void main() { myFunc("abc"); } void myFunc(string text) { writeln(text); // prints "abc" import xyz; // N.B.: we never imported `text` explicitly writeln(text); // prints "haha you got hijacked" } ``` This is a problem because perhaps module xyz didn't declare `text` when this code was written. Then later on, upstream rewrote the implementation and added a declaration of `text`. Suddenly, unrelated code in myFunc() has broken because its semantics silently changed from under its carpet: the newly-added declaration of `text` in xyz has hijacked the function parameter `text` in an unrelated module that just happened to import it. This is only one instance of the problem. Another instance concerns *private* symbols in the imported module causing a conflict with a local symbol in the importing module. (Remember, this is a logical consequence of the simple and elegant concept that `import` simply injects symbols into the local scope -- the injected symbols include private symbols. Treating private symbols specially was one of the "inelegant" things that complicated the implementation of imports, among other things.) T -- My stomach is flat. The L is just silent!
Mar 24
parent reply user1234 <user1234 12.de> writes:
On Tuesday, 24 March 2026 at 15:41:59 UTC, H. S. Teoh wrote:
 On Tue, Mar 24, 2026 at 03:25:29PM +0000, user1234 via 
 Digitalmars-d wrote:
 [...]
[...]
 [...]
[...]
 [...]
It's not about `text` being a function called without parens; it's about an imported module hijacking symbols in the local scope. The same problem occurs in this situation: [...]
Sorry, I see it now. It's the old "furnisher" problem. There was something about that a few days ago on Y news https://www.boxyuwu.blog/posts/an-incoherent-rust/
Mar 24
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/24/2026 12:23 PM, user1234 wrote:
 https://www.boxyuwu.blog/posts/an-incoherent-rust/
If I understand this correctly, D has solved this problem: ```d import a; // exports 'x' import b; // exports 'x' int z = x; // error, multiple definition of 'x' ``` The resolution: ```d import a; // exports 'x' import b; // exports 'x' alias x = b.x; int z = x; // b.x is selected ```
Mar 24
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 3/25/26 03:32, Walter Bright wrote:
 On 3/24/2026 12:23 PM, user1234 wrote:
 https://www.boxyuwu.blog/posts/an-incoherent-rust/
If I understand this correctly, D has solved this problem: ```d import a;   // exports 'x' import b;   // exports 'x' int z = x; // error, multiple definition of 'x' ``` The resolution: ```d import a;   // exports 'x' import b;   // exports 'x' alias x = b.x; int z = x; // b.x is selected ```
Then so has Rust: ```rust mod a { pub const x : i32 = 2; } mod b { pub const x : i32 = 3; } use a::*; use b::*; const z : i32 = x; // error: `x` is ambiguous fn main(){ print!("{}", z); } ``` The resolution: ```rust mod a { pub const x : i32 = 2; } mod b { pub const x : i32 = 3; } use a::*; use b::*; use b::x as x; const z : i32 = x; // b::x is selected fn main(){ print!("{}", z); } // 3 ``` The issue is a bit more fundamental to their traits design than just disambiguating names. They don't even have names for implementations of traits. Once you have names, in order to detect mismatches between associated types, you need some sort of dependent typing, for example path-dependent types. This is because implementations are values, and the associated types depend on these values. Enforcing that there can be only one implementation of a trait for a given type gets around this dependency: you can treat the mapping from implementation to associated type as just a mapping from the type to the associated type given by the unique implementation.
Mar 26
parent Walter Bright <newshound2 digitalmars.com> writes:
It does look like the same idea in the example you gave.

D has a lot less typing!
Mar 27
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 3/24/26 15:36, H. S. Teoh wrote:
 However, that led to counterintuitive behaviours like:
 
 	void main() {
 		myFunc("abc");
 	}
 	void myFunc(string text) {
 		writeln(text);	// prints "abc"
 		import std.conv;
 		writeln(text);	// prints ""
 	}
 
 Walter was very resistant to changing this behaviour, as all proposed
 solutions involved "inelegant" complications in the semantics and
 implementation of "import".  Eventually, however, the crowd prevailed
 and Walter relented.
Not really, the problem is not fixed: ```d import std.stdio; string readAndLog(string filename){ import std.file; auto text=readText(filename); write(filename," read successfully!\n"); return text; } void main(){ writeln(readAndLog("important_data.txt")); } ``` https://github.com/dlang/dmd/issues/19272 So what we have now is a compromise: It's both broken and not entirely straightforward.
Mar 24
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Mar 24, 2026 at 08:50:26PM +0100, Timon Gehr via Digitalmars-d wrote:
[...]
 ```d
 import std.stdio;
 
 string readAndLog(string filename){
     import std.file;
     auto text=readText(filename);
     write(filename," read successfully!\n");
     return text;
 }
 
 void main(){
     writeln(readAndLog("important_data.txt"));
 }
 ```
 
 https://github.com/dlang/dmd/issues/19272
 
 So what we have now is a compromise: It's both broken and not entirely
 straightforward.
And the solution will add yet more complication to an already complex import symbol resolution system. :-D T -- What do you get if you throw a grenade into a French kitchen? Linoleum blown apart.
Mar 24
parent reply user1234 <user1234 12.de> writes:
On Tuesday, 24 March 2026 at 20:17:16 UTC, H. S. Teoh wrote:
 On Tue, Mar 24, 2026 at 08:50:26PM +0100, Timon Gehr via 
 Digitalmars-d wrote: [...]
 ```d
 import std.stdio;
 
 string readAndLog(string filename){
     import std.file;
     auto text=readText(filename);
     write(filename," read successfully!\n");
     return text;
 }
 
 void main(){
     writeln(readAndLog("important_data.txt"));
 }
 ```
 
 https://github.com/dlang/dmd/issues/19272
 
 So what we have now is a compromise: It's both broken and not 
 entirely straightforward.
And the solution will add yet more complication to an already complex import symbol resolution system. :-D T
git revert is so simple
Mar 24
parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Mar 24, 2026 at 11:31:25PM +0000, user1234 via Digitalmars-d wrote:
 On Tuesday, 24 March 2026 at 20:17:16 UTC, H. S. Teoh wrote:
 On Tue, Mar 24, 2026 at 08:50:26PM +0100, Timon Gehr via Digitalmars-d
 wrote: [...]
[...]
 https://github.com/dlang/dmd/issues/19272
 
 So what we have now is a compromise: It's both broken and not
 entirely straightforward.
And the solution will add yet more complication to an already complex import symbol resolution system. :-D
[...]
 git revert is so simple
Yes, and it will equally simply break a ton of user code that has come to rely on the new behaviour, thus causing outrage among users at the next release. :-P T -- In theory, there is no difference between theory and practice.
Mar 24
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
A shout out to Martin Nowak who implemented the 2 phase lookup.
Mar 24
prev sibling next sibling parent reply matheus <matheus gmail.com> writes:
On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 It's true that writing code in C doesn't automatically make it 
 faster.

 ...
Yes C has it flaws, but it's an older and popular language even these days. In 90's it was a king among developers and even today there are project being written with it. It's hard to compare with more flexible language today, so it's look like brittle, but at the same time is less complex than newer languages. Coming from C, sometimes I get lost reading new languages code and people making simple things more complex. Matheus.
Mar 30
next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Monday, 30 March 2026 at 12:31:37 UTC, matheus wrote:
 It's hard to
 compare with more flexible language today, so it's look like 
 brittle, but at the same time is less complex than newer 
 languages. Coming from C, sometimes I get lost reading new 
 languages code and people making simple things more complex.
I can relate. When reading D code, code often looks over-engineered. For example by aggressively deduplicating code using a string mixin and CTFE instead of normal functions. ```D mixin(TraceHook!("Tarr", "_d_arrayappendcTX")); ``` https://github.com/dlang/dmd/blob/371c7a5e3ef9c84d84bea869e2476ad050fbc916/druntime/src/core/internal/array/appending.d#L125 --- On the other hand, most C code bases I find come with their own macro language to make up for C lacking the most basic necessities. For example: https://github.com/EpicGamesExt/raddebugger/blob/7021d6056a8058b34e7b026e728b5f44346be1a1/src/base/base_core.h#L412 ```C #define ArrayCount(a) (sizeof(a) / sizeof((a)[0])) //... typedef uint32_t U32; typedef uint64_t U64; //... global U64 max_U64 = 0xffffffffffffffffull; //... #define DeferLoop(begin, end) for(int _i_ = ((begin), 0); !_i_; _i_ += 1, (end)) #define DeferLoopChecked(begin, end) for(int _i_ = 2 * !(begin); (_i_ == 2 ? ((end), 0) : !_i_); _i_ += 1, (end)) #define EachIndex(it, count) (U64 it = 0; it < (count); it += 1) #define EachElement(it, array) (U64 it = 0; it < ArrayCount(array); it += 1) #define EachEnumVal(type, it) (type it = (type)0; it < #define EachNonZeroEnumVal(type, it) (type it = (type)1; it < #define EachInRange(it, range) (U64 it = (range).min; it < (range).max; it += 1) #define EachNode(it, T, first) (T *it = first; it != 0; it = it->next) #define EachBit(it, flags) (U64 (_i_) = (flags), it = (flags) & -(flags); (_i_) != 0; (_i_) &= ((_i_) - 1), it = (flags) & -(flags)) ``` Not looking so simple anymore :-) I'm glad D users at least use standard `.length`, `ulong.max` and `foreach` instead of everyone inventing their own versions of them.
Mar 30
next sibling parent reply Forum User <forumuser example.com> writes:
On Monday, 30 March 2026 at 14:23:35 UTC, Dennis wrote:

 ```C
 #define ArrayCount(a) (sizeof(a) / sizeof((a)[0]))
 ```
Is this a valid C translation unit? #include <stdio.h> int (*fun) (); int main () { printf ("%zu\n", sizeof fun); printf ("%zu\n", sizeof (fun) ()); } If it is compiled on an LP64 system, what will be printed?
Mar 31
parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 31 March 2026 at 19:39:20 UTC, Forum User wrote:
 On Monday, 30 March 2026 at 14:23:35 UTC, Dennis wrote:

 ```C
 #define ArrayCount(a) (sizeof(a) / sizeof((a)[0]))
 ```
This construct fails if a is a pointer and not an array. As sizeof(a) then is only the size of the pointer not of the array. To make sure to get a compilation error one has to use compiler extensions, which gives a non trivial code. Here how to make that macro safe in GNU-C ```C #ifdef __GNUC__ #define ArrayCount(a) (sizeof(struct {int :-!!(__builtin_types_compatible_p(typeof (a), typeof (&0[a])));})+sizeof a/sizeof 0[a]) #else /* A portable variant, but which doesn't check for array or pointer. */ #define ArrayCount(a) (sizeof a/sizeof 0[a]) #endif ```
Apr 02
parent Derek Fawcus <dfawcus+dlang employees.org> writes:
On Thursday, 2 April 2026 at 17:37:18 UTC, Patrick Schluter wrote:
 To make sure to get a compilation error one has to use compiler 
 extensions, which gives a non trivial code.
One can do equivalent in standard C (C11) by making use of _Generic. DF
Apr 02
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2026 7:23 AM, Dennis wrote:
 I'm glad D users at least use standard `.length`, `ulong.max` and `foreach` 
 instead of everyone inventing their own versions of them.
The motivation for D's `debug` statements and declarations was a Microsoft manager's lament to me that every C team had a different standard on how to mark debug code. This made it very tedious to share code. `debug` also turns off safety checks, so one can insert whatever is needed to debug the code. Standardizing common things turns out to be a big deal. (It's a rite of passage for every C programmer to develop their own, unique string package. Gawd knows I've probably coded up a couple dozen of them.)
Apr 01
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Wed, Apr 01, 2026 at 09:11:45AM -0700, Walter Bright via Digitalmars-d wrote:
[...]
 (It's a rite of passage for every C programmer to develop their own,
 unique string package. Gawd knows I've probably coded up a couple
 dozen of them.)
Ugh, recently at a work project (> 2G LOC, majority of it in plain C) I had to deal with cleaning up string code for compliance with security standards. Every module has its own way of working with strings; some have fancy buffers with custom functions for building strings (and each gratuitously incompatible with the others), others abuse snprintf and strncat all over the place (hidden O(n^2) costs, anyone?), and yet others outright strcat and sprintf (yes, the unsafe variants!) to a buffer whose size is never checked that's passed around as a bare char*. (It took some digging to discover that everyone uses the same underlying 64k buffer size. Still, extremely scary.) A lot of boilerplate code exists purely to bridge the impedance mismatch when passing strings between modules. In this day and age, not having a standard API for string handling across the entire language is inexcusable. While flexibility is nice to have, some things *need* to be standardized, otherwise it becomes unmanageable with size and leads to needless boilerplate bloat. T -- Democracy: The triumph of popularity over principle. -- C.Bond
Apr 01
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
So true. When I'm asked to review a C project, the first thing I do is look for 
strncat(). Code that uses it is guaranteed to be broken.
Apr 01
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/1/2026 11:55 AM, H. S. Teoh wrote:
 I had to deal with cleaning up string code for compliance with security
 standards.  Every module has its own way of working with strings; some
 have fancy buffers with custom functions for building strings (and each
 gratuitously incompatible with the others), others abuse snprintf and
 strncat all over the place (hidden O(n^2) costs, anyone?), and yet
 others outright strcat and sprintf (yes, the unsafe variants!) to a
 buffer whose size is never checked that's passed around as a bare char*.
 (It took some digging to discover that everyone uses the same underlying
 64k buffer size. Still, extremely scary.)
I made this into a slide for the upcoming Elegant D presentation at Yale! Too on target to pass up.
Apr 01
prev sibling parent reply ShadoLight <ettienne.gilbert gmail.com> writes:
On Wednesday, 1 April 2026 at 18:55:29 UTC, H. S. Teoh wrote:

 Ugh, recently at a work project (> 2G LOC ...
Wait, what, holy mackerel ... 2G LOC like in 2 *billion* LOC ? What are your company coding ... a new control system for the Starship Enterprise ? ;-)
Apr 02
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Thu, Apr 02, 2026 at 08:03:26AM +0000, ShadoLight via Digitalmars-d wrote:
 On Wednesday, 1 April 2026 at 18:55:29 UTC, H. S. Teoh wrote:
 
 Ugh, recently at a work project (> 2G LOC ...
Wait, what, holy mackerel ... 2G LOC like in 2 *billion* LOC ?
OK, I miscalculated, it's actually closer to 40M LOC, lol. But still, large enough not to sneeze at.
 What are your company coding ... a new control system for the Starship
 Enterprise ? ;-)
No, just a huge patchwork of lot of legacy code written over the course of the last 20-25 years or so. A lot of stuff from 20 years ago that nobody dares to touch because it Just Works(tm) (well, *most* of the time) and nobody understands it anymore 'cos the original dev has moved on. The evil strcat's and strcpy's are mostly in this code. T -- Why is the doctor so calm? Because he has a lot of patients.
Apr 02
parent reply Kapendev <alexandroskapretsos gmail.com> writes:
On Thursday, 2 April 2026 at 14:41:47 UTC, H. S. Teoh wrote:
 No, just a huge patchwork of lot of legacy code written over 
 the course of the last 20-25 years or so.  A lot of stuff from 
 20 years ago that nobody dares to touch because it Just 
 Works(tm) (well, *most* of the time) and nobody understands it 
 anymore 'cos the original dev has moved on.
Good thing we live in the future and LLMs can fix this. Well, they can fix everything actually. Even this message was written, checked and sent by a very helpful LLM.
Apr 02
parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Thu, Apr 02, 2026 at 03:27:42PM +0000, Kapendev via Digitalmars-d wrote:
 On Thursday, 2 April 2026 at 14:41:47 UTC, H. S. Teoh wrote:
 No, just a huge patchwork of lot of legacy code written over the
 course of the last 20-25 years or so.  A lot of stuff from 20 years
 ago that nobody dares to touch because it Just Works(tm) (well,
 *most* of the time) and nobody understands it anymore 'cos the
 original dev has moved on.
Good thing we live in the future and LLMs can fix this. Well, they can fix everything actually. Even this message was written, checked and sent by a very helpful LLM.
I am still a skeptic. T -- Synonym rolls: just like grammar used to make them.
Apr 02
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 02/04/2026 5:11 AM, Walter Bright wrote:
 On 3/30/2026 7:23 AM, Dennis wrote:
 
     I'm glad D users at least use standard |.length|, |ulong.max| and |
     foreach| instead of everyone inventing their own versions of them.
 
 The motivation for D's |debug| statements and declarations was a 
 Microsoft manager's lament to me that every C team had a different 
 standard on how to mark debug code. This made it very tedious to share 
 code. |debug| also turns off safety checks, so one can insert whatever 
 is needed to debug the code.
My one gripe with version and debug statements, is the non-standard names for the identifiers. It should've been tied to the module system: ```d debug(my.pack:whatever) {} debug(my.mod.ule:thing) {} debug(:feature) {} // automatically for module ``` Without this, its always a worry for me about namespace clashing.
 Standardizing common things turns out to be a big deal.
I am going to quote you on that one, a lot, thanks for the quote!
Apr 01
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
I think I'm going to steal that for my upcoming talk on "Elegant D"! Thank you, 
Dennis!
Apr 01
parent reply Dennis <dkorpel gmail.com> writes:
On Wednesday, 1 April 2026 at 16:12:28 UTC, Walter Bright wrote:
 I think I'm going to steal that for my upcoming talk on 
 "Elegant D"! Thank you, Dennis!
Go right ahead!
Apr 01
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/1/2026 1:54 PM, Dennis wrote:
 Go right ahead!
Ahead warp factor 6!
Apr 01
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2026 5:31 AM, matheus wrote:
 Yes C has it flaws, but it's an older and popular language even these days. In 
 90's it was a king among developers and even today there are project being 
 written with it. It's hard to compare with more flexible language today, so
it's 
 look like brittle, but at the same time is less complex than newer languages. 
 Coming from C, sometimes I get lost reading new languages code and people
making 
 simple things more complex.
C is indeed a simple language, but writing higher level code with it is awkward. Especially when one resorts to using the preprocessor for metaprogramming.
Apr 01
parent reply Araq <rumpf_a web.de> writes:
On Wednesday, 1 April 2026 at 16:04:08 UTC, Walter Bright wrote:
 On 3/30/2026 5:31 AM, matheus wrote:
 Yes C has it flaws, but it's an older and popular language 
 even these days. In 90's it was a king among developers and 
 even today there are project being written with it. It's hard 
 to compare with more flexible language today, so it's look 
 like brittle, but at the same time is less complex than newer 
 languages. Coming from C, sometimes I get lost reading new 
 languages code and people making simple things more complex.
C is indeed a simple language, but writing higher level code with it is awkward. Especially when one resorts to using the preprocessor for metaprogramming.
Well it's neither particularly simple to implement (is `T*x` a declaration or a multiplication?) nor to use (so much UB...) nor to read (too bad if the author used the preprocessor extensively)... There is nothing "simple" about it; it lacks many important features though... If you want simple, look into Forth or Scheme or Oberon.
Apr 01
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/1/2026 9:35 AM, Araq wrote:
 Well it's neither particularly simple to implement (is `T*x` a declaration or
a 
 multiplication?) nor to use (so much UB...) nor to read (too bad if the author 
 used the preprocessor extensively)... There is nothing "simple" about it; it 
 lacks many important features though... If you want simple, look into Forth or 
 Scheme or Oberon.
As programming languages go, C is simple to implement, which gave it a big early boost in popularity. That doesn't mean it is simple to write robust, elegant code in it!
Apr 01
parent Araq <rumpf_a web.de> writes:
On Wednesday, 1 April 2026 at 16:45:26 UTC, Walter Bright wrote:
 On 4/1/2026 9:35 AM, Araq wrote:
 Well it's neither particularly simple to implement (is `T*x` a 
 declaration or a multiplication?) nor to use (so much UB...) 
 nor to read (too bad if the author used the preprocessor 
 extensively)... There is nothing "simple" about it; it lacks 
 many important features though... If you want simple, look 
 into Forth or Scheme or Oberon.
As programming languages go, C is simple to implement, which gave it a big early boost in popularity. That doesn't mean it is simple to write robust, elegant code in it!
Sure, if you ignore anything like `volatile` and sequence points and `_Atomic` and const qualifiers and its two-step tokenization when you directly embed the preprocessor in your parser for speed... Then it's simple, sure. In general, don't read the spec, don't read the sections about valid type based aliasing and it's fine.
Apr 01
prev sibling next sibling parent reply Cid Lib <CidLib gmail.com> writes:
On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 ...
 
 The fix is simple - use length-delimited strings. D relies on 
 them to great effect. This can be done in C, but there is no 
 succor from the language, and such a package is not 
 standardized. I've proposed a simple enhancement for C to make 
 them work 
 https://www.digitalmars.com/articles/C-biggest-mistake.html but 
 nobody in the C world has any interest in it (which is 
 baffling, as it is so simple!).

 ...
C isn't just a language; it's a global ecosystem. They aren't *ignoring* you because the idea is bad. To the C world, 'simple' means "don't break the ABI." C has been using the "manual fat pointer" for 50 years. And it'll still be doing it for another 50 years. C isn't a sportscar; it's the *foundation* of interoperability. You don't want your foundation changing very often, if at all. In the end, there is no successor to C on the horizon, because C's unchanging nature is its greatest. It's also its greatest weakness.
Apr 02
parent reply Cid Lib <CidLib gmail.com> writes:
On Friday, 3 April 2026 at 06:53:41 UTC, Cid Lib wrote:
 In the end, there is no successor to C on the horizon, because 
 C's unchanging nature is its greatest.

 It's also its greatest weakness.
oops. correction: In the end, there is no successor to C on the horizon, because C's unchanging nature is its greatest strength. It's also its greatest weakness.
Apr 02
parent reply Walter Bright <newshound2 digitalmars.com> writes:
C changes all the time.

My proposal for arrays is completely upwards compatible with C. It won't break 
anything.
Apr 03
parent reply Cid Lib <cidlib gmail.com> writes:
On Friday, 3 April 2026 at 16:50:52 UTC, Walter Bright wrote:
 C changes all the time.

 My proposal for arrays is completely upwards compatible with C. 
 It won't break anything.
When a project needs that kind of safety, the answer should not be to try to 'fix' C. Instead you should to move to a language that was built with safety principles from the ground up. Fixing 'pointer decay' with fat pointers is like upgrading the locks on your front door while the house is built on a shifting swamp. Fat pointers is just a 'safety' mirage. It's 'temporally' blind. You have to deal with the life cycle of memory, no just the boundaries of memory. Even with a fat pointer, you will still have to contend with these issues in C: - the "use-after-free" problem. - the "uninitialized memory" problem. - null dereference (A fat pointer can still have a NULL base address.) .. .... ...... It's a classic case of diminishing returns. While a fat pointer solves the most 'infamous' C vulnerability - the buffer overflow - it leaves the rest of the minefield untouched. In fact, it might even make some bugs harder to find by providing a false sense of security. And really, after all of the above, I haven't even begun to mount my technical 'prosecution' against a fat pointer for C.
Apr 04
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 05/04/2026 12:15 PM, Cid Lib wrote:
 On Friday, 3 April 2026 at 16:50:52 UTC, Walter Bright wrote:
 C changes all the time.

 My proposal for arrays is completely upwards compatible with C. It 
 won't break anything.
When a project needs that kind of safety, the answer should not be to try to 'fix' C. Instead you should to move to a language that was built with safety principles from the ground up. Fixing 'pointer decay' with fat pointers is like upgrading the locks on your front door while the house is built on a shifting swamp.
That's right. This is why static analyses engines like Astrée exist. If you're not using the best analysis engine in existence to prove safety, you are not safe. Only C, and C++ are supported. Changing languages doesn't fix this. https://www.absint.com/astree/index.htm
Apr 04
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/4/2026 6:27 PM, Richard (Rikki) Andrew Cattermole wrote:
 https://www.absint.com/astree/index.htm
I don't see mention of use-after-free, double free, or memory leak detection. There's no static analyzer that can detect this reliably and have C semantics.
Apr 06
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 07/04/2026 5:17 PM, Walter Bright wrote:
 On 4/4/2026 6:27 PM, Richard (Rikki) Andrew Cattermole wrote:
 https://www.absint.com/astree/index.htm
I don't see mention of use-after-free, double free, or memory leak detection. There's no static analyzer that can detect this reliably and have C semantics.
https://www.absint.com/astree/compliance.htm Page CWE: 401 Improper release of memory before removing last reference (memory leak) 415 Double free 416 Use after free Astrée is literally the static analyzer where if money was not a limitation here is what you build. The reason I am so gung-ho on it, is because its quite literally a marvel of engineering. Right up there with TeleType's and the black bird. Just a 21st century feat, not a 20th. Its a shame we can't go and play with it, we don't have that kind of budget.
Apr 07
parent reply Walter Bright <newshound2 digitalmars.com> writes:
```
if (foo())
     T* p = (T*)malloc(...);
...dum dee dum dee dum...
if (foo())
{
     *p = 3;
     free(p);
}
```
If the static analyzer cannot determine what foo() returns, it cannot determine 
if the code is bad or not. It's the halting problem.

I've seen the equivalent of this in the wild. It's bad style, but it exists.
Apr 07
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/04/2026 8:24 AM, Walter Bright wrote:
 ```
 if (foo())
      T* p = (T*)malloc(...);
 ...dum dee dum dee dum...
 if (foo())
 {
      *p = 3;
      free(p);
 }
 ```
 If the static analyzer cannot determine what foo() returns, it cannot 
 determine if the code is bad or not. It's the halting problem.
 
 I've seen the equivalent of this in the wild. It's bad style, but it 
 exists.
1. Dedicated static analyzers work with whole program analysis, via IR's. 2. This is solved, without the use of meet operations. Basically the entire state context gets duplicated on if statement completion, one for each branch, and everything after gets evaluated with both contexts. This appears to be why GCC will only error on one branch with the static analyzer errors. This has a research paper on it, and I was able to come up with it on my own too. I haven't done it in the fast DFA engine because its not exactly fast.
Apr 07
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/7/2026 4:52 PM, Richard (Rikki) Andrew Cattermole wrote:
 1. Dedicated static analyzers work with whole program analysis, via IR's.
The halting problem cannot be solved.
 2. This is solved, without the use of meet operations. Basically the entire 
 state context gets duplicated on if statement completion, one for each branch, 
 and everything after gets evaluated with both contexts. This appears to be why 
 GCC will only error on one branch with the static analyzer errors. This has a 
 research paper on it, and I was able to come up with it on my own too. I
haven't 
 done it in the fast DFA engine because its not exactly fast.
The program's flow depends on its inputs, and a static analyzer doesn't know what they are. BTW, if the dedicated static analyzers work, why does AI keep finding security bugs in Linux code and everything else? Most recently an array buffer overflow that had been in the kernel for 20 years.
Apr 08
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Wed, Apr 08, 2026 at 01:01:01PM -0700, Walter Bright via Digitalmars-d wrote:
 On 4/7/2026 4:52 PM, Richard (Rikki) Andrew Cattermole wrote:
[...]
 2. This is solved, without the use of meet operations. Basically the
 entire state context gets duplicated on if statement completion, one
 for each branch, and everything after gets evaluated with both
 contexts.  This appears to be why GCC will only error on one branch
 with the static analyzer errors. This has a research paper on it,
 and I was able to come up with it on my own too. I haven't done it
 in the fast DFA engine because its not exactly fast.
The program's flow depends on its inputs, and a static analyzer doesn't know what they are.
[...] I've dabbled a bit in algorithms of this kind. Generally, you don't work with concrete values, because then you'll run into performance issues and the good ole halting problem. Instead, you work at a slightly more abstract level, such as with value sets (a generalization of VRP actually). You assume that parameters can take on any value in their type (and perhaps a fancy analyser will reduce this initial set according to in-contracts -- but that's only possible for simple contracts; with D's contracts a halting problem can potentially lurk here; if the contract is too complex the algorithm can just take the worst case of taking the full set of values for that type). Then you iterate over the statements and eliminate portions of this set, e.g. when the parameter is assigned to, etc., and propagate the set over assignments. You also don't do actual flow, because that can be arbitrarily complex and lead to the halting problem; instead, you do a simplified flow: when you encounter an if-statement, you step over both branches and compute the value set for both, then at the end you assume the worst case and take their union. Loops are handled the same way: you step through the loop body once, then assume the worst and reduce the set only if you're able to prove that *every* loop iteration reduces it. If you can't prove that, assume the worst and take the unreduced set prior to the loop. There may be some simple loops (e.g. foreach) where more advanced reasoning can be applied, but basically you apply special cases where you can to reduce the value set, and where you can't just assume the worst / take the conservative assumption. Then at the end, if you're able to reduce the set, then the missing elements are provably impossible, and you can reason about whether the remaining values can possibly overrun bounds. But as Rikki said, all of this won't be fast. And obviously there will be cases it cannot catch. So it's a toss-up whether such a thing is worth implementing in a compiler. T -- "You know, maybe we don't *need* enemies." "Yeah, best friends are about all I can take." -- Calvin & Hobbes
Apr 08
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Thank you for the interesting explanation of how it works!

I see its value as a way to remove array bounds checks for the cases where it 
can prove no overflow problems. So I see it as an optimization.

But it does not *solve* the problem. Array bounds checks solve it.
Apr 08
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/04/2026 8:34 AM, H. S. Teoh wrote:
 You also don't do actual flow, because that can be arbitrarily complex
 and lead to the halting problem; instead, you do a simplified flow: when
 you encounter an if-statement, you step over both branches and compute
 the value set for both, then at the end you assume the worst case and
 take their union.  Loops are handled the same way: you step through the
 loop body once, then assume the worst and reduce the set only if you're
 able to prove that *every* loop iteration reduces it. If you can't prove
 that, assume the worst and take the unreduced set prior to the loop.
 There may be some simple loops (e.g. foreach) where more advanced
 reasoning can be applied, but basically you apply special cases where
 you can to reduce the value set, and where you can't just assume the
 worst / take the conservative assumption.
You're talking about doing a meet operation on lattices. Walter knows how to do this, he has implemented this before I was ever born. However this is the original way, there are other solutions for how to handle it now that don't use the meet operation, and instead keep all the state around. But alas more cpu and ram costs. They could not be implemented in the early 2000's when static analyzers surged in implementation and usage.
 Then at the end, if you're able to reduce the set, then the missing
 elements are provably impossible, and you can reason about whether the
 remaining values can possibly overrun bounds.
 
 But as Rikki said, all of this won't be fast.  And obviously there will
 be cases it cannot catch.  So it's a toss-up whether such a thing is
 worth implementing in a compiler.
After a bit more reading of: https://www.nist.gov/publications/sate-vi-report-bug-injection-and-collection I notice that Frama-C with Eva also meets the sound critera and it is free.
Apr 08
prev sibling next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/04/2026 8:01 AM, Walter Bright wrote:
 BTW, if the dedicated static analyzers work, why does AI keep finding 
 security bugs in Linux code and everything else? Most recently an array 
 buffer overflow that had been in the kernel for 20 years.
Hardly anyone is using dedicated static analyzers, let alone paying for one that is price upon request. The most people do is use the static analyzer that comes with their compiler, which quite frankly is a toy in comparison. The linux kernel is an outlier, as they've been using dedicated static analyzers for well over 20 years. I've covered this well over a year ago now at a monthly meeting. On top of this, there are multiple dedicated efforts to apply static analyzers to the kernel which has found and gotten problems fixed. Quite frankly when was the last time you ran into a BSOD regularly? A good 20 years ago right? Hint, this is why. Everyone uses some kind of static analyzers for kernels which has improved reliability significantly. https://lwn.net/Articles/412750/ https://repo.or.cz/w/smatch.git https://linuxtesting.org/results/ldv I've mentioned this one in the past, it was originally written by Linus: https://sparse.docs.kernel.org/en/latest/ So to answer your question: 1. They are not being used. 2. They are not using the best analyzer(s) as they cost significant money. 3. Not all bugs being found are being fixed (sigh). The reason I keep referencing Astrée is because its at the maximum of what we as a species can do, it can analyze multi-threading and prevent dead locks. Whereas others like ikos which is free is for single threaded applications only. Given how prevalent software is now in aerospace, the fact that planes aren't falling out of the sky on a regular basis is pretty incredible. No amount of hardware can make up for how invasive it is.
Apr 08
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/8/2026 1:48 PM, Richard (Rikki) Andrew Cattermole wrote:
 Given how prevalent software is now in aerospace, the fact that planes aren't 
 falling out of the sky on a regular basis is pretty incredible. No amount of 
 hardware can make up for how invasive it is.
Having done design work for 3 years at Boeing, I can tell you that the software does fail. So does the hardware. And so does the nut behind the wheel. The reason planes don't fall out of the sky is: * backup systems and workarounds * (I don't know how often this happens, but airplanes are given the green light to fly even when many things are broken. There is a "minimum equipment list" which specifies what cannot be let slide.) The reality is one cannot make perfect parts. But one can greatly reduce the consequences by making the system *tolerant* of faults. It's a different mindset. Note that the Fukushima reactor and the Deepwater Horizon rig did not have backup systems. And so when something went wrong, a zipper effect resulted. (I don't recall the details, but I went through the zipper for both of them. It was rather astonishing for me as redundancy and fault tolerance was hammered into me.)
Apr 08
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/04/2026 9:33 AM, Walter Bright wrote:
 On 4/8/2026 1:48 PM, Richard (Rikki) Andrew Cattermole wrote:
 Given how prevalent software is now in aerospace, the fact that planes 
 aren't falling out of the sky on a regular basis is pretty incredible. 
 No amount of hardware can make up for how invasive it is.
Having done design work for 3 years at Boeing, I can tell you that the software does fail. So does the hardware. And so does the nut behind the wheel. The reason planes don't fall out of the sky is: * backup systems and workarounds * (I don't know how often this happens, but airplanes are given the green light to fly even when many things are broken. There is a "minimum equipment list" which specifies what cannot be let slide.) The reality is one cannot make perfect parts. But one can greatly reduce the consequences by making the system *tolerant* of faults. It's a different mindset. Note that the Fukushima reactor and the Deepwater Horizon rig did not have backup systems. And so when something went wrong, a zipper effect resulted. (I don't recall the details, but I went through the zipper for both of them. It was rather astonishing for me as redundancy and fault tolerance was hammered into me.)
Over the last 50 years, software has replaced hardware, and simpler hardware became more complex. Both introduce their own new risks that didn't exist when you were working on it. The significantly increased surface area and responsibility of behavior controlled by software, and the fact that aircraft are still flying without catastrophic failures on a regular basis due to software alone is what is so amazing. While I'm sure we could if we really tried, could find examples where software failed and could not recover, finding hardware faults causing software faults is a lot easier than just software faults by itself. I.e. https://www.eplaneai.com/news/airbus-to-update-a380-engine-software-by-q1-2026 Note: I picked Airbus to search for, because I know that they use Astrée. By all accounts, we should be seeing a lot more catastrophic level problems with articles on software failures in aircraft, yet we are not. The surface area is simply too large for humans to be the only ones catching problems.
Apr 08
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/8/2026 3:02 PM, Richard (Rikki) Andrew Cattermole wrote:
 By all accounts, we should be seeing a lot more catastrophic level problems
with 
 articles on software failures in aircraft, yet we are not. The surface area is 
 simply too large for humans to be the only ones catching problems.
Problems where the backups take over do not make the news. Some systems are even triply redundant. When I was at Boeing, the backup system did not use the same hardware, programming language, or algorithms. Different teams did the programming, and were not allowed to communicate with the other team.
Apr 08
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/04/2026 8:01 AM, Walter Bright wrote:
 BTW, if the dedicated static analyzers work, why does AI keep finding 
 security bugs in Linux code and everything else?
https://www.ffmpeg.org/security.html "Note, we have recently seen a spike in false positives. Make sure that what you report are real issues by careful human verification." Due to LLM's: https://x.com/FFmpeg/status/2041895360839237952 Early static analyzers had a lot of false positives, which resulted in the term: static analysis fatigue. But unlike early static analyzers, LLM's can't be fixed. There is no code that can be altered to get the desired behavior.
Apr 08
prev sibling next sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Sunday, 5 April 2026 at 00:15:33 UTC, Cid Lib wrote:
 Fixing 'pointer decay' with fat pointers is like upgrading the 
 locks on your front door while the house is built on a shifting 
 swamp.
It has always been possible to avoid pointer decay for fixed size buffers, it is just that that the syntax is ugly, and the use cases are limited. Having been assigned to work on some old code, adjusting APIs to use that form (or where difficult the newer "char buf[static X]" combined with static analysis (and warning flags) has been an easy improvement. Rewriting in a different language is a much more risky task, and generally an impossible sell to management. DF
Apr 05
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Sunday, 5 April 2026 at 10:28:42 UTC, Derek Fawcus wrote:
 On Sunday, 5 April 2026 at 00:15:33 UTC, Cid Lib wrote:
 Fixing 'pointer decay' with fat pointers is like upgrading the 
 locks on your front door while the house is built on a 
 shifting swamp.
It has always been possible to avoid pointer decay for fixed size buffers, it is just that that the syntax is ugly, and the use cases are limited.
Indeed. A function returning a pointer to an array of pointer is wild, for example: char *(*function_name(int param1, cons char *msg, ...))[20];
Apr 06
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/5/2026 3:28 AM, Derek Fawcus wrote:
 On Sunday, 5 April 2026 at 00:15:33 UTC, Cid Lib wrote:
 Fixing 'pointer decay' with fat pointers is like upgrading the locks on your 
 front door while the house is built on a shifting swamp.
It has always been possible to avoid pointer decay for fixed size buffers, it is just that that the syntax is ugly, and the use cases are limited.
I've never seen it used. It's too awkward.
 Having been assigned to work on some old code, adjusting APIs to use that form 
 (or where difficult the newer "char buf[static X]" combined with static
analysis 
 (and warning flags) has been an easy improvement.
Static analysis cannot reliably detect array overflows. It's that old halting problem.
 Rewriting in a different language is a much more risky task, and generally an 
 impossible sell to management.
AI has proven to be useful in translating code to a different language.
Apr 06
next sibling parent reply Carl sagat <carlsagat outlook.com> writes:
On Tuesday, 7 April 2026 at 05:20:38 UTC, Walter Bright wrote:
 AI has proven to be useful in translating code to a different 
 language.
True, in fact AI made programming languages less important these days, since you can port a code in minutes between languages. The most used or trained ones will be more reliable with this translation. Write code in this language x. Now port it in that language y. Now rewrite using this feature z. The old programming language features war is over. What matters is what language is easier and less prone to errors for AI to write code. Even C can come back in style with all the sourced trained over several years.
Apr 06
next sibling parent Serg Gini <kornburn yandex.ru> writes:
On Tuesday, 7 April 2026 at 05:34:51 UTC, Carl sagat wrote:
 Even C can come back in style with all the sourced trained over 
 several years.
It is not minutes Large code bases are still hard to port and expensive Languages itself are less relevant for years. And their AI performance is not too important.. Ecosystem and adoption in common services were in the past and I think still are more important.
Apr 06
prev sibling parent Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Tuesday, 7 April 2026 at 05:34:51 UTC, Carl sagat wrote:
 On Tuesday, 7 April 2026 at 05:20:38 UTC, Walter Bright wrote:
 AI has proven to be useful in translating code to a different 
 language.
True, in fact AI made programming languages less important these days, since you can port a code in minutes between languages. The most used or trained ones will be more reliable with this translation. Write code in this language x. Now port it in that language y. Now rewrite using this feature z. The old programming language features war is over. What matters is what language is easier and less prone to errors for AI to write code. Even C can come back in style with all the sourced trained over several years.
deterministic and give you an error when something cannot be translated directly. The problem with AI is that it’s not deterministic and that it’s not reliable. If something can’t be easily translated, most AIs will overconfidently do something that doesn’t work. Then you’re in for a debugging session of the worst kind: You’ll be debugging code that no human wrote, not even a colleague who retired years ago.
Apr 13
prev sibling parent reply Derek Fawcus <dfawcus+dlang employees.org> writes:
On Tuesday, 7 April 2026 at 05:20:38 UTC, Walter Bright wrote:
 On 4/5/2026 3:28 AM, Derek Fawcus wrote:
 It has always been possible to avoid pointer decay for fixed 
 size buffers, it is just that that the syntax is ugly, and the 
 use cases are limited.
I've never seen it used. It's too awkward.
For returning buffers, yes. For passing buffers in to functions, it is actually not that bad, and one now generally gets complaints from the compilers over errors. Something like this: ```c void caller() { char buffer[23]; int n = callee(&buffer); } int callee(char (*buffer)[23]) { enum { buflen = sizeof *buffer }; return snprintf(*buffer, buflen, "Foobar!"); } ``` possibly using another enum, or #define rather than than the literal 23.
 Having been assigned to work on some old code, adjusting APIs 
 to use that form (or where difficult the newer "char 
 buf[static X]" combined with static analysis (and warning 
 flags) has been an easy improvement.
Static analysis cannot reliably detect array overflows. It's that old halting problem.
It is not about being reliable, it is about catching the low hanging fruit, and catching them as early as possible. The compilers now often flag that case, 3rd party SA catches a bit more. Covering the other instances can often be achieved with various dynamic sanitisers (ASAN, bounds sanitizer, etc). I actually found and fixed a handful of errors simply from changing the callee definition in a few instances to that 'buf[static X]' form, and having the gcc flag it. In that case because there were so many 3rd party SA complaints that folks had stopped paying attention to them. So it is not just capabilities, it is an issue of practices.
 Rewriting in a different language is a much more risky task, 
 and generally an impossible sell to management.
AI has proven to be useful in translating code to a different language.
The major issue there is review bandwidth. It probably ain't going to happen, rewrites from scratch are more likely. What I am following with interest is Fil-C, which does add a GC. If the performance cost can be sufficiently improved, then that strikes me as the reasonable way forward for legacy C code, until it is chucked out and replaced by something written in a different language. DF
Apr 07
parent Walter Bright <newshound2 digitalmars.com> writes:
```
void process(size_t size, char buf[static size]) {
     // Compiler can assume buf points to at least 'size' elements
}
```

Is just a failure in language design:

1. it requires two independent pieces of information

2. it repeats itself

3. cannot declare a VLA as a variable

4. cannot return a VLA

6. cannot encapsulate a VLA in a struct

I don't recall it ever being used. It's next to useless.
Apr 07
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
The biggest problem, by far, with C arrays is buffer overflows. The evidence is 
from people who collect statistics on causes of bugs in the field.

To be memory safe with arrays, the GC can be used.
Apr 06
parent reply Indraj Gandham <newsgroups indraj.net> writes:
Just on the subject of buffer overflows, since I don't think anyone else 
has mentioned it here: Intel and AMD have been shipping x86 processors 
with Control-Flow Enforcement (CET) since 2020.

To mitigate ROP, when the CPU executes a CALL instruction, it will push 
the return address to both the normal process stack and a shadow stack 
stored in a protected region that cannot be modified except by the CPU. 
When the CPU executes the RET instruction, the addresses are popped from 
both stacks and compared. If they do not match, a fault will be raised.

To mitigate JOP, the CPU uses a state machine. When a JMP or CALL 
instruction is executed, it moves from an idle state into a waiting 
state. The very next instruction must be ENDBRANCH. If it is, the state 
returns to idle. If it isn't, a fault will be raised.

These mitigations do have limitations; for example the JOP protection 
does not verify whether the target is actually the correct destination 
for a given call site. It also does not protect against data 
modification. For compatibility reasons the operating system will likely 
disable CET for the entire process if one of the libraries the 
application links against was not compiled with CET support. JOP 
protection is not yet (?) available on AMD processors.

However -- CET may mean the difference between a vulnerability allowing 
for some degree of local data corruption and an attacker gaining 
complete control over your application.

You can enable CET in GCC and GDC by passing -fcf-protection=full.

The performance overhead of CET is generally minimal and most major 
Linux distributions are attempting to enable it (or have already enabled 
it) by default when compiling packages.

(Of course, it does nothing to protect against double-free, UAF etc.)
Apr 08
parent Walter Bright <newshound2 digitalmars.com> writes:
DMD has a switch to generate Indirect Branch Tracking code: -fIBT
Apr 08
prev sibling next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 22/03/2026 5:47 PM, Walter Bright wrote:
 The fix is simple - use length-delimited strings. D relies on them to 
 great effect. This can be done in C, but there is no succor from the 
 language, and such a package is not standardized. I've proposed a simple 
 enhancement for C to make them work https://www.digitalmars.com/ 
 articles/C-biggest-mistake.html <https://www.digitalmars.com/articles/C- 
 biggest-mistake.html> but nobody in the C world has any interest in it 
 (which is baffling, as it is so simple!).
It is not officially documented: https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm Unless its there, it does not exist. But there is plenty of desire for memory safety for C, including full blown static analysis prescribed by the spec. You should be contacting Uecker who is leading the memory safety workgroup. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3211.pdf
Apr 03
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2026 12:03 AM, Richard (Rikki) Andrew Cattermole wrote:
 You should be contacting Uecker who is leading the memory safety workgroup. 
 https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3211.pdf
Done.
Apr 03
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 04/04/2026 7:49 AM, Walter Bright wrote:
 On 4/3/2026 12:03 AM, Richard (Rikki) Andrew Cattermole wrote:
 You should be contacting Uecker who is leading the memory safety 
 workgroup. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3211.pdf
Done.
Awesome! I look forward to reading about it.
Apr 03
prev sibling parent reply Cid Lib <cidlib gmail.com> writes:
On Friday, 3 April 2026 at 18:49:19 UTC, Walter Bright wrote:
 On 4/3/2026 12:03 AM, Richard (Rikki) Andrew Cattermole wrote:
 You should be contacting Uecker who is leading the memory 
 safety workgroup. 
 https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3211.pdf
Done.
If I were on the ISO C Committee, I would not see my job as 'fixing' C, by trying to give it a brain, but rather ensuring the language remains the 'somewhat' mindless guardian of legacy code that it is, while simultaneously ensuring it remains a language that can continue to be used in cases where it would not be considered professionally negligent to do so. The only proposal I would 100% vote for, would be a proposal for categorizing the C language to the world more correctly - i.e. renaming it to UnsafeC (for example). Every attempt to 'fix' C, eventually just results in a different language entirely. "The great thing about C is that it doesn't try to be smarter than you. If I see p = q, I know exactly what the CPU is doing. If p is a fat pointer, now the compiler is doing multi-word copies and hidden arithmetic behind my back. And that is not C!" - (A Hypothetical Linus)
Apr 04
next sibling parent Kapendev <alexandroskapretsos gmail.com> writes:
On Sunday, 5 April 2026 at 02:34:08 UTC, Cid Lib wrote:
 On Friday, 3 April 2026 at 18:49:19 UTC, Walter Bright wrote:
 On 4/3/2026 12:03 AM, Richard (Rikki) Andrew Cattermole wrote:
 You should be contacting Uecker who is leading the memory 
 safety workgroup. 
 https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3211.pdf
Done.
If I were on the ISO C Committee, I would not see my job as 'fixing' C, by trying to give it a brain, but rather ensuring the language remains the 'somewhat' mindless guardian of legacy code that it is, while simultaneously ensuring it remains a language that can continue to be used in cases where it would not be considered professionally negligent to do so. The only proposal I would 100% vote for, would be a proposal for categorizing the C language to the world more correctly - i.e. renaming it to UnsafeC (for example). Every attempt to 'fix' C, eventually just results in a different language entirely. "The great thing about C is that it doesn't try to be smarter than you. If I see p = q, I know exactly what the CPU is doing. If p is a fat pointer, now the compiler is doing multi-word copies and hidden arithmetic behind my back. And that is not C!" - (A Hypothetical Linus)
A slice/view/span is literally just a struct and `=` would do exactly what you think it would do. You can even avoid `slice[0]` with a theoretical "BetterC" by using something like `__sget(slice, index)` to keep things more C like. Would return null on error.
Apr 04
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/4/2026 7:34 PM, Cid Lib wrote:
 If I were on the ISO C Committee, I would not see my job as 'fixing' C, by 
 trying to give it a brain, but rather ensuring the language remains the 
 'somewhat' mindless guardian of legacy code that it is, while simultaneously 
 ensuring it remains a language that can continue to be used in cases where it 
 would not be considered professionally negligent to do so.
The C committee adds new features all the time.
 The only proposal I would 100% vote for, would be a proposal for categorizing 
 the C language to the world more correctly - i.e. renaming it to UnsafeC (for 
 example).
 
 Every attempt to 'fix' C, eventually just results in a different language
entirely.
My proposal is a simple addition to C. C has already copied features from D (like `static_assert`).
 "The great thing about C is that it doesn't try to be smarter than you. If I
see 
 p = q, I know exactly what the CPU is doing. If p is a fat pointer, now the 
 compiler is doing multi-word copies and hidden arithmetic behind my back. And 
 that is not C!" - (A Hypothetical Linus)
`p = q` does not involve hidden arithmetic. It's an assignment of two words. Like `long long` is (often) an assignment of two words.
Apr 06
prev sibling parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
 It's true that writing code in C doesn't automatically make it 
 faster.

 For example, string manipulation. 0-terminated strings (the 
 default in C) are, frankly, an abomination. String processing 
 code is a tangle of strlen, strcpy, strncpy, strcat, all of 
 which require repeated passes over the string looking for the 
 0. (Even worse, reloading the string into the cache just to 
 find its length makes things even slower.)

 Worse is the problem that, in order to slice a string, a malloc 
 is needed to copy the slice to. And then carefully manage the 
 lifetime of that slice.

 The fix is simple - use length-delimited strings.
Working with C++ and having implemented my own `span` and `string_view` types, I think D’s strings are under-appreciated. They’re by far the best strings I’ve seen in a programming language. C++ has: `char*` and `const char*`, `char[]`, `const char[]`, `std::string`, `std::string_view`, `std::span<char>`, and `std::span<const char>` (those times 7 because there’s `wchar_t`, `char8_t`, `char16_t` and `char32_t` and `signed`/`unsigned char`) (those times 2 for `volatile` to be pedantic). At least. Knowing when to use which is not straightforward. Comparing substrings efficiently is difficult, I always have to look up the arguments for `compare`. Most people just allocate substrings and compare those naïvely. Returning a length-delimited string by mutable content and with a default is impossible before C++20’s `span`: You can return `data()` or `nullptr` erasing the length, or return a `string_view` making the characters `const`. `std::string_view` and `span`s easily dangle, so using them e.g. for map keys is an issue: If *one* key would be dangling, you *have* to use `std::string` and copy all the keys, simply because C++ doesn’t have a GC; and then, you have `std::less<>` on your ordered map type or `std::equal_to` and a custom hash on your unordered map type because you still want to look up using `string_view` keys without copying them. Appending is done with `+` and only ever returns `std::string`. C++ has no `switch`ing on strings; if they’re short, you can write a `constexpr` utility function that maps them to numbers and switch on them. To top it all off, (`unsigned`) `char*` are also used for random data (instead of `void*`), and (`un`)`signed char` as the smallest integer types. C has a small subset of this, which makes it arguably worse. D has: `immutable(char)[]`, `const(char)[]`, and `char[]` (times 3 for `wchar` and `dchar`). Their use-cases are straightforward, you never need to decide vanilla/signed/unsigned or `whcar` vs `char16_t`/`char32_t`. You can just return a `char[]` you have and an empty one as the default. They can’t easily dangle. Appending its own operator `~` and you can “just append” things. Comparing substrings is straightforward. You can just use `string` as map key type and just perform lookup with a `char[]`. You can just `switch` on strings. (Honestly, D should add `switch` for all slice types over switchable types: You should be able to switch over `int[]` and `int[][]`.) Sure, there’s also `shared` and `inout`, which are basically in the same camp as C++’s `volatile`: You rarely encounter them. Built-in length-delimited strings (or slices/spans) would be a win for C, but C++ shows they’re not a panacea. The GC enables D’s length-delimited strings to be great instead of just good; when you disable the GC, they’re still good and do profit off of the GC existing at compile-time. You can build a static string at compile-time in D. That’s impossible in C++ before C++26; I’m not sure if C++26’s reflection can do it. A lot of C++ code is C++03, lacking even basics, and lots more is C++14 (still many Linux distros’ default compiler’s default), which has no `string_view` and C++20 brought `span` and transparent lookup in unordered containers. Before actually working with those, I couldn’t have imagined how terrible it was. The worst part about D’s strings are auto-decoding and that literals include a secret zero past the end without you asking for it. About the last thing, maybe in the next Edition, we can have `""z` strings that request a secret zero and only add the zero if a non-z string is used to initialize an `immutable(char)*` or `const(char)*` (or make it an error to omit the `z` in that case). That would allow for some compression to be done by the compiler: If you have strings in your code like `"BC"` and `"ABCD"`, it could just re-use segment. With the secret zero, it can only re-use suffixes.
Apr 03
next sibling parent Nick Treleaven <nick geany.org> writes:
On Friday, 3 April 2026 at 12:20:19 UTC, Quirin Schroll wrote:
 The worst part about D’s strings are auto-decoding and that 
 literals include a secret zero past the end without you asking 
 for it.

 About the last thing, maybe in the next Edition, we can have 
 `""z` strings that request a secret zero and only add the zero 
 if a non-z string is used to initialize an  `immutable(char)*` 
 or `const(char)*` (or make it an error to omit the `z` in that 
 case). That would allow for some compression to be done by the 
 compiler: If you have strings in your code like `"BC"` and 
 `"ABCD"`, it could just re-use segment. With the secret zero, 
 it can only re-use suffixes.
If we decide to break existing code in an edition, it should not cause unwanted silent runtime behaviour change. Especially something so prone to buffer overruns! If you want a literal to not be zero-terminated, presumably you could use a static array? ```d immutable char[2] s = "hi"; ``` That will be a nicer workaround with length inference: https://dlang.org/changelog/pending.html#dmd.sarr-length-infer
Apr 03
prev sibling next sibling parent reply Kapendev <alexandroskapretsos gmail.com> writes:
On Friday, 3 April 2026 at 12:20:19 UTC, Quirin Schroll wrote:
 The worst part about D’s strings are auto-decoding and that 
 literals include a secret zero past the end without you asking 
 for it.

 About the last thing, maybe in the next Edition, we can have 
 `""z` strings that request a secret zero and only add the zero 
 if a non-z string is used to initialize an  `immutable(char)*` 
 or `const(char)*` (or make it an error to omit the `z` in that 
 case). That would allow for some compression to be done by the 
 compiler: If you have strings in your code like `"BC"` and 
 `"ABCD"`, it could just re-use segment. With the secret zero, 
 it can only re-use suffixes.
It should be the other way around if that ever becomes a thing. The defaults being safe with a zero at the end and the unsafe version requiring an explicit symbol thing.
Apr 03
next sibling parent reply Kapendev <alexandroskapretsos gmail.com> writes:
On Friday, 3 April 2026 at 14:51:00 UTC, Kapendev wrote:
 It should be the other way around if that ever becomes a thing. 
 The defaults being safe with a zero at the end and the unsafe 
 version requiring an explicit symbol thing.
Also imagine the confusion when people read old comments saying that D string literals are zero terminated, and now they are not.
Apr 03
parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Friday, 3 April 2026 at 14:58:52 UTC, Kapendev wrote:
 On Friday, 3 April 2026 at 14:51:00 UTC, Kapendev wrote:
 It should be the other way around if that ever becomes a 
 thing. The defaults being safe with a zero at the end and the 
 unsafe version requiring an explicit symbol thing.
Also imagine the confusion when people read old comments saying that D string literals are zero terminated, and now they are not.
If we apply that standard, Editions aren’t worth it. Almost every property of the D language has been talked about. Maybe you missed it, but I wrote that if the literal is assigned to a char-type pointer, the NUL should be added. It’s not technically needed for safety because if you read over the buffer, that’s a you problem, but we should try to make C code that’s syntactically correct work correctly. However, if you assign a string literal to a slice type first and later derive a pointer, that’s already ` system`. It has to be because you can’t know if the pointer can be dereferenced. A valid, non-null pointer that can’t be dereferenced is the one-past-the-end pointer of an array. ```d void main() safe { const char* p = "abc"; // okay, safe string s = "abc"; const char* q = s.ptr; // error, system operation in safe function } ``` Essentially, if you `array.ptr` you say: I know what I’m doing. And if you don’t that’s on you. Moving to a newer Edition without learning what changes it brings, I repeat myself, that’s on you. Editions aren’t worth it if we can only add features that don’t require the programmer read into them. If you decide to write ` system` code, you need to actually know the language well, and you will be expected to know that Edition 2027 which added `""z` literals made ordinary literals not NUL terminated. ` system` is an expert tool, and if that’s above your weight class, you need to step up your game. If you don’t do C interop yourself, it won’t affect you. If you do C interop, replacing all your `""` literals by `""z` literals is a simple regex replace away plus a linter configuration that keeps you from accidentally using `""`. However, I don’t think you pass many slice typed things to C functions. If you don’t use a linter on a big project, I’d say that’s on you as well. I expected this to be controversial, but not to that degree.
Apr 13
parent reply Dejan Lekic <dejan.lekic gmail.com> writes:
On Monday, 13 April 2026 at 09:56:19 UTC, Quirin Schroll wrote:
 If you don’t do C interop yourself, it won’t affect you.

 If you do C interop, replacing all your `""` literals by `""z` 
 literals is a simple regex replace away plus a linter 
 configuration that keeps you from accidentally using `""`. 
 However, I don’t think you pass many slice typed things to C 
 functions. If you don’t use a linter on a big project, I’d say 
 that’s on you as well.

 I expected this to be controversial, but not to that degree.
Luckily this is just an edition so I do not need to use it. - I wonder who will? Is there a comprehensive document describing this decision (DIP perhaps)? I am curious to know what do you do when you use packageX that contains a function that takes a string - how do you know whether you pass it a new "" string or ""z string? You need to go and read the code to understand what that function does, right? Or ""z is not a "string" but some other type (zstring)?
Apr 13
parent Serg Gini <kornburn yandex.ru> writes:
On Monday, 13 April 2026 at 10:20:33 UTC, Dejan Lekic wrote:
 Luckily this is just an edition so I do not need to use it. - I 
 wonder who will?
it works for Rust :P https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2Fjsh156txovp91.png
Apr 13
prev sibling parent Dejan Lekic <dejan.lekic gmail.com> writes:
On Friday, 3 April 2026 at 14:51:00 UTC, Kapendev wrote:

 It should be the other way around if that ever becomes a thing. 
 The defaults being safe with a zero at the end and the unsafe 
 version requiring an explicit symbol thing.
Finally someone who uses brain... Imagine something like `i"foo"wz` in your code... D haters will feed on this, and they would be right this time.
Apr 03
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
I agree with everything you wrote, except:

 The worst part about D’s strings are auto-decoding
That's a Phobos-only feature, and will be dumped for the next Phobos.
 and that literals include a 
 secret zero past the end without you asking for it.
I think it's quite nice! It makes using string literals assigned to pointers (or implicitly converted to pointers) work seamlessly with C. Also, ```d pragma(msg,"hello".length); immutable string abc = "hello"; pragma(msg, abc.length); ``` prints: 5Lu 5LU How is it a problem?
Apr 03
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 04/04/2026 6:27 AM, Walter Bright wrote:
 I agree with everything you wrote, except:
 
 The worst part about D’s strings are auto-decoding
That's a Phobos-only feature, and will be dumped for the next Phobos.
 and that literals include a secret zero past the end without you 
 asking for it.
I think it's quite nice! It makes using string literals assigned to pointers (or implicitly converted to pointers) work seamlessly with C. Also, ```d pragma(msg,"hello".length); immutable string abc = "hello"; pragma(msg, abc.length); ``` prints: 5Lu 5LU How is it a problem?
Prefix string interning, can't optimize the text segment so only one copy of a given sequence of bytes is present. As in: AB, ABC, ABCD Only needs ABCD in text segment. Whereas: AB\0, ABC\0, ABCD\0 Can't be optimized down to just ABCD\0. Idk how much that matters, but that was the argument.
Apr 03
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2026 4:32 PM, Richard (Rikki) Andrew Cattermole wrote:
 Prefix string interning, can't optimize the text segment so only one copy of a 
 given sequence of bytes is present.
 
 As in: AB, ABC, ABCD
 Only needs ABCD in text segment.
 
 Whereas: AB\0, ABC\0, ABCD\0
 Can't be optimized down to just ABCD\0.
 
 Idk how much that matters, but that was the argument.
You are correct, but in practice I cannot see it being an issue, as it is not worse in any way than existing C string literals.
Apr 06