digitalmars.D.learn - mixin template's alias parameter ... ignored ?
- someone (22/22) Jul 10 2021 ```d
- =?UTF-8?Q?Ali_=c3=87ehreli?= (20/25) Jul 10 2021 The only way that I know is to take a string parameter and use it with a...
- someone (29/31) Jul 10 2021 Yes, that I tried, but the structure has a lot of lines of codes
- someone (375/376) Jul 11 2021 Primarily to Ali & Steve for their help, be advised, this post
- ag0aep6g (28/62) Jul 11 2021 [...]> alias stringUTF16 = dstring; /// same as immutable(dchar)[];>
- someone (34/64) Jul 12 2021 I can't believe this one ... these lines were introduced almost a
- jfondren (56/69) Jul 12 2021 Local variables don't have a visibility in the sense of public or
- someone (29/101) Jul 12 2021 Some days ago I assumed scope was, as I previously stated, the
- Mike Parker (79/99) Jul 12 2021 DIPs are handled in this repository:
- =?UTF-8?Q?Ali_=c3=87ehreli?= (15/18) Jul 12 2021 I think you are doing it only for literal values but in general, casts
- someone (16/36) Jul 12 2021 nope, I'll never do such a downcast UNLESS I previously tested
- =?UTF-8?Q?Ali_=c3=87ehreli?= (22/54) Jul 12 2021 Cumbersome because one has to make sure existing casts are correct after...
- someone (9/55) Jul 13 2021 Hmmm ... I'll be reconsidering my cast usage approach then.
- ag0aep6g (17/38) Jul 12 2021 `scope` is not a visibility level.
- someone (32/66) Jul 12 2021 Well, that explains why it is not listed among the visibility
- Mike Parker (2/15) Jul 12 2021 Hopefully, my post above will shed some light on this.
- Mike Parker (28/35) Jul 12 2021 And I meant to add... local variables are by default visible only
- someone (10/45) Jul 12 2021 Yes. This one I understood from the beginning -it was on Ali's
- someone (10/11) Jul 12 2021 Yes Mike, a *lot*.
- ag0aep6g (16/28) Jul 12 2021 Yes. Let me rephrase and elaborate: I'm not sure what the current status...
- someone (508/539) Jul 13 2021 ACK. So for the time being I'll be reverting all my input
- Adam D Ruppe (5/17) Jul 11 2021 This creates a struct with teh literal name `lstrStructureID`.
- Steven Schveighoffer (12/38) Jul 11 2021 when I've done this kind of stuff, what I usually do is:
- someone (4/15) Jul 11 2021 Thanks for your tip Steve, I ended with something similar, I'll
- zjh (2/5) Jul 11 2021 Could you explain more detail?
- Adam D Ruppe (3/4) Jul 11 2021 It is just normal code with a normal name. The fact there's
- someone (6/8) Jul 11 2021 As I mentioned in my previous reply to Ali this could be viable
```d mixin template templateUGC ( typeStringUTF, alias lstrStructureID ) { public struct lstrStructureID { typeStringUTF whatever; } } mixin templateUGC!(string, "gudtUGC08"); mixin templateUGC!(dstring, "gudtUGC16"); mixin templateUGC!(wstring, "gudtUGC32"); void main() { gudtUGC32 something; /// Error: undefined identifier `gudtUGC32` } ``` I cannot manage to get this right; not even with: ```d public struct mixin(lstrStructureID) { ... } ``` because the argument seems to require a complete statement.
Jul 10 2021
On 7/10/21 10:20 PM, someone wrote:mixin template templateUGC ( typeStringUTF, alias lstrStructureID ) { public struct lstrStructureID {The only way that I know is to take a string parameter and use it with a string mixin: mixin template templateUGC ( typeStringUTF, string lstrStructureID ) { mixin("public struct " ~ lstrStructureID ~ q{ { typeStringUTF whatever; } }); } mixin templateUGC!(string, "gudtUGC08"); mixin templateUGC!(dstring, "gudtUGC16"); mixin templateUGC!(wstring, "gudtUGC32"); void main() { gudtUGC32 something; } Ali
Jul 10 2021
On Sunday, 11 July 2021 at 05:54:48 UTC, Ali Çehreli wrote:The only way that I know is to take a string parameter and use it with a string mixin:Yes, that I tried, but the structure has a lot of lines of codes and so it is impractical and of course it will turn out difficult to debug. Since this seems to be a dead-end I did reshuffle some things around: ```d /// for illustration purposes only: alias stringUTF08 = string; /// = immutable(char )[]; alias stringUTF16 = dstring; /// = immutable(dchar)[]; alias stringUTF32 = wstring; /// = immutable(wchar)[]; alias stringUGC08 = gudtUGC!(stringUTF08); alias stringUGC16 = gudtUGC!(stringUTF16); alias stringUGC32 = gudtUGC!(stringUTF32); public struct gudtUGC(typeStringUTF) { typeStringUTF whatever; ... lots of functions using typeStringUTF here } void main() { version (useUTF08) { stringUGC08 lugcSequence3 = stringUGC08(r"..."c); } version (useUTF16) { stringUGC16 lugcSequence3 = stringUGC16(r"..."d); } version (useUTF32) { stringUGC32 lugcSequence3 = stringUGC32(r"..."w); } } ``` It works. Thanks Ali :) !
Jul 10 2021
On Sunday, 11 July 2021 at 05:54:48 UTC, Ali Çehreli wrote:AliPrimarily to Ali & Steve for their help, be advised, this post will be somehow ... long. Some bit of background to begin with: a week or so ago I posted asking advice on code safeness, and still I didn't reply to the ones that kindly answered. Seeing some replies, and encountering a code issue regarding string manipulation, I pretty soon figured out that I still did not have solid knowledge on many basic things regarding D, so I put the brakes on, and went to square one and started reading and researching some things a bit more ... slowly. One of the things that struck me this week is that UniCode string manipulation in many cases is more complex that I previously thought, because there is no precise-concept of what is a character in UniCode, at least, not the way we are used to with plain-old-ASCII. After reading a lot of about it (this was good: https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-u icode-code-points/) I learned of code-units, code-points, abstract-graphemes, graphemes-clusters, and the like. And I learned the inner details of the UTF encodings and that UTF-32 is best (almost required) for string processing (easier, faster, etc) and of course UTF-8 for definitive storage, and UTF-16 to the trashcan unless you need to interface with Windows (I was previously using UTF-8 within all my code for processing). So, in order to manipulate a string, say, left(n), right(n), substr(n,m), ie: the usual stuff for many languages/libraries, I need to operate on grapheme-clusters and not in code-points and never ever on code-units, at least, for unexpected text, ie: incoming text, user-input, etc, the things that we can not control beforehand. Both primary D books, Andrei's and Ali's ones, as the D documentation, have plenty of examples but they are mainly focused on simple things like strings having nothing-out-of-the-ordinary. They perform string manipulation mainly slicing the source string (ie: the char array) with the functions of std.range like take, takeOne, etc. I needed to set this things once-and-for-all for my code and thus I decided to build a grapheme-aware UDT that once instantiated with any given string will provide the usual string manipulation functions so I can forget the minutiae about them. The unittest at the bottom has many usage examples. The whole UDT needed to be templated for the three string types (string, dstring, wstring -and nothing else) and this was what produced this post to begin with. This issue was solved, not the way I liked to, but solved. The code works alas for something grapheme arrays (foreach always missing the last one). I ended up with the following (as usual advice/suggestions welcomed): ```d /// testing D on 2021-06~07 import std.algorithm : map, joiner; import std.array : array; import std.conv : to; import std.range : walkLength, take, tail, drop, dropBack; import std.stdio; import std.uni : Grapheme, byGrapheme; alias stringUGC = Grapheme; alias stringUGC08 = gudtUGC!(stringUTF08); alias stringUGC16 = gudtUGC!(stringUTF16); alias stringUGC32 = gudtUGC!(stringUTF32); alias stringUTF08 = string; /// same as immutable(char )[]; alias stringUTF16 = dstring; /// same as immutable(dchar)[]; alias stringUTF32 = wstring; /// same as immutable(wchar)[]; void main() {} //mixin templateUGC!(stringUTF08, r"gudtUGC08"w); /// if these main() //mixin templateUGC!(stringUTF16, r"gudtUGC16"w); //mixin templateUGC!(stringUTF32, r"gudtUGC32"w); //template templateUGC ( // typeStringUTF, // alias lstrStructureID // ) { public struct gudtUGC(typeStringUTF) { /// UniCode grapheme cluster‐aware string manipulation void popFront() { ++pintSequenceCurrent; } bool empty() { return pintSequenceCurrent == pintSequenceCount; } typeStringUTF front() { return toUTFtake(pintSequenceCurrent); } private stringUGC[] pugcSequence; private size_t pintSequenceCount = cast(size_t) 0; private size_t pintSequenceCurrent = cast(size_t) 0; property public size_t count() { return pintSequenceCount; } this(scope const typeStringUTF lstrSequence) { decode(lstrSequence); } safe public size_t decode( scope const typeStringUTF lstrSequence ) { scope size_t lintSequenceCount = cast(size_t) 0; if (lstrSequence is null) { pugcSequence = null; pintSequenceCount = cast(size_t) 0; pintSequenceCurrent = cast(size_t) 0; } else { pugcSequence = lstrSequence.byGrapheme.array; pintSequenceCount = pugcSequence.walkLength; pintSequenceCurrent = cast(size_t) 1; lintSequenceCount = pintSequenceCount; } return lintSequenceCount; } safe public typeStringUTF encode() { /// UniCode grapheme cluster to UniCode UTF‐encoded string scope typeStringUTF lstrSequence = null; if (pintSequenceCount >= cast(size_t) 1) { lstrSequence = pugcSequence .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF toUTFtake( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintStart, scope const size_t lintCount = cast(size_t) 1 ) { scope typeStringUTF lstrSequence = null; if (lintStart <= lintStart + lintCount) { scope size_t lintRange1 = lintStart - cast(size_t) 1; scope size_t lintRange2 = lintRange1 + lintCount; if (lintRange1 >= cast(size_t) 0 && lintRange2 <= pintSequenceCount) { lstrSequence = pugcSequence[lintRange1..lintRange2] .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } } return lstrSequence; } safe public typeStringUTF toUTFtakeL( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount ) { scope typeStringUTF lstrSequence = null; if (lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .take(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF toUTFtakeR( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount ) { scope typeStringUTF lstrSequence = null; if (lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .tail(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF toUTFchopL( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount ) { scope typeStringUTF lstrSequence = null; if (lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .drop(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF toUTFchopR( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount ) { scope typeStringUTF lstrSequence = null; if (lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .dropBack(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF toUTFpadL( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount, scope const typeStringUTF lstrPadding = cast(typeStringUTF) r" " ) { scope typeStringUTF lstrSequence = null; if (lintCount > pintSequenceCount) { lstrSequence = null; /// pending } return lstrSequence; } safe public typeStringUTF toUTFpadR( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount, scope const typeStringUTF lstrPadding = cast(typeStringUTF) r" " ) { scope typeStringUTF lstrSequence = null; if (lintCount > pintSequenceCount) { lstrSequence = null; /// pending } return lstrSequence; } /* safe public gudtUGC(typeStringUTF) take( scope const size_t lintStart, scope const size_t lintCount = cast(size_t) 1 ) { /// the idea behind this new set of functions (returning a new object) is to enable the following one‐liner constructions: /// assert(lugcSequence3.take(35, 3).take(1,2).take(1,1).encode() == cast(stringUTF) r"日"); /// ooops … error: function declaration without return type. (Note that constructors are always named `this`) /// ooops … error: no identifier for declarator ` safe gudtUGC(typeStringUTF)` scope gudtUGC(typeStringUTF) lugcSequence; if (lintStart <= lintStart + lintCount) { scope size_t lintRange1 = lintStart - cast(size_t) 1; scope size_t lintRange2 = lintRange1 + lintCount; if (lintRange1 >= cast(size_t) 0 && lintRange2 <= pintSequenceCount) { lugcSequence = gudtUGC(typeStringUTF)(pugcSequence[lintRange1..lintRange2] .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ); } } return lugcSequence; }*/ } //} unittest { version (useUTF08) { scope stringUTF08 lstrSequence1 = r"12345678901234567890123456789012345678901234567890"c; scope stringUTF08 lstrSequence2 = r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"c; scope stringUTF08 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"c; } version (useUTF16) { scope stringUTF16 lstrSequence1 = r"12345678901234567890123456789012345678901234567890"d; scope stringUTF16 lstrSequence2 = r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"d; scope stringUTF16 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"d; } version (useUTF32) { scope stringUTF32 lstrSequence1 = r"12345678901234567890123456789012345678901234567890"w; scope stringUTF32 lstrSequence2 = r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"w; scope stringUTF32 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"w; } scope size_t lintSequence1sizeUTF = lstrSequence1.length; scope size_t lintSequence2sizeUTF = lstrSequence2.length; scope size_t lintSequence3sizeUTF = lstrSequence3.length; scope size_t lintSequence1sizeUGA = lstrSequence1.walkLength; scope size_t lintSequence2sizeUGA = lstrSequence2.walkLength; scope size_t lintSequence3sizeUGA = lstrSequence3.walkLength; scope size_t lintSequence1sizeUGC = lstrSequence1.byGrapheme.walkLength; scope size_t lintSequence2sizeUGC = lstrSequence2.byGrapheme.walkLength; scope size_t lintSequence3sizeUGC = lstrSequence3.byGrapheme.walkLength; assert(lintSequence1sizeUGC == cast(size_t) 50); assert(lintSequence2sizeUGC == cast(size_t) 50); assert(lintSequence3sizeUGC == cast(size_t) 50); assert(lintSequence1sizeUGA == cast(size_t) 50); assert(lintSequence2sizeUGA == cast(size_t) 50); assert(lintSequence3sizeUGA == cast(size_t) 52); version (useUTF08) { assert(lintSequence1sizeUTF == cast(size_t) 50); assert(lintSequence2sizeUTF == cast(size_t) 60); assert(lintSequence3sizeUTF == cast(size_t) 91); } version (useUTF16) { assert(lintSequence1sizeUTF == cast(size_t) 50); assert(lintSequence2sizeUTF == cast(size_t) 50); assert(lintSequence3sizeUTF == cast(size_t) 52); } version (useUTF32) { assert(lintSequence1sizeUTF == cast(size_t) 50); assert(lintSequence2sizeUTF == cast(size_t) 50); assert(lintSequence3sizeUTF == cast(size_t) 57); } /// the following should be the same regardless of the encoding being used and is the whole point of this UDT being made: version (useUTF08) { alias stringUTF = stringUTF08; scope stringUGC08 lugcSequence3 = stringUGC08(lstrSequence3); } version (useUTF16) { alias stringUTF = stringUTF16; scope stringUGC16 lugcSequence3 = stringUGC16(lstrSequence3); } version (useUTF32) { alias stringUTF = stringUTF32; scope stringUGC32 lugcSequence3 = stringUGC32(lstrSequence3); } assert(lugcSequence3.encode() == lstrSequence3); assert(lugcSequence3.toUTFtake(21) == cast(stringUTF) r"р"); assert(lugcSequence3.toUTFtake(27) == cast(stringUTF) r"й"); assert(lugcSequence3.toUTFtake(35) == cast(stringUTF) r"日"); assert(lugcSequence3.toUTFtake(37) == cast(stringUTF) r"語"); assert(lugcSequence3.toUTFtake(21, 7) == cast(stringUTF) r"русский"); assert(lugcSequence3.toUTFtake(35, 3) == cast(stringUTF) r"日本語"); assert(lugcSequence3.toUTFtakeL(1) == cast(stringUTF) r"ä"); assert(lugcSequence3.toUTFtakeR(1) == cast(stringUTF) r"😎"); assert(lugcSequence3.toUTFtakeL(7) == cast(stringUTF) r"äëåčñœß"); assert(lugcSequence3.toUTFtakeR(16) == cast(stringUTF) r"日本語 = japanese 😎"); assert(lugcSequence3.toUTFchopL(10) == cast(stringUTF) r"russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"); assert(lugcSequence3.toUTFchopR(21) == cast(stringUTF) r"äëåčñœß … russian = русский 🇷🇺"); version (useUTF08) { scope stringUTF08 lstrSequence3reencoded; } version (useUTF16) { scope stringUTF16 lstrSequence3reencoded; } version (useUTF32) { scope stringUTF32 lstrSequence3reencoded; } for ( size_t lintSequenceUGC = cast(size_t) 1; lintSequenceUGC <= lintSequence3sizeUGC; ++lintSequenceUGC ) { lstrSequence3reencoded ~= lugcSequence3.toUTFtake(lintSequenceUGC); } assert(lstrSequence3reencoded == lstrSequence3); lstrSequence3reencoded = null; version (useUTF08) { foreach (stringUTF08 lstrSequence3UGC; lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } } version (useUTF16) { foreach (stringUTF16 lstrSequence3UGC; lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } } version (useUTF32) { foreach (stringUTF32 lstrSequence3UGC; lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } } assert(lstrSequence3reencoded == lstrSequence3); /// ooops … } ```
Jul 11 2021
On 12.07.21 03:37, someone wrote:I ended up with the following (as usual advice/suggestions welcomed):[...]> alias stringUTF16 = dstring; /// same as immutable(dchar)[];> alias stringUTF32 = wstring; /// same as immutable(wchar)[]; Bug: You mixed up `wstring` and `dstring`. `wstring` is UTF-16. `dstring` is UTF-32. [...]public struct gudtUGC(typeStringUTF) { /// UniCode grapheme cluster‐aware string manipulationStyle: `typeStringUTF` is a type, so it should start with a capital letter (`TypeStringUTF`). [...]private size_t pintSequenceCount = cast(size_t) 0; private size_t pintSequenceCurrent = cast(size_t) 0;Style: There's no need for the casts (throughout). [...]safe public typeStringUTF encode() { /// UniCode grapheme cluster to UniCode UTF‐encoded string scope typeStringUTF lstrSequence = null;[...]return lstrSequence; }Bug: `scope` makes no sense if you want to return `lstrSequence` (throughout).safe public typeStringUTF toUTFtake( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintStart, scope const size_t lintCount = cast(size_t) 1 ) {Style: `scope` does nothing on `size_t` parameters (throughout). [...]if (lintStart <= lintStart + lintCount) {[...]scope size_t lintRange1 = lintStart - cast(size_t) 1;Possible bug: Why subtract 1?scope size_t lintRange2 = lintRange1 + lintCount; if (lintRange1 >= cast(size_t) 0 && lintRange2 <= pintSequenceCount) {Style: The first half of that condition is pointless. `lintRange1` is unsigned, so it will always be greater than or equal to 0. If you want to defend against overflow, you have to do it before subtracting. [...]} }[...]}[...]safe public typeStringUTF toUTFpadL( /// UniCode grapheme cluster to UniCode UTF‐encoded string scope const size_t lintCount, scope const typeStringUTF lstrPadding = cast(typeStringUTF) r" "Style: Cast is not needed (throughout).) {[...]}[...]}[...]
Jul 11 2021
On Monday, 12 July 2021 at 05:33:22 UTC, ag0aep6g wrote:Bug: You mixed up `wstring` and `dstring`. `wstring` is UTF-16. `dstring` is UTF-32.I can't believe this one ... these lines were introduced almost a week ago LoL !Style: `typeStringUTF` is a type, so it should start with a capital letter (`TypeStringUTF`).Style is a personal preference; I am not following D style conventions (if any) nor do I follow any other language style conventions; I have my personal style and I apply it everywhere, I think it is not important which style you use, what is important in the end is that you adhere to your chosen style all the time -unless, of course, you are contributing to x project which states its own style and then there's no choice but to follow it.private size_t pintSequenceCount = cast(size_t) 0; private size_t pintSequenceCurrent = cast(size_t) 0;Style: There's no need for the casts (throughout).I know. I do these primarily because of muscle memory and secondly because I try to write code thinking someone not knowing the language details may be porting it later so I tend to state the obvious; besides, it won't hurt, and it helps me in many ways.Teach me please: if I declare a variable right after the function declaration like this one ... ain't scope its default visibility ? I understand (not quite sure whether correct or not right now) that everything you declare without explicitly stating its visibility (public/private/whatever) becomes scope ie: what in many languages are called a local variable. What actually is the visibility of lstrSequence without my scope declaration ?safe public typeStringUTF encode() { scope typeStringUTF lstrSequence = null;[...]return lstrSequence; }Bug: `scope` makes no sense if you want to return `lstrSequence` (throughout).safe public typeStringUTF toUTFtake( scope const size_t lintStart, scope const size_t lintCount = cast(size_t) 1 ) {Style: `scope` does nothing on `size_t` parameters (throughout).A week ago I was using [in] almost everywhere for parameters, ain't [in] an alias for [scope const] ? Did I get it wrong ? I'm not talking style here, I'm talking unexpected (to me) functionality.scope size_t lintRange1 = lintStart - cast(size_t) 1; scope size_t lintRange2 = lintRange1 + lintCount;Possible bug: Why subtract 1?Because ranges are zero-based for their first argument and one-based for their second; ie: something[n..m] where m should always be one-beyond than the one we want.if (lintRange1 >= cast(size_t) 0 && lintRange2 <= pintSequenceCount) {Style: The first half of that condition is pointless. `lintRange1` is unsigned, so it will always be greater than or equal to 0. If you want to defend against overflow, you have to do it before subtracting.Indeed. Refactored the code (previously were int parameters) and got stuck in the wrong place ! All in all, thank you very much for your detailed reply, this kind of stuff is what helps me most understanding the language nuances :)
Jul 12 2021
On Monday, 12 July 2021 at 22:35:27 UTC, someone wrote:Local variables don't have a visibility in the sense of public or private. They do have a 'scope' in the general computer science sense, and a variable can be said to be in or out of scope at different points in a program, but this is the case without regard for whether the variable is declared with D's `scope`. What `scope` says is https://dlang.org/spec/attribute.html#scopeBug: `scope` makes no sense if you want to return `lstrSequence` (throughout).Teach me please: if I declare a variable right after the function declaration like this one ... ain't scope its default visibility ? I understand (not quite sure whether correct or not right now) that everything you declare without explicitly stating its visibility (public/private/whatever) becomes scope ie: what in many languages are called a local variable. What actually is the visibility of lstrSequence without my scope declaration ?For local declarations, scope ... means that the destructor for an object is automatically called when the reference to it goes out of scope.The value of a normal, non-scope local variable has a somewhat indefinite lifetime: you have to examine the program and think about operations on the variable to be sure about that lifetime. Does it survive the function? Might it die even before the function completes? Does it live until the next GC collection or until the program ends? These are questions you can ask. For a `scope` variable, the lifetime of its value ends with the scope of the variable. Consider: ```d import std.stdio : writeln, writefln; import std.conv : to; import core.memory : pureMalloc, pureFree; class Noisy { static int ids; int* id; this() { id = cast(int*) pureMalloc(int.sizeof); *id = ids++; } ~this() { writefln!"[%d] I perish."(*id); pureFree(id); } } Noisy f() { scope n = new Noisy; return n; } void main() { scope a = f(); writeln("Checking a.n..."); writefln!"a.n = %d"(*a.id); } ``` Which has this output on my system: ```d [0] I perish. Checking a.n... Error: program killed by signal 11 ``` Or with -preview=dip1000, this dmd output: ```d Error: scope variable `n` may not be returned ``` the lifetime of the Noisy object bound by `scope n` is the same as the scope of the variable, and the varaible goes out of scope when the function returns, so the Noisy object is destructed at that point.
Jul 12 2021
On Monday, 12 July 2021 at 23:18:57 UTC, jfondren wrote:On Monday, 12 July 2021 at 22:35:27 UTC, someone wrote:Some days ago I assumed scope was, as I previously stated, the local default scope, and explicitly added scope to all my *local* variables. Soon afterward I encountered a situation which gave me the "program killed by signal 11" which I did not fully-understand why it was happening at all, because it never occurred to me it was connected to my previous scope refactor. Now I understand. Regarding -preview=dip1000 (and the explicit error description that could have helped me a lot back then) : DMD man page says the preview switch lists upcoming language features, so DIP1000 is something like a D proposal as I glanced somewhere sometime ago ... where do DIPs get listed (docs I mean) ? So, every *local* variable within a chunk of code, say, a function, should be declared without anything else to avoid this type of behavior ? I mean, anything in code that it is not private/public/etc. Or, as I presume, every *local* meaning *aux* variable that won't need to survive the function should be declared scope but *not* the one we are returning ... lstrSequence in my specific case ? Can I declare everything *scope* within and on the last line using lstrSequence.dup instead ? dup/idup duplicates the variable (the first allowing mutability while the second not) right ? Which one of the following approaches do you consider best practice if you were directed to explicitly state as much behavior as possible ? Your reply with this example included was very illustrating to me -right to the point. Thanks a lot for your time :) !Local variables don't have a visibility in the sense of public or private. They do have a 'scope' in the general computer science sense, and a variable can be said to be in or out of scope at different points in a program, but this is the case without regard for whether the variable is declared with D's `scope`. What `scope` says is https://dlang.org/spec/attribute.html#scopeBug: `scope` makes no sense if you want to return `lstrSequence` (throughout).Teach me please: if I declare a variable right after the function declaration like this one ... ain't scope its default visibility ? I understand (not quite sure whether correct or not right now) that everything you declare without explicitly stating its visibility (public/private/whatever) becomes scope ie: what in many languages are called a local variable. What actually is the visibility of lstrSequence without my scope declaration ?For local declarations, scope ... means that the destructor for an object is automatically called when the reference to it goes out of scope.The value of a normal, non-scope local variable has a somewhat indefinite lifetime: you have to examine the program and think about operations on the variable to be sure about that lifetime. Does it survive the function? Might it die even before the function completes? Does it live until the next GC collection or until the program ends? These are questions you can ask. For a `scope` variable, the lifetime of its value ends with the scope of the variable. Consider: ```d import std.stdio : writeln, writefln; import std.conv : to; import core.memory : pureMalloc, pureFree; class Noisy { static int ids; int* id; this() { id = cast(int*) pureMalloc(int.sizeof); *id = ids++; } ~this() { writefln!"[%d] I perish."(*id); pureFree(id); } } Noisy f() { scope n = new Noisy; return n; } void main() { scope a = f(); writeln("Checking a.n..."); writefln!"a.n = %d"(*a.id); } ``` Which has this output on my system: ```d [0] I perish. Checking a.n... Error: program killed by signal 11 ``` Or with -preview=dip1000, this dmd output: ```d Error: scope variable `n` may not be returned ``` the lifetime of the Noisy object bound by `scope n` is the same as the scope of the variable, and the varaible goes out of scope when the function returns, so the Noisy object is destructed at that point.
Jul 12 2021
On Monday, 12 July 2021 at 23:45:57 UTC, someone wrote:Regarding -preview=dip1000 (and the explicit error description that could have helped me a lot back then) : DMD man page says the preview switch lists upcoming language features, so DIP1000 is something like a D proposal as I glanced somewhere sometime ago ... where do DIPs get listed (docs I mean) ?DIPs are handled in this repository: https://github.com/dlang/DIPs This is a list of every DIP that is going through or has gone through the review process: https://github.com/dlang/DIPs/blob/master/DIPs/README.md DIP1000 is here: https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1000.md But it doesn't describe the actual implementation, as described here: https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1000.md#addendum I don't know what all the differences are, as I haven't followed it.So, every *local* variable within a chunk of code, say, a function, should be declared without anything else to avoid this type of behavior ? I mean, anything in code that it is not private/public/etc.Not "without anything", but without scope---unless you're using -preview=dip1000, or unless you're applying it to class references (see below).Or, as I presume, every *local* meaning *aux* variable that won't need to survive the function should be declared scope but *not* the one we are returning ... lstrSequence in my specific case ? Can I declare everything *scope* within and on the last line using lstrSequence.dup instead ? dup/idup duplicates the variable (the first allowing mutability while the second not) right ? Which one of the following approaches do you consider best practice if you were directed to explicitly state as much behavior as possible ?Consider this example, which demonstrates the original purpose of scope prior to DIP 1000: ```d import std.stdio; class C { int id; this(int id) { this.id = id; } } struct S { int id; this(int id) { this.id = id; } } void main() { { C c1 = new C(1); scope c2 = new C(2); S s1 = S(1); S* s2 = new S(2); scope s3 = new S(3); writeln("The inner scope is exiting now."); } writeln("Main is exiting now."); } static ~this() { writeln("The GC will cleanup after this point."); } ``` Classes are reference types and must be allocated. c1 is allocated on the GC and lives beyond its scope. By applying the scope attribute to c2, its destructor is forced to execute when its scope exits. It is not allocated on the GC, but on the stack. Structs are value types, so s1 is automatically allocated on the stack. Its destructor will be always be called when the scope exits. s2 is a pointer allocated on the GC heap, so its lifetime is managed by the GC and it exists beyond its scope. s3 is also of type S*. The scope attribute has no effect on it, and it is still managed by the GC. If you want stack allocation and RAII destructors for structs, you just use the default behavior like s1. You can run it here: https://run.dlang.io/is/iu7QiO Someone else will have to explain what DIP 1000 actually does right now (if anyone really knows). What I'm certain about is that it prevents things like this: ```d void func() { int i = 10; int* pi = &i; return pi; ``` The compiler has always raised an error when it encountered something like `return &i`, but the above would slip by. With -preview=dip1000, that is also an error. But scope isn't needed on either variable for it to do so. Beyond that, my knowledge of DIP 1000's implementation is limited. But I do know that scope has no effect on variables with no indirections. It's all about indirections (pointers & references). At any rate, DIP 1000 is not yet ready for prime time. Getting it to that state is a current priority of the language maintainers. So for now, you probably just shouldn't worry about scope at all.
Jul 12 2021
On 7/12/21 3:35 PM, someone wrote:I think you are doing it only for literal values but in general, casts can be very cumbersome and harmful. For example, if we change the parameter from 'int' to 'long', the cast in the function body is a bug to be chased and fixed: // Used to be 'int arg' void foo(long arg) { // ... auto a = cast(int)arg; // BUG? // ... } void main() { foo(long.max); } Aliprivate size_t pintSequenceCurrent = cast(size_t) 0;Style: There's no need for the casts (throughout).[...] besides, it won't hurt, and it helps me in many ways.
Jul 12 2021
On Monday, 12 July 2021 at 23:25:13 UTC, Ali Çehreli wrote:On 7/12/21 3:35 PM, someone wrote:Cumbersome and harmful ... could you explain ?I think you are doing it only for literal values but in general, casts can be very cumbersome and harmful.private size_t pintSequenceCurrent = cast(size_t) 0;Style: There's no need for the casts (throughout).[...] besides, it won't hurt, and it helps me in many ways.For example, if we change the parameter from 'int' to 'long', the cast in the function body is a bug to be chased and fixed: // Used to be 'int arg' void foo(long arg) { // ... auto a = cast(int)arg; // BUG? // ... }nope, I'll never do such a downcast UNLESS I previously tested with if () {} for proper int range; I use cast a lot, but this is mainly because I am used to strongly-typed languages etc etc, for example if for whatever reason I have to: ushort a = 250; ubyte b = cast(ubyte) a; I'll do: ushort a = 250; ubyte b = cast(ubyte) 0; /// redundant of course; but we don't have nulls in D for ints so this is muscle-memory if (a <= 255) { /// or ubyte.max instead of 255 (I think it is possible) b = cast(ubyte) a; }void main() { foo(long.max); } Ali
Jul 12 2021
On 7/12/21 5:42 PM, someone wrote:On Monday, 12 July 2021 at 23:25:13 UTC, Ali =C3=87ehreli wrote:sOn 7/12/21 3:35 PM, someone wrote:I think you are doing it only for literal values but in general, cast=private size_t pintSequenceCurrent =3D cast(size_t) 0;Style: There's no need for the casts (throughout).[...] besides, it won't hurt, and it helps me in many ways.Cumbersome because one has to make sure existing casts are correct after = changing a type. Harmful because it bypasses the compiler's type checking.can be very cumbersome and harmful.Cumbersome and harmful ... could you explain ?tFor example, if we change the parameter from 'int' to 'long', the cas=The point was, nobody did a downcast in that code. The original=20 parameter was 'int' so cast(int) was "correct" initially. Then somebody=20 charnged the parameter to "long" and the cast became potentially harmful.=in the function body is a bug to be chased and fixed: // Used to be 'int arg' void foo(long arg) { // ... auto a =3D cast(int)arg; // BUG? // ... }nope, I'll never do such a downcastUNLESS I previously tested with if () {} for proper int range; I use cast a lot, but this is mainly becau=seI am used to strongly-typed languages etc etc,Hm. I am used to strongly-typed languages as well and that's exactly why = I *avoid* casts as much as possible. :)for example if for whatever reason I have to: ushort a =3D 250; ubyte b =3D cast(ubyte) a; I'll do: ushort a =3D 250; ubyte b =3D cast(ubyte) 0; /// redundant of course; but we don't haveWe have a different way of looking at this. :) My first preference would = be: ubyte b; This alternative has less typing than your method and is easier to=20 change the code because 'ubyte' appears only in one place. (DRY principle= =2E) auto b =3D ubyte(0); Another alternative: auto b =3D ubyte.init; Ali
Jul 12 2021
On Tuesday, 13 July 2021 at 05:26:56 UTC, Ali Çehreli wrote:Cumbersome because one has to make sure existing casts are correct after changing a type.ACK.Harmful because it bypasses the compiler's type checking.Hmmm ... I'll be reconsidering my cast usage approach then.ACK.'long', the castFor example, if we change the parameter from 'int' toThe point was, nobody did a downcast in that code. The original parameter was 'int' so cast(int) was "correct" initially. Then somebody charnged the parameter to "long" and the cast became potentially harmful.in the function body is a bug to be chased and fixed: // Used to be 'int arg' void foo(long arg) { // ... auto a = cast(int)arg; // BUG? // ... }nope, I'll never do such a downcastACK. I'll be revisiting the whole matter. I just re-read your http://ddili.org/ders/d.en/cast.html chapter. I did not have a clear understanding between the difference of to!(...) and cast() for example; and, re-reading integer promotion and arithmetic conversions refreshed my knowledge at this point.UNLESS I previously tested with if () {} for proper int range; I use cast a lot, but this ismainly becauseI am used to strongly-typed languages etc etc,Hm. I am used to strongly-typed languages as well and that's exactly why I *avoid* casts as much as possible. :)for example if for whatever reason I have to: ushort a = 250; ubyte b = cast(ubyte) a; I'll do: ushort a = 250; ubyte b = cast(ubyte) 0; /// redundant of course; but wedon't have We have a different way of looking at this. :) My first preference would be: ubyte b; This alternative has less typing than your method and is easier to change the code because 'ubyte' appears only in one place. (DRY principle.) auto b = ubyte(0); Another alternative: auto b = ubyte.init;Ali
Jul 13 2021
On Monday, 12 July 2021 at 22:35:27 UTC, someone wrote:On Monday, 12 July 2021 at 05:33:22 UTC, ag0aep6g wrote:[...]Teach me please: if I declare a variable right after the function declaration like this one ... ain't scope its default visibility ? I understand (not quite sure whether correct or not right now) that everything you declare without explicitly stating its visibility (public/private/whatever) becomes scope ie: what in many languages are called a local variable. What actually is the visibility of lstrSequence without my scope declaration ?`scope` is not a visibility level. `lstrSequence` is local to the function, so visibility (`public`, `private`, ...) doesn't even apply. Most likely, you don't have any use for `scope` at the moment. You're obviously not compiling with `-preview=dip1000`. And neither should you, because the feature is not ready for a general audience yet. [...]I'm not sure where we stand with `in`, but let's say that it means `scope const`. The `scope` part of `scope const` still does nothing to a `size_t`. These are all the same: `in size_t`, `const size_t`, `scope const size_t`.Style: `scope` does nothing on `size_t` parameters (throughout).A week ago I was using [in] almost everywhere for parameters, ain't [in] an alias for [scope const] ? Did I get it wrong ? I'm not talking style here, I'm talking unexpected (to me) functionality.That doesn't make sense. A length of zero is perfectly fine. It's just an empty range. You're making `lintStart` one-based for no reason.scope size_t lintRange1 = lintStart - cast(size_t) 1; scope size_t lintRange2 = lintRange1 + lintCount;Possible bug: Why subtract 1?Because ranges are zero-based for their first argument and one-based for their second; ie: something[n..m] where m should always be one-beyond than the one we want.
Jul 12 2021
On Monday, 12 July 2021 at 23:28:29 UTC, ag0aep6g wrote:`scope` is not a visibility level.Well, that explains why it is not listed among the visibility attributes to begin with -something that at first glance seemed weird to me.`lstrSequence` is local to the function, so visibility (`public`, `private`, ...) doesn't even apply.Being *local* to ... ain't imply visibility too regardless scope not being a visibility attribute ? I mean, scope is restricting the variable to be leaked outside the function/whatever and to me it seems like restricted to be seen from the outside. *Please note* that I am not making an argument against the implementation, I am just trying to understand why it is not being classified as another visibility attribute given that more-or-less has the same concept as a local variable like in other languages.Most likely, you don't have any use for `scope` at the moment.Almost sure if you say so given your vast knowledge of D against my humble first steps LoL.You're obviously not compiling with `-preview=dip1000`.Nope. I didn't knew it even existed.And neither should you, because the feature is not ready for a general audience yet.ACK.[...]You mean *we* = D developers ?I'm not sure where we stand with `in`Style: `scope` does nothing on `size_t` parameters (throughout).A week ago I was using [in] almost everywhere for parameters, ain't [in] an alias for [scope const] ? Did I get it wrong ? I'm not talking style here, I'm talking unexpected (to me) functionality.but let's say that it means `scope const`This I stated because I read it somewhere in the docs, it was not my assumption.The `scope` part of `scope const` still does nothing to a `size_t`. These are all the same:in size_t const size_t scope const size_tOK. Specifically to integers nothing then. But, what about strings and whatever else ? I put them more-or-less as a general rule or so was the idea when I replaced the in's in the parameters app-wide.For a UDT like mine I think it has a lot of sense because when I think of a string and I want to chop/count/whatever on it my mind works one-based not zero-based. Say "abc" needs b my mind works a lot easier mid("abc", 2, 1) than mid("abc", 1, 1) and besides I am *not* returning a range or a reference slice to a range or whatever I am returning a whole new string construction. If I would be returning a range I will follow common sense since I don't know what will be done thereafter of course.That doesn't make sense. A length of zero is perfectly fine. It's just an empty range. You're making `lintStart` one-based for no reason.scope size_t lintRange1 = lintStart - cast(size_t) 1; scope size_t lintRange2 = lintRange1 + lintCount;Possible bug: Why subtract 1?Because ranges are zero-based for their first argument and one-based for their second; ie: something[n..m] where m should always be one-beyond than the one we want.
Jul 12 2021
On Tuesday, 13 July 2021 at 01:03:11 UTC, someone wrote:Being *local* to ... ain't imply visibility too regardless scope not being a visibility attribute ? I mean, scope is restricting the variable to be leaked outside the function/whatever and to me it seems like restricted to be seen from the outside. *Please note* that I am not making an argument against the implementation, I am just trying to understand why it is not being classified as another visibility attribute given that more-or-less has the same concept as a local variable like in other languages.OK. Specifically to integers nothing then. But, what about strings and whatever else ? I put them more-or-less as a general rule or so was the idea when I replaced the in's in the parameters app-wide.Hopefully, my post above will shed some light on this.
Jul 12 2021
On Tuesday, 13 July 2021 at 02:22:46 UTC, Mike Parker wrote:On Tuesday, 13 July 2021 at 01:03:11 UTC, someone wrote:And I meant to add... local variables are by default visible only inside the scope in which they are declared and, by extension, any inner scopes within that scope, and can never be visible outside. ```d { // Scope A // x can never be visible here { // Scope B int x; { // Scope C // x is visible here } } } ``` The only possible use for your concept of scope applying to visibility would be to prevent x from being visible in in Scope C. But since we already have the private attribute, it would make more sense to use that instead, e.g., `private int x` would not be visible in scope C. I don't know of any language that has that kind of feature, or if it would even be useful. But at any rate, there's no need for a visibility attribute to prevent outer scopes from seeing a local variable, as that's already impossible.Being *local* to ... ain't imply visibility too regardless scope not being a visibility attribute ? I mean, scope is restricting the variable to be leaked outside the function/whatever and to me it seems like restricted to be seen from the outside.
Jul 12 2021
On Tuesday, 13 July 2021 at 02:34:07 UTC, Mike Parker wrote:On Tuesday, 13 July 2021 at 02:22:46 UTC, Mike Parker wrote:Yes. This one I understood from the beginning -it was on Ali's book and previously I remember seeing it in Andrei's one too IIRC. http://ddili.org/ders/d.en/name_space.html The thing that I supposed started my confusion was the lack of a statement for it, nothing more; something like: whatever int x; ... it was more of form than concept.On Tuesday, 13 July 2021 at 01:03:11 UTC, someone wrote:And I meant to add... local variables are by default visible only inside the scope in which they are declared and, by extension, any inner scopes within that scope, and can never be visible outside. ```d { // Scope A // x can never be visible here { // Scope B int x; { // Scope C // x is visible here } } } ```Being *local* to ... ain't imply visibility too regardless scope not being a visibility attribute ? I mean, scope is restricting the variable to be leaked outside the function/whatever and to me it seems like restricted to be seen from the outside.The only possible use for your concept of scope applying to visibility would be to prevent x from being visible in in Scope C. But since we already have the private attribute, it would make more sense to use that instead, e.g., `private int x` would not be visible in scope C.No. My concept is/was the same that the one above. It was form not function.I don't know of any language that has that kind of feature, or if it would even be useful. But at any rate, there's no need for a visibility attribute to prevent outer scopes from seeing a local variable, as that's already impossible.Me neither.
Jul 12 2021
On Tuesday, 13 July 2021 at 02:22:46 UTC, Mike Parker wrote:Hopefully, my post above will shed some light on this.Yes Mike, a *lot*. Your previous example was crystal-clear -it makes a lot of sense for some class usage scenarios I am thinking of but not for what I did with my example. Now I understand a couple of things more clearly. I was using scope thinking it was something else -now glancing at my code using scope like the way I did is ... pointless; period. I am getting rid of all those statements. Thanks a lot for your example and the links :) !
Jul 12 2021
On 13.07.21 03:03, someone wrote:On Monday, 12 July 2021 at 23:28:29 UTC, ag0aep6g wrote:[...]Yes. Let me rephrase and elaborate: I'm not sure what the current status of `in` is. It used to mean `const scope`. But DIP1000 changes the effects of `scope` and there was some discussion about its relation to `in`. Checking the spec, it says that `in` simply means `const` unless you use `-preview=in`. The preview switch makes it `const scope` again, but that's not all. There's also something about passing by reference. https://dlang.org/spec/function.html#in-params [...]I'm not sure where we stand with `in`You mean *we* = D developers ?For a UDT like mine I think it has a lot of sense because when I think of a string and I want to chop/count/whatever on it my mind works one-based not zero-based. Say "abc" needs b my mind works a lot easier mid("abc", 2, 1) than mid("abc", 1, 1) and besides I am *not* returning a range or a reference slice to a range or whatever I am returning a whole new string construction. If I would be returning a range I will follow common sense since I don't know what will be done thereafter of course.I think you're setting yourself up for off-by-one bugs by going against the grain like that. Your functions are one-based. The rest of the D world, including the standard library, is zero-based. You're bound to forget to account for the difference. But it's your code, and you can do whatever you want, of course. Just looked like it might be a mistake.
Jul 12 2021
On Tuesday, 13 July 2021 at 05:37:49 UTC, ag0aep6g wrote:On 13.07.21 03:03, someone wrote:ACK. So for the time being I'll be reverting all my input parameters to const (unless ref or out of course) and when the whole in DIP matter resolves (one way or the other) I'll revert them (or not) accordingly. Parameters declared in read more naturally (and akin to out) than const but is form not function what I need to get right right now.On Monday, 12 July 2021 at 23:28:29 UTC, ag0aep6g wrote:[...]Yes. Let me rephrase and elaborate: I'm not sure what the current status of `in` is. It used to mean `const scope`. But DIP1000 changes the effects of `scope` and there was some discussion about its relation to `in`. Checking the spec, it says that `in` simply means `const` unless you use `-preview=in`. The preview switch makes it `const scope` again, but that's not all. There's also something about passing by reference. https://dlang.org/spec/function.html#in-paramsI'm not sure where we stand with `in`You mean *we* = D developers ?And I think you have a good point. I'll reconsider.For a UDT like mine I think it has a lot of sense because when I think of a string and I want to chop/count/whatever on it my mind works one-based not zero-based. Say "abc" needs b my mind works a lot easier mid("abc", 2, 1) than mid("abc", 1, 1) and besides I am *not* returning a range or a reference slice to a range or whatever I am returning a whole new string construction. If I would be returning a range I will follow common sense since I don't know what will be done thereafter of course.I think you're setting yourself up for off-by-one bugs by going against the grain like that. Your functions are one-based. The rest of the D world, including the standard library, is zero-based. You're bound to forget to account for the difference.But it's your code, and you can do whatever you want, of course. Just looked like it might be a mistake.All in all the whole module was updated accordingly and it seems it is working as expected (further testing needed) but, in the meantime, I learned a lot of things following the advice given by you, Ali, and others in this forum: ```d /// implementation-bugs [-] using foreach (with this structure) 20483 unittest's last line /// implementation‐tasks [+] reconsider making this whole UDT zero‐based as suggested by ag0aep6g—has a good point /// implementation‐tasks [+] reconsider excessive cast usage as suggested by Ali: bypassing compiler checks could be potentially harmful … cast and integer promotion http://ddili.org/ders/d.en/cast.html /// implementation‐tasks [-] for the time being input parameters are declared const instead of in; eventually they'll be back to in when the related DIP was setted once and for all; but, definetely—not scope const /// implementation‐tasks‐possible [-] pad[L|R] /// implementation‐tasks‐possible [-] replicate/repeat /// implementation‐tasks‐possible [-] replace(string, string) /// implementation‐tasks‐possible [-] translate(string, string) … same‐size strings matching one‐to‐one /// usage: array slicing can be used for usual things like: left() right() substr() etc … mainly when grapheme‐clusters are not expected at all /// usage: array slicing needs a zero‐based first range argument and a second one one‐based (or one‐past‐beyond; which it is somehow … counter‐intuitive module fw.types.UniCode; import std.algorithm : map, joiner; import std.array : array; import std.conv : to; import std.range : walkLength, take, tail, drop, dropBack; /// repeat, padLeft, padRight import std.stdio; import std.uni : Grapheme, byGrapheme; /// within this file: gudtUGC shared static this() { } /// the following will be executed only‐once per‐app: static this() { } /// the following will be executed only‐once per‐thread: static ~this() { } /// the following will be executed only‐once per‐thread: shared static ~this() { } /// the following will be executed only‐once per‐app: alias stringUGC = Grapheme; alias stringUGC08 = gudtUGC!(stringUTF08); alias stringUGC16 = gudtUGC!(stringUTF16); alias stringUGC32 = gudtUGC!(stringUTF32); alias stringUTF08 = string; /// same as immutable(char )[]; alias stringUTF16 = wstring; /// same as immutable(wchar)[]; alias stringUTF32 = dstring; /// same as immutable(dchar)[]; /// mixin templateUGC!(stringUTF08, r"gudtUGC08"d); /// mixin templateUGC!(stringUTF16, r"gudtUGC16"d); /// mixin templateUGC!(stringUTF32, r"gudtUGC32"d); /// template templateUGC (typeStringUTF, alias lstrStructureID) { aliases in main() public struct gudtUGC(typeStringUTF) { /// UniCode grapheme‐cluster‐aware string manipulation (implemented for one‐based operations) /// provides: public property size_t count /// provides: public size_t decode(typeStringUTF strSequence) /// provides: public typeStringUTF encode() /// provides: public gudtUGC!(typeStringUTF) take(size_t intStart, size_t intCount = 1) /// provides: public gudtUGC!(typeStringUTF) takeL(size_t intCount) /// provides: public gudtUGC!(typeStringUTF) takeR(size_t intCount) /// provides: public gudtUGC!(typeStringUTF) chopL(size_t intCount) /// provides: public gudtUGC!(typeStringUTF) chopR(size_t intCount) /// provides: public gudtUGC!(typeStringUTF) padL(size_t intCount, typeStringUTF strPadding = r" ") /// provides: public gudtUGC!(typeStringUTF) padR(size_t intCount, typeStringUTF strPadding = r" ") /// provides: public typeStringUTF takeasUTF(size_t intStart, size_t intCount = 1) /// provides: public typeStringUTF takeLasUTF(size_t intCount) /// provides: public typeStringUTF takeRasUTF(size_t intCount) /// provides: public typeStringUTF chopLasUTF(size_t intCount) /// provides: public typeStringUTF chopRasUTF(size_t intCount) /// provides: public typeStringUTF padL(size_t intCount, typeStringUTF strPadding = r" ") /// provides: public typeStringUTF padR(size_t intCount, typeStringUTF strPadding = r" ") /// usage; eg: stringUGC32("äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese"d).take(35, 3).take(1,2).take(1,1).encode(); /// 日 /// usage; eg: stringUGC32("äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese"d).take(35).encode(); /// 日 /// usage; eg: stringUGC32("äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese"d).takeasUTF(35); /// 日 void popFront() { ++pintSequenceCurrent; } bool empty() { return pintSequenceCurrent == pintSequenceCount; } typeStringUTF front() { return takeasUTF(pintSequenceCurrent); } private stringUGC[] pugcSequence; private size_t pintSequenceCount = cast(size_t) 0; private size_t pintSequenceCurrent = cast(size_t) 0; property public size_t count() { return pintSequenceCount; } this( const typeStringUTF lstrSequence ) { /// (1) given UTF‐encoded sequence decode(lstrSequence); } safe public size_t decode( /// UniCode (UTF‐encoded → grapheme‐cluster) sequence const typeStringUTF lstrSequence ) { /// (1) given UTF‐encoded sequence size_t lintSequenceCount = cast(size_t) 0; if (lstrSequence is null) { pugcSequence = null; pintSequenceCount = cast(size_t) 0; pintSequenceCurrent = cast(size_t) 0; } else { pugcSequence = lstrSequence.byGrapheme.array; pintSequenceCount = pugcSequence.walkLength; pintSequenceCurrent = cast(size_t) 1; lintSequenceCount = pintSequenceCount; } return lintSequenceCount; } safe public typeStringUTF encode() { /// UniCode (grapheme‐cluster → UTF‐encoded) sequence typeStringUTF lstrSequence = null; if (pintSequenceCount >= cast(size_t) 1) { lstrSequence = pugcSequence .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public gudtUGC!(typeStringUTF) take( /// UniCode (grapheme‐cluster → grapheme‐cluster) sequence const size_t lintStart, const size_t lintCount = cast(size_t) 1 ) { /// (1) given start position >= 1 /// (2) given count >= 1 gudtUGC!(typeStringUTF) lugcSequence; if (lintStart >= cast(size_t) 1 && lintCount >= cast(size_t) 1) { size_t lintRange1 = lintStart - cast(size_t) 1; size_t lintRange2 = lintRange1 + lintCount; if (lintRange2 <= pintSequenceCount) { lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence[lintRange1..lintRange2] .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ); } } return lugcSequence; } safe public gudtUGC!(typeStringUTF) takeL( /// UniCode (grapheme‐cluster → grapheme‐cluster) sequence const size_t lintCount ) { /// (1) given count >= 1 gudtUGC!(typeStringUTF) lugcSequence; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence .take(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ); } return lugcSequence; } safe public gudtUGC!(typeStringUTF) takeR( /// UniCode (grapheme‐cluster → grapheme‐cluster) sequence const size_t lintCount ) { /// (1) given count >= 1 gudtUGC!(typeStringUTF) lugcSequence; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence .tail(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ); } return lugcSequence; } safe public gudtUGC!(typeStringUTF) chopL( /// UniCode (grapheme‐cluster → grapheme‐cluster) sequence const size_t lintCount ) { /// (1) given count >= 1 gudtUGC!(typeStringUTF) lugcSequence; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence .drop(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ); } return lugcSequence; } safe public gudtUGC!(typeStringUTF) chopR( /// UniCode (grapheme‐cluster → grapheme‐cluster) sequence const size_t lintCount ) { /// (1) given count >= 1 gudtUGC!(typeStringUTF) lugcSequence; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence .dropBack(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ); } return lugcSequence; } safe public typeStringUTF takeasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintStart, const size_t lintCount = cast(size_t) 1 ) { /// (1) given start position >= 1 /// (2) given count >= 1 typeStringUTF lstrSequence = null; if (lintStart >= cast(size_t) 1 && lintCount >= cast(size_t) 1) { size_t lintRange1 = lintStart - cast(size_t) 1; size_t lintRange2 = lintRange1 + lintCount; if (lintRange2 <= pintSequenceCount) { lstrSequence = pugcSequence[lintRange1..lintRange2] .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } } return lstrSequence; } safe public typeStringUTF takeLasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintCount ) { /// (1) given count >= 1 typeStringUTF lstrSequence = null; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .take(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF takeRasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintCount ) { /// (1) given count >= 1 typeStringUTF lstrSequence = null; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .tail(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF chopLasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintCount ) { /// (1) given count >= 1 typeStringUTF lstrSequence = null; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .drop(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF chopRasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintCount ) { /// (1) given count >= 1 typeStringUTF lstrSequence = null; if (lintCount >= cast(size_t) 1 && lintCount <= pintSequenceCount) { lstrSequence = pugcSequence .dropBack(lintCount) .map!((ref g) => g[]) .joiner .to!(typeStringUTF) ; } return lstrSequence; } safe public typeStringUTF padLasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintCount, const typeStringUTF lstrPadding = cast(typeStringUTF) r" " ) { /// (1) given count >= 1 /// [2] given padding (default is a single blank space) typeStringUTF lstrSequence = null; if (lintCount >= cast(size_t) 1 && lintCount > pintSequenceCount) { lstrSequence = null; /// pending } return lstrSequence; } safe public typeStringUTF padRasUTF( /// UniCode (grapheme‐cluster → UTF‐encoded) sequence const size_t lintCount, const typeStringUTF lstrPadding = cast(typeStringUTF) r" " ) { /// (1) given count >= 1 /// [2] given padding (default is a single blank space) typeStringUTF lstrSequence = null; if (lintCount >= cast(size_t) 1 && lintCount > pintSequenceCount) { lstrSequence = null; /// pending } return lstrSequence; } } unittest { version (useUTF08) { stringUTF08 lstrSequence1 = r"12345678901234567890123456789012345678901234567890"c; stringUTF08 lstrSequence2 = r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"c; stringUTF08 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"c; } version (useUTF16) { stringUTF16 lstrSequence1 = r"12345678901234567890123456789012345678901234567890"w; stringUTF16 lstrSequence2 = r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"w; stringUTF16 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"w; } version (useUTF32) { stringUTF32 lstrSequence1 = r"12345678901234567890123456789012345678901234567890"d; stringUTF32 lstrSequence2 = r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"d; stringUTF32 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"d; } size_t lintSequence1sizeUTF = lstrSequence1.length; size_t lintSequence2sizeUTF = lstrSequence2.length; size_t lintSequence3sizeUTF = lstrSequence3.length; size_t lintSequence1sizeUGA = lstrSequence1.walkLength; size_t lintSequence2sizeUGA = lstrSequence2.walkLength; size_t lintSequence3sizeUGA = lstrSequence3.walkLength; size_t lintSequence1sizeUGC = lstrSequence1.byGrapheme.walkLength; size_t lintSequence2sizeUGC = lstrSequence2.byGrapheme.walkLength; size_t lintSequence3sizeUGC = lstrSequence3.byGrapheme.walkLength; assert(lintSequence1sizeUGC == cast(size_t) 50); assert(lintSequence2sizeUGC == cast(size_t) 50); assert(lintSequence3sizeUGC == cast(size_t) 50); assert(lintSequence1sizeUGA == cast(size_t) 50); assert(lintSequence2sizeUGA == cast(size_t) 50); assert(lintSequence3sizeUGA == cast(size_t) 52); version (useUTF08) { assert(lintSequence1sizeUTF == cast(size_t) 50); assert(lintSequence2sizeUTF == cast(size_t) 60); assert(lintSequence3sizeUTF == cast(size_t) 91); } version (useUTF16) { assert(lintSequence1sizeUTF == cast(size_t) 50); assert(lintSequence2sizeUTF == cast(size_t) 50); assert(lintSequence3sizeUTF == cast(size_t) 57); } version (useUTF32) { assert(lintSequence1sizeUTF == cast(size_t) 50); assert(lintSequence2sizeUTF == cast(size_t) 50); assert(lintSequence3sizeUTF == cast(size_t) 52); } /// the following should be the same regardless of the encoding being used and is the whole point of this UDT being made: version (useUTF08) { alias stringUTF = stringUTF08; stringUGC08 lugcSequence3 = stringUGC08(lstrSequence3); } version (useUTF16) { alias stringUTF = stringUTF16; stringUGC16 lugcSequence3 = stringUGC16(lstrSequence3); } version (useUTF32) { alias stringUTF = stringUTF32; stringUGC32 lugcSequence3 = stringUGC32(lstrSequence3); } assert(lugcSequence3.encode() == lstrSequence3); assert(lugcSequence3.take(35, 3).take(1,2).take(1,1).encode() == cast(stringUTF) r"日"); assert(lugcSequence3.take(21).encode() == cast(stringUTF) r"р"); assert(lugcSequence3.take(27).encode() == cast(stringUTF) r"й"); assert(lugcSequence3.take(35).encode() == cast(stringUTF) r"日"); assert(lugcSequence3.take(37).encode() == cast(stringUTF) r"語"); assert(lugcSequence3.take(21, 7).encode() == cast(stringUTF) r"русский"); assert(lugcSequence3.take(35, 3).encode() == cast(stringUTF) r"日本語"); assert(lugcSequence3.takeasUTF(21) == cast(stringUTF) r"р"); assert(lugcSequence3.takeasUTF(27) == cast(stringUTF) r"й"); assert(lugcSequence3.takeasUTF(35) == cast(stringUTF) r"日"); assert(lugcSequence3.takeasUTF(37) == cast(stringUTF) r"語"); assert(lugcSequence3.takeasUTF(21, 7) == cast(stringUTF) r"русский"); assert(lugcSequence3.takeasUTF(35, 3) == cast(stringUTF) r"日本語"); assert(lugcSequence3.takeL(1).encode() == cast(stringUTF) r"ä"); assert(lugcSequence3.takeR(1).encode() == cast(stringUTF) r"😎"); assert(lugcSequence3.takeL(7).encode() == cast(stringUTF) r"äëåčñœß"); assert(lugcSequence3.takeR(16).encode() == cast(stringUTF) r"日本語 = japanese 😎"); assert(lugcSequence3.takeLasUTF(1) == cast(stringUTF) r"ä"); assert(lugcSequence3.takeRasUTF(1) == cast(stringUTF) r"😎"); assert(lugcSequence3.takeLasUTF(7) == cast(stringUTF) r"äëåčñœß"); assert(lugcSequence3.takeRasUTF(16) == cast(stringUTF) r"日本語 = japanese 😎"); assert(lugcSequence3.chopL(10).encode() == cast(stringUTF) r"russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"); assert(lugcSequence3.chopR(21).encode() == cast(stringUTF) r"äëåčñœß … russian = русский 🇷🇺"); assert(lugcSequence3.chopLasUTF(10) == cast(stringUTF) r"russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎"); assert(lugcSequence3.chopRasUTF(21) == cast(stringUTF) r"äëåčñœß … russian = русский 🇷🇺"); version (useUTF08) { stringUTF08 lstrSequence3reencoded; } version (useUTF16) { stringUTF16 lstrSequence3reencoded; } version (useUTF32) { stringUTF32 lstrSequence3reencoded; } for ( size_t lintSequenceUGC = cast(size_t) 1; lintSequenceUGC <= lintSequence3sizeUGC; ++lintSequenceUGC ) { lstrSequence3reencoded ~= lugcSequence3.takeasUTF(lintSequenceUGC); } assert(lstrSequence3reencoded == lstrSequence3); lstrSequence3reencoded = null; version (useUTF08) { foreach (stringUTF08 lstrSequence3UGC; lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } } version (useUTF16) { foreach (stringUTF16 lstrSequence3UGC; lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } } version (useUTF32) { foreach (stringUTF32 lstrSequence3UGC; lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } } //assert(lstrSequence3reencoded == lstrSequence3); /// ooops … } ```
Jul 13 2021
On Sunday, 11 July 2021 at 05:20:49 UTC, someone wrote:```d mixin template templateUGC ( typeStringUTF, alias lstrStructureID ) { public struct lstrStructureID { typeStringUTF whatever; }This creates a struct with teh literal name `lstrStructureID`. Just like any other name. So it is NOT the value of the variable.```d public struct mixin(lstrStructureID) { ... } ``` because the argument seems to require a complete statement.Indeed, you'd have to mixin the whole thing like mixin("public struct " ~ lstrStructureId ~ " { ... } ");
Jul 11 2021
On 7/11/21 8:49 AM, Adam D Ruppe wrote:On Sunday, 11 July 2021 at 05:20:49 UTC, someone wrote:when I've done this kind of stuff, what I usually do is: ```d struct Thing { ... // actual struct } mixin("alias ", lstrStructureID, " = Thing;"); ``` the downside is that the actual struct name symbol will be `Thing`, or whatever you called it. But at least you are not writing lots of code using mixins. -Steve```d mixin template templateUGC ( typeStringUTF, alias lstrStructureID ) { public struct lstrStructureID { typeStringUTF whatever; }This creates a struct with teh literal name `lstrStructureID`. Just like any other name. So it is NOT the value of the variable.```d public struct mixin(lstrStructureID) { ... } ``` because the argument seems to require a complete statement.Indeed, you'd have to mixin the whole thing like mixin("public struct " ~ lstrStructureId ~ " { ... } ");
Jul 11 2021
On Sunday, 11 July 2021 at 13:14:23 UTC, Steven Schveighoffer wrote:when I've done this kind of stuff, what I usually do is: ```d struct Thing { ... // actual struct } mixin("alias ", lstrStructureID, " = Thing;"); ``` the downside is that the actual struct name symbol will be `Thing`, or whatever you called it. But at least you are not writing lots of code using mixins. -SteveThanks for your tip Steve, I ended with something similar, I'll be posting my whole example below.
Jul 11 2021
On Sunday, 11 July 2021 at 12:49:28 UTC, Adam D Ruppe wrote:This creates a struct with teh literal name `lstrStructureID`. Just like any other name. So it is NOT the value of the variable.Could you explain more detail?
Jul 11 2021
On Sunday, 11 July 2021 at 13:30:27 UTC, zjh wrote:Could you explain more detail?It is just normal code with a normal name. The fact there's another variable with the same name doesn't change anything.
Jul 11 2021
```d mixin template templateUGC ( typeStringUTF, alias lstrStructureID){ public struct lstrStructureID { typeStringUTF w; } } mixin templateUGC!(string, "gudtUGC08"); ``` You say `This creates a struct with teh literal name lstrStructureID`. I tried,can compile,but I don't know generate what. Could you explain more detail?
Jul 11 2021
On Sunday, 11 July 2021 at 14:04:14 UTC, zjh wrote: just genenrate `lstrStructureID` struct.
Jul 11 2021
On Sunday, 11 July 2021 at 12:49:28 UTC, Adam D Ruppe wrote:Indeed, you'd have to mixin the whole thing like mixin("public struct " ~ lstrStructureId ~ " { ... } ");As I mentioned in my previous reply to Ali this could be viable for one-liners-or-so, but for chunks of code having, say, a couple hundred lines for one UDT, it will become debug/maintenance-hell soon ... so clearly; it is a no-go ... for my specific case at least.
Jul 11 2021