digitalmars.D - First Impressions!
- A Guy With an Opinion (162/162) Nov 27 2017 Hi,
- docandrew (20/30) Nov 27 2017 Good feedback overall, thanks for checking it out. You're not
- rikki cattermole (25/186) Nov 27 2017 Its on our TODO list.
- A Guy With an Opinion (15/27) Nov 27 2017 That's good to hear.
- ketmar (13/17) Nov 27 2017 basically, default initializers aren't meant to give a "usable value", t...
- A Guy With an Opinion (13/32) Nov 27 2017 Eh...I still don't agree. I think C and C++ just gave that style
- A Guy With an Opinion (4/44) Nov 27 2017 Also, C and C++ didn't just have undefined behavior, sometimes it
- codephantom (3/6) Nov 27 2017 set to?
- Patrick Schluter (4/10) Nov 28 2017 It's only auto variables that are undefined. statics and code
- ketmar (9/10) Nov 27 2017 anyway, it is something that won't be changed, 'cause there may be code
- Patrick Schluter (20/60) Nov 28 2017 Just a little anecdote of a maintainer of a legacy project in C.
- Kagamin (4/6) Nov 30 2017 UCS2 was awesome. UTF-16 is used by Java, JavaScript,
- Walter Bright (12/18) Dec 01 2017 "was" :-) Those are pretty much pre-surrogate pair designs, or based on ...
- H. S. Teoh (12/27) Dec 01 2017 This is not true in Asia, esp. where the CJK block is extensively used.
- Walter Bright (6/15) Dec 02 2017 Are you sure about that? I know that Asian languages will be longer in U...
- Jacob Carlborg (6/10) Dec 02 2017 Not necessarily. I've seen code in non-English languages, i.e. when the
- Patrick Schluter (13/41) Dec 02 2017 That's true in theory, in practice it's not that severe as the
- Patrick Schluter (4/14) Dec 02 2017 106% for Korean, copied the wrong column. Traditiojal Chinese was
- Joakim (25/49) Dec 02 2017 Yep, that's why five years back many of the major Chinese sites
- Patrick Schluter (23/56) Dec 03 2017 Summary
- Andrei Alexandrescu (3/11) Dec 04 2017 BTW has anyone been in contact with Xah Lee? Perhaps we could commission...
- Joakim (4/16) Dec 04 2017 I traded email with him last summer, emailed you his email
- Adam D. Ruppe (65/83) Nov 27 2017 Yes, indeed, and many of them don't help much in finding the real
- A Guy With an Opinion (6/7) Nov 27 2017 I actually did try something like that, because I remembered
- Michael V. Franklin (21/88) Nov 27 2017 I come from a heavy C#/C++ background. I also I *felt* this as
- A Guy With an Opinion (15/21) Nov 27 2017 I'd be happy to submit an issue, but I'm not quite sure I'd be
- Michael V. Franklin (6/13) Nov 27 2017 If this was on the forum, please point me to it. I'll see if I
- A Guy With an Opinion (3/16) Nov 27 2017 https://forum.dlang.org/thread/vcvlffjxowgdvpvjsijq@forum.dlang.org
- Steven Schveighoffer (47/95) Nov 28 2017 Hi Guy, welcome, and I wanted to say I was saying "me too" while reading...
- A Guy With an Opinion (7/12) Nov 28 2017 That's exactly what it was I think. As I stated before, I tried
- A Guy With an Opinion (7/9) Nov 28 2017 On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole
- Guillaume Piolat (20/21) Nov 28 2017 You are not supposed to come to this forum with well-balanced
- Jack Stouffer (16/27) Nov 28 2017 Attributes were one of my biggest hurdles when working on my own
- Adam D. Ruppe (3/5) Nov 28 2017 That doesn't quite work since it doesn't descend into aggregates.
- Jacob Carlborg (4/6) Nov 28 2017 And if your project is a library.
- A Guy With an Opinion (2/4) Nov 28 2017 I take it adding those inverse attributes is no trivial thing?
- Michael V. Franklin (7/8) Nov 28 2017 It would require a DIP: https://github.com/dlang/DIPs
- Mike Parker (4/7) Nov 28 2017 It's awaiting formal review. I'll move it forward when the formal
- A Guy With a Question (4/12) Nov 29 2017 How well does phobos play with it? I'm finding, for instance,
- Adam D. Ruppe (9/10) Nov 28 2017 Technically, it is extremely trivial.
- Dukc (6/11) Nov 30 2017 In fact I believe it is. When you have something unsafe you can
- Walter Bright (7/15) Nov 30 2017 Sooner or later your code will exhibit bugs if it assumes that char==cod...
- Joakim (7/25) Nov 30 2017 Java, .NET, Qt, Javascript, and a handful of others use UTF-16
- Walter Bright (2/9) Nov 30 2017 I stand corrected.
- Jonathan M Davis (26/35) Nov 30 2017 I get the impression that the stuff that uses UTF-16 is mostly stuff tha...
- A Guy With a Question (19/42) Nov 30 2017 I don't think that's true though. Haven't you always been able to
- A Guy With a Question (7/20) Nov 30 2017 I think it also simplifies the logic. You are not always looking
- Jonathan M Davis (33/54) Nov 30 2017 Even if that were true, UTF-16 code units are not code points. If you wa...
- Walter Bright (6/10) Dec 01 2017 UTF-8 is not the cause of that particular problem, it's caused by the Un...
- Jonathan M Davis (16/26) Dec 01 2017 Oh, definitely. UTF-8 is arguably the best that Unicode has, but Unicode...
- Walter Bright (24/36) Dec 02 2017 Yup. I've presented that point of view a couple times on HackerNews, and...
- Patrick Schluter (5/11) Dec 02 2017 Where it gets really fun is the when there is color composition
- H. S. Teoh (27/33) Dec 02 2017 The same can be argued for the icon mania started by the GUI craze in
- Walter Bright (4/5) Dec 02 2017 Even worse, companies go and copyright their icons, guaranteeing they ha...
- Steven Schveighoffer (5/10) Dec 04 2017 I like this site for icons. Only requires you to reference them in your
- Kagamin (2/6) Dec 04 2017 What happened when you ran vi for the first time?
- codephantom (11/17) Dec 02 2017 The real problem, is that sometimes people don't feel like a
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/5) Dec 02 2017 https://splinternews.com/violent-emoji-are-starting-to-get-people-in-tro...
- codephantom (13/18) Dec 02 2017 No. Humans never express negative emotions, and also, never
- codephantom (6/11) Dec 02 2017 btw. Good article here, further demonstrating my point..
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (16/19) Dec 02 2017 They are used as symbols culturally, which is how written
- Nicholas Wilson (4/12) Nov 30 2017 I assume you meant UTF32 not UCS32, given UCS2 is Microsoft's
- Walter Bright (4/7) Nov 30 2017 I meant UCS-4, which is identical to UTF-32. It's hard keeping all that ...
- A Guy With a Question (6/14) Nov 30 2017 It's also worth mentioning that the more I think about it, the
- Walter Bright (8/11) Nov 30 2017 Both Windows and Java selected UTF16 before surrogates were added, so it...
- A Guy With a Question (13/31) Nov 30 2017 As long as you understand it's limitations I think most bugs can
- Jonathan M Davis (19/30) Nov 30 2017 The reality of the matter is that if you want to write fully valid Unico...
- Patrick Schluter (7/9) Nov 30 2017 Not even that in most cases. Only if you use unstructured text
- Patrick Schluter (10/13) Nov 30 2017 To give just an example of what can go wrong with UTF-16. Reading
- Steven Schveighoffer (4/17) Nov 30 2017 iopipe handles this:
- Patrick Schluter (9/27) Nov 30 2017 It was only to give an example. With UTF-8 people who implement
- A Guy With a Question (5/34) Dec 01 2017 Most problems with UTF16 is applicable to UTF8. The only issue
- Patrick Schluter (6/28) Dec 01 2017 That's what I said. UTF-16 and UTF-8 have the same issues, but
- Patrick Schluter (3/18) Dec 01 2017 I meant isolated code-units, of course.
- Steven Schveighoffer (4/9) Dec 01 2017 Hehe, it's impossible for me to talk about code points and code units
- Jonathan M Davis (13/20) Dec 01 2017 What, you mean that Unicode can be confusing? No way! ;)
- A Guy With a Question (6/31) Dec 01 2017 And dealing with that complexity can often introduce bugs in
- Walter Bright (3/4) Dec 02 2017 Yeah, I forgot to mention that one. As if anyone remembers to put in the...
- Kagamin (15/35) Nov 30 2017 Then do it the C# way. There's choice.
Hi, I've been using D for a personal project for about two weeks now and just thought I'd share my initial impression just in case it's useful! I like feedback on things I do, so I just assume others do to. Plus my opinion is the best on the internet! You will see (hopefully the sarcasm is obvious otherwise I'll just appear pompous). It would probably be better if I did a retrospective after my project is completed, but with life who knows if that will happen. I could lose interest or something and not finish it. And then you guys wouldn't know my opinion. I can't allow that. I'll start off by saying I like the overall experience. I come a day to day basis. I did do a three year stint working with C/C++ (mostly C++), but I never really enjoyed it much. C++ is overly verbose, overly complicated, overly littered with poor for the most part been a delight. The only problem is I don't find it to be the best when it comes to generative programming. the most part it's always struck me as more specialized for container types and to do anything remotely outside of it's purpose takes a fair bit of cleverness. I'm sick of being clever in that aspect. So here are some impressions good and bad: the .NET framework, like files and unicode, have fairly direct counterparts in D. + D code so far is pushing me towards more "flat" code (for a lack of a better way to phrase it) and so far that has helped opposite. With it's namespace -> class -> method coupled with lock, using, etc...you tend to do a lot of nesting. You are generally 3 '{' in before any true logic even begins. Then couple that with try/catch, IDisposable/using, locking, and then if/else, it can get quite chaotic very easily. So right away, I and I think it has to do with the flatness. I'm not sure if that opinion will hold when I delve into 'static if' a little more, but so far my uses of it haven't really dampened that opinion. + Visual D. It might be that I had poor expectations of it, because I read D's tooling was poor on the internet (and nothing is ever wrong on the internet), however, the combination of Visual D and DMD actually exceeded my expectations. I've been quite happy with it. It was relatively easy to set up and worked as I would expect it to work. It lets me debug, add breakpoints, and does the basic syntax highlighting I would expect. It could have a few other features, but for a project that is not corporate backed, it was really above what I could have asked for. + So far, compiling is fast. And from what I hear it will stay fast. A big motivator. The one commercial C++ project I worked on was a beast and could take an hour+ to compile if you needed to accustomed to not having to go to the bathroom, get a drink, etc...before returning to find out I'm on the linking step. I'm used to if it doesn't take less than ten seconds (probably less) then I prep myself for an error to deal with. I want this to remain. - Some of the errors from DMD are a little strange. I don't want to crap on this too much, because for the most part it's fine. However occasionally it throws errors I still can't really work out why THAT is the error it gave me. Some of you may have saw my question in the "Learn" forum about not knowing to use static in an embedded class, but the error was the following: Error: 'this' is only defined in non-static member functions I'd say the errors so far are above some of the cryptic stuff C++ can throw at you (however, I haven't delved that deeply into D templates yet, so don't hold me to this yet), but in terms of + The standard library so far is really good. Nullable worked as I thought it should. I just guessed a few of the methods based on what I had seen at that point and got it right. So it appears consistent and intuitive. I also like the fact I can peek at the code and understand it by just reading it. Unlike with C++ where I still don't know how some of the stuff is *really* implemented. The STL almost seems like it's written in a completely different language than the stuff it enables. For instance, I figured out how to do packages by seeing it in Phobos. - ...however, where are all of the collections? No Queue? No Stack? No HashTable? I've read that it's not a big focus because Dictionary<> quite a bit, so I'm not looking forward to having to hand roll my own or use something that aren't fundamentally them. This is definitely the biggest negative I've come across. I want a queue, not something that *can* behave as a queue. I definitely expected more from a language that is this old. + Packages and 'public import'. I really think it's useful to forward imports/using statements. It kind of packages everything that is required to use that thing in your namespace/package together. So you don't have to include a dozen things. C and C++ can do this with it's #includes, but in an unsatisfactory way. At least in my opinion. - Modules. I like modules better than #include, but I don't like there is this gravity that kind of pulls me to associate a module with a file. It appears you don't have to, because I can do the package thing, but whenever I try to do things outside that one idiom I end up in a soup of errors. I'm sure I'm just not use to it, but so far it's been a little dissatisfying. Sometimes I want where it is physically on my file system to be different from how really the standard to beat or meet. + Unit tests. Finally built in unit tests. Enough said here. If the lack of collections was the biggest negative, this is the biggest positive. I would like to enable them at build time if possible though. - Attributes. I had another post in the Learn forum about attributes which was unfortunate. At first I was excited because it seems like on the surface it would help me write better code, but it gets a little tedious and tiresome to have to remember to decorate code with them. It seems like most of them should have been the defaults. I would have preferred if the compiler helped me and reminded me. I asked if there was a way to enforce them globally, which I guess there is, but I guess there's also not a way to turn some of them off afterwards. A bit unfortunate. But at least I can see some solutions to this. - The defaults for primitives seem off. They seem to encourage errors. I don't think that is the best design decision even if it encourages the errors to be caught as quickly as possible. I think the better decision would be to not have the errors occur. When I asked about this, there seemed to be a disassociation between the spec and the implementation. The spec says a declaration should error if not explicitly set, but the implementation just initializes them to something that is likely to error. Like NaN for floats which I would have thought would have been 0 based on prior experiences with other languages. - Immutable. I'm not sure I fully understand it. On the surface it seemed like const but transitive. I tried having a method return an immutable value, but when I used it in my unit test I got some weird errors about objects not being able to return immutable (I forget the exact error...apologies). I refactored to use const, and it all worked as I expected, but I don't get why the immutable didn't work. I was returning a value type, so I don't see why passing in assert(object.errorCount == 0) would have triggered errors. But it did. I have a set of classes that keep track of snapshots of specific counts that seems like a perfect fit for immutable (because I don't want those 'snapshots' to change...like ever), but I kept getting errors trying to use it like const. The type string seems to be an immutable(char[]) which works exactly the way I was expecting, and I haven't ran into problems, so I'm not sure what the problem was. I'm just more confused knowing that string works, but what I did didn't. +- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'." Seems like the natural fit for me. Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint. Not the biggest issue in the world and maybe I'm just being overly critical here. + Templates seem powerful. I've only fiddled thus far, but I don't think I've quite comprehended their usefulness yet. It will probably take me some time to figure out how to wield them effectively. One thing I accidentally stumbled upon that I liked was that I could simulate inheritance in structs with them, by using the mixin keyword. That was cool, and I'm not even sure if that is what they were really meant to enable. So those are just some of my thoughts. Tell me why I'm wrong :P
Nov 27 2017
On Tuesday, 28 November 2017 at 03:01:33 UTC, A Guy With an Opinion wrote:- ...however, where are all of the collections? No Queue? No Stack? No HashTable? I've read that it's not a big focus because some of the built in stuff *can* behave like those forward to having to hand roll my own or use something that aren't fundamentally them. This is definitely the biggest negative I've come across. I want a queue, not something that *can* behave as a queue. I definitely expected more from a language that is this old.Good feedback overall, thanks for checking it out. You're not wrong, but some of the design decisions that feel strange to newcomers at first have been heavily-debated, generally well-reasoned, and just take some time to get used to. That sounds like a cop-out, but stick with it and I think you'll find that a lot of the decisions make sense - see the extensive discussion on NaN-default for floats, for example. Just one note about the above comment though: the std.container.dlist doubly-linked list has methods that you can use to put together stacks and queues easily: https://dlang.org/phobos/std_container_dlist.html Also, D's associative arrays implement a hash map https://dlang.org/spec/hash-map.html, which I think should take Anyhow, D is a big language (for better and sometimes worse), so it's easy to miss some of the good nuggets buried within the spec/library. -Doc
Nov 27 2017
On 28/11/2017 3:01 AM, A Guy With an Opinion wrote:Hi, I've been using D for a personal project for about two weeks now and just thought I'd share my initial impression just in case it's useful! I like feedback on things I do, so I just assume others do to. Plus my opinion is the best on the internet! You will see (hopefully the sarcasm is obvious otherwise I'll just appear pompous). It would probably be better if I did a retrospective after my project is completed, but with life who knows if that will happen. I could lose interest or something and not finish it. And then you guys wouldn't know my opinion. I can't allow that. and C++ background with a little bit of C mixed in. For the most part I did do a three year stint working with C/C++ (mostly C++), but I never really enjoyed it much. C++ is overly verbose, overly complicated, the other hand has for the most part been a delight. The only problem is I don't find it to be the best when it comes to generative programming. most part it's always struck me as more specialized for container types and to do anything remotely outside of it's purpose takes a fair bit of cleverness. I'm sick of being clever in that aspect. So here are some impressions good and bad: .NET framework, like files and unicode, have fairly direct counterparts in D. + D code so far is pushing me towards more "flat" code (for a lack of a better way to phrase it) and so far that has helped tremendously when it class -> method coupled with lock, using, etc...you tend to do a lot of nesting. You are generally 3 '{' in before any true logic even begins. Then couple that with try/catch, IDisposable/using, locking, and then if/else, it can get quite chaotic very easily. So right away, I saw my it has to do with the flatness. I'm not sure if that opinion will hold when I delve into 'static if' a little more, but so far my uses of it haven't really dampened that opinion. + Visual D. It might be that I had poor expectations of it, because I read D's tooling was poor on the internet (and nothing is ever wrong on the internet), however, the combination of Visual D and DMD actually exceeded my expectations. I've been quite happy with it. It was relatively easy to set up and worked as I would expect it to work. It lets me debug, add breakpoints, and does the basic syntax highlighting I would expect. It could have a few other features, but for a project that is not corporate backed, it was really above what I could have asked for. + So far, compiling is fast. And from what I hear it will stay fast. A big motivator. The one commercial C++ project I worked on was a beast and could take an hour+ to compile if you needed to compile something to go to the bathroom, get a drink, etc...before returning to find out I'm on the linking step. I'm used to if it doesn't take less than ten seconds (probably less) then I prep myself for an error to deal with. I want this to remain. - Some of the errors from DMD are a little strange. I don't want to crap on this too much, because for the most part it's fine. However occasionally it throws errors I still can't really work out why THAT is the error it gave me. Some of you may have saw my question in the "Learn" forum about not knowing to use static in an embedded class, but the error was the following: Error: 'this' is only defined in non-static member functions I'd say the errors so far are above some of the cryptic stuff C++ can throw at you (however, I haven't delved that deeply into D templates yet, so don't hold me to this yet), but in terms of quality I'd put it + The standard library so far is really good. Nullable worked as I thought it should. I just guessed a few of the methods based on what I had seen at that point and got it right. So it appears consistent and intuitive. I also like the fact I can peek at the code and understand it by just reading it. Unlike with C++ where I still don't know how some of the stuff is *really* implemented. The STL almost seems like it's written in a completely different language than the stuff it enables. For instance, I figured out how to do packages by seeing it in Phobos. - ...however, where are all of the collections? No Queue? No Stack? No HashTable? I've read that it's not a big focus because some of the built not looking forward to having to hand roll my own or use something that aren't fundamentally them. This is definitely the biggest negative I've come across. I want a queue, not something that *can* behave as a queue. I definitely expected more from a language that is this old.Its on our TODO list. Allocators need to come out of experimental and some form of RC before we tackle it again. In the mean time https://github.com/economicmodeling/containers is pretty good.+ Packages and 'public import'. I really think it's useful to forward imports/using statements. It kind of packages everything that is required to use that thing in your namespace/package together. So you don't have to include a dozen things. C and C++ can do this with it's #includes, but in an unsatisfactory way. At least in my opinion. - Modules. I like modules better than #include, but I don't like them gravity that kind of pulls me to associate a module with a file. It appears you don't have to, because I can do the package thing, but whenever I try to do things outside that one idiom I end up in a soup of errors. I'm sure I'm just not use to it, but so far it's been a little dissatisfying. Sometimes I want where it is physically on my file system namespaces are really the standard to beat or meet.Modules are a fairly well understood concept from the ML family. You are not use to it is all :) Keep in mind we do have namespaces for binding to c++ code and I haven't heard of anybody abusing it for the purpose of using name spaces. They tend to be ugly hacks with ambiguity running through them. Of course I never had to use them in c++ so I'm sure somebody can give you some war stories with them ;)+ Unit tests. Finally built in unit tests. Enough said here. If the lack of collections was the biggest negative, this is the biggest positive. I would like to enable them at build time if possible though.I keep saying it, if you don't have unit tests built in, you don't care about code quality!- Attributes. I had another post in the Learn forum about attributes which was unfortunate. At first I was excited because it seems like on the surface it would help me write better code, but it gets a little tedious and tiresome to have to remember to decorate code with them. It seems like most of them should have been the defaults. I would have preferred if the compiler helped me and reminded me. I asked if there was a way to enforce them globally, which I guess there is, but I guess there's also not a way to turn some of them off afterwards. A bit unfortunate. But at least I can see some solutions to this.You don't need to bother with them for most code :)- The defaults for primitives seem off. They seem to encourage errors. I don't think that is the best design decision even if it encourages the errors to be caught as quickly as possible. I think the better decision would be to not have the errors occur. When I asked about this, there seemed to be a disassociation between the spec and the implementation. The spec says a declaration should error if not explicitly set, but the implementation just initializes them to something that is likely to error. Like NaN for floats which I would have thought would have been 0 based on prior experiences with other languages.Doesn't mean the other languages are right either.- Immutable. I'm not sure I fully understand it. On the surface it seemed like const but transitive. I tried having a method return an immutable value, but when I used it in my unit test I got some weird errors about objects not being able to return immutable (I forget the exact error...apologies). I refactored to use const, and it all worked as I expected, but I don't get why the immutable didn't work. I was returning a value type, so I don't see why passing in assert(object.errorCount == 0) would have triggered errors. But it did. I have a set of classes that keep track of snapshots of specific counts that seems like a perfect fit for immutable (because I don't want those 'snapshots' to change...like ever), but I kept getting errors trying to use it like const. The type string seems to be an immutable(char[]) which works exactly the way I was expecting, and I haven't ran into problems, so I'm not sure what the problem was. I'm just more confused knowing that string works, but what I did didn't. +- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'." Seems like the natural fit for me. Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint. Not the biggest issue in the world and maybe I'm just being overly critical here.That uses a lot of memory UTF-16 instead of UTF-8. I would argue for UTF-32 instead of 16. If you need a wstring, use a wstring! Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.+ Templates seem powerful. I've only fiddled thus far, but I don't think I've quite comprehended their usefulness yet. It will probably take me some time to figure out how to wield them effectively. One thing I accidentally stumbled upon that I liked was that I could simulate inheritance in structs with them, by using the mixin keyword. That was cool, and I'm not even sure if that is what they were really meant to enable.And that is where we use alias this instead. Do wish it was fully implemented though (multiple). Welcome!
Nov 27 2017
On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:Its on our TODO list. Allocators need to come out of experimental and some form of RC before we tackle it again. In the mean time https://github.com/economicmodeling/containers is pretty good.That's good to hear.I keep saying it, if you don't have unit tests built in, you don't care about code quality!I just like not having to create a throwaway project to test my code. It's nice to just use unit tests for what I used to create console apps for and then it forever ensures my code works the same!You don't need to bother with them for most code :)That seems to be what people here are saying, but that seems so sad...Doesn't mean the other languages are right either.That is true, but I'm still unconvinced that making the person's program likely to error is better than initializing a number to 0. Zero is such a fundamental default for so many things. And it would be consistent with the other number types.If you need a wstring, use a wstring! Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.I do come from that world, so there is a chance I'm just comfortable with it.
Nov 27 2017
A Guy With an Opinion wrote:That is true, but I'm still unconvinced that making the person's program likely to error is better than initializing a number to 0. Zero is such a fundamental default for so many things. And it would be consistent with the other number types.basically, default initializers aren't meant to give a "usable value", they meant to give a *defined* value, so we don't have UB. that is, just initialize your variables explicitly, don't rely on defaults. writing: int a; a += 42; is still bad code, even if you're know that `a` is guaranteed to be zero. int a = 0; a += 42; is the "right" way to write it. if you'll look at default values from this PoV, you'll see that NaN has more sense that zero. if there was a NaN for ints, ints would be inited with it too. ;-)
Nov 27 2017
On Tuesday, 28 November 2017 at 04:12:14 UTC, ketmar wrote:A Guy With an Opinion wrote:Eh...I still don't agree. I think C and C++ just gave that style of coding a bad rap due to the undefined behavior. But the issue is it was undefined behavior. A lot of language features aim to make things well defined and have less verbose representations. Once a language matures that's what a big portion of their newer features become. Less verbose shortcuts of commonly done things. I agree it's important that it's well defined, I'm just thinking it should be a value that someone actually wants some notable fraction of the time. Not something no one wants ever. I could be persuaded, but so far I'm not drinking the koolaid on that. It's not the end of the world, I was just confused when my float was NaN.That is true, but I'm still unconvinced that making the person's program likely to error is better than initializing a number to 0. Zero is such a fundamental default for so many things. And it would be consistent with the other number types.basically, default initializers aren't meant to give a "usable value", they meant to give a *defined* value, so we don't have UB. that is, just initialize your variables explicitly, don't rely on defaults. writing: int a; a += 42; is still bad code, even if you're know that `a` is guaranteed to be zero. int a = 0; a += 42; is the "right" way to write it. if you'll look at default values from this PoV, you'll see that NaN has more sense that zero. if there was a NaN for ints, ints would be inited with it too. ;-)
Nov 27 2017
On Tuesday, 28 November 2017 at 04:17:18 UTC, A Guy With an Opinion wrote:On Tuesday, 28 November 2017 at 04:12:14 UTC, ketmar wrote:Also, C and C++ didn't just have undefined behavior, sometimes it has inconsistent behavior. Sometimes int a; is actually set to 0.A Guy With an Opinion wrote:Eh...I still don't agree. I think C and C++ just gave that style of coding a bad rap due to the undefined behavior. But the issue is it was undefined behavior. A lot of language features aim to make things well defined and have less verbose representations. Once a language matures that's what a big portion of their newer features become. Less verbose shortcuts of commonly done things. I agree it's important that it's well defined, I'm just thinking it should be a value that someone actually wants some notable fraction of the time. Not something no one wants ever. I could be persuaded, but so far I'm not drinking the koolaid on that. It's not the end of the world, I was just confused when my float was NaN.That is true, but I'm still unconvinced that making the person's program likely to error is better than initializing a number to 0. Zero is such a fundamental default for so many things. And it would be consistent with the other number types.basically, default initializers aren't meant to give a "usable value", they meant to give a *defined* value, so we don't have UB. that is, just initialize your variables explicitly, don't rely on defaults. writing: int a; a += 42; is still bad code, even if you're know that `a` is guaranteed to be zero. int a = 0; a += 42; is the "right" way to write it. if you'll look at default values from this PoV, you'll see that NaN has more sense that zero. if there was a NaN for ints, ints would be inited with it too. ;-)
Nov 27 2017
On Tuesday, 28 November 2017 at 04:19:40 UTC, A Guy With an Opinion wrote:Also, C and C++ didn't just have undefined behavior, sometimes it has inconsistent behavior. Sometimes int a; is actually set to 0.set to?
Nov 27 2017
On Tuesday, 28 November 2017 at 04:19:40 UTC, A Guy With an Opinion wrote:On Tuesday, 28 November 2017 at 04:17:18 UTC, A Guy With an Opinion wrote:It's only auto variables that are undefined. statics and code unit (aka globals) are defined.[...]Also, C and C++ didn't just have undefined behavior, sometimes it has inconsistent behavior. Sometimes int a; is actually set to 0.
Nov 28 2017
A Guy With an Opinion wrote:Eh...I still don't agree.anyway, it is something that won't be changed, 'cause there may be code that rely on current default values. i'm not really trying to change your mind, i just tried to give a rationale behind the choice. that's why `char.init` is 255 too, not zero. still, explicit variable initialization looks better for me. with default init, it is hard to say if the author just forget to initialize a variable, and it happens to work, or he knows about the default value and used it. and the reader don't have to guess what default value is.
Nov 27 2017
On Tuesday, 28 November 2017 at 04:17:18 UTC, A Guy With an Opinion wrote:On Tuesday, 28 November 2017 at 04:12:14 UTC, ketmar wrote:Just a little anecdote of a maintainer of a legacy project in C. My predecessors in that project had the habit of systematically initialize any auto declared variable at the beginning of a function. The code base that was initiated in the early '90s and written by people who were typical BASIC programmer, so the consequence of it was that functions were very often hundreds of lines long and they all started with a lot of declarations. In the years of reviewing that code, and I was really surprised by that, was how often I found bugs because the variables had been wrongly initialised. By initialising with 0 or NULL, the data flow pass was essentially suppressed at the start so that it could not detect when variables were used before they had been properly populated with the right values the functionality required. The thing with these kind of bugs was that they were very subtle. To make it short, 0 is an arbitrary number that often is the right value but when it isn't, it can be a pain to detect that it was the wrong value.A Guy With an Opinion wrote:Eh...I still don't agree. I think C and C++ just gave that style of coding a bad rap due to the undefined behavior. But the issue is it was undefined behavior. A lot of language features aim to make things well defined and have less verbose representations. Once a language matures that's what a big portion of their newer features become. Less verbose shortcuts of commonly done things. I agree it's important that it's well defined, I'm just thinking it should be a value that someone actually wants some notable fraction of the time. Not something no one wants ever. I could be persuaded, but so far I'm not drinking the koolaid on that. It's not the end of the world, I was just confused when my float was NaN.That is true, but I'm still unconvinced that making the person's program likely to error is better than initializing a number to 0. Zero is such a fundamental default for so many things. And it would be consistent with the other number types.basically, default initializers aren't meant to give a "usable value", they meant to give a *defined* value, so we don't have UB. that is, just initialize your variables explicitly, don't rely on defaults. writing: int a; a += 42; is still bad code, even if you're know that `a` is guaranteed to be zero. int a = 0; a += 42; is the "right" way to write it. if you'll look at default values from this PoV, you'll see that NaN has more sense that zero. if there was a NaN for ints, ints would be inited with it too. ;-)
Nov 28 2017
On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.UCS2 was awesome. UTF-16 is used by Java, JavaScript, Objective-C, Swift, Dart and ms tech, which is 28% of tiobe index.
Nov 30 2017
On 11/30/2017 9:23 AM, Kagamin wrote:On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:"was" :-) Those are pretty much pre-surrogate pair designs, or based on them (Dart compiles to JavaScript, for example). UCS2 has serious problems: 1. Most strings are in ascii, meaning UCS2 doubles memory consumption. Strings in the executable file are twice the size. 2. The code doesn't work well with C. C doesn't even have a UCS2 type. 3. There's no reasonable way to audit the code to see if it handles surrogate pairs correctly. Surrogate pairs occur only rarely, so the code is never tested for it, and the bugs may remain latent for many, many years. With UTF8, multibyte code points are much more common, so bugs are detected much earlier.Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.UCS2 was awesome. UTF-16 is used by Java, JavaScript, Objective-C, Swift, Dart and ms tech, which is 28% of tiobe index.
Dec 01 2017
On Fri, Dec 01, 2017 at 03:04:44PM -0800, Walter Bright via Digitalmars-d wrote:On 11/30/2017 9:23 AM, Kagamin wrote:This is not true in Asia, esp. where the CJK block is extensively used. A CJK block character is 3 bytes in UTF-8, meaning that string sizes are 150% of the UCS2 encoding. If your code contains a lot of CJK text, that's a lot of bloat. But then again, in non-Latin locales you'd generally store your strings separately of the executable (usually in l10n files), so this may not be that big an issue. But the blanket statement "Most strings are in ASCII" is not correct. T -- Bare foot: (n.) A device for locating thumb tacks on the floor.On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:"was" :-) Those are pretty much pre-surrogate pair designs, or based on them (Dart compiles to JavaScript, for example). UCS2 has serious problems: 1. Most strings are in ascii, meaning UCS2 doubles memory consumption. Strings in the executable file are twice the size.Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.UCS2 was awesome. UTF-16 is used by Java, JavaScript, Objective-C, Swift, Dart and ms tech, which is 28% of tiobe index.
Dec 01 2017
On 12/1/2017 3:16 PM, H. S. Teoh wrote:This is not true in Asia, esp. where the CJK block is extensively used. A CJK block character is 3 bytes in UTF-8, meaning that string sizes are 150% of the UCS2 encoding. If your code contains a lot of CJK text, that's a lot of bloat. But then again, in non-Latin locales you'd generally store your strings separately of the executable (usually in l10n files), so this may not be that big an issue. But the blanket statement "Most strings are in ASCII" is not correct.Are you sure about that? I know that Asian languages will be longer in UTF-8. But how much data that programs handle is in those languages? The language of business, science, programming, aviation, and engineering is english. Of course, D itself is agnostic about that. The compiler, for example, accepts strings, identifiers, and comments in Chinese in UTF-16 format.
Dec 02 2017
On 2017-12-02 11:02, Walter Bright wrote:Are you sure about that? I know that Asian languages will be longer in UTF-8. But how much data that programs handle is in those languages? The language of business, science, programming, aviation, and engineering is english.Not necessarily. I've seen code in non-English languages, i.e. when the identifiers are non-English. But of course, most programming languages will using English for keywords and built-in functions. -- /Jacob Carlborg
Dec 02 2017
On Friday, 1 December 2017 at 23:16:45 UTC, H. S. Teoh wrote:On Fri, Dec 01, 2017 at 03:04:44PM -0800, Walter Bright via Digitalmars-d wrote:That's true in theory, in practice it's not that severe as the CJK languages are never isolated and appear embedded in a lot of ASCII. You can read here a case study [1] which shows 106% for Simplified Chinese, 76% for Traditional Chinese, 129% for Japanese and 94% for Korean. These numbers for pure text. Publish it on the web embedded in bloated html and there goes the size advantage of UTF-16On 11/30/2017 9:23 AM, Kagamin wrote:This is not true in Asia, esp. where the CJK block is extensively used. A CJK block character is 3 bytes in UTF-8, meaning that string sizes are 150% of the UCS2 encoding. If your code contains a lot of CJK text, that's a lot of bloat.On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:"was" :-) Those are pretty much pre-surrogate pair designs, or based on them (Dart compiles to JavaScript, for example). UCS2 has serious problems: 1. Most strings are in ascii, meaning UCS2 doubles memory consumption. Strings in the executable file are twice the size.Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.UCS2 was awesome. UTF-16 is used by Java, JavaScript, Objective-C, Swift, Dart and ms tech, which is 28% of tiobe index.But then again, in non-Latin locales you'd generally store your strings separately of the executable (usually in l10n files), so this may not be that big an issue. But the blanket statement "Most strings are in ASCII" is not correct.False, in the sense that isolated pure text is rare and is generally delivered inside some file format, most times ASCII based like docx, odf, tmx, xliff, akoma ntoso etc... [1]: https://stackoverflow.com/questions/6883434/at-all-times-text-encoded-in-utf-8-will-never-give-us-more-than-a-50-file-size
Dec 02 2017
On Saturday, 2 December 2017 at 10:35:50 UTC, Patrick Schluter wrote:On Friday, 1 December 2017 at 23:16:45 UTC, H. S. Teoh wrote:106% for Korean, copied the wrong column. Traditiojal Chinese was smaller, probably because of whitespaces.[...]That's true in theory, in practice it's not that severe as the CJK languages are never isolated and appear embedded in a lot of ASCII. You can read here a case study [1] which shows 106% for Simplified Chinese, 76% for Traditional Chinese, 129% for Japanese and 94% for Korean. These numbers for pure text.Publish it on the web embedded in bloated html and there goes the size advantage of UTF-16 [...]
Dec 02 2017
On Friday, 1 December 2017 at 23:16:45 UTC, H. S. Teoh wrote:On Fri, Dec 01, 2017 at 03:04:44PM -0800, Walter Bright via Digitalmars-d wrote:Yep, that's why five years back many of the major Chinese sites were still not using UTF-8: http://xahlee.info/w/what_encoding_do_chinese_websites_use.html That led that Chinese guy to also rant against UTF-8 a couple years ago: http://xahlee.info/comp/unicode_utf8_encoding_propaganda.html Considering China buys more smartphones than the US and Europe combined, it's time people started recognizing their importance when it comes to issues like this: https://www.statista.com/statistics/412108/global-smartphone-shipments-global-region/ Regarding the unique representation issue Jonathan brings up, I've heard people say that was to provide an easier path for legacy encodings, ie some used combining characters and others didn't, so Unicode chose to accommodate both so both groups would move to Unicode. It would be nice if the Unicode people spent their time pruning and regularizing what they have, rather than adding more useless stuff. Speaking of which, completely agree with Walter and Jonathan that there's no need to add emoji and other such symbols to Unicode, should have never been added. Unicode is supposed to standardize long-existing characters, not promote marginal new symbols to characters. If there's a real need for it, chat software will figure out a way to do it, no need to add such symbols to the Unicode character set.On 11/30/2017 9:23 AM, Kagamin wrote:This is not true in Asia, esp. where the CJK block is extensively used. A CJK block character is 3 bytes in UTF-8, meaning that string sizes are 150% of the UCS2 encoding. If your code contains a lot of CJK text, that's a lot of bloat.On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:"was" :-) Those are pretty much pre-surrogate pair designs, or based on them (Dart compiles to JavaScript, for example). UCS2 has serious problems: 1. Most strings are in ascii, meaning UCS2 doubles memory consumption. Strings in the executable file are twice the size.Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.UCS2 was awesome. UTF-16 is used by Java, JavaScript, Objective-C, Swift, Dart and ms tech, which is 28% of tiobe index.
Dec 02 2017
On Saturday, 2 December 2017 at 22:16:09 UTC, Joakim wrote:On Friday, 1 December 2017 at 23:16:45 UTC, H. S. Teoh wrote:Summary Taiwan sites almost all use UTF-8. Very old ones still use BIG5. Mainland China sites mostly still use GBK or GB2312, but a few newer ones use UTF-8. Many top Japan, Korea, sites also use UTF-8, but some uses EUC (Extended Unix Code) variants. This probably means that UTF-8 might dominate in the future. mmmhOn Fri, Dec 01, 2017 at 03:04:44PM -0800, Walter Bright via Digitalmars-d wrote:Yep, that's why five years back many of the major Chinese sites were still not using UTF-8: http://xahlee.info/w/what_encoding_do_chinese_websites_use.htmlOn 11/30/2017 9:23 AM, Kagamin wrote:This is not true in Asia, esp. where the CJK block is extensively used. A CJK block character is 3 bytes in UTF-8, meaning that string sizes are 150% of the UCS2 encoding. If your code contains a lot of CJK text, that's a lot of bloat.On Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:"was" :-) Those are pretty much pre-surrogate pair designs, or based on them (Dart compiles to JavaScript, for example). UCS2 has serious problems: 1. Most strings are in ascii, meaning UCS2 doubles memory consumption. Strings in the executable file are twice the size.Be aware Microsoft is alone in thinking that UTF-16 was awesome. Everybody else standardized on UTF-8 for Unicode.UCS2 was awesome. UTF-16 is used by Java, JavaScript, Objective-C, Swift, Dart and ms tech, which is 28% of tiobe index.That led that Chinese guy to also rant against UTF-8 a couple years ago: http://xahlee.info/comp/unicode_utf8_encoding_propaganda.htmlA rant from someone reproaching a video it doesn't provide reasons why utf-8 is good by not providing any reasons why utf-8 is bad. I'm not denying the issues with utf-8, only that the ranter doesn't provide any useful info on what the issues the "Asian" encounter with it, besides legacy reasons (which are important but do not enter in judging the technical quality of an encoding). Add to that that he advocates for GB18030 which is quite inferior to utf-8 except in the legacy support area (here some of the advantages of utf-8 that GB-18030 does not possess: auto-synchronization, algorithmic mapping of codepoints, error detection). If his only beef with utf-8 is the size for CJK text then he shouldn't argue for UTF-32 as he seems to do at the end.
Dec 03 2017
On 12/2/17 5:16 PM, Joakim wrote:Yep, that's why five years back many of the major Chinese sites were still not using UTF-8: http://xahlee.info/w/what_encoding_do_chinese_websites_use.html That led that Chinese guy to also rant against UTF-8 a couple years ago: http://xahlee.info/comp/unicode_utf8_encoding_propaganda.htmlBTW has anyone been in contact with Xah Lee? Perhaps we could commission him to write some tutorial material for D. -- Andrei
Dec 04 2017
On Monday, 4 December 2017 at 21:23:51 UTC, Andrei Alexandrescu wrote:On 12/2/17 5:16 PM, Joakim wrote:I traded email with him last summer, emailed you his email address just now.Yep, that's why five years back many of the major Chinese sites were still not using UTF-8: http://xahlee.info/w/what_encoding_do_chinese_websites_use.html That led that Chinese guy to also rant against UTF-8 a couple years ago: http://xahlee.info/comp/unicode_utf8_encoding_propaganda.htmlBTW has anyone been in contact with Xah Lee? Perhaps we could commission him to write some tutorial material for D. -- Andrei
Dec 04 2017
On Tuesday, 28 November 2017 at 03:01:33 UTC, A Guy With an Opinion wrote:- Some of the errors from DMD are a little strange.Yes, indeed, and many of them don't help much in finding the real source of your problem. I think improvements to dmd's error- ...however, where are all of the collections? No Queue? No Stack? No HashTable?I always say "meh" to that because any second year student can slap those together in... well, for a second year student, maybe a couple hours for the student, but after that you're looking at just a few minutes, especially leveraging D's built in arrays and associative arrays as your foundation. Sure, they'd be nice to have, but it isn't a dealbreaker in the slightest. Try turning Dictionary<string, string> into D's string[string], for example.Sometimes I want where it is physically on my file system to be different from how I include it in other source files.This is a common misconception, though one promoted by several of the tools: you don't actually need to match file system layout to modules. OK, sure, D does require one module == one file. But the file name and location is not actually tied to the import name you use in code. They can be anything, you just need to pass the list of files to the compiler so it can parse them and figure out the names.- Attributes. I had another post in the Learn forum about attributes which was unfortunate.Yeah, of course, from my post there you know my basic opinion on them. I've written in more detail about them elsewhere and don't feel like it tonight, but I think they are a big failure right now.... but they could be fixed if we're willing to take a few are)- Immutable. I'm not sure I fully understand it. On the surface it seemed like const but transitive.const is transitive too. So the difference is really that `const` means YOU won't change it, whereas `immutable` means NOBODY will change it. What's important there is that to make something immutable, you need to prove to the compiler's satisfaction that nobody else can change it either. const/immutable in D isn't as common as in its family of languages (C++ notably), but when you do get to use it - at least once you get to know it - it is useful.I was returning a value type, so I don't see why passing in assert(object.errorCount == 0) would have triggered errors.Was the object itself immutable? I suspect you wrote something like this: immutable int errorCount() { return ...; } But this is a curious syntax... the `immutable` there actually applies to the *object*, not the return value! It means you can call this method on an immutable object (in fact, it means you MUST call it on an immutable object. const is the middle ground that allows you to call it on either) immutable(int) errorCount() { return ...; } note the parens, is how you apply it to the return value. Yes, this is kinda weird, and style guides tend to suggest putting the qualifiers after the argument list for the `this` thing instead of before... but the language allows it before, so it trips up a LOT of people like this.The type string seems to be an immutable(char[]) which works exactly the way I was expecting,It is actually `immutable(char)[]`. The parens are important here - it applies to the contents of the array, but not the array itself here.+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states:Note that it has UTF-16 built in as well, with almost equal support. Put `w` at the end of a literal: `"this literal is UTF-16"w` // notice the w after the " and you get utf16. It considers that to be `wstring` instead of `string`, but it works basically the same. If you are doing a lot of Windows API work, this is pretty useful!That was cool, and I'm not even sure if that is what they were really meant to enable.yes, indeed. plugging my book https://www.packtpub.com/application-development/d-cookbook i talk about much of this stuff in there
Nov 27 2017
On Tuesday, 28 November 2017 at 04:24:46 UTC, Adam D. Ruppe wrote:immutable(int) errorCount() { return ...; }I actually did try something like that, because I remembered seeing the parens around the string definition. I think at that point I was just so riddled with errors I just took a step back and went back to something I know. Just to make sure I wasn't going insane.
Nov 27 2017
On Tuesday, 28 November 2017 at 03:01:33 UTC, A Guy With an Opinion wrote:+ D code so far is pushing me towards more "flat" code (for a lack of a better way to phrase it) and so far that has helped opposite. With it's namespace -> class -> method coupled with lock, using, etc...you tend to do a lot of nesting. You are generally 3 '{' in before any true logic even begins. Then couple that with try/catch, IDisposable/using, locking, and then if/else, it can get quite chaotic very easily. So right translated it and I think it has to do with the flatness. I'm not sure if that opinion will hold when I delve into 'static if' a little more, but so far my uses of it haven't really dampened that opinion.well, but never really consciously though about it, until you mentioned it :-)- Some of the errors from DMD are a little strange. I don't want to crap on this too much, because for the most part it's fine. However occasionally it throws errors I still can't really work out why THAT is the error it gave me. Some of you may have saw my question in the "Learn" forum about not knowing to use static in an embedded class, but the error was the following: Error: 'this' is only defined in non-static member functionsPlease submit things like this to the issue tracker. They are very easy to fix, and if I'm aware of them, I'll probably do the work. But, please provide a code example and offer a suggestion of what you would prefer it to say; it just makes things easier.- Modules. I like modules better than #include, but I don't like how there is this gravity that kind of pulls me to associate a module with a file. It appears you don't have to, because I can do the package thing, but whenever I try to do things outside that one idiom I end up in a soup of errors. I'm sure I'm just not use to it, but so far it's been a little dissatisfying. Sometimes I want where it is physically on my file system to be different from how I include it in other beat or meet.I feel the same. I don't like that modules are tied to files; it seems like such an arbitrary limitation. We're not alone: https://youtu.be/6_xdfSVRrKo?t=353- Attributes. I had another post in the Learn forum about attributes which was unfortunate. At first I was excited because it seems like on the surface it would help me write better code, but it gets a little tedious and tiresome to have to remember to decorate code with them. It seems like most of them should have been the defaults. I would have preferred if the compiler helped me and reminded me. I asked if there was a way to enforce them globally, which I guess there is, but I guess there's also not a way to turn some of them off afterwards. A bit unfortunate. But at least I can see some solutions to this.Yep. One of my pet peeves in D.- The defaults for primitives seem off. They seem to encourage errors. I don't think that is the best design decision even if it encourages the errors to be caught as quickly as possible. I think the better decision would be to not have the errors occur. When I asked about this, there seemed to be a disassociation between the spec and the implementation. The spec says a declaration should error if not explicitly set, but the implementation just initializes them to something that is likely to error. Like NaN for floats which I would have thought would have been 0 based on prior experiences with other languages.Another one of my pet peeves in D. Though this post (http://forum.dlang.org/post/tcldaatzzbhjoamnvniu forum.dlang.org) made me realize we might be able to do something about that.+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'."See http://utf8everywhere.org/+ Templates seem powerful. I've only fiddled thus far, but I don't think I've quite comprehended their usefulness yet. It will probably take me some time to figure out how to wield them effectively. One thing I accidentally stumbled upon that I liked was that I could simulate inheritance in structs with them, by using the mixin keyword. That was cool, and I'm not even sure if that is what they were really meant to enable.Templates, CTFE, and mixins are gravy! and D's the only language I know of that has this symbiotic feature set.So those are just some of my thoughts. Tell me why I'm wrong :PI share much of your perspective. Thanks for the interesting read. Mike
Nov 27 2017
On Tuesday, 28 November 2017 at 04:37:04 UTC, Michael V. Franklin wrote:Please submit things like this to the issue tracker. They are very easy to fix, and if I'm aware of them, I'll probably do the work. But, please provide a code example and offer a suggestion of what you would prefer it to say; it just makes things easier.>I'd be happy to submit an issue, but I'm not quite sure I'd be the best to determine an error message (at least not this early). Mainly because I have no clue what it was yelling at me about. I only new to add static because I told people my intentions and they suggested it. I guess having a non statically marked class is a valid feature imported from Java world. I'm just not as familiar with that specific feature of Java. Therefore I have no idea what the text really had to do with anything. Maybe appending "if you meant to make a static class" would have been helpful. I fiddled with Rust a little too, and it's what they tend to do very well. Make verbose error messages.We're not alone: https://youtu.be/6_xdfSVRrKo?t=353And he was so much better at articulating it than I was. Another
Nov 27 2017
On Tuesday, 28 November 2017 at 04:48:57 UTC, A Guy With an Opinion wrote:I'd be happy to submit an issue, but I'm not quite sure I'd be the best to determine an error message (at least not this early). Mainly because I have no clue what it was yelling at me about. I only new to add static because I told people my intentions and they suggested it. I guess having a non statically marked class is a valid feature imported from Java world.If this was on the forum, please point me to it. I'll see if I can understand what's going on and do something about it. Thanks, Mike
Nov 27 2017
On Tuesday, 28 November 2017 at 05:16:54 UTC, Michael V. Franklin wrote:On Tuesday, 28 November 2017 at 04:48:57 UTC, A Guy With an Opinion wrote:https://forum.dlang.org/thread/vcvlffjxowgdvpvjsijq forum.dlang.orgI'd be happy to submit an issue, but I'm not quite sure I'd be the best to determine an error message (at least not this early). Mainly because I have no clue what it was yelling at me about. I only new to add static because I told people my intentions and they suggested it. I guess having a non statically marked class is a valid feature imported from Java world.If this was on the forum, please point me to it. I'll see if I can understand what's going on and do something about it. Thanks, Mike
Nov 27 2017
On 11/27/17 10:01 PM, A Guy With an Opinion wrote:Hi,Hi Guy, welcome, and I wanted to say I was saying "me too" while reading years, and the biggest thing I agree with you on is the generic programming. I was also using D at the time, and using generics felt like eating a superbly under-baked cake. A few points:- Some of the errors from DMD are a little strange. I don't want to crap on this too much, because for the most part it's fine. However occasionally it throws errors I still can't really work out why THAT is the error it gave me. Some of you may have saw my question in the "Learn" forum about not knowing to use static in an embedded class, but the error was the following: Error: 'this' is only defined in non-static member functionsYes, this is simply a bad error message. Many of our bad error messages come from something called "lowering", where one piece of code is converted to another piece of code, and then the error message happens on the converted code. So essentially you are getting errors on code you didn't write! They are more difficult to fix, since we can't change the real error message (it applies to real code as well), and the code that generated the lowered code is decoupled from the error. I think this is one of those cases.I'd say the errors so far are above some of the cryptic stuff C++ can throw at you (however, I haven't delved that deeply into D templates yet, so don't hold me to this yet), but in terms of quality I'd put itOnce you use templates a lot, the error messages explode in cryptology :) But generally, you can get the gist of your errors if you can decipher half-way the mangling.- ...however, where are all of the collections? No Queue? No Stack? No HashTable? I've read that it's not a big focus because some of the built not looking forward to having to hand roll my own or use something that aren't fundamentally them. This is definitely the biggest negative I've come across. I want a queue, not something that *can* behave as a queue. I definitely expected more from a language that is this old.I haven't touched this in years, but it should still work pretty well (if you try it and it doesn't compile for some reason, please submit an issue there): https://github.com/schveiguy/dcollections interface hierarchy. That being said, Queue is just so easy to implement given a linked list, I never bothered :)+ Unit tests. Finally built in unit tests. Enough said here. If the lack of collections was the biggest negative, this is the biggest positive. I would like to enable them at build time if possible though.+1000 About the running of unit tests at build time, many people version their main function like this: version(unittest) void main() {} else int main(string[] args) // real declaration { ... } This way, when you build with -unittest, you only run unit tests, and exit immediately. So enabling them at build time is quite easy.- Attributes. I had another post in the Learn forum about attributes which was unfortunate. At first I was excited because it seems like on the surface it would help me write better code, but it gets a little tedious and tiresome to have to remember to decorate code with them. It seems like most of them should have been the defaults. I would have preferred if the compiler helped me and reminded me. I asked if there was a way to enforce them globally, which I guess there is, but I guess there's also not a way to turn some of them off afterwards. A bit unfortunate. But at least I can see some solutions to this.If you are using more templates (and I use them the more I write D code), you will not have this problem. Templates infer almost all attributes.- Immutable. I'm not sure I fully understand it. On the surface it seemed like const but transitive. I tried having a method return an immutable value, but when I used it in my unit test I got some weird errors about objects not being able to return immutable (I forget the exact error...apologies). I refactored to use const, and it all worked as I expected, but I don't get why the immutable didn't work. I was returning a value type, so I don't see why passing in assert(object.errorCount == 0) would have triggered errors. But it did.This is likely because of Adam's suggestion -- you were incorrectly declaring a function that returned an immutable like this: immutable T foo(); Where the immutable *doesn't* apply to the return value, but to the function itself. immutable applied to a function is really applying immutable to the 'this' reference.+ Templates seem powerful. I've only fiddled thus far, but I don't think I've quite comprehended their usefulness yet. It will probably take me some time to figure out how to wield them effectively. One thing I accidentally stumbled upon that I liked was that I could simulate inheritance in structs with them, by using the mixin keyword. That was cool, and I'm not even sure if that is what they were really meant to enable.Templates and generative programming is what hooks you on D. You will be spoiled when you work on other languages :) -Steve
Nov 28 2017
On Tuesday, 28 November 2017 at 13:17:16 UTC, Steven Schveighoffer wrote:This is likely because of Adam's suggestion -- you were incorrectly declaring a function that returned an immutable like this:immutable T foo(); -SteveThat's exactly what it was I think. As I stated before, I tried to do immutable(T) but I was drowning in errors at that point that I just took a step back. I'll try to refactor it back to using immutable. I just honestly didn't quite know what I was doing obviously.
Nov 28 2017
On Tuesday, 28 November 2017 at 13:17:16 UTC, Steven Schveighoffer wrote:https://github.com/schveiguy/dcollectionsOn Tuesday, 28 November 2017 at 03:37:26 UTC, rikki cattermole wrote:https://github.com/economicmodeling/containersThanks. I'll check both out. It's not that I don't want to write them, it's just I don't want to stop what I'm doing when I need them and write them. It takes me out of my thought process.
Nov 28 2017
On Tuesday, 28 November 2017 at 03:01:33 UTC, A Guy With an Opinion wrote:So those are just some of my thoughts. Tell me why I'm wrong :PYou are not supposed to come to this forum with well-balanced opinions and reasonable arguments. It's not colourful enough to be heard! Instead make a dent in the universe. Prepare your most impactful, most offensive statements to push your personal agenda of what your own system programming language would be like, if you had the stamina. Use doubtful analogies and references to languages with wildly different goals than D. Prepare to abuse the volunteers, and say how much you would dare to use D, if only it would do "just this one obvious change". Having this feature would make the BlobTech industry switch to D overnight! And you haven't asked for any new feature, especially no new _syntax_ were demanded! I don't know, find anything: "It would be nice to have a shortcut syntax for when you wan't to add zero. Writing 0 + x is cumbersome, when +x would do it. It has the nice benefit or unifying unary and binary operators, and thus leads to a simplified implementation." Do you realize the dangers of looking satisfied?
Nov 28 2017
On Tuesday, 28 November 2017 at 03:01:33 UTC, A Guy With an Opinion wrote:- Attributes. I had another post in the Learn forum about attributes which was unfortunate. At first I was excited because it seems like on the surface it would help me write better code, but it gets a little tedious and tiresome to have to remember to decorate code with them. It seems like most of them should have been the defaults. I would have preferred if the compiler helped me and reminded me. I asked if there was a way to enforce them globally, which I guess there is, but I guess there's also not a way to turn some of them off afterwards. A bit unfortunate. But at least I can see some solutions to this.Attributes were one of my biggest hurdles when working on my own projects. For example, it's a huge PITA when you have to add a debug writeln deep down in your call stack, and it ends up violating a bunch of function attributes further up. Thankfully, wrapping statements in debug {} allows you to ignore pure and safe violations in that code if you compile with the flag -debug. Also, you can apply attributes to your whole project by adding them to main void main(string[] args) safe {} Although this isn't recommended, as almost no program can be completely safe. You can do it on a per-file basis by putting the attributes at the top like so safe: pure:
Nov 28 2017
On Tuesday, 28 November 2017 at 16:14:52 UTC, Jack Stouffer wrote:You can do it on a per-file basis by putting the attributes at the top like soThat doesn't quite work since it doesn't descend into aggregates. And you can't turn most them off.
Nov 28 2017
On 2017-11-28 17:24, Adam D. Ruppe wrote:That doesn't quite work since it doesn't descend into aggregates. And you can't turn most them off.And if your project is a library. -- /Jacob Carlborg
Nov 28 2017
On Tuesday, 28 November 2017 at 16:24:56 UTC, Adam D. Ruppe wrote:That doesn't quite work since it doesn't descend into aggregates. And you can't turn most them off.I take it adding those inverse attributes is no trivial thing?
Nov 28 2017
On Tuesday, 28 November 2017 at 19:34:27 UTC, A Guy With an Opinion wrote:I take it adding those inverse attributes is no trivial thing?It would require a DIP: https://github.com/dlang/DIPs This DIP is related (https://github.com/dlang/DIPs/blob/master/DIPs/DIP1012.md) but I don't know what's happening with it. Mike
Nov 28 2017
On Tuesday, 28 November 2017 at 19:39:19 UTC, Michael V. Franklin wrote:This DIP is related (https://github.com/dlang/DIPs/blob/master/DIPs/DIP1012.md) but I don't know what's happening with it.It's awaiting formal review. I'll move it forward when the formal review queue clears out a bit.
Nov 28 2017
On Tuesday, 28 November 2017 at 22:08:48 UTC, Mike Parker wrote:On Tuesday, 28 November 2017 at 19:39:19 UTC, Michael V. Franklin wrote:How well does phobos play with it? I'm finding, for instance, it's not playing too well with nothrow. Things throw that I don't understand why.This DIP is related (https://github.com/dlang/DIPs/blob/master/DIPs/DIP1012.md) but I don't know what's happening with it.It's awaiting formal review. I'll move it forward when the formal review queue clears out a bit.
Nov 29 2017
On Tuesday, 28 November 2017 at 19:34:27 UTC, A Guy With an Opinion wrote:I take it adding those inverse attributes is no trivial thing?Technically, it is extremely trivial. Politically, that's a different matter. There's been arguments before about the words or the syntax (is it " gc" or " nogc(false)", for example? tbh i think the latter is kinda elegant, but the former works too, i just want something that work) and the process (so much paperwork!) and all kinds of nonsense.
Nov 28 2017
On Tuesday, 28 November 2017 at 16:14:52 UTC, Jack Stouffer wrote:you can apply attributes to your whole project by adding them to main void main(string[] args) safe {} Although this isn't recommended, as almost no program can be completely safe.In fact I believe it is. When you have something unsafe you can manually wrap it with trusted. Same goes with nothrow, since you can catch everything thrown. But putting nogc to main is of course not recommended except in special cases, and pure is competely out of question.
Nov 30 2017
On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'." Seems like the natural fit for me. Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint. Not the biggest issue in the world and maybe I'm just being overly critical here.Sooner or later your code will exhibit bugs if it assumes that char==codepoint with UTF16, because of surrogate pairs. https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32. I recommend using UTF8.
Nov 30 2017
On Thursday, 30 November 2017 at 10:19:18 UTC, Walter Bright wrote:On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:Java, .NET, Qt, Javascript, and a handful of others use UTF-16 too, some starting off with the earlier UCS-2: https://en.m.wikipedia.org/wiki/UTF-16#Usage Not saying either is better, each has their flaws, just pointing out it's more than just Windows.+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'." Seems like the natural fit for me. Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint. Not the biggest issue in the world and maybe I'm just being overly critical here.Sooner or later your code will exhibit bugs if it assumes that char==codepoint with UTF16, because of surrogate pairs. https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32. I recommend using UTF8.
Nov 30 2017
On 11/30/2017 2:39 AM, Joakim wrote:Java, .NET, Qt, Javascript, and a handful of others use UTF-16 too, some starting off with the earlier UCS-2: https://en.m.wikipedia.org/wiki/UTF-16#Usage Not saying either is better, each has their flaws, just pointing out it's more than just Windows.I stand corrected.
Nov 30 2017
On Thursday, November 30, 2017 03:37:37 Walter Bright via Digitalmars-d wrote:On 11/30/2017 2:39 AM, Joakim wrote:I get the impression that the stuff that uses UTF-16 is mostly stuff that picked an encoding early on in the Unicode game and thought that they picked one that guaranteed that a code unit would be an entire character. Many of them picked UCS-2 and then switched later to UTF-16, but once they picked a 16-bit encoding, they were kind of stuck. Others - most notably C/C++ and the *nix world - picked UTF-8 for backwards compatibility, and once it became clear that UCS-2 / UTF-16 wasn't going to cut it for a code unit representing a character, most stuff that went Unicode went UTF-8. Language-wise, I think that most of the UTF-16 is driven by the fact that copying Java and because the Win32 API had gone with UCS-2 / UTF-16). So, that's had a lot of influence on folks, though most others have gone with UTF-8 for backwards compatibility and because it typically takes up less does seem to have resulted in some folks thinking that wide characters means Unicode, and narrow characters meaning ASCII. I really wish that everything would just got to UTF-8 and that UTF-16 would die, but that would just break too much code. And if we were willing to do that, I'm sure that we could come up with a better encoding than UTF-8 (e.g. getting rid of Unicode normalization as being a thing and never having multiple encodings for the same character), but _that_'s never going to happen. - Jonathan M DavisJava, .NET, Qt, Javascript, and a handful of others use UTF-16 too, some starting off with the earlier UCS-2: https://en.m.wikipedia.org/wiki/UTF-16#Usage Not saying either is better, each has their flaws, just pointing out it's more than just Windows.I stand corrected.
Nov 30 2017
On Thursday, 30 November 2017 at 17:56:58 UTC, Jonathan M Davis wrote:On Thursday, November 30, 2017 03:37:37 Walter Bright via Digitalmars-d wrote:I don't think that's true though. Haven't you always been able to combine two codepoints into one visual representation (Ä for example). To me it's still two characters to look for when going through the string, but the UI or text interpreter might choose to combine them. So in certain domains, such as trying to visually represent the character, yes a codepoint is not a character, if by what you mean by character is the visual representation. But what we are referring to as a character can kind of morph depending on context. When you are running through the data though in the algorithm behind the scenes, you care about the *information* therefore the codepoint. And we are really just have a semantics battle if someone calls that a character.On 11/30/2017 2:39 AM, Joakim wrote:I get the impression that the stuff that uses UTF-16 is mostly stuff that picked an encoding early on in the Unicode game and thought that they picked one that guaranteed that a code unit would be an entire character.Java, .NET, Qt, Javascript, and a handful of others use UTF-16 too, some starting off with the earlier UCS-2: https://en.m.wikipedia.org/wiki/UTF-16#Usage Not saying either is better, each has their flaws, just pointing out it's more than just Windows.I stand corrected.Many of them picked UCS-2 and then switched later to UTF-16, but once they picked a 16-bit encoding, they were kind of stuck. Others - most notably C/C++ and the *nix world - picked UTF-8 for backwards compatibility, and once it became clear that UCS-2 / UTF-16 wasn't going to cut it for a code unit representing a character, most stuff that went Unicode went UTF-8.That's only because C used ASCII and thus was a byte. UTF-8 is inline with this, so literally nothing needs to change to get pretty much the same behavior. It makes sense. With this this in mind, it actually might make sense for D to use it.
Nov 30 2017
On Thursday, 30 November 2017 at 17:56:58 UTC, Jonathan M Davis wrote:On Thursday, November 30, 2017 03:37:37 Walter Bright via Digitalmars-d wrote: Language-wise, I think that most of the UTF-16 is driven by the (both because they were copying Java and because the Win32 API had gone with UCS-2 / UTF-16). So, that's had a lot of influence on folks, though most others have gone with UTF-8 for backwards compatibility and because it typically takes up less space for non-Asian text. But the use of UTF-16 in Windows, that wide characters means Unicode, and narrow characters meaning ASCII.- Jonathan M DavisI think it also simplifies the logic. You are not always looking to represent the codepoints symbolically. You are just trying to see what information is in it. Therefore, if you can practically treat a codepoint as the unit of data behind the scenes, it simplifies the logic.
Nov 30 2017
On Thursday, November 30, 2017 18:32:46 A Guy With a Question via Digitalmars-d wrote:On Thursday, 30 November 2017 at 17:56:58 UTC, Jonathan M Davis wrote:Even if that were true, UTF-16 code units are not code points. If you want to operate on code points, you have to go to UTF-32. And even if you're at UTF-32, you have to worry about Unicode normalization, otherwise the same information can be represented differently even if all you care about is code points and not graphemes. And of course, some stuff really does care about graphemes, since those are the actual characters. Ultimately, you have to understand how code units, code points, and graphemes work and what you're doing with a particular algorithm so that you know at which level you should operate at and where the pitfalls are. Some code can operate on code units and be fine; some can operate on code points; and some can operate on graphemes. But there is no one-size-fits-all solution that makes it all magically easy and efficient to use. And UTF-16 does _nothing_ to improve any of this over UTF-8. It's just a different way to encode code points. And really, it makes things worse, because it usually takes up more space than UTF-8, and it makes it easier to miss when you screw up your Unicode handling, because more UTF-16 code units are valid code points than UTF-8 code units are, but they still aren't all valid code points. So, if you use UTF-8, you're more likely to catch your mistakes. Honestly, I think that the only good reason to use UTF-16 is if you're interacting with existing APIs that use UTF-16, and even then, I think that in most cases, you're better off using UTF-8 and converting to UTF-16 only when you have to. Strings eat less memory that way, and mistakes are more easily caught. And if you're writing cross-platform code in D, then Windows is really the only place that you're typically going to have to deal with UTF-16, so it definitely works better in general to favor UTF-8 in D programs. But regardless, at least D gives you the tools to deal with the different Unicode encodings relatively cleanly and easily, so you can use whichever Unicode encoding you need to. Most D code is going to use UTF-8 though. - Jonathan M DavisOn Thursday, November 30, 2017 03:37:37 Walter Bright via Digitalmars-d wrote: Language-wise, I think that most of the UTF-16 is driven by the (both because they were copying Java and because the Win32 API had gone with UCS-2 / UTF-16). So, that's had a lot of influence on folks, though most others have gone with UTF-8 for backwards compatibility and because it typically takes up less space for non-Asian text. But the use of UTF-16 in Windows, that wide characters means Unicode, and narrow characters meaning ASCII. - Jonathan M DavisI think it also simplifies the logic. You are not always looking to represent the codepoints symbolically. You are just trying to see what information is in it. Therefore, if you can practically treat a codepoint as the unit of data behind the scenes, it simplifies the logic.
Nov 30 2017
On 11/30/2017 9:56 AM, Jonathan M Davis wrote:I'm sure that we could come up with a better encoding than UTF-8 (e.g. getting rid of Unicode normalization as being a thing and never having multiple encodings for the same character), but _that_'s never going to happen.UTF-8 is not the cause of that particular problem, it's caused by the Unicode committee being a committee. Other Unicode problems are caused by the committee trying to add semantic information to code points, which causes nothing but problems. I.e. the committee forgot that Unicode is a character set, and nothing more.
Dec 01 2017
On Friday, December 01, 2017 15:54:31 Walter Bright via Digitalmars-d wrote:On 11/30/2017 9:56 AM, Jonathan M Davis wrote:Oh, definitely. UTF-8 is arguably the best that Unicode has, but Unicode in general is what's broken, because the folks designing it made poor choices. And personally, I think that their worst decisions tend to be at the code point level (e.g. having the same character being representable by different combinations of code points). Quite possbily the most depressing thing that I've run into with Unicode though was finding out that emojis had their own code points. Emojis are specifically representable by a sequence of existing characters (usually ASCII), because they came from folks trying to represent pictures with text. The fact that they're then trying to put those pictures into the Unicode standard just blatantly shows that the Unicode folks have lost sight of what they're up to. It's like if they started trying to add Unicode characters for words. It makes no sense. But unfortunately, we just have to live with it... :( - Jonathan M DavisI'm sure that we could come up with a better encoding than UTF-8 (e.g. getting rid of Unicode normalization as being a thing and never having multiple encodings for the same character), but _that_'s never going to happen.UTF-8 is not the cause of that particular problem, it's caused by the Unicode committee being a committee. Other Unicode problems are caused by the committee trying to add semantic information to code points, which causes nothing but problems. I.e. the committee forgot that Unicode is a character set, and nothing more.
Dec 01 2017
On 12/1/2017 8:08 PM, Jonathan M Davis wrote:And personally, I think that their worst decisions tend to be at the code point level (e.g. having the same character being representable by different combinations of code points).Yup. I've presented that point of view a couple times on HackerNews, and some Unicode people took umbrage at that. The case they presented fell a little flat.Quite possbily the most depressing thing that I've run into with Unicode though was finding out that emojis had their own code points. Emojis are specifically representable by a sequence of existing characters (usually ASCII), because they came from folks trying to represent pictures with text. The fact that they're then trying to put those pictures into the Unicode standard just blatantly shows that the Unicode folks have lost sight of what they're up to. It's like if they started trying to add Unicode characters for words. It makes no sense. But unfortunately, we just have to live with it... :(Yah, I've argued against that, too. And those "international" icons are arguably one of the dumber ideas to ever sweep the world, yet they seem to be celebrated without question. Have you ever tried to look up an icon in a dictionary? It doesn't work. So if you don't know what an icon means, you're hosed. If it is a word you don't understand, you can look it up in a dictionary. Furthermore, you don't need to know English to know what "ON" means. There is no more cognitive difficulty asking someone what "ON" means than there is asking what "|" means. Is an illiterate person from XxLand really going to understand that "|" means "ON" without help? My car has a bunch emoticons labeling the controls. I can't figure out what any of them do without reading the manual, or just pushing random buttons until what I want happens. One button has an icon on it that looks like a snowflake. What does that do? Turn on the A/C? Defrost the frosty windows? Set the AWD in slippery mode? Turn on the Christmas lights? On my pre-madness truck, they're labeled in English. Never had any trouble with that. Part of the problem I've seen is that people do things like "vote for my emoji/icon and I'll vote for yours!" And then when they get something accepted, they wear it as a badge of status and write articles saying how you, too, can get your whatever accepted as an icon. It's madness, madness I say!
Dec 02 2017
On Saturday, 2 December 2017 at 10:20:10 UTC, Walter Bright wrote:On 12/1/2017 8:08 PM, Jonathan M Davis wrote:Where it gets really fun is the when there is color composition for emoticons U+1F466 = 👦 U+1F466 U+1F3FF = 👦🏿[...]Yup. I've presented that point of view a couple times on HackerNews, and some Unicode people took umbrage at that. The case they presented fell a little flat. [...]
Dec 02 2017
On Sat, Dec 02, 2017 at 02:20:10AM -0800, Walter Bright via Digitalmars-d wrote: [...]My car has a bunch emoticons labeling the controls. I can't figure out what any of them do without reading the manual, or just pushing random buttons until what I want happens. One button has an icon on it that looks like a snowflake. What does that do? Turn on the A/C? Defrost the frosty windows? Set the AWD in slippery mode? Turn on the Christmas lights?The same can be argued for the icon mania started by the GUI craze in the 90's that has now become the de facto standard. Some icons are more obvious than others, but nowadays GUI toolbars are full of inscrutible icons of unclear meaning that are basically opaque unless you already have prior knowledge of what they're supposed to represent. Thankfully most(?) GUI programs have enough sanity left to provide tooltips with textual labels for what each button means. Still, it betrays the emperor's invisible clothes of the "graphics == intuitive" mantra -- you still have to learn the icons just like you have to learn the keywords of a text-based UI, before you can use the software effectively. Reminds me also of the infamous Mystery Meat navigation style of the 90's, where people would use images for navigation weblinks on their website, that you basically don't know where they're linking to until you click on it. This is why I think GUIs and the whole "desktop metaphor" craze is heading the wrong direction, and why 95% of my computer usage is via a text terminal. There's a place for graphical interfaces, but it's gone too far these days. But thanks to Unicode emoticons, we can now have icons on my text terminal too, isn't that just wonderful?! Esp. when a missing/incompatible font causes them to show up as literal blank boxes. The power of a standardized, universal character set, lemme tell ya! T -- Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Dec 02 2017
On 12/2/2017 5:59 PM, H. S. Teoh wrote:[...]Even worse, companies go and copyright their icons, guaranteeing they have to be substantially different for every company! If there ever was an Emperor's New Clothes, it's icons and emojis.
Dec 02 2017
On 12/2/17 11:28 PM, Walter Bright wrote:On 12/2/2017 5:59 PM, H. S. Teoh wrote:I like this site for icons. Only requires you to reference them in your about box: https://icons8.com/ -Steve[...]Even worse, companies go and copyright their icons, guaranteeing they have to be substantially different for every company!
Dec 04 2017
On Sunday, 3 December 2017 at 01:59:58 UTC, H. S. Teoh wrote:Still, it betrays the emperor's invisible clothes of the "graphics == intuitive" mantra -- you still have to learn the icons just like you have to learn the keywords of a text-based UI, before you can use the software effectively.What happened when you ran vi for the first time?
Dec 04 2017
On Saturday, 2 December 2017 at 04:08:54 UTC, Jonathan M Davis wrote:The fact that they're then trying to put those pictures into the Unicode standard just blatantly shows that the Unicode folks have lost sight of what they're up to. It's like if they started trying to add Unicode characters for words. It makes no sense. But unfortunately, we just have to live with it... :( - Jonathan M DavisThe real problem, is that sometimes people don't feel like a little cat with a smiling face. Sometimes, people actually get pissed off at something, and would like to express it. Do the people on the unicode consortium consider such communication to be invalid? Where are the emoji's for saying.. I'm pissed off at this..or that.. (unicode consortium == emoji censorship) https://www.google.com.au/search?q=fuck+you+emoticon&source=lnms&tbm=isch&sa=X&ved=0ahUKEwiWkMzMpOvXAhWIj5QKHVnGC5YQ_AUICigB&biw=1536&bih=736
Dec 02 2017
On Saturday, 2 December 2017 at 12:25:22 UTC, codephantom wrote:Do the people on the unicode consortium consider such communication to be invalid?https://splinternews.com/violent-emoji-are-starting-to-get-people-in-trouble-wit-1793845130 On the other hand try to google "emoji sexual"…
Dec 02 2017
On Saturday, 2 December 2017 at 16:44:56 UTC, Ola Fosheim Grøstad wrote:On Saturday, 2 December 2017 at 12:25:22 UTC, codephantom wrote:No. Humans never express negative emotions, and also, never communicate a desire to have sex. That's explains a lot about the unicode consortium. 's', 'e', 'x' is ok, just not together. Q.What's the difference between a politician and an emoji? A.Nothing. You cannot take either at face value. ..oophs. politics again. I should know better. but my wider point is, unicode emoji's are useless if they only contain those that 'some' consider to be polictically correct, or socially acceptable. The Unicode consortium is a bunch of ... (I don't have the unicode emoji representation yet to complete that sentence).Do the people on the unicode consortium consider such communication to be invalid?https://splinternews.com/violent-emoji-are-starting-to-get-people-in-trouble-wit-1793845130 On the other hand try to google "emoji sexual"…
Dec 02 2017
On Sunday, 3 December 2017 at 01:11:14 UTC, codephantom wrote:but my wider point is, unicode emoji's are useless if they only contain those that 'some' consider to be polictically correct, or socially acceptable. The Unicode consortium is a bunch of ... (I don't have the unicode emoji representation yet to complete that sentence).btw. Good article here, further demonstrating my point.. "We're talking about engineers that are concerned about standards and internationalization issues who now have to do something more in line with Apple or Google's marketing teams,". https://www.buzzfeed.com/charliewarzel/thanks-to-apples-influence-youre-not-getting-a-rifle-emoji
Dec 02 2017
On Saturday, 2 December 2017 at 04:08:54 UTC, Jonathan M Davis wrote:code points. Emojis are specifically representable by a sequence of existing characters (usually ASCII), because they came from folks trying to represent pictures with text.They are used as symbols culturally, which is how written language happen, so I think the real question is if they have just implemented the ones that have become widespread over a long period of time or if they have deliberately created completely new ones... It makes sense for the most used ones. E.g. I don't want "8-(3+4)" to render as "😳3+4" ;-) There is also a difference between Ø and ∅, because the meaning is different. Too bad the same does not apply to arrows (math vs non math usage). So yeah, they could do better, but not too bad. If something is widely used in a way that gives signs a different meaning then it makes sense to introduce a new symbol for it so that one both can render them slightly differently and so that the programs can interpret them correctly.
Dec 02 2017
On Thursday, 30 November 2017 at 10:19:18 UTC, Walter Bright wrote:On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:I assume you meant UTF32 not UCS32, given UCS2 is Microsoft's half-assed UTF16.[...]Sooner or later your code will exhibit bugs if it assumes that char==codepoint with UTF16, because of surrogate pairs. https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32. I recommend using UTF8.
Nov 30 2017
On 11/30/2017 2:47 AM, Nicholas Wilson wrote:I meant UCS-4, which is identical to UTF-32. It's hard keeping all that stuff straight. Sigh. https://en.wikipedia.org/wiki/UTF-32As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32.I assume you meant UTF32 not UCS32, given UCS2 is Microsoft's half-assed UTF16.
Nov 30 2017
On Thursday, 30 November 2017 at 11:41:09 UTC, Walter Bright wrote:On 11/30/2017 2:47 AM, Nicholas Wilson wrote:It's also worth mentioning that the more I think about it, the UTF8 vs. UTF16 thing was probably not worth mentioning with the rest of the things I listed out. It's pretty minor and more of a preference.I meant UCS-4, which is identical to UTF-32. It's hard keeping all that stuff straight. Sigh. https://en.wikipedia.org/wiki/UTF-32As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32.I assume you meant UTF32 not UCS32, given UCS2 is Microsoft's half-assed UTF16.
Nov 30 2017
On 11/30/2017 5:22 AM, A Guy With a Question wrote:It's also worth mentioning that the more I think about it, the UTF8 vs. UTF16 thing was probably not worth mentioning with the rest of the things I listed out. It's pretty minor and more of a preference.Both Windows and Java selected UTF16 before surrogates were added, so it was a reasonable decision made in good faith. But an awful lot of Windows/Java code has latent bugs in it because of not dealing with surrogates. D is designed from the ground up to work smoothly with UTF8/UTF16 multi-codeunit encodings. If you do decide to use UTF16, please take advantage of this and deal with surrogates correctly. When you do decide to give up on UTF16 (!) and go with UTF8, your code will be easy to convert to UTF8.
Nov 30 2017
On Thursday, 30 November 2017 at 10:19:18 UTC, Walter Bright wrote:On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:As long as you understand it's limitations I think most bugs can be avoided. Where UTF16 breaks down, is pretty well defined. Also, super rare. I think UTF32 would be great to, but it seems like just a waste of space 99% of the time. UTF8 isn't horrible, I am not going to never use D because it uses UTF8 (that would be silly). Especially when wstring also seems baked into the language. However, it can complicate code because you pretty much always have to assume character != codepoint outside of ASCII. I can see a reasonable person arguing that it forcing you assume character != code point is actually a good thing. And that is a valid opinion.+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'." Seems like the natural fit for me. Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint. Not the biggest issue in the world and maybe I'm just being overly critical here.Sooner or later your code will exhibit bugs if it assumes that char==codepoint with UTF16, because of surrogate pairs. https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java As far as I can tell, pretty much the only users of UTF16 are Windows programs. Everyone else uses UTF8 or UCS32. I recommend using UTF8.
Nov 30 2017
On Thursday, November 30, 2017 13:18:37 A Guy With a Question via Digitalmars-d wrote:As long as you understand it's limitations I think most bugs can be avoided. Where UTF16 breaks down, is pretty well defined. Also, super rare. I think UTF32 would be great to, but it seems like just a waste of space 99% of the time. UTF8 isn't horrible, I am not going to never use D because it uses UTF8 (that would be silly). Especially when wstring also seems baked into the language. However, it can complicate code because you pretty much always have to assume character != codepoint outside of ASCII. I can see a reasonable person arguing that it forcing you assume character != code point is actually a good thing. And that is a valid opinion.The reality of the matter is that if you want to write fully valid Unicode, then you have to understand the differences between code units, code points, and graphemes, and since it really doesn't make sense to operate at the grapheme level for everything (it would be terribly slow and is completely unnecessary for many algorithms), you pretty much have to come to accept that in the general case, you can't assume that something like a char represents an actual character, regardless of its encoding. UTF-8 vs UTF-16 doesn't change anything in that respect except for the fact that there are more characters which fit fully in a UTF-16 code unit than a UTF-8 code unit, so it's easier to think that you're correctly handling Unicode when you actually aren't. And if you're not dealing with Asian languages, UTF-16 uses up more space than UTF-8. But either way, they're both wrong if you're trying to treat a code unit as a code point, let alone a grapheme. It's just that we have a lot of programmers who only deal with English and thus don't as easily hit the cases where their code is wrong. For better or worse, UTF-16 hides it better than UTF-8, but the problem exists in both. - Jonathan M Davis
Nov 30 2017
On Thursday, 30 November 2017 at 17:40:08 UTC, Jonathan M Davis wrote:[...] And if you're not dealing with Asian languages, UTF-16 uses up more space than UTF-8.Not even that in most cases. Only if you use unstructured text can it happen that UTF-16 needs less space than UTF-8. In most cases, the text is embedded in some sort of ML (html, odf, docx, tmx, xliff, akoma ntoso, etc...) which puts the balance again to the side of UTF-8.
Nov 30 2017
On Thursday, 30 November 2017 at 17:40:08 UTC, Jonathan M Davis wrote:English and thus don't as easily hit the cases where their code is wrong. For better or worse, UTF-16 hides it better than UTF-8, but the problem exists in both.To give just an example of what can go wrong with UTF-16. Reading a file in UTF-16 and converting it tosomething else like UTF-8 or UTF-32. Reading block by block and hitting exactly a SMP codepoint at the buffer limit, high surrogate at the end of the first buffer, low surrogate at the start of the next. If you don't think about it => 2 invalid characters instead of your nice poop 💩 emoji character (emojis are in the SMP and they are more and more frequent).
Nov 30 2017
On 11/30/17 1:20 PM, Patrick Schluter wrote:On Thursday, 30 November 2017 at 17:40:08 UTC, Jonathan M Davis wrote:iopipe handles this: http://schveiguy.github.io/iopipe/iopipe/textpipe/ensureDecodeable.html -SteveEnglish and thus don't as easily hit the cases where their code is wrong. For better or worse, UTF-16 hides it better than UTF-8, but the problem exists in both.To give just an example of what can go wrong with UTF-16. Reading a file in UTF-16 and converting it tosomething else like UTF-8 or UTF-32. Reading block by block and hitting exactly a SMP codepoint at the buffer limit, high surrogate at the end of the first buffer, low surrogate at the start of the next. If you don't think about it => 2 invalid characters instead of your nice poop 💩 emoji character (emojis are in the SMP and they are more and more frequent).
Nov 30 2017
On Thursday, 30 November 2017 at 19:37:47 UTC, Steven Schveighoffer wrote:On 11/30/17 1:20 PM, Patrick Schluter wrote:It was only to give an example. With UTF-8 people who implement the low level code in general think about the multiple codeunits at the buffer boundary. With UTF-16 it's often forgotten. In UTF-16 there are also 2 other common pitfalls, that exist also in UTF-8 but are less consciously acknowledged, overlong encoding and isolated codepoints. So UTF-16 has the same issues as UTF-8, plus some more, endianness and size.On Thursday, 30 November 2017 at 17:40:08 UTC, Jonathan M Davis wrote:iopipe handles this: http://schveiguy.github.io/iopipe/iopipe/textpipe/ensureDecodeable.htmlEnglish and thus don't as easily hit the cases where their code is wrong. For better or worse, UTF-16 hides it better than UTF-8, but the problem exists in both.To give just an example of what can go wrong with UTF-16. Reading a file in UTF-16 and converting it tosomething else like UTF-8 or UTF-32. Reading block by block and hitting exactly a SMP codepoint at the buffer limit, high surrogate at the end of the first buffer, low surrogate at the start of the next. If you don't think about it => 2 invalid characters instead of your nice poop 💩 emoji character (emojis are in the SMP and they are more and more frequent).
Nov 30 2017
On Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter wrote:On Thursday, 30 November 2017 at 19:37:47 UTC, Steven Schveighoffer wrote:Most problems with UTF16 is applicable to UTF8. The only issue that isn't, is if you are just dealing with ASCII it's a bit of a waste of space.On 11/30/17 1:20 PM, Patrick Schluter wrote:It was only to give an example. With UTF-8 people who implement the low level code in general think about the multiple codeunits at the buffer boundary. With UTF-16 it's often forgotten. In UTF-16 there are also 2 other common pitfalls, that exist also in UTF-8 but are less consciously acknowledged, overlong encoding and isolated codepoints. So UTF-16 has the same issues as UTF-8, plus some more, endianness and size.On Thursday, 30 November 2017 at 17:40:08 UTC, Jonathan M Davis wrote:iopipe handles this: http://schveiguy.github.io/iopipe/iopipe/textpipe/ensureDecodeable.htmlEnglish and thus don't as easily hit the cases where their code is wrong. For better or worse, UTF-16 hides it better than UTF-8, but the problem exists in both.To give just an example of what can go wrong with UTF-16. Reading a file in UTF-16 and converting it tosomething else like UTF-8 or UTF-32. Reading block by block and hitting exactly a SMP codepoint at the buffer limit, high surrogate at the end of the first buffer, low surrogate at the start of the next. If you don't think about it => 2 invalid characters instead of your nice poop 💩 emoji character (emojis are in the SMP and they are more and more frequent).
Dec 01 2017
On Friday, 1 December 2017 at 12:21:22 UTC, A Guy With a Question wrote:On Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter wrote:That's what I said. UTF-16 and UTF-8 have the same issues, but UTF-16 has even 2 more: endianness and bloat for ASCII. All 3 encodings have their pluses and minuses, that's why D supports all 3 but with a preference for utf-8.On Thursday, 30 November 2017 at 19:37:47 UTC, Steven Schveighoffer wrote:Most problems with UTF16 is applicable to UTF8. The only issue that isn't, is if you are just dealing with ASCII it's a bit of a waste of space.On 11/30/17 1:20 PM, Patrick Schluter wrote:It was only to give an example. With UTF-8 people who implement the low level code in general think about the multiple codeunits at the buffer boundary. With UTF-16 it's often forgotten. In UTF-16 there are also 2 other common pitfalls, that exist also in UTF-8 but are less consciously acknowledged, overlong encoding and isolated codepoints. So UTF-16 has the same issues as UTF-8, plus some more, endianness and size.[...]iopipe handles this: http://schveiguy.github.io/iopipe/iopipe/textpipe/ensureDecodeable.html
Dec 01 2017
On Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter wrote:On Thursday, 30 November 2017 at 19:37:47 UTC, Steven Schveighoffer wrote:I meant isolated code-units, of course.On 11/30/17 1:20 PM, Patrick Schluter wrote:It was only to give an example. With UTF-8 people who implement the low level code in general think about the multiple codeunits at the buffer boundary. With UTF-16 it's often forgotten. In UTF-16 there are also 2 other common pitfalls, that exist also in UTF-8 but are less consciously acknowledged, overlong encoding and isolated codepoints. So UTF-16 has the[...]iopipe handles this: http://schveiguy.github.io/iopipe/iopipe/textpipe/ensureDecodeable.htmlsame issues as UTF-8, plus some more, endianness and size.
Dec 01 2017
On 12/1/17 7:26 AM, Patrick Schluter wrote:On Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter wrote:Hehe, it's impossible for me to talk about code points and code units without having to pause and consider which one I mean :) -Steveisolated codepoints.I meant isolated code-units, of course.
Dec 01 2017
On Friday, December 01, 2017 09:49:08 Steven Schveighoffer via Digitalmars-d wrote:On 12/1/17 7:26 AM, Patrick Schluter wrote:What, you mean that Unicode can be confusing? No way! ;) LOL. I have to be careful with that too. What bugs me even more though is that the Unicode spec talks about code points being characters, and then talks about combining characters for grapheme clusters - and this in spite of the fact that what most people would consider a character is a grapheme cluster and _not_ a code point. But they presumably had to come up with new terms for a lot of this nonsense, and that's not always easy. Regardless, what they came up with is complicated enough that it's arguably a miracle whenever a program actually handles Unicode text 100% correctly. :| - Jonathan M DavisOn Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter wrote:Hehe, it's impossible for me to talk about code points and code units without having to pause and consider which one I mean :)isolated codepoints.I meant isolated code-units, of course.
Dec 01 2017
On Friday, 1 December 2017 at 18:31:46 UTC, Jonathan M Davis wrote:On Friday, December 01, 2017 09:49:08 Steven Schveighoffer via Digitalmars-d wrote:And dealing with that complexity can often introduce bugs in their own right, because it's hard to get right. That's why sometimes it's easy just to simplify things and to exclude certain ways of looking at the string.On 12/1/17 7:26 AM, Patrick Schluter wrote:What, you mean that Unicode can be confusing? No way! ;) LOL. I have to be careful with that too. What bugs me even more though is that the Unicode spec talks about code points being characters, and then talks about combining characters for grapheme clusters - and this in spite of the fact that what most people would consider a character is a grapheme cluster and _not_ a code point. But they presumably had to come up with new terms for a lot of this nonsense, and that's not always easy. Regardless, what they came up with is complicated enough that it's arguably a miracle whenever a program actually handles Unicode text 100% correctly. :| - Jonathan M DavisOn Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter wrote:Hehe, it's impossible for me to talk about code points and code units without having to pause and consider which one I mean :)isolated codepoints.I meant isolated code-units, of course.
Dec 01 2017
On 11/30/2017 10:07 PM, Patrick Schluter wrote:endiannessYeah, I forgot to mention that one. As if anyone remembers to put in the Byte Order Mark :-(
Dec 02 2017
On Tuesday, 28 November 2017 at 03:01:33 UTC, A Guy With an Opinion wrote:- Attributes. I had another post in the Learn forum about attributes which was unfortunate. At first I was excited because it seems like on the surface it would help me write better code, but it gets a little tedious and tiresome to have to remember to decorate code with them.I think the better decision would be to not have the errors occur.Hehe, I'm not against living in an idea world either.- Immutable. I'm not sure I fully understand it. On the surface it seemed like const but transitive. I tried having a method return an immutable value, but when I used it in my unit test I got some weird errors about objects not being able to return immutable (I forget the exact error...apologies).That's the point of static type system: if you make a mistake, the code doesn't compile.+- Unicode support is good. Although I think D's string type should have probably been utf16 by default. Especially considering the utf module states: "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'." Seems like the natural fit for me.UTF-16 in inadequate for range '\u0000' <= character <= '\U0010FFFF', though. UCS2 was adequate (for '\u0000' <= character <= '\uFFFF'), but lost relevance. UTF-16 is only backward compatibility for early adopters of unicode based on UCS2.Plus for the vast majority of use cases I am pretty guaranteed a char = codepoint.That way only end users will be able to catch bugs in production system. It's not the best strategy, is it? Text is often persistent data, how do you plan to fix a text handling bug when corruption accumulated for years and spilled all over the place?
Nov 30 2017