www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Adding Unicode operators to D

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/


Andrei
Oct 22 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Correx:

http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/

Andrei

Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei
Oct 22 2008
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu"  wrote
 Correx:

 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/

 Andrei
No thanks. Please let's only use operators that are on the keys of my keyboard. I don't fancy having to type key digraphs or trigraphs to try and write code. I understand that others already have this problem, but I don't. This would be a huge detractor from D for me. I'd definitely support a language fork at that point, or at least refuse to deal with any code that has unicode operators. I think you'd find others feel the same way. Why can't the emacs module solution work that was used for the cheverons? That is, when emacs sees: x opCross(y); display it as x x y (of course, assume the middle x is the cross symbol, I have no idea how to type it). And upon save, regenerate the correct code. I see no issue with something like that. This is all the compiler is doing anyways... Note that any operators for unicode would be user-defined anyways, the standard operator symbols already cover what actually gets generated to machine code. That is, unicode operator X is invariably going to map to opX, so there is no benefit to the compiler performing this step instead of an editor. -Steve
Oct 22 2008
next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 Why can't the emacs module solution work that was used for the cheverons?
Beeeecause not everyone uses emacs?
Oct 22 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Jarrett Billingsley" wrote
 On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Why can't the emacs module solution work that was used for the cheverons?
Beeeecause not everyone uses emacs?
Including myself ;) But I really meant the same *type* of solution. If you use another editor, especially if it is used for coding, it probably has a macro feature that you can use for doing this. -Steve
Oct 22 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 23, 2008 at 10:36 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 No thanks.  Please let's only use operators that are on the keys of my
 keyboard. I don't fancy having to type key digraphs or trigraphs to try and
 write code.
 [...]
 Why can't the emacs module solution work that was used for the cheverons?
Actually, the solutions aren't that far apart. Andrei's solution displays XXX as YYY, the actual Unicode version you'd still type XXX just it would actually be replaced by YYY instead of just being displayed as YYY. The nice thing about getting such AutoCorrect replacements working well across a wide range of editors is that it has benefits beyond just typing unicode characters. You can have it insert code snippets when you type [[main]] for example, or some people have said that some of the existing characters are hard to type on their non-US keyboards. You could define replacements for those. I'm certainly not saying going Unicode is the right thing to do right now. More like trying to explore what has to change (if anything) before it really becomes viable to introduce Unicode. The topic seems to keep coming up in a lot of places, so I think eventually it is inevitable that we will see more and more languages start using it. ---bb
Oct 22 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 23, 2008 at 10:45 AM, Jarrett Billingsley
<jarrett.billingsley gmail.com> wrote:
 On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Why can't the emacs module solution work that was used for the cheverons?
Beeeecause not everyone uses emacs?
In fact, I think there are only like three of us using emacs. :-) So it's not a very general solution. But I think the point is that you should be able to implement something similar in many editors. Although I think the trick of showing one thing but saving another is more tricky for most editors than just replacing the strings outright a la AutoCorrect. --bb
Oct 22 2008
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"davidl" wrote
? Thu, 23 Oct 2008 09:36:29 +0800,Steven Schveighoffer 
<schveiguy yahoo.com> ??:

 "Andrei Alexandrescu"  wrote
 Correx:

 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/

 Andrei
No thanks. Please let's only use operators that are on the keys of my keyboard. I don't fancy having to type key digraphs or trigraphs to try and write code. I understand that others already have this problem, but I don't. This would be a huge detractor from D for me. I'd definitely support a language fork at that point, or at least refuse to deal with any code that has unicode operators. I think you'd find others feel the same way. Why can't the emacs module solution work that was used for the cheverons? That is, when emacs sees: x opCross(y); display it as x x y (of course, assume the middle x is the cross symbol, I have no idea how to type it). And upon save, regenerate the correct code. I see no issue with something like that. This is all the compiler is doing anyways...
Everything you worry about is just poor editor. Why do you think an editor can affect the language?
All that is being proposed right now is syntax sugar. Cross product, dot product, union, etc. All of these will map to a function, so there is no reason to require compiler support (that is, they don't translate directly to assembly/machine code). I'm proposing the editor be used to do the sugar instead of the compiler. Right now Unicode is not universally accepted by all editors, ASCII is. Right now, I don't have cross product symbol on my keyboard, all currently supported symbols I do have. Why should my experience with D be severely affected by your desire for syntax sugar?
 And It complexes the language, if it's not priorly converted by the 
 programmer. Also it possibly sets up
 future restrictions of extending the language in the correct direction!
Today, I can call opX functions instead of using the appropriate operator. This is no different.
 In your case: x opCross(y) , why identifier opCross(identifier) is 
 considered as identifier x identifier?
 So would the typical operator overload function declaration should be 
 considered that way?

 x opCross(y)
 {
 }

 x x y
 {
 }

 or even

 x opCross(y, m){}

 --->

 x x y, m  {}

 also consider a template declaration

 Matrix opCross(T)(T a)
 {
 }

 should it be considered as Matrix x T (T a)?

 If not , how do you distinguish in all those circumstances(and not all 
 possible "shouldn't be" situations are listed here)
The editor module would have to be (and can be) smarter than that. -Steve
Oct 23 2008
prev sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 23 Oct 2008 18:21:18 +0800,
davidl wrote:
 Everything you worry about is just poor editor. Why do you think an
 editor can affect the language?
I think an editor is not the only thing that displays your program's source. I think that compiler's error message should be readable over a TTY terminal. Otherwise you're limited to working with fancy graphical shells.
Oct 23 2008
parent KennyTM~ <kennytm gmail.com> writes:
Sergey Gromov wrote:
 Thu, 23 Oct 2008 18:21:18 +0800,
 davidl wrote:
 Everything you worry about is just poor editor. Why do you think an
 editor can affect the language?
I think an editor is not the only thing that displays your program's source. I think that compiler's error message should be readable over a TTY terminal. Otherwise you're limited to working with fancy graphical shells.
I agree. My real world experience: Sometimes I need to code over ssh. The server admin only installed vim (which I don't use) and nano, no emacs. Probably there could be a vim module also (is it possible?), but that's just palliatives.
Oct 23 2008
prev sibling next sibling parent Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
Andrei Alexandrescu Wrote:

 Correx:
 
 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/
 
 Andrei
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei
Java allows unicode variable names. The Greek letter 'pi' is a valid variable name in Java (see www.jscience.org for an example). Having said that, I've had Java IDEs choke on these. An opportunity may exist here for someone to create/modify a D language IDE that supports same. [Although Descent (being Eclipse-based and therefore Java-based) should have a leg up already.] I know projects exist that intend to be 'the' D IDE (written in D, for D, etc.). Maybe this could be a discriminator that makes one stand out. Paul
Oct 22 2008
prev sibling next sibling parent reply Spacen Jasset <spacenjasset yahoo.co.uk> writes:
Andrei Alexandrescu wrote:
 Correx:
 
 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 Andrei
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 



 Andrei
I haven't really ever felt the need for such things. It would require editor support and I think that it could hinder readability as one would have to know that symbol 'x' is say, crossproduct. -- It isn't always, it depends on the mathematical domain. There are, I belive, far more pressing matters, and this feature would make editor support a bit more difficult, and we are currently in the days where there isn't enough editor and/or ide support for D. I would personally prefer it not be added to the language in the near future, this is of course only my perferance, which in honesty may be biased but isn't entirely for self reasons.
Oct 23 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Fri, Oct 24, 2008 at 3:42 AM, Spacen Jasset <spacenjasset yahoo.co.uk> wrote:
 I haven't really ever felt the need for such things. It would require editor
 support and I think that it could hinder readability as one would have to
 know that symbol 'x' is say, crossproduct. -- It isn't always, it depends on
 the mathematical domain.

 There are, I belive, far more pressing matters, and this feature would make
 editor support a bit more difficult, and we are currently in the days where
 there isn't enough editor and/or ide support for D. I would personally
 prefer it not be added to the language in the near future, this is of course
 only my perferance, which in honesty may be biased but isn't entirely for
 self reasons.
I think that's the conclusion I'm coming too as well. While the use of Unicode would have some advantages, there are various technical issues with it (like I haven't been able to figure out how to get the DOS console in Windows to display UTF-8). I think those issues can all be solved, but it would be a large distraction for the D community. Better to let some big, well-funded, massively popular language pioneer in this area. If some language with a billion programmers decided to use Unicode, then you can bet that most of these infrastructure problems would start to disappear quickly as annoyed programmers start scratching their own itches and as they start complaining to the people who write the tools they use. Realistically, if I complain to any software vendor now that their editor doesn't work well with D because they don't have funky Unicode functionality, the response is likely to be "Sounds like a problem with D, whatever that is". If the language were Java or C++, though, they would have little choice but to take the complaint seriously, regardless of the effort required. --bb
Oct 23 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Bill Baxter wrote:
 I think that's the conclusion I'm coming too as well.  While the use
 of Unicode would have some advantages, there are various technical
 issues with it (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8).  I think those issues can
 all be solved, but it would be a large distraction for the D
 community.  Better to let some big, well-funded, massively popular
 language pioneer in this area.  If some language with a billion
 programmers decided to use Unicode, then you can bet that most of
 these infrastructure problems would start to disappear quickly as
 annoyed programmers start scratching their own itches and as they
 start complaining to the people who write the tools they use.
 
 Realistically, if I complain to any software vendor now that their
 editor doesn't work well with D because they don't have funky Unicode
 functionality, the response is likely to be "Sounds like a problem
 with D, whatever that is".  If the language were Java or C++, though,
 they would have little choice but to take the complaint seriously,
 regardless of the effort required.
Unfortunately, you might be right in that D is not currently in a position to force the issue.
Oct 23 2008
parent "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:gdr4pe$2uje$1 digitalmars.com...
 Bill Baxter wrote:
 I think that's the conclusion I'm coming too as well.  While the use
 of Unicode would have some advantages, there are various technical
 issues with it (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8).  I think those issues can
 all be solved, but it would be a large distraction for the D
 community.  Better to let some big, well-funded, massively popular
 language pioneer in this area.  If some language with a billion
 programmers decided to use Unicode, then you can bet that most of
 these infrastructure problems would start to disappear quickly as
 annoyed programmers start scratching their own itches and as they
 start complaining to the people who write the tools they use.

 Realistically, if I complain to any software vendor now that their
 editor doesn't work well with D because they don't have funky Unicode
 functionality, the response is likely to be "Sounds like a problem
 with D, whatever that is".  If the language were Java or C++, though,
 they would have little choice but to take the complaint seriously,
 regardless of the effort required.
Unfortunately, you might be right in that D is not currently in a position to force the issue.
My various thoughts: Whatever language does end up forcing the issue is going to come up against (inertial) resistance, either successfully or unsuccessfully. If D, right now, were to be the language to attempt to force the issue, then like you two have said, it would probably be unsuccesful. So, in order for the unicode transition to ever be successful, it would have to be some other language (or a version of D later down the road) that forces the issue. However, if D and/or other similarly less-than-mainstream (I hate referring to D that way, BTW) languages already had useful unicode support in a way that *wasn't* trying to force the issue (ie, purely optional, with perfectly acceptable ASCII fallbacks) when that "force the issue" language does come along, then that can help cut down on the resistance that the "force the issue" language encounters. We might not be able to crack the chicken-and-the-egg, but we could help weaken it by providing a little extra incentive of out own (again, as long as it was in a way that wasn't forceful). I do agree, though, with the people who have said that D has more important things to focus on right now than unicode. And I would add that I see most of D's biggest strengths as things where it cleans up and fixes the mistakes made by the more pioneering languages like C++ or Java. So I think it would be in true D style (in a good way) to wait for something else, like maybe Fortress, to go muck around in unicode, and then we can design our unicode to clean up the mistakes those languages will inevitably end up making (instead leading our own language into a corner by making those "pioneer" mistakes ourselves). Plus, hopefully by that time we'll have finally taken care of the more pressing issues that we're currently facing. (Like eliminating foreward reference issues!! Please!!) I hope that all made sense. I guess my summary is: Hold off on official unicode stuff for now and learn from other's unicode mistakes. But, if we do put official unicode stuff in right now, keep it in a way that doesn't force the issue. And as for unofficial unicode stuff, I say go ahead, play around with it, post it, do whatever.
Oct 23 2008
prev sibling parent reply Don <nospam nospam.com.au> writes:
Andrei Alexandrescu wrote:
 Correx:
 
 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 Andrei
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 



 Andrei
Entering this debate late: I think that operator overloading itself is syntactic sugar, and primarily exists for numerical programmers. So it's not so unreasonable to support for operator overloading which is not hugely intelligible to non-mathematicians. "Funny" operators should never be seen by anyone without a mathematical background. However, I'm not so sure how common they'd actually be. The strongest use case seems to me to be the situation where multiple related operations exist, but only one operator is available. The classic example is vector products, where we have: - vector dot vector - vector cross vector - Elementwise product of two vectors. But we only have one opMul. So it would be useful to have alternate multiplication signs available. Adding (opCross) as a multiplication which is non-associative would, I think, be quite generally useful. But, I think there aren't actually very many other operators which are easy to justify on mathematical grounds. Largely because most unary operations look quite OK when implemented as functions, and mathematicians don't have a huge number of binary operators. Other than dot product, cross product, and convolution, there's the exclusive or symbol (+ with a circle around it), and everything else is pretty obscure. Apart from the dot and cross product, the inability to have superscripts and subscripts in variable names (and comments!) is a much bigger issue, in my experience. Oh. And the lack of an exponentiation operator. I miss the old Commodore 64 up-arrow for power <g> If you could completely ignore keyboard and display issues, and use any unicode character as an operator, which ones would you actually use?
Oct 28 2008
parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?
I'd use dot "⋅" and cross "×" products for 3D, union "∪" and intersection "∩", subset "⊂" and superset "⊃" and their negative forms. I don't think I'd use anything else. Well, comparisons look better when converted into appropriate unicode.
Oct 28 2008
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Sergey Gromov:
 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
  I don't think I'd use anything else.
I just want to note that the whole thread is almost unreadable on the digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So adding unicode to D will give problems to show code. Unrelated to the unicode, but related on those opSubset, opSuperset, etc: while implementing a set() class with the same API of the Python sets, I have seen there are the following operators/methods too: issubset(other) set <= other Test whether every element in the set is in other. set < other Test whether the set is a true subset of other, that is, set <= other and set != other. issuperset(other) set >= other Test whether every element in other is in the set. set > other Test whether the set is a true superset of other, that is, set >= other and set != other. A full opCmp can't be defined on sets, so I think in D1 we can't overload <= >= among sets... I think this is a problem has to be solved in D2, because sets are important enough. Bye, bearophile
Oct 28 2008
parent reply KennyTM~ <kennytm gmail.com> writes:
bearophile wrote:
 Sergey Gromov:
 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their
negative forms.
  I don't think I'd use anything else.
I just want to note that the whole thread is almost unreadable on the digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So adding unicode to D will give problems to show code. Unrelated to the unicode, but related on those opSubset, opSuperset, etc: while implementing a set() class with the same API of the Python sets, I have seen there are the following operators/methods too: issubset(other) set <= other Test whether every element in the set is in other. set < other Test whether the set is a true subset of other, that is, set <= other and set != other. issuperset(other) set >= other Test whether every element in other is in the set. set > other Test whether the set is a true superset of other, that is, set >= other and set != other. A full opCmp can't be defined on sets, so I think in D1 we can't overload <= >= among sets... I think this is a problem has to be solved in D2, because sets are important enough. Bye, bearophile
If the two sets are incomparable, just return NaN... We need an opCmp that returns a float :)
Oct 28 2008
parent KennyTM~ <kennytm gmail.com> writes:
KennyTM~ wrote:
 bearophile wrote:
 Sergey Gromov:
 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their 
 negative forms.
  I don't think I'd use anything else.
I just want to note that the whole thread is almost unreadable on the digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So adding unicode to D will give problems to show code. Unrelated to the unicode, but related on those opSubset, opSuperset, etc: while implementing a set() class with the same API of the Python sets, I have seen there are the following operators/methods too: issubset(other) set <= other Test whether every element in the set is in other. set < other Test whether the set is a true subset of other, that is, set <= other and set != other. issuperset(other) set >= other Test whether every element in other is in the set. set > other Test whether the set is a true superset of other, that is, set >= other and set != other. A full opCmp can't be defined on sets, so I think in D1 we can't overload <= >= among sets... I think this is a problem has to be solved in D2, because sets are important enough. Bye, bearophile
If the two sets are incomparable, just return NaN... We need an opCmp that returns a float :)
Actually I've made a working solution. Even the exotic operators like !<= (not a subset of, ⊈) works too. It's designed for demonstration, not performance, though.
Oct 28 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?
I'd use dot "⋅" and cross "×" products for 3D, union "∪" and intersection "∩", subset "⊂" and superset "⊃" and their negative forms. I don't think I'd use anything else. Well, comparisons look better when converted into appropriate unicode.
In my opinion, a workable feature is this: * Functions can be defined with a leading backspace. They will be usable with the infix notation. * There is a way of specifying that precedence of a function defined as above is the same as precedence of a built-in operator. * Functions of which name is the same as an HTML entity name for a symbol can be replaced with the actual symbol. Andrei
Oct 28 2008
next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
T24gV2VkLCBPY3QgMjksIDIwMDggYXQgNDoxMiBBTSwgQW5kcmVpIEFsZXhhbmRyZXNjdQo8U2Vl
V2Vic2l0ZUZvckVtYWlsQGVyZGFuaS5vcmc+IHdyb3RlOgo+IFNlcmdleSBHcm9tb3Ygd3JvdGU6
Cj4+Cj4+IERvbiB3cm90ZToKPj4+Cj4+PiBJZiB5b3UgY291bGQgY29tcGxldGVseSBpZ25vcmUg
a2V5Ym9hcmQgYW5kIGRpc3BsYXkgaXNzdWVzLCBhbmQgdXNlIGFueQo+Pj4gdW5pY29kZSBjaGFy
YWN0ZXIgYXMgYW4gb3BlcmF0b3IsIHdoaWNoIG9uZXMgd291bGQgeW91IGFjdHVhbGx5IHVzZT8K
Pj4KPj4gSSdkIHVzZSBkb3QgIuKLhSIgYW5kIGNyb3NzICLDlyIgcHJvZHVjdHMgZm9yIDNELCB1
bmlvbiAi4oiqIiBhbmQKPj4gaW50ZXJzZWN0aW9uICLiiKkiLCBzdWJzZXQgIuKKgiIgYW5kIHN1
cGVyc2V0ICLiioMiIGFuZCB0aGVpciBuZWdhdGl2ZSBmb3Jtcy4KPj4gIEkgZG9uJ3QgdGhpbmsg
SSdkIHVzZSBhbnl0aGluZyBlbHNlLgo+Pgo+PiBXZWxsLCBjb21wYXJpc29ucyBsb29rIGJldHRl
ciB3aGVuIGNvbnZlcnRlZCBpbnRvIGFwcHJvcHJpYXRlIHVuaWNvZGUuCj4KPiBJbiBteSBvcGlu
aW9uLCBhIHdvcmthYmxlIGZlYXR1cmUgaXMgdGhpczoKPgo+ICogRnVuY3Rpb25zIGNhbiBiZSBk
ZWZpbmVkIHdpdGggYSBsZWFkaW5nIGJhY2tzcGFjZS4gVGhleSB3aWxsIGJlIHVzYWJsZQo+IHdp
dGggdGhlIGluZml4IG5vdGF0aW9uLgoKRGlkIHlvdSBtZWFuIGJhY2tzbGFzaD8gIEkgaG9wZSB5
b3UncmUgbm90IHN1Z2dlc3Rpbmcgd2Ugd3JpdGUKXkhpbmZpeE9wZXJhdG9yLiA6LSkKCj4gKiBU
aGVyZSBpcyBhIHdheSBvZiBzcGVjaWZ5aW5nIHRoYXQgcHJlY2VkZW5jZSBvZiBhIGZ1bmN0aW9u
IGRlZmluZWQgYXMKPiBhYm92ZSBpcyB0aGUgc2FtZSBhcyBwcmVjZWRlbmNlIG9mIGEgYnVpbHQt
aW4gb3BlcmF0b3IuCgpXb3JrYWJsZSwgYnV0IGl0IGFpbid0IHdoYXQgV2FsdGVyIGNhbGxzIHBh
cnNpbmcuCgo+ICogRnVuY3Rpb25zIG9mIHdoaWNoIG5hbWUgaXMgdGhlIHNhbWUgYXMgYW4gSFRN
TCBlbnRpdHkgbmFtZSBmb3IgYSBzeW1ib2wKPiBjYW4gYmUgcmVwbGFjZWQgd2l0aCB0aGUgYWN0
dWFsIHN5bWJvbC4KCi0tYmIK
Oct 28 2008
prev sibling next sibling parent Don <nospam nospam.com.au> writes:
Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?
I'd use dot "⋅" and cross "×" products for 3D, union "∪" and intersection "∩", subset "⊂" and superset "⊃" and their negative forms. I don't think I'd use anything else. Well, comparisons look better when converted into appropriate unicode.
In my opinion, a workable feature is this: * Functions can be defined with a leading backspace. They will be usable with the infix notation. * There is a way of specifying that precedence of a function defined as above is the same as precedence of a built-in operator.
Do we really need to do that? How many Unicode binary operators are there? This list of symbols which work in web browsers is very short. http://en.wikipedia.org/wiki/Wikipedia:Mathematical_symbols The interesting thing about this second list is just how short it is, and how many of the items in it are comparison operators. Any of the unicode comparison operators could be given the same precedence as <,> and 'in'. Cross should be given the same precedence as opMul and opDiv. That just leaves oplus, otimes, which probably the same precedence as plus and mul. You can do the same thing with this list: http://en.wikipedia.org/wiki/Unicode_Mathematical_Operators And you find that the precedence of almost everything is easy to determine. Seems like 90% of them are relational operators. Specifying the precedence of each unicode operator (eg by a lookup table) would be adequate for any use case I can imagine, and it wouldn't make syntactic analysis any more ambiguous.
 * Functions of which name is the same as an HTML entity name for a 
 symbol can be replaced with the actual symbol.
Oct 29 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 * There is a way of specifying that precedence of a function defined as 
 above is the same as precedence of a built-in operator.
That throws out the ability to parse without semantic analysis. It's not worth it.
Oct 29 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Andrei Alexandrescu wrote:
 * There is a way of specifying that precedence of a function defined 
 as above is the same as precedence of a built-in operator.
That throws out the ability to parse without semantic analysis. It's not worth it.
It doesn't per a previous post of mine, but I agree it's still not worth it. Andrei
Oct 29 2008
prev sibling parent Benji Smith <dlanguage benjismith.net> writes:
Sergey Gromov wrote:
 Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?
I'd use dot "⋅" and cross "×" products for 3D, union "∪" and intersection "∩", subset "⊂" and superset "⊃" and their negative forms. I don't think I'd use anything else. Well, comparisons look better when converted into appropriate unicode.
I have pretty much the same list. For me the really compelling case for unicode characters isn't in finding more operators. It's the brackets!! --benji
Oct 28 2008
prev sibling next sibling parent reply Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/
allowing_unicode_operators_in_d_similarly_to/
 
 
 Andrei
It would be very nice to have unicode operators. But what opFooBar functions do users need (most)? opDotProduct and opCrossProduct would be definitely cool.
Oct 22 2008
next sibling parent Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 23:37:43 +0000, Moritz Warning wrote:

 On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:
 
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/
allowing_unicode_operators_in_d_similarly_to/
 
 
 Andrei
It would be very nice to have unicode operators. But what opFooBar functions do users need (most)? opDotProduct and opCrossProduct would be definitely cool.
sorry posted in d.announce by .. accident. :/
Oct 22 2008
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Moritz Warning" <moritzwarning web.de> wrote in message 
news:gdodg7$1f5o$1 digitalmars.com...
 On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/
allowing_unicode_operators_in_d_similarly_to/
 Andrei
It would be very nice to have unicode operators. But what opFooBar functions do users need (most)? opDotProduct and opCrossProduct would be definitely cool.
I'd certainly like opIntersection and maybe opUnion.
Oct 22 2008
prev sibling next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
Oct 22 2008
next sibling parent Jesse Phillips <jessekphillips gmail.com> writes:
On Thu, 23 Oct 2008 09:52:34 +0900, Bill Baxter wrote:

 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/
allowing_unicode_operators_in_d_similarly_to/

 (My comment cross posted here from reddit)
 
 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.
 
 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.
 
 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.
 
 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every time
 you type "(X)" a funky unicode character instantly replaces those chars.
 
 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready to
 support input of Unicode chars in any language just by adding the right
 definitions.
 
 --bb
I don't find this terribly appealing. Walter mentions having thrown out support for 16bit processors and such. Why not through out 32bit too? Those are going out of style. The point is, it's not the languages job to force change of hardware. And support via a text editor is also not acceptable. Going the software support route relies on the OS to support a universal easy method to enter unicode. As for D's case, I say support unicode for these new operators, but provide the same function with keyboard provided symbols.
Oct 22 2008
prev sibling next sibling parent reply Don <nospam nospam.com.au> writes:
Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more.
I agree. There is in fact a fairly defensible subset of Unicode: those characters which are easy to type on some keyboard. This would includes chevrons, currency symbols (especially pound, euro, yen); european accented characters (not terribly useful) and a couple of other punctuation marks. After all, if it's painful to type a Euro symbol on your keyboard, you're heading for oblivion. The list is pretty much equivalent to the US-International keyboard layout in Windows. There aren't many useful characters in there, but it might be enough. The chevrons and the inverted ? and ! are perhaps the most interesting, since they are paired. The multiply sign isn't bad, though. With the German keyboards I have to use, some of these are less painful to type than {}.
Oct 23 2008
parent Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 23 Oct 2008 09:36:39 +0200,
Don wrote:
 =AB =BB ? ? =B6 =A7 =AC ? ? ? ? ? =A4 ? =A9 =AE
Lots of question marks here. This sucks.
Oct 23 2008
prev sibling parent reply Spacen Jasset <spacenjasset yahoo.co.uk> writes:
Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand. If set memebrship test operator and a few others are introduced, then really to be "complete" all the set operators must be added, and implemented. Futhermore, the introduction of set operators should really mean that you can use them on something by default, that means implementing sets that presumably are usable, quick, and are worth using, otherwise peope will roll thier own (all the time) in many different ways. Unicode symbol 'x' may look better, but is it really more readable? I think it is -- a bit, and it may be cool, but I don't think it's one of the things that is going to make developing software siginficantly easier. Why unicode anyway? In the same way that editor support is required to actually type them in, why not let the editor render them. So instead of symbol 'x' in the source code, say: m3 = m1 cross_product m2 as an infix notatation in a similar way to the (uniary) sizeof operator. While cross_product is a bit long and unwieldy any editor capable can replace the rendition of that keyword with a symbol for it. But in editors that don't it means that it still can be typed in and/or displayed easily. Another option includes providing cross_product as an 'alias' and 'X' aswell. Which then leads on to the introduction of a facility to add arbitary operators, which could be interesting becuase you can supply any operator you see fit for the domains that you use that require it. -- This provide exactly the right solution though as all the additions would be 'non standard' and I can see books in the future recommending people not use unicode operators, becuase editors don't have support for them. If D is to be used on a wide variety of platforms, which would be desirable if it is to gain traction, then editor support barriers like this could impeede it's progress.
Oct 25 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
 If set memebrship test operator and a few others are introduced, then 
 really to be "complete" all the set operators must be added, and 
 implemented.
 
 Futhermore, the introduction of set operators should really mean that 
 you can use them on something by default, that means implementing sets 
 that presumably are usable, quick, and are worth using, otherwise peope 
 will roll thier own (all the time) in many different ways.
 
 Unicode symbol 'x' may look better, but is it really more readable? I 
 think it is -- a bit, and it may be cool, but I don't think it's one of 
 the things that is going to make developing software siginficantly easier.
I think "cool" has not a lot to do with it. For scientific code, it's closer to a necessity. Andrei
Oct 25 2008
next sibling parent reply Spacen Jasset <spacenjasset yahoo.co.uk> writes:
Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
Yes, that is indeed a fair point and I agree. D is a "systems programming language." [sic] though; and so what will people use it for in the main? I suggest that communities that require scientific code have options now, and that they can and do choose languages for the purpose which have better support for thier needs than D might achieve.
 If set memebrship test operator and a few others are introduced, then 
 really to be "complete" all the set operators must be added, and 
 implemented.

 Futhermore, the introduction of set operators should really mean that 
 you can use them on something by default, that means implementing sets 
 that presumably are usable, quick, and are worth using, otherwise 
 peope will roll thier own (all the time) in many different ways.

 Unicode symbol 'x' may look better, but is it really more readable? I 
 think it is -- a bit, and it may be cool, but I don't think it's one 
 of the things that is going to make developing software siginficantly 
 easier.
I think "cool" has not a lot to do with it. For scientific code, it's closer to a necessity.
On my use of "cool" I only brought it up as this thread has a few mentions of the word and it's a bit nebulous. I, personally, am more concerened with practicality than "cool".
 
 
 Andrei
What I think of unicode symbols therefore depends on whether D should be more scientific oriented or not. If it should be, then unicode symbols would undoubtedly be a benefit. My responses were guided by the assumption that D was more generic in nature, though.
Oct 25 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 3:46 AM, Spacen Jasset <spacenjasset yahoo.co.uk> wrote:
 I am not entirely sure that 30 or (x amount) of new operators would be a
 good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 =
 m1 X m2 ? and how often will that happen? It's also going to make the
 language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
Yes, heavy math code is hard to read in the current situation. I almost always prefix any significant math with a comment giving the equations being implemented in a more compact notation. Having to write the same thing in two different ways like that is a waste of effort. It would be very cool if I could just write it once and have it look like it does in my notebook.
 Yes, that is indeed a fair point and I agree. D is a "systems programming
 language." [sic] though; and so what will people use it for in the main?
D is a compile-to-the-metal language that is of interest to anyone who ranks performance high on their list of priorities. Mathemeticians and scientists are among the few remaining groups where maximum speed is still needed. Games are another area, and games are becoming more and more sophisticated mathematically under the hood.
 I suggest that communities that require scientific code have options now, and
 that they can and do choose languages for the purpose which have better
 support for thier needs than D might achieve.
The traditional math languages suck at doing anything besides math. Want to do a bit of math then display the results interactively in an OpenGL window? With Fortran?! Ha! On the other end there are the Matlab and NumPy-type solutions. They are convenient for tinkering around and displaying some results, but these are not good for performance. D has both. So I think D has potential to gain traction in the world of math-heavy computing. But anyway, I'm got convinced several posts back that the time is not yet ripe for Unicode in D. So I'm not gonna argue that D go Unicode now. I'm just saying that math code is hard to read, and that heavy math users are a good target audience for D because they need performance, but don't necessarily want to give up general-purposeness. --bb
Oct 25 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:
 On the other end there are the Matlab and NumPy-type solutions.  They
 are convenient for tinkering around and displaying some results, but
 these are not good for performance.
I have seen many scientific programs that use numpy, so sometimes it's fast enough. But it forces you to write everything in a vector programming style, that a procedural programmer needs time to learn. Normal C/D/C++ code is more flexible, you can work on single items too in a fast way, while in numpy you can go fast only when you work in bulk, on vectors. On the other hand numpy offers you some higher level operations on arrays that are currently missing in D, like certain complex slicing operations, that may reduce your code length significantly, increasing code readability (because it looks more like formulas); I can show you some examples if you want. Note that in D there's no built-in rectangular dynamic arrays, that are basic stuff in numpy/matlab. Bye, bearophile
Oct 25 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 5:10 AM, bearophile <bearophileHUGS lycos.com> wrote:
 Bill Baxter:
 On the other end there are the Matlab and NumPy-type solutions.  They
 are convenient for tinkering around and displaying some results, but
 these are not good for performance.
I have seen many scientific programs that use numpy, so sometimes it's fast enough. But it forces you to write everything in a vector programming style, that a procedural programmer needs time to learn. Normal C/D/C++ code is more flexible, you can work on single items too in a fast way, while in numpy you can go fast only when you work in bulk, on vectors.
Yep C/D/C++ is easier. The SciPy.org site has a growing section of their wiki devoted to how to make your code fast using various levels of python/native hybrids. I was using python heavily for numerical stuff for a while and it got to the point where I realized that the time I spent trying to figure out how to vectorize things and use other tricks to make things fast, and to make python modules out of external code I wanted to call, etc. was actually more work than it would be to just use D for everything. Sure Python does have some nice features as a language that D lacks, but from 10,000 ft D is a lot closer to Python than C++ in terms of ease of use. Also, while Python is nice for arrays and number crunching, I found the lack of typing to be a liability when it comes to complicated graph structures. Instead of nicely typed pointers that the compiler can tell apart, you end up with 23 different integer index variables that you have to keep straight. And finally, also type related, there's the annoyance that you have to actually run your app to detect typos. I'm sure there's way's to work around all those issues, but to me D's a lot easier. I simply don't need the workarounds. I still fire up NumPy and Matplotlib for analyzing the from results from my D programs. And SymPy is great too. I just don't use it as my main development langauge any more.
 On the other hand numpy offers you some higher level operations on arrays that
are currently missing in D, like certain complex slicing operations, that may
reduce your code length significantly, increasing code readability (because it
looks more like formulas); I can show you some examples if you want.
No thanks! Been there, done that!
 Note that in D there's no built-in rectangular dynamic arrays, that are basic
stuff in numpy/matlab.
I've got my dflat and gobo (http://www.dsource.org/projects/multiarray) that are working for me pretty well. They could use some full-time loving to make more operations work intuitively, but the basics work ok. --bb
Oct 25 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:

was actually more work than it would be to just use D for everything.<
Mixing languages isn't nice, I agree. That's why I too use D for several purposes. But if you have to change your code very often (and if your problems are of a certain kind that allow a natural vectorization), then having vectorial (short) code may have some advantages), think about how much C++ code you need to write to implement the programs of this book: http://wiki.deductivethinking.com/wiki/Python_Programs_for_Modelling_Infectious_Diseases_book So it allows a more explorative way of coding.
Sure Python does have some nice features as a language that D lacks, but from
10,000 ft  D is a lot closer to Python than C++ in terms of ease of use.<
My experience with the ShedSkin compiler shows me that most of those features that D lacks (complex slices, list comps, generators, short syntax, some near-zero-cost safeties, etc) are absent because of cultural or inertial reasons present in the brain of people used to C/C++, and not because they can't be present/added in a language like D. ShedSkin translates Python code to clean C++ code, showing that it can be done, it gives advantages, and it's not too much difficult to do. It shows once and forever, that you can have a C++-class language with a short and nice syntax, etc. Hopefully the Delight language has less of the cultural inertia coming from C/C++, so it may become a better compromise than D itself.
I've got my dflat and gobo (http://www.dsource.org/projects/multiarray) that
are working for me pretty well.  They could use some full-time loving to make
more operations work intuitively, but the basics work ok.<
Nice stuff, lot of stuff. More comments require more study of that code. D (Tango) may gain from having more batteries. Bye, bearophile
Oct 25 2008
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Spacen Jasset wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
Yes, that is indeed a fair point and I agree. D is a "systems programming language." [sic] though; and so what will people use it for in the main? I suggest that communities that require scientific code have options now, and that they can and do choose languages for the purpose which have better support for thier needs than D might achieve.
Surprisingly there's not a lot of choice, witnessed by the prevalence of Fortran for scientific code. One interesting thing is that quite a few scientific coders mess with D and hang out around here, such as Don Clugston, Bill Baxter, bearophile, Benji Smith (he's doing machine learning if I remember correctly) and, if I may aspire to the status, yours truly. (I remain with an unformed opinion regarding Unicode operators.) Andrei
Oct 25 2008
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
But what operators would be added? Some mathematician programmers might want vector and matrix operators, others set operators, others still derivation/integration operators, and so on. Where would we stop? I don't deny it might be useful for them, but it does seem like too specific a need to integrate in the language. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 26 2008
next sibling parent KennyTM~ <kennytm gmail.com> writes:
Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
But what operators would be added? Some mathematician programmers might want vector and matrix operators, others set operators, others still derivation/integration operators, and so on. Where would we stop? I don't deny it might be useful for them, but it does seem like too specific a need to integrate in the language.
Composition may be useful for functional programming (I've never used any functional programming paradigm except "reduce".) Matrix operations: + - * .tr() .inv() .det() etc are already sufficient for most jobs. Vector operations: Maybe an operator for cross product. Set operators: Just use + - * (| ~ &) instead like Pascal. So only 2 Unicode operators I see are really useful and the replacements are ugly: Composition (o) and cross product (×).
Oct 26 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
But what operators would be added? Some mathematician programmers might want vector and matrix operators, others set operators, others still derivation/integration operators, and so on. Where would we stop? I don't deny it might be useful for them, but it does seem like too specific a need to integrate in the language.
I was thinking of allowing a general way of defining one Unicode character to stand in as one operator, and then have libraries implement the actual operators. There's the remaining problem of different libraries defining the same character to mean different operators. This may not be huge as math subdomains tend to be rather consistent in their use of operators. Across math subdomains, types and overloading can take care of things. Also, ascii representation should be allowed for operators, and one nice thing about Unicode characters is that many have HTML ascii and human-readable names, see http://www.fileformat.info/format/w3c/htmlentity.htm. So \unicodecharname may be a good alternate way to enter these operators. For example, the empty set could be \empty, and the cross-product could be written as \times. So c = a \times b; doesn't quite look bad to me. One nice thing about this is that we don't need to pore over naming and such, we just use stuff that others (creators and users alike) have already pored over. Saves on documentation writing too :o). Andrei
Oct 26 2008
parent KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
 Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
But what operators would be added? Some mathematician programmers might want vector and matrix operators, others set operators, others still derivation/integration operators, and so on. Where would we stop? I don't deny it might be useful for them, but it does seem like too specific a need to integrate in the language.
I was thinking of allowing a general way of defining one Unicode character to stand in as one operator, and then have libraries implement the actual operators. There's the remaining problem of different libraries defining the same character to mean different operators. This may not be huge as math subdomains tend to be rather consistent in their use of operators. Across math subdomains, types and overloading can take care of things. Also, ascii representation should be allowed for operators, and one nice thing about Unicode characters is that many have HTML ascii and human-readable names, see http://www.fileformat.info/format/w3c/htmlentity.htm. So \unicodecharname may be a good alternate way to enter these operators. For example, the empty set could be \empty, and the cross-product could be written as \times. So c = a \times b; doesn't quite look bad to me. One nice thing about this is that we don't need to pore over naming and such, we just use stuff that others (creators and users alike) have already pored over. Saves on documentation writing too :o). Andrei
LaTeX in D? :p Anyway we already have \&times; and \&empty; so we could reuse them in source code level as I've described somewhere in this thread. auto torque = position \&times; force; This is uglier than auto torque = position \times force; but it gives a uniform syntax between escape sequences inside and outside strings. The problem is you may have to invent some names, i.e. the composition operator ∘ (U+2218 ring operator) has no name in SGML entities. In LaTeX it is represented as \circ but \&circ; is already taken by ˆ (U+02C6 modifier letter circumflex accent). And you'll need to predefine the associativity and operation precedence too. ;) See my other entry in this thread.
Oct 26 2008
prev sibling parent Charles Hixson <charleshixsn earthlink.net> writes:
Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
(My comment cross posted here from reddit) I think the right way to do it is not to make everything Unicode. All the pressure on the existing symbols would be dramatically relieved by the addition of just a handful of new symbols. The truth is keyboards aren't very good for inputting Unicode. That isn't likely to change. Yes they've dealt with the problem in Asian languages by using IMEs but in my opinion IMEs are horrible to use. Some people seem to argue it's a waste to go to Unicode only for a few symbols. If you're going to go Unicode, you should go whole hog. I'd argue the exact opposite. If you're going to go Unicode, it should be done in moderation. Use as little Unicode as necessary and no more. As for how to input unicode -- Microsoft Word solved that problem ages ago, assuming we're talking about small numbers of special characters. It's called AutoCorrect. You just register your unicode symbol as a misspelling for "(X)" or something unique like that and then every time you type "(X)" a funky unicode character instantly replaces those chars. Yeh, not many editors support such a feature. But it's very easy to implement. And with that one generic mechanism, your editor is ready to support input of Unicode chars in any language just by adding the right definitions. --bb
I am not entirely sure that 30 or (x amount) of new operators would be a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that happen? It's also going to make the language more difficult to learn and understand.
I have noticed that in pretty much all scientific code, the f(a, b) and a.f(b) notations fall off a readability cliff when the number of operators grows only to a handful. Lured by simple examples like yours, people don't see that as a problem until they actually have to read or write such code. Adding temporaries and such is not that great because it further takes the algorithm away from its mathematical form just for serving a notation that was the problem in the first place.
But what operators would be added? Some mathematician programmers might want vector and matrix operators, others set operators, others still derivation/integration operators, and so on. Where would we stop? I don't deny it might be useful for them, but it does seem like too specific a need to integrate in the language.
Perhaps what needs to be added is a syntax for defining character to function correspondence? That way people could define the binary functions that they need, and then define a corresponding character string that represented it. I once recommended that Eiffel include a means of defining user operators (i.e., binary functions that sit between the terms on which the operate) using the name syntax thusly: Starts and ends with '|' and doesn't contain any whitespace. Must be surrounded by whitespace when used. I.e. 1 |X|-3 would be forbidden, as there is no whitespace following the |X| operator. That still seems like a good rule to me. If you want to include unicode, that's no problem. And the function could also be used as: X(1, -3) with identical meaning. I.e., marking a function as an operator by surrounding it with pipes would be purely syntax sugar. Note that such operators would have a precedence higher than assignment, but lower than everything else, so in practice the choice would be between writing: X (1, -3) and writing: (1 |X| -3) unless all one were doing is making an assignment. This is analogous to the class member variable in object methods, or the class name in class methods, except that that is often understood. OTOH, I'm not certain how much such syntax buys you. P.S.: another possibility, which is more in line with current D syntax requires an assignment of the operator character to a function that starts with op. As in '+' is associated with opAdd. However even though this is more in line with current D syntax, it seems to buy you a lot less. And it seems to require that the operator be a single character. This appears to me to be more work than it's worth for the return. Even the approach that I suggested is probably marginal. P.P.S: Any system that requires that a specific IDE or editor be used is no going to work. Not unless the IDE were provided with the language, and even then the most successful examples I can thing of are EMACS and Smalltalk. (I'm excluding programs that don't run on Linux, as I have no familiarity with either how they function or how popular they are. Probably, though, one could include Visual Basic and maybe some others. But one certainly couldn't include Basic, merely one dialect of it.)
Oct 26 2008
prev sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset  
<spacenjasset yahoo.co.uk> wrote:

 Why unicode anyway? In the same way that editor support is required to  
 actually type them in, why not let the editor render them. So instead of  
 symbol 'x' in the source code, say:

 m3 = m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof operator.


 While cross_product is a bit long and unwieldy any editor capable can  
 replace the rendition of that keyword with a symbol for it. But in  
 editors that don't it means that it still can be typed in and/or  
 displayed easily.

 Another option includes providing cross_product as an 'alias' and 'X'  
 aswell.

 Which then leads on to the introduction of a facility to add arbitary  
 operators, which could be interesting becuase you can supply any  
 operator you see fit for the domains that you use that require it. --  
 This provide exactly the right solution though as all the additions  
 would be 'non standard' and I can see books in the future recommending  
 people not use unicode operators, becuase editors don't have support for  
 them.
This made me think. What if we /could/ define arbitrary infix operators in D? I'm thinking something along the lines of: operator cross_product(T, U) { static if (T.opCross) { T.opCross(T) } else static if (U.opCross) { U.opCross_r(T); } else { static assert(false, "Operator not applicable to operands."); } } alias cross_product ×; I'm not sure if this is possible, but it sure would please downs. :P -- Simen
Oct 26 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas <simen.kjaras gmail.com> w=
rote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset <spacenjasset yahoo.co.=
uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So instead of
 symbol 'x' in the source code, say:

 m3 =3D m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof operator.


 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in edito=
rs
 that don't it means that it still can be typed in and/or displayed easil=
y.
 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any operato=
r
 you see fit for the domains that you use that require it. -- This provid=
e
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not use
 unicode operators, becuase editors don't have support for them.
This made me think. What if we /could/ define arbitrary infix operators i=
n
 D? I'm thinking something along the lines of:


 operator cross_product(T, U)
 {
  static if (T.opCross)
  {
    T.opCross(T)
  }
  else static if (U.opCross)
  {
    U.opCross_r(T);
  }
  else
  {
    static assert(false, "Operator not applicable to operands.");
  }
 }

 alias cross_product =D7;


 I'm not sure if this is possible, but it sure would please downs. :P
What's the precedence of your user-defined in-fix operator? --bb
Oct 26 2008
parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote:

 On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas  
 <simen.kjaras gmail.com> wrote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset  
 <spacenjasset yahoo.co.uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So instead  
 of
 symbol 'x' in the source code, say:

 m3 = m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof  
 operator.


 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in  
 editors
 that don't it means that it still can be typed in and/or displayed  
 easily.

 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any  
 operator
 you see fit for the domains that you use that require it. -- This  
 provide
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not use
 unicode operators, becuase editors don't have support for them.
This made me think. What if we /could/ define arbitrary infix operators in D? I'm thinking something along the lines of: operator cross_product(T, U) { static if (T.opCross) { T.opCross(T) } else static if (U.opCross) { U.opCross_r(T); } else { static assert(false, "Operator not applicable to operands."); } } alias cross_product ×; I'm not sure if this is possible, but it sure would please downs. :P
What's the precedence of your user-defined in-fix operator? --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p -- Simen
Oct 26 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 8:23 AM, Simen Kjaeraas <simen.kjaras gmail.com> wr=
ote:
 On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote=
:
 On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas <simen.kjaras gmail.com=
 wrote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset
 <spacenjasset yahoo.co.uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So instead =
of
 symbol 'x' in the source code, say:

 m3 =3D m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof operato=
r.
 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in
 editors
 that don't it means that it still can be typed in and/or displayed
 easily.

 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any
 operator
 you see fit for the domains that you use that require it. -- This
 provide
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not us=
e
 unicode operators, becuase editors don't have support for them.
This made me think. What if we /could/ define arbitrary infix operators in D? I'm thinking something along the lines of: operator cross_product(T, U) { static if (T.opCross) { T.opCross(T) } else static if (U.opCross) { U.opCross_r(T); } else { static assert(false, "Operator not applicable to operands."); } } alias cross_product =D7; I'm not sure if this is possible, but it sure would please downs. :P
What's the precedence of your user-defined in-fix operator? --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p
Same thing goes for downs' in-fix operators. I think his syntax is /infix/ which means that his ops always have the same precedence as division. I'm guessing this Python Cookbook recipe is very similar to Downs' technique. It discusses pros and cons and such. http://code.activestate.com/recipes/384122/ --bb
Oct 26 2008
parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Mon, 27 Oct 2008 00:41:26 +0100, Bill Baxter <wbaxter gmail.com> wrote:
 Same thing goes for downs' in-fix operators.  I think his syntax is
 /infix/ which means that his ops always have the same precedence as
 division.
 I'm guessing this Python Cookbook recipe is very similar to Downs'
 technique.  It discusses pros and cons and such.
 http://code.activestate.com/recipes/384122/

 --bb
An interesting read, though I have looked at downs' code before. It occured to me now that this could sorta have been fixed with a preprocessor, just define an operator to have the same precedence as an already existing operator, define an alias that gets replaced with /foo/, +foo+, or whatever operator you chose. I guess we're stuck waiting for macros in the meantime. -- Simen
Oct 26 2008
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Simen Kjaeraas wrote:
 On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote:
 
 On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas 
 <simen.kjaras gmail.com> wrote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset 
 <spacenjasset yahoo.co.uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So 
 instead of
 symbol 'x' in the source code, say:

 m3 = m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof 
 operator.


 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in 
 editors
 that don't it means that it still can be typed in and/or displayed 
 easily.

 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any 
 operator
 you see fit for the domains that you use that require it. -- This 
 provide
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not use
 unicode operators, becuase editors don't have support for them.
This made me think. What if we /could/ define arbitrary infix operators in D? I'm thinking something along the lines of: operator cross_product(T, U) { static if (T.opCross) { T.opCross(T) } else static if (U.opCross) { U.opCross_r(T); } else { static assert(false, "Operator not applicable to operands."); } } alias cross_product ×; I'm not sure if this is possible, but it sure would please downs. :P
What's the precedence of your user-defined in-fix operator? --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p
An operator could always be defined to have the same precedent as an existing operator, which it has to specify. Andrei
Oct 26 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 What's the precedence of your user-defined in-fix operator?

 --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p
An operator could always be defined to have the same precedent as an existing operator, which it has to specify.
Walter said in a previous post a few days ago when I suggested it that that would kill D's easy parsability. You say no? I'm no parser expert, so hard for me to say. --bb
Oct 26 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 
 What's the precedence of your user-defined in-fix operator?

 --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p
An operator could always be defined to have the same precedent as an existing operator, which it has to specify.
Walter said in a previous post a few days ago when I suggested it that that would kill D's easy parsability. You say no? I'm no parser expert, so hard for me to say.
It can be done, but it's kinda involved. You define a grammar in which all operators have the same precedence. Consequently you compile any expression into a list of operands and operators. That makes the language parsable without semanting info. Then the semantic stage transforms the list into a tree. Cecil does that. Andrei
Oct 26 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 11:43 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 What's the precedence of your user-defined in-fix operator?

 --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p
An operator could always be defined to have the same precedent as an existing operator, which it has to specify.
Walter said in a previous post a few days ago when I suggested it that that would kill D's easy parsability. You say no? I'm no parser expert, so hard for me to say.
It can be done, but it's kinda involved. You define a grammar in which all operators have the same precedence. Consequently you compile any expression into a list of operands and operators. That makes the language parsable without semanting info. Then the semantic stage transforms the list into a tree. Cecil does that.
I see. So the price you pay is that you defer more decisions till semantic stage. I.e. "a b c d e" is allowed to parse into an amorphous list, then in the semantic pass you decide if 'b' and 'd' are actually legal operators or not. --bb
Oct 26 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 11:43 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 What's the precedence of your user-defined in-fix operator?

 --bb
Yup, I realized this myself as well. Seemed like such a great idea when I only thought of it for three seconds. :p
An operator could always be defined to have the same precedent as an existing operator, which it has to specify.
Walter said in a previous post a few days ago when I suggested it that that would kill D's easy parsability. You say no? I'm no parser expert, so hard for me to say.
It can be done, but it's kinda involved. You define a grammar in which all operators have the same precedence. Consequently you compile any expression into a list of operands and operators. That makes the language parsable without semanting info. Then the semantic stage transforms the list into a tree. Cecil does that.
I see. So the price you pay is that you defer more decisions till semantic stage. I.e. "a b c d e" is allowed to parse into an amorphous list, then in the semantic pass you decide if 'b' and 'd' are actually legal operators or not.
Yah. Something tells me Walter won't embark on that soon. Andrei
Oct 26 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Yah. Something tells me Walter won't embark on that soon.
Not a chance <g>. Producing an amorphous list of tokens isn't what I'd call "parsing".
Oct 26 2008
prev sibling next sibling parent Max Samukha <samukha voliacable.com.removethis> writes:
On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/


Andrei
I'm already having problems with unicode: the news reader I'm using doesn't display the characters correctly (maybe it's time to update). If unicode can be avoided, please avoid it.
Oct 22 2008
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

Few random thoughts on the subject:
- Someday probably programming languages will use some Unicode symbols. I don't
know if Fortress will succeed, but I think someday some language will do.
Probably Unicode symbols will be used as in Fotress, for improve the
readability of the code, and not as in APL to transform the code into
hieroglyphics.
- Another good thing that Fortress does is that there are always *nice* looking
ways to write the same code in pure ASCII. So there are usually intuitive 2 or
3 char long translations of all the accepted Unicode symbols. This is very
positive, so you can write/read Fortress with a normal ASCII editor too.
- My editor, programming font, newsreader, IDEs, and probably more things,
currently have problems with Unicode texts.
- Novels in English and other languages show that you can express very complex
and refined thoughts with just very few characters. But you need some space to
write a novel/short story. Mathematics shows that a judicious usage of standard
and widely used symbols helps a lot in decreasing the space used to represent
formulas, etc.
- Fortress and the Mathematica language are designed for physics and
mathematics. D language can be used for that, but it's mostly a system
language. So symbols are more used and more important in Fortress than D. So
their purposes and targets are different.
- I like the idea of using *few* Unicode symbols in my programs, they can
reduce code size and they may even improve readability.
- Python3 allows Unicode identifiers, mostly to allow people in all part of the
world to write variable names in their languages.
- But seeing the disadvantages in the end I think that in practice adopting
Unicode for D programs is currently bad.

Bye,
bearophile
Oct 23 2008
next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in all part of
the world to write variable names in their languages.
So does D.
Oct 23 2008
next sibling parent reply Max Samukha <samukha voliacable.com.removethis> writes:
On Thu, 23 Oct 2008 04:23:29 -0700, Robert Fraser
<fraserofthenight gmail.com> wrote:

bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in all part of
the world to write variable names in their languages.
So does D.
I'd like to note that identifiers in a non-English language are considered bad style by many programmers. Besides, big part of software projects nowadays are international. Imagine participants of linux project writing identifiers in his language.
Oct 23 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Max Samukha wrote:
 On Thu, 23 Oct 2008 04:23:29 -0700, Robert Fraser 
 <fraserofthenight gmail.com> wrote:
 
 bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in
 all part of the world to write variable names in their languages.
 
So does D.
I'd like to note that identifiers in a non-English language are considered bad style by many programmers. Besides, big part of software projects nowadays are international. Imagine participants of linux project writing identifiers in his language.
isn't that something that should be decided upon on a per-project basis? I agree that it'll be bad for Linux, but each project has its own objectives. for example, what if you're teaching a programming course for kids? it'll be easier for them writing in their own native language. I could easily imagine a small start-up writing in their own native language (let's say Hebrew) as one way for obfuscating the source code, so as to protect their IP. there are, I'm sure, more use-cases.
Oct 23 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
I always use English for variable names, instead of my language, because I've
had my share of debugging code with variables in other languages and it's not a
nice thing to do.

Regarding Python code, its std libs keeps identifiers in English only, but when
they have invented the OneLaptopForChild that uses Python a lot, they have
decided that 'kids' may enjoy using variable names in their language...

Bye,
bearophile
Oct 23 2008
parent Max Samukha <samukha voliacable.com.removethis> writes:
On Thu, 23 Oct 2008 08:33:16 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

I always use English for variable names, instead of my language, because I've
had my share of debugging code with variables in other languages and it's not a
nice thing to do.

Regarding Python code, its std libs keeps identifiers in English only, but when
they have invented the OneLaptopForChild that uses Python a lot, they have
decided that 'kids' may enjoy using variable names in their language...

Bye,
bearophile
Keep children away from Python. Let them have happy lives :)
Oct 23 2008
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in all 
 part of the world to write variable names in their languages.
So does D.
D currently allows Unicode in identifiers, comments, and strings. In fact, D source text is defined to be Unicode.
Oct 23 2008
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
[snip] (No need to single me out. It's Walter's post, and besides I don't have a formed opinion on Unicode symbols.) Andrei
Oct 23 2008
prev sibling next sibling parent Yigal Chripun <yigal100 gmail.com> writes:
Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
 
 
 
 Andrei
A few thoughts on the subject: - others already mentioned, i think, smalltalk as an example. smalltalk bundles as part of the language also the complete environment and IDE so they can add Unicode chars without worrying much about editor support. in D this is an issue as D doesn't provide an "official" D editor. The support largely exists for Unicode - even plain notepad supports Unicode fully but that doesn't mean people are using any of the many editors that has this feature. - smalltalk uses left-arrow as assignment op. the way you enter it is by typing "<_" so this is similar to Bill's suggestion, i.e. define a short sequence of chars to be replaced by a Unicode char in the file source. - why not generalize the concept? a few ideas: syntax is not important here, just the idea itself.. 1) bool compare as == (A a, A b) {} you can add an op alias to your function, maybe define anonymous function with alias to be used only as op. 2) provide a way to specify which functions can be used as infix functions (Scala does that IIRC) and maybe even specify precedence somehow, so that downs' map function could be written as : infix void map(...) {} and used as: dg map array;
Oct 23 2008
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei
I suggest not. There are problems if you adopt Unicode as operators: ====== 1) My editor supports Unicode, but my keyboard don't. So how do I type ∩ and ∪ for a set«T»? 1.1) What if the library writer forget to provide an alternative, ASCII-only name? [This is also a problem of using Unicode as identifier as general.] 1.2) Some suggested auto-correction in the IDE. Again what if I used notepad/nano/TextEdit to code? I had suggested once before, but let me put it formally here. If you really want to support Unicode operators in source code, - Firstly, ditch the ability to replace \xxx with '\xxx' when it appears without the quotes (so “char x = \n;” won't compile). - Then, replace \xxx with the character represented in source level, so Vector3D«real» τ = r × F; can be written as Vector3D!(real) \&tau; = r \&times; F; - You don't need to introduce a separate trigraph. - But suggestion do trigger some people's trigraph-phobia. [Yell no! Now! :) ] - It may make the source code difficult to parse grammatically. - It will make the source code difficult to read, just look at the number of semicolons in the ASCII encoded version. - But at least you can compile your code. ====== 2) This is regarding the rejection of « & » to be supported even if the emacs module goes official. Of course it turns out it is not, but let's think of these scenarios: 2.1) OK it turns out ∩ and ∪ and «T» where just .opUnion(x) and .opIntersect(x) and !(T) pretty-printed in emacs; the compiler won't accept these characters anyway. But sometimes I forgot and just copied a portion of these code to nano/geany/whatever and then it stops compiling! 2.2) Well this copy&paste problem has been solved in the IDE level by inverting the pretty printing while copying. But now I publish my fantastic, pretty-printed D program in a web page/PDF/whatever, and people just complain the compiler won't accept it! I still believe if you're going to transform D code to Unicode visually, the compiler must accept these visual replacement as well. May I also take Mathematica as an example. The programming language itself uses a heavy load of non-ASCII characters, and the IDE also pretty-printed them as nice mathematical formulas, but in the “source code” level they are just escape sequences. So on screen you see E^(I π) + 1 but in the source code you'll see E^(I \[Pi]) + 1 However, if you type in “E^(I π) + 1” in a plain .nb file and open with the Get[] function (think of it as “import xx.d”) it can still correctly display the result “0”. ====== 3) There are over 800 unary or binary operators in Unicode[1]. How are you going to opXXX all them? Assume your blog entry doesn't mean the simple “!=” ↦ “≠” transformation. ====== 4) These are regarding if you are going to support overloading for all these 800 operators, how to define: 4.1) [Big problem] Operator precedence? (One person may want ∧ to mean the wedge product (so they have higher precedence than + and -) but another want it to mean logical AND (so lower than + and -).) 4.2) Associativity? How to determine if an operator is left-associative, right-associative or both? (∧ as wedge product is both, while ∧ as a power function pow(a,b) is right-assoc.) 4.3) [Minor problem] Commutativity? Or we'll need to write opXXX and opXXX_r all the time? introduce some attributes like [Associative, Commutative] FuzzyBool operator∧ (FuzzyBool x, FuzzyBool y) { return min(x,y); } but it's not D. :) Or predefine the meaning, precedence and associativity for the each operator, so e.g. ∧ always means the wedge product and not logical AND, just like now ^ always means XOR and not power function. Or just require the programmer to always put the parenthesis. Ref: [1] A rough word count in http://www.unicode.org/Public/math/revision-11/MathClass-11.txt. The actual number is higher than this.
Oct 23 2008
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
KennyTM~ wrote:
 
 1.2) Some suggested auto-correction in the IDE. Again what if I used 
 notepad/nano/TextEdit to code?
 
Then I suggest a change in career... ^^' -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 24 2008
prev sibling next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 23 Oct 2008 00:27:58 +0200, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/


 Andrei
I really like the idea of having more unicode in the language, but I feel these should be fairly limited. There are times I feel that more operators (especially, as has been mentioned, opCross and opDotProduct) would be nice to have, but it's just sugar, really. As an example, while I'd enjoy seeing code like this, I'm not sure I'd enjoy writing it (Note that I am prone to exaggerations): int a = ∅; //empty set, same as "= void" int[] b = [1,2,3,4,5,6]; a = readInt(); if (a ∈ b) // Element of - "in" { float c = 2.00001; float d = readInt(); writefln(c ≈ ⌈d⌉ ); // Approximately equal, ceil myClass c = getInstance(); if (∃c) // c exists, i.e. "!is null" { writefln(√(c.foo)); // I thought this should work in D today, using "alias sqrt √;", but it seems the compiler chokes on it. :( } ∀element∈b // New foreach syntax! { element *= ¼; } } -- Simen
Oct 23 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
T24gRnJpLCBPY3QgMjQsIDIwMDggYXQgNTo0OCBBTSwgU2ltZW4gS2phZXJhYXMgPHNpbWVuLmtq
YXJhc0BnbWFpbC5jb20+IHdyb3RlOgoKPiAgICB3cml0ZWZsbiiWKGMuZm9vKSk7IC8vIEkgdGhv
dWdodCB0aGlzIHNob3VsZCB3b3JrIGluIEQgdG9kYXksIHVzaW5nCj4gImFsaWFzIHNxcnQgljsi
LCBidXQgaXQgc2VlbXMgdGhlIGNvbXBpbGVyIGNob2tlcyBvbiBpdC4gOigKCkFjY29yZGluZyB0
byB0aGUgc3BlYywgeW91IGNhbiBjYW4gb25seSB1c2UgIlVuaXZlcnNhbEFscGhhIiBVbmljb2Rl
CmNoYXJhY3RlcnMgaW4geW91ciBpZGVudGlmaWVycy4gIFN1cHBvc2VkbHkgdGhvc2UgYXJlIGRl
ZmluZWQgaW4KSVNPL0lFQyA5ODk5OjE5OTkoRSkgQXBwZW5kaXggRC4gIEJ1dCBJJ20gZ3Vlc3Np
bmcgdGhlIElTTyBkaWQgbm90CmRlZmluZSBzcXVhcmUtcm9vdC1zeW1ib2wgYXMgYW4gYWxwaGEg
Y2hhcmFjdGVyLgoKLS1iYgo=
Oct 23 2008
parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 23 Oct 2008 23:47:59 +0200, Bill Baxter <wbaxter gmail.com> wrote:

 On Fri, Oct 24, 2008 at 5:48 AM, Simen Kjaeraas <simen.kjaras gmail.com>  
 wrote:

    writefln(√(c.foo)); // I thought this should work in D today, using
 "alias sqrt √;", but it seems the compiler chokes on it. :(
According to the spec, you can can only use "UniversalAlpha" Unicode characters in your identifiers. Supposedly those are defined in ISO/IEC 9899:1999(E) Appendix D. But I'm guessing the ISO did not define square-root-symbol as an alpha character. --bb
That seems to make sense indeed. -- Simen
Oct 23 2008
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Simen Kjaeraas wrote:
 
 As an example, while I'd enjoy seeing code like this, I'm not sure I'd 
 enjoy writing it (Note that I am prone to exaggerations):
 
 int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();
Hum, interesting example, it actually made me realize that 'null' would be an ideal candidate for having a Unicode symbol of it's own. Does anyone have suggestions for a possible one? Preferably somewhat circle-shaped. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 24 2008
next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Fri, 24 Oct 2008 18:52:03 +0200, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 Simen Kjaeraas wrote:
  As an example, while I'd enjoy seeing code like this, I'm not sure I'd  
 enjoy writing it (Note that I am prone to exaggerations):
  int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();
Hum, interesting example, it actually made me realize that 'null' would be an ideal candidate for having a Unicode symbol of it's own. Does anyone have suggestions for a possible one? Preferably somewhat circle-shaped.
Well, we norwegians got the Ø (html entity &Oslash;, Latin-1 character 216) - looks a lot like the empty set symbol. -- Simen
Oct 24 2008
prev sibling parent reply KennyTM~ <kennytm gmail.com> writes:
Bruno Medeiros wrote:
 Simen Kjaeraas wrote:
 As an example, while I'd enjoy seeing code like this, I'm not sure I'd 
 enjoy writing it (Note that I am prone to exaggerations):

 int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();
Hum, interesting example, it actually made me realize that 'null' would be an ideal candidate for having a Unicode symbol of it's own. Does anyone have suggestions for a possible one? Preferably somewhat circle-shaped.
auto Ø = null; // \&Oslash; I assume you're not serious...
Oct 24 2008
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
KennyTM~ wrote:
 Bruno Medeiros wrote:
 Simen Kjaeraas wrote:
 As an example, while I'd enjoy seeing code like this, I'm not sure 
 I'd enjoy writing it (Note that I am prone to exaggerations):

 int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();
Hum, interesting example, it actually made me realize that 'null' would be an ideal candidate for having a Unicode symbol of it's own. Does anyone have suggestions for a possible one? Preferably somewhat circle-shaped.
auto Ø = null; // \&Oslash; I assume you're not serious...
It's an interesting and effective way to save some typing, and it might be even more readable (but with a symbol other than Ø). But I probably would not use it anyway, since I like to write very standardized code, that other people can easily recognize and read. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 26 2008
prev sibling next sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei
I'm unsure about this idea. I don't know if it would be worthwhile, but I would say there are two aspects that likely would need to be observed for this to work out favorably: * Having non-unicode versions of the symbols/keywords available in Unicode, such that non-Uunicode editing and viewing is always possible as a fallback. This has some important consequences though, such as making Unicode-symbol-usage unable to solve the shortage of brackets for, for example, the template instantiation syntax (because an alternative ASCII notation would still be necessary). * Having a way to directly input the Unicode symbols in the keyboard. One reason is because of typing succinctness, and another, is because I find the alternative (have the editor/IDE automatically change an ASCII character sequence into a Unicode symbol) to have several disadvantages: First is that it doesn't work outside the editors/IDEs configured to do so, (which is a bummer, there is actually plenty of code written outside that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I personally like that the editor always require exactly N backspaces to erase N typed characters[*]. So, anyone knows if it is possible on Windows (I believe in Unix it is) to configure your keyboard mapping with custom settings? For example, if I press AltGr-O, it inputs some Unicode character of my choosing? [*] As a sidenote, this is also why I don't like having my editor configured to insert 4 spaces on TAB-press. Unless, the editor is also smart enough to delete the 4 spaces on one backspace/delete and move 4 spaces on one move cursor operation (arrow key press). -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 24 2008
next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Fri, 24 Oct 2008 18:28:51 +0200, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
   
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/  
    Andrei
I'm unsure about this idea. I don't know if it would be worthwhile, but I would say there are two aspects that likely would need to be observed for this to work out favorably: * Having non-unicode versions of the symbols/keywords available in Unicode, such that non-Uunicode editing and viewing is always possible as a fallback. This has some important consequences though, such as making Unicode-symbol-usage unable to solve the shortage of brackets for, for example, the template instantiation syntax (because an alternative ASCII notation would still be necessary). * Having a way to directly input the Unicode symbols in the keyboard. One reason is because of typing succinctness, and another, is because I find the alternative (have the editor/IDE automatically change an ASCII character sequence into a Unicode symbol) to have several disadvantages: First is that it doesn't work outside the editors/IDEs configured to do so, (which is a bummer, there is actually plenty of code written outside that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I personally like that the editor always require exactly N backspaces to erase N typed characters[*]. So, anyone knows if it is possible on Windows (I believe in Unix it is) to configure your keyboard mapping with custom settings? For example, if I press AltGr-O, it inputs some Unicode character of my choosing?
I'd guess this oughtta do it: http://www.microsoft.com/globaldev/tools/msklc.mspx -- Simen
Oct 24 2008
next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Simen Kjaeraas wrote:
 On Fri, 24 Oct 2008 18:28:51 +0200, Bruno Medeiros 
 <brunodomedeiros+spam com.gmail> wrote:
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
  http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
    Andrei
I'm unsure about this idea. I don't know if it would be worthwhile, but I would say there are two aspects that likely would need to be observed for this to work out favorably: * Having non-unicode versions of the symbols/keywords available in Unicode, such that non-Uunicode editing and viewing is always possible as a fallback. This has some important consequences though, such as making Unicode-symbol-usage unable to solve the shortage of brackets for, for example, the template instantiation syntax (because an alternative ASCII notation would still be necessary). * Having a way to directly input the Unicode symbols in the keyboard. One reason is because of typing succinctness, and another, is because I find the alternative (have the editor/IDE automatically change an ASCII character sequence into a Unicode symbol) to have several disadvantages: First is that it doesn't work outside the editors/IDEs configured to do so, (which is a bummer, there is actually plenty of code written outside that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I personally like that the editor always require exactly N backspaces to erase N typed characters[*]. So, anyone knows if it is possible on Windows (I believe in Unix it is) to configure your keyboard mapping with custom settings? For example, if I press AltGr-O, it inputs some Unicode character of my choosing?
I'd guess this oughtta do it: http://www.microsoft.com/globaldev/tools/msklc.mspx
Yes, exactly that! I had the impression there was such a program for Windows, but couldn't remember the name. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 26 2008
prev sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Simen Kjaeraas wrote:
 So, anyone knows if it is possible on Windows (I believe in Unix it 
 is) to configure your keyboard mapping with custom settings? For 
 example, if I press AltGr-O, it inputs some Unicode character of my 
 choosing?
I'd guess this oughtta do it: http://www.microsoft.com/globaldev/tools/msklc.mspx
I remember this same question being asked on a Microsoft DL when I was working there, and all the answers given were for third-party tools like KeyTweak ( http://webpages.charter.net/krumsick/ ) ;-P . Good to know there's an MS one.
Oct 26 2008
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Bruno Medeiros:
* Having non-unicode versions of the symbols/keywords available in Unicode,
such that non-Uunicode editing and viewing is always possible as a fallback.
This has some important consequences though, such as making
Unicode-symbol-usage unable to solve the shortage of brackets for, for example,
the template instantiation syntax (because an alternative ASCII notation would
still be necessary).<
Fortress uses pairs of symbols to denote various sequence literarls. Some of http://a6systems.com/fsharpsheet.pdf Creates the list: let lsgen2 = [0 .. 2 .. 8] Gives: [0;2;4;6;8] Note: 0 .. 2 .. 8 equals to the Python slice with stride syntax 0:8:2 Create the array: let argen2 = [|0 .. 2 .. 8|] Gives: [|0;2;4;6;8|] Creating a seq (that is lazy): let s = seq { for i in 0 .. 10 do yield i } more functional (as them are useful in Scala too, that is partially functional. functional-procedural-OOP hybrids almost like D2 will want to become, D2 is so languages like Haskell are functional all the way), this is an Augmented Discriminated Union: type BinTree<'a> = | Node of BinTree<'a> * 'a * BinTree<'a> | Leaf with member self.Depth() = match self with | Leaf -> 0 | Node(l, _, r) -> 1 + l.Depth() + r.Depth() lazy/nonlazy collection generators too, this is the third iteration of my ideas on this topic (if you think succintness in (partially) functional languages is useless, think again. It allows to use certain things instead of falling back to more procedural idioms): auto flat = (abs(el) for(row: mat) for(el: row) if (el % 2)); // lazy auto multi = [c:mulIter(c, i) for(i,c: "abcdef")]; // AA auto squares = void[x*x for(x: 0..100)]; // set void[int] squares = [x*x for(x: 0..100)];// set, alternative syntax auto squares = {x*x for x in xrange(100)}; // set, alternative syntax auto squares = {| x*x for(x: 0..100) |}; // list? auto squares = [| x*x for(x: 0..100) |]; // multiset? something else? Bye, bearophile
Oct 24 2008
prev sibling next sibling parent reply ore-sama <spam here.lot> writes:
Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly? --bb
Oct 24 2008
next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 25 Oct 2008 06:43:19 +0900,
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
Oct 24 2008
next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oct 24 2008
next sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest! --benji
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest!
But that has the same problem. Cygtools don't understand windows paths so barf when you say "grep c:\foo.txt" But the Windows shell only will only autocomplete Windows-style paths. I've found the gnuwin32 tools to work a little better on that front. --bb
Oct 24 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest!
But that has the same problem. Cygtools don't understand windows paths so barf when you say "grep c:\foo.txt" But the Windows shell only will only autocomplete Windows-style paths. I've found the gnuwin32 tools to work a little better on that front. --bb
Wha??? The "grep" tool doesn't read the path. The *shell* interprets the path and passes the text to the program. That's how all the gnu tools are able to pipe their results from one tool to the other. Or at least, that's how I assume it works. Cuz I use grep like every single day. On the "cmd.exe" shell. With windows paths. In fact, just for you, I tested this: grep -i "SHAZZAM" "C:\Documents and Settings\benji\Desktop\my filename with spaces.txt" Worked like a charm. If the path doesn't have spaces, I have no problem with this: grep -i "SHAZZAM" C:\file.txt I tried it in both "command.com" and in "cmd.exe" and didn't experience any problem in either environment. The key is to never never never use the cygwin shell. It's a piece of garbage. But using the executables from the "cygwin\bin" directory within the windows shell... Priceless! --benji
Oct 24 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 11:39 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest!
But that has the same problem. Cygtools don't understand windows paths so barf when you say "grep c:\foo.txt" But the Windows shell only will only autocomplete Windows-style paths. I've found the gnuwin32 tools to work a little better on that front. --bb
Wha??? The "grep" tool doesn't read the path. The *shell* interprets the path and passes the text to the program. That's how all the gnu tools are able to pipe their results from one tool to the other. Or at least, that's how I assume it works. Cuz I use grep like every single day. On the "cmd.exe" shell. With windows paths. In fact, just for you, I tested this: grep -i "SHAZZAM" "C:\Documents and Settings\benji\Desktop\my filename with spaces.txt" Worked like a charm. If the path doesn't have spaces, I have no problem with this: grep -i "SHAZZAM" C:\file.txt I tried it in both "command.com" and in "cmd.exe" and didn't experience any problem in either environment. The key is to never never never use the cygwin shell. It's a piece of garbage. But using the executables from the "cygwin\bin" directory within the windows shell... Priceless!
Oh, I didn't realize that. There is one thing that doesn't work, which is probably what gave me the impression it was broken -- Windows paths with wildcards don't work. Like "grep c:\Windows\*.txt". But you're right that it does seem to work for both windows paths, and local wildcards, just not Windows paths with wildcards. But that's great. Thanks for the info. Actually I used to put cygwin\bin on my path years ago, but stopped doing it at some point and switched to gnuwin32. I was under the impression that it worked better then, but actually I've had some trouble with gnuwin32 recently. --bb
Oct 24 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory within
 the windows shell... Priceless!
Oh, I didn't realize that. There is one thing that doesn't work, which is probably what gave me the impression it was broken -- Windows paths with wildcards don't work. Like "grep c:\Windows\*.txt". But you're right that it does seem to work for both windows paths, and local wildcards, just not Windows paths with wildcards. But that's great. Thanks for the info. Actually I used to put cygwin\bin on my path years ago, but stopped doing it at some point and switched to gnuwin32. I was under the impression that it worked better then, but actually I've had some trouble with gnuwin32 recently.
Glad I could be of service! --benji
Oct 24 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory 
 within
 the windows shell... Priceless!
Oh, I didn't realize that. There is one thing that doesn't work, which is probably what gave me the impression it was broken -- Windows paths with wildcards don't work. Like "grep c:\Windows\*.txt". But you're right that it does seem to work for both windows paths, and local wildcards, just not Windows paths with wildcards.
It's not the paths with wildcards that is the problem. In this case, it is the shell. Grep is expecting the shell to expand the wildcards, as it does on unix. For example, you can use this old trick if ls suddenly becomes unavailable to list all files in the current directory: echo * Which is all shell builtin no executables are run. If you ran this from a windows shell you get the same error: grep text /cygdrive/c/Windows/*.txt The windows shell expects the application to handle wildcard expansion, which is why windows command line programs don't always work the same way. Every program has to build in wildcard expansion to support it. -Steve
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 1:40 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory
 within
 the windows shell... Priceless!
Oh, I didn't realize that. There is one thing that doesn't work, which is probably what gave me the impression it was broken -- Windows paths with wildcards don't work. Like "grep c:\Windows\*.txt". But you're right that it does seem to work for both windows paths, and local wildcards, just not Windows paths with wildcards.
It's not the paths with wildcards that is the problem. In this case, it is the shell. Grep is expecting the shell to expand the wildcards, as it does on unix.
Read again. Particularly this part: "it does seem to work for both windows paths, **and local wildcards**, just not Windows paths with wildcards". (emphasis added) "grep Foo *.txt" works just fine. "grep Foo c:\*.txt" does not. --bb
Oct 24 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 1:40 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory
 within
 the windows shell... Priceless!
Oh, I didn't realize that. There is one thing that doesn't work, which is probably what gave me the impression it was broken -- Windows paths with wildcards don't work. Like "grep c:\Windows\*.txt". But you're right that it does seem to work for both windows paths, and local wildcards, just not Windows paths with wildcards.
It's not the paths with wildcards that is the problem. In this case, it is the shell. Grep is expecting the shell to expand the wildcards, as it does on unix.
Read again. Particularly this part: "it does seem to work for both windows paths, **and local wildcards**, just not Windows paths with wildcards". (emphasis added) "grep Foo *.txt" works just fine. "grep Foo c:\*.txt" does not.
Then that must be something grep is doing extra. Or perhaps the Windows console selectively expands wildcards? I have no idea. It seems weird that grep would expand only current-directory wildcards (try grep Foo *, and see if it works. Windows normally only expands *.* to mean 'all files'). But in the case of using a cygwin shell, the shell expands all wildcards before passing arguments to grep. That much I do know. I haven't really had a need to use the windows shell in a long time ;) -Steve
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 2:09 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 1:40 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory
 within
 the windows shell... Priceless!
Oh, I didn't realize that. There is one thing that doesn't work, which is probably what gave me the impression it was broken -- Windows paths with wildcards don't work. Like "grep c:\Windows\*.txt". But you're right that it does seem to work for both windows paths, and local wildcards, just not Windows paths with wildcards.
It's not the paths with wildcards that is the problem. In this case, it is the shell. Grep is expecting the shell to expand the wildcards, as it does on unix.
Read again. Particularly this part: "it does seem to work for both windows paths, **and local wildcards**, just not Windows paths with wildcards". (emphasis added) "grep Foo *.txt" works just fine. "grep Foo c:\*.txt" does not.
Then that must be something grep is doing extra.
Yep, that was what I said.
 Or perhaps the Windows
 console selectively expands wildcards?  I have no idea.
Don't think so. "echo *" still dutifully prints a "*" to the console. Cygwin grep is doing it, probably in an attempt to be more useful when used from the DOS prompt.
 It seems weird that
 grep would expand only current-directory wildcards (try grep Foo *, and see
 if it works.
Yep that works.
 Windows normally only expands *.* to mean 'all files').
If by that you mean Windows command line programs usually expand *.*, then yeh.
 But in the case of using a cygwin shell, the shell expands all wildcards before
 passing arguments to grep.  That much I do know.  I haven't really had a
 need to use the windows shell in a long time ;)
Yep that's true for Bash. An easy way to tell the Windows shell does nothing is by compiling and running: import std.stdio; void main(string[] args) { writefln("Args: %s", args); } And passing it some wildcards. It never expands anything. Only thing it does do is mess with quotes some. Here's an example: C:\> args.exe * "C:\Program Files" *.* c:\* Args: [args,*,C:\Program Files,*.*,c:\*] --bb
Oct 24 2008
parent Benji Smith <dlanguage benjismith.net> writes:
Bill Baxter wrote:
 "it does seem to work for both windows paths, **and local wildcards**,
 just not Windows paths with wildcards".
 (emphasis added)

 "grep Foo *.txt"  works just fine.  "grep Foo c:\*.txt"  does not.
Then that must be something grep is doing extra.
Yep, that was what I said.
 Or perhaps the Windows
 console selectively expands wildcards?  I have no idea.
Don't think so. "echo *" still dutifully prints a "*" to the console. Cygwin grep is doing it, probably in an attempt to be more useful when used from the DOS prompt.
 It seems weird that
 grep would expand only current-directory wildcards (try grep Foo *, and see
 if it works.
Interesting. About 90% of the time, I run grep with the "recursion" flag, so I haven't thought about wildcard expansion in ages. grep -R "some text" . I do know that "wc" does wildcard expansion, even with paths, but you have to use forward slashes. So, to count lines in D programs from the windows shell: wc -l /dev/*.d Unfortunately, there's no "recursion" flag for wc, so I end up doing something dumb like this: wc -l /dev/*.d wc -l /dev/*/*.d wc -l /dev/*/*/*.d Etc. Hmmmmmm. I really should just compile my own wc. After all, Walter's already written the sample code. --benji
Oct 25 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.

 I've found the gnuwin32 tools to work a little better on that front.
Wha??? The "grep" tool doesn't read the path. The *shell* interprets the path and passes the text to the program. That's how all the gnu tools are able to pipe their results from one tool to the other. Or at least, that's how I assume it works.
No, that's how it works with the Bash shell and most Unix shells, but the Windows console doesn't do that stuff. It's up to each app to interpret and expand wildcards like *.txt. So the cygwin progs must be explicitly checking to see if they got a * from a stupid DOS console and doing the glob themselves. But the implementation is apparently imperfect since it doesn't work on full DOS paths with wildcards. --bb
Oct 24 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Benji Smith" wrote
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net> 
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest!
But that has the same problem. Cygtools don't understand windows paths so barf when you say "grep c:\foo.txt" But the Windows shell only will only autocomplete Windows-style paths. I've found the gnuwin32 tools to work a little better on that front. --bb
Wha??? The "grep" tool doesn't read the path. The *shell* interprets the path and passes the text to the program. That's how all the gnu tools are able to pipe their results from one tool to the other. Or at least, that's how I assume it works.
No, grep accepts either input. The shell does not change paths to windows style, that is what cygpath is for. But it does interpret backslashes, so you have to double all those. So for instance, in a cygwin shell, this works also: grep -i "SHAZZAM" C:\\Documents\ and\ Settings\\benji\\Desktop\\my\ filename\ with\ spaces.txt The arguments are passed as they are, grep just is smart enough to use either one. Probably many tools are that way, I wouldn't know because I usually do the /cygdrive/c/... form.
 The key is to never never never use the cygwin shell. It's a piece of 
 garbage. But using the executables from the "cygwin\bin" directory within 
 the windows shell... Priceless!
Without the cygwin shell, you lose all bash features, like for, or backticks to execute a command and use it's output. The paths are a minor annoyance IMO. Using the cmd.exe shell is ok for simple tasks, but it pales severely in comparison to the power of bash. So piece of garbage it is not. Something you don't understand how to use properly? definitely ;) -Steve
Oct 24 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 1:33 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest!
But that has the same problem. Cygtools don't understand windows paths so barf when you say "grep c:\foo.txt" But the Windows shell only will only autocomplete Windows-style paths. I've found the gnuwin32 tools to work a little better on that front. --bb
Wha??? The "grep" tool doesn't read the path. The *shell* interprets the path and passes the text to the program. That's how all the gnu tools are able to pipe their results from one tool to the other. Or at least, that's how I assume it works.
No, grep accepts either input. The shell does not change paths to windows style, that is what cygpath is for. But it does interpret backslashes, so you have to double all those. So for instance, in a cygwin shell, this works also: grep -i "SHAZZAM" C:\\Documents\ and\ Settings\\benji\\Desktop\\my\ filename\ with\ spaces.txt The arguments are passed as they are, grep just is smart enough to use either one. Probably many tools are that way, I wouldn't know because I usually do the /cygdrive/c/... form.
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory within
 the windows shell... Priceless!
Without the cygwin shell, you lose all bash features, like for, or backticks to execute a command and use it's output. The paths are a minor annoyance IMO. Using the cmd.exe shell is ok for simple tasks, but it pales severely in comparison to the power of bash. So piece of garbage it is not. Something you don't understand how to use properly? definitely ;)
Yeh, I love the bash shell. Really the only thing keeping me from using it for D work is the fact that it won't auto-complete Windows filenames. --bb
Oct 24 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 1:33 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith 
 <dlanguage benjismith.net>
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov 
 <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt> also, MSYS gives you all the linux tools if you really need to be shell only. last resort: nothing stops you from implementing your own "cat" application in D with full Unicode support. most if not all linux shell tools are separate executables anyway and if any still do not support unicode it'll be trivial to roll your own replacements for the bad ones.
Oh, and one of my favorite tricks in Windows is to install cygwin (usually at "C:\cygwin" or whatever their boneheaded installer insists on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH. That way, I can continue using the ordinary windows shell (which I prefer, since it doesn't force me to use the nutty directory names that the cygwin shell uses), but I can still access all the linux commands. Calling grep from a windows shell is the bestest!
But that has the same problem. Cygtools don't understand windows paths so barf when you say "grep c:\foo.txt" But the Windows shell only will only autocomplete Windows-style paths. I've found the gnuwin32 tools to work a little better on that front. --bb
Wha??? The "grep" tool doesn't read the path. The *shell* interprets the path and passes the text to the program. That's how all the gnu tools are able to pipe their results from one tool to the other. Or at least, that's how I assume it works.
No, grep accepts either input. The shell does not change paths to windows style, that is what cygpath is for. But it does interpret backslashes, so you have to double all those. So for instance, in a cygwin shell, this works also: grep -i "SHAZZAM" C:\\Documents\ and\ Settings\\benji\\Desktop\\my\ filename\ with\ spaces.txt The arguments are passed as they are, grep just is smart enough to use either one. Probably many tools are that way, I wouldn't know because I usually do the /cygdrive/c/... form.
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory 
 within
 the windows shell... Priceless!
Without the cygwin shell, you lose all bash features, like for, or backticks to execute a command and use it's output. The paths are a minor annoyance IMO. Using the cmd.exe shell is ok for simple tasks, but it pales severely in comparison to the power of bash. So piece of garbage it is not. Something you don't understand how to use properly? definitely ;)
Yeh, I love the bash shell. Really the only thing keeping me from using it for D work is the fact that it won't auto-complete Windows filenames.
It's ugly, but can be aliased or scripted, look into cygpath: cygpath -w /cygdrive/c/filename.txt outputs: C:\filename.txt so you can use dmd combined with cygpath: dmd `cygpath -w /cygdrive/c/path/to/d/files/*.d` It wouldn't take much to write a bash script to do this for you... -Steve
Oct 24 2008
prev sibling parent Benji Smith <dlanguage benjismith.net> writes:
Steven Schveighoffer wrote:
 So piece of garbage it is not.  Something you don't understand how to use 
 properly? definitely ;)
Definitely! I hope you'll agree that hyperbole is the best thing in the world :) --benji
Oct 25 2008
prev sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 10:23 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not. Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage. --bb
so don't use type. use notepad instead... notepad <filewith-utf8.txt>
Ok what about grep and sort and uniq then? Can notepad do that? I have all these tools that work fine in my DOS shell. I never use "type". It was simply meant as the most basic possible tool -- as in if "type" doesn't work nothing will.
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
I think part of the problem I had with Cygwin shell was that it can't auto-complete dos filenames, but D programs on Windows can't accept Cygwin paths. So it was a pain to work with command-line tools (like DMD itself) that take filenames. So I don't think MSYS helps there either.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.
Oct 24 2008
prev sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console. --benji
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Oct 24 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Write a tiny little D program and see what you get on the console: import tango.io.Stdout; void main() { Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666"); } I don't know anything about the "type" command, and whether it supports UTF-8. But the console itself ought to be able to handle it. Try compiling the above code and see what happens. --benji
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Write a tiny little D program and see what you get on the console: import tango.io.Stdout; void main() { Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666"); } I don't know anything about the "type" command, and whether it supports UTF-8. But the console itself ought to be able to handle it. Try compiling the above code and see what happens. --benji
Ah, I see. I guess more what I want to know is if I had utf-8 source code and the D compiler spit out a message about one of the lines, would that error message come out as garbage? Same for ddbg -- if I'm debugging and say "ps" for "print source" will the result be garbage. I was thinking that "type" would be a simple test if that sort of thing would work. But maybe type is just borked. I did try "cat" and "more" too I think, with same result, though. --bb
Oct 24 2008
next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Write a tiny little D program and see what you get on the console: import tango.io.Stdout; void main() { Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666"); } I don't know anything about the "type" command, and whether it supports UTF-8. But the console itself ought to be able to handle it. Try compiling the above code and see what happens. --benji
Ah, I see. I guess more what I want to know is if I had utf-8 source code and the D compiler spit out a message about one of the lines, would that error message come out as garbage? Same for ddbg -- if I'm debugging and say "ps" for "print source" will the result be garbage. I was thinking that "type" would be a simple test if that sort of thing would work. But maybe type is just borked. I did try "cat" and "more" too I think, with same result, though. --bb
Msys does autocomplete. it's not perfect but it works. the path will look unix like though.. i.e. /c/program files/... from what I know (winXP sp 2) - console works for unicode Except for RTL languages like Hebrew. as someone else already noted, this is legacy tech which you shouldn't be using anyway. I don't know if it's fixed in there are also other 3rd party stuff as well..
Oct 24 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 11:53 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Write a tiny little D program and see what you get on the console: import tango.io.Stdout; void main() { Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666"); } I don't know anything about the "type" command, and whether it supports UTF-8. But the console itself ought to be able to handle it. Try compiling the above code and see what happens. --benji
Ah, I see. I guess more what I want to know is if I had utf-8 source code and the D compiler spit out a message about one of the lines, would that error message come out as garbage? Same for ddbg -- if I'm debugging and say "ps" for "print source" will the result be garbage. I was thinking that "type" would be a simple test if that sort of thing would work. But maybe type is just borked. I did try "cat" and "more" too I think, with same result, though. --bb
Msys does autocomplete. it's not perfect but it works. the path will look unix like though.. i.e. /c/program files/...
Right that's what Cygwin does too, and it's useless if I want to call the DMD compiler. dmd foo.d /c/libs/mydlib.lib "Error: what do you think this is, Linux?"
 from what I know (winXP sp 2) - console works for unicode Except for RTL
 languages like Hebrew. as someone else already noted, this is legacy
 tech which you shouldn't be using anyway. I don't know if it's fixed in

 there are also other 3rd party stuff as well..
Yeh, i've heard of that. Do you (or anyone) have any actual experience with PowerShell? It doesn't seem to be standard equipment on my new Vista box even. Does it require a separate download? Strange if it really is supposed to be "the new way". --bb
Oct 24 2008
parent Robert Fraser <fraserofthenight gmail.com> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 11:53 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Write a tiny little D program and see what you get on the console: import tango.io.Stdout; void main() { Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666"); } I don't know anything about the "type" command, and whether it supports UTF-8. But the console itself ought to be able to handle it. Try compiling the above code and see what happens. --benji
Ah, I see. I guess more what I want to know is if I had utf-8 source code and the D compiler spit out a message about one of the lines, would that error message come out as garbage? Same for ddbg -- if I'm debugging and say "ps" for "print source" will the result be garbage. I was thinking that "type" would be a simple test if that sort of thing would work. But maybe type is just borked. I did try "cat" and "more" too I think, with same result, though. --bb
Msys does autocomplete. it's not perfect but it works. the path will look unix like though.. i.e. /c/program files/...
Right that's what Cygwin does too, and it's useless if I want to call the DMD compiler. dmd foo.d /c/libs/mydlib.lib "Error: what do you think this is, Linux?"
 from what I know (winXP sp 2) - console works for unicode Except for RTL
 languages like Hebrew. as someone else already noted, this is legacy
 tech which you shouldn't be using anyway. I don't know if it's fixed in

 there are also other 3rd party stuff as well..
Yeh, i've heard of that. Do you (or anyone) have any actual experience with PowerShell? It doesn't seem to be standard equipment on my new Vista box even. Does it require a separate download? Strange if it really is supposed to be "the new way". --bb
PowerShell is MS's concession that there are things better done in a console environment, especially for developers & powerusers. And, yes, it works very well (I'm a fan...). It also contains aliases for all the GNU tools (i.e. ls => dir, etc.). It doesn't come as the default on most OSes simply because Microsoft doesn't expect the average home user to need it. It does come default on Windows Server 2008, because Microsoft expects it to be a useful utility to server admins.
Oct 25 2008
prev sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Bill Baxter пишет:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?
A regular Windows console supports UTF-8 to some extent: * Change console font to Lucida Console * issue "chcp 65001" You can even get more fonts into there with a bit of hackery.
I did that but "type <filewith-utf8.txt>" still prints garbage.
That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the console. The only special thing I did was changed the font to Lucide Console.
Ok. Thanks for the info. Knowing that it has actually worked for at least one person gives me motivation to try again. --bb
Write a tiny little D program and see what you get on the console: import tango.io.Stdout; void main() { Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666"); } I don't know anything about the "type" command, and whether it supports UTF-8. But the console itself ought to be able to handle it. Try compiling the above code and see what happens. --benji
Ah, I see. I guess more what I want to know is if I had utf-8 source code and the D compiler spit out a message about one of the lines, would that error message come out as garbage? Same for ddbg -- if I'm debugging and say "ps" for "print source" will the result be garbage. I was thinking that "type" would be a simple test if that sort of thing would work. But maybe type is just borked. I did try "cat" and "more" too I think, with same result, though.
They all work for me: type, cat, less. The file is UTF-8 with BOM. Error messages are printed correctly displaying all the characters in a buggy symbol. But now I remember. It fails to execute any batch files when it's in 65001 codepage. More precisely, it executes exactly one line from a batch file like if there were no more lines. So this pseudo-uniclde mode is useless.
Oct 27 2008
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
Any text-based program uses the same Windows console (unless it's a GUI application, and it uses controls to create a text box, etc). Including cygwin shell. To say it's a legacy technology is like saying Linux is a legacy technology because it's command line based. It's a false experience promoted by Microsoft to try and spread FUD about OSes that mainly support command line tools, like Linux. But command line tools are extremely useful and powerful, much easier to develop, and IMO easier to use. For instance, if you want to find all files that contain a certain text, grep -R text / and you're done. On windows it's 'click the start menu, select search, wait for the search window to pop up, click on the dog, etc'. Freaking annoying if you ask me ;)
 Anyone using a shell for Windows that works and supports UTF-8 properly?
I would guess it should work properly, most everything in windows supports unicode. Perhaps you have some configuration setting not set properly? I'd suggest searching msdn. -Steve
Oct 24 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
Any text-based program uses the same Windows console (unless it's a GUI application, and it uses controls to create a text box, etc). Including cygwin shell. To say it's a legacy technology is like saying Linux is a legacy technology because it's command line based. It's a false experience promoted by Microsoft to try and spread FUD about OSes that mainly support command line tools, like Linux. But command line tools are extremely useful and powerful, much easier to develop, and IMO easier to use. For instance, if you want to find all files that contain a certain text, grep -R text / and you're done. On windows it's 'click the start menu, select search, wait for the search window to pop up, click on the dog, etc'. Freaking annoying if you ask me ;)
 Anyone using a shell for Windows that works and supports UTF-8 properly?
I would guess it should work properly, most everything in windows supports unicode. Perhaps you have some configuration setting not set properly? I'd suggest searching msdn. -Steve
windows console AKA DOS Box *is* in fact legacy technology. It is ideas from Linux and incorporated in it. Also, it doesn't have to be either/or situation regarding CLI vs GUI. There's Apple's quicksilver (IIRC the name) which is a gui app with CLI like interface. it has the best from both worlds. PowerShell is GUI based as well. IMO, CLI should be provided as just a widget in the GUI world and not a separate entity.
Oct 25 2008
next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Sat, Oct 25, 2008 at 8:57 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
Any text-based program uses the same Windows console (unless it's a GUI application, and it uses controls to create a text box, etc). Including cygwin shell. To say it's a legacy technology is like saying Linux is a legacy technology because it's command line based. It's a false experience promoted by Microsoft to try and spread FUD about OSes that mainly support command line tools, like Linux. But command line tools are extremely useful and powerful, much easier to develop, and IMO easier to use. For instance, if you want to find all files that contain a certain text, grep -R text / and you're done. On windows it's 'click the start menu, select search, wait for the search window to pop up, click on the dog, etc'. Freaking annoying if you ask me ;)
 Anyone using a shell for Windows that works and supports UTF-8 properly?
I would guess it should work properly, most everything in windows supports unicode. Perhaps you have some configuration setting not set properly? I'd suggest searching msdn. -Steve
 PowerShell is GUI based as well.
After downloading it and giving it a try, I find this claim somewhat suspect. What makes you say it's GUI based? It has the exact same decorations and goofy menu options as a regular non-GUI Windows console. If it were really a GUI, I doubt they would go through the extra programming effort required to make it look *exactly* like a console app. --bb
Oct 25 2008
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 8:57 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
Any text-based program uses the same Windows console (unless it's a GUI application, and it uses controls to create a text box, etc). Including cygwin shell. To say it's a legacy technology is like saying Linux is a legacy technology because it's command line based. It's a false experience promoted by Microsoft to try and spread FUD about OSes that mainly support command line tools, like Linux. But command line tools are extremely useful and powerful, much easier to develop, and IMO easier to use. For instance, if you want to find all files that contain a certain text, grep -R text / and you're done. On windows it's 'click the start menu, select search, wait for the search window to pop up, click on the dog, etc'. Freaking annoying if you ask me ;)
 Anyone using a shell for Windows that works and supports UTF-8 
 properly?
I would guess it should work properly, most everything in windows supports unicode. Perhaps you have some configuration setting not set properly? I'd suggest searching msdn. -Steve
 PowerShell is GUI based as well.
After downloading it and giving it a try, I find this claim somewhat suspect. What makes you say it's GUI based? It has the exact same decorations and goofy menu options as a regular non-GUI Windows console. If it were really a GUI, I doubt they would go through the extra programming effort required to make it look *exactly* like a console app.
I've never used powershell, but most likely you are correct. I think there is a confusion of terms here. Windows Console is the GUI that comes up with the black window, and displays text. It serves as a terminal, not a shell. This is not 'old' technology, it's just an integral piece of the OS. cmd.exe is the command interpreter, which is definitely crappy technology (and somewhat old). The responsible party for displaying UTF properly is the console, not the shell. -Steve
Oct 25 2008
parent ore-sama <spam here.lot> writes:
Steven Schveighoffer Wrote:

 The responsible party for displaying UTF properly is the console, not the 
 shell.
 
One important feature of legacy technology is it must not change for compatibility with legacy code, stdout is just an oblique pipe and one has no means to specify text encoding and legacy applications write OCP-encoded text to stdout, that's why console expects OCP output and breaking this convention will break legacy applications, piping etc, etc. BTW, cmd.exe can in fact produce utf-16 output.
Oct 26 2008
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Bill Baxter wrote:
 Yigal Chripun wrote:
 PowerShell is GUI based as well.
After downloading it and giving it a try, I find this claim somewhat suspect. What makes you say it's GUI based? It has the exact same decorations and goofy menu options as a regular non-GUI Windows console. If it were really a GUI, I doubt they would go through the extra programming effort required to make it look *exactly* like a console app. --bb
It uses the same console application to do the displaying/execution. And, yes, this application sucks (ever done any serious copy/paste in it?) There's PoshConsole ( http://www.codeplex.com/PoshConsole ), but that TODO list is a bit extensive ;-P. Hopefully by Win7 time, the Windows group gets around to fixing the console, but that's like hoping they'll fix Paint or Notepad ;-P.
Oct 25 2008
next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 9:18 AM, Robert Fraser
<fraserofthenight gmail.com> wrote:
 Bill Baxter wrote:
 Yigal Chripun wrote:
 PowerShell is GUI based as well.
After downloading it and giving it a try, I find this claim somewhat suspect. What makes you say it's GUI based? It has the exact same decorations and goofy menu options as a regular non-GUI Windows console. If it were really a GUI, I doubt they would go through the extra programming effort required to make it look *exactly* like a console app. --bb
It uses the same console application to do the displaying/execution. And, yes, this application sucks (ever done any serious copy/paste in it?) There's PoshConsole ( http://www.codeplex.com/PoshConsole ), but that TODO list is a bit extensive ;-P. Hopefully by Win7 time, the Windows group gets around to fixing the console, but that's like hoping they'll fix Paint or Notepad ;-P.
I'm using "Console2" as my facade on the console window. Works pretty nicely. http://sourceforge.net/projects/console/ --bb
Oct 25 2008
prev sibling next sibling parent KennyTM~ <kennytm gmail.com> writes:
Robert Fraser wrote:
 Bill Baxter wrote:
 Yigal Chripun wrote:
 PowerShell is GUI based as well.
After downloading it and giving it a try, I find this claim somewhat suspect. What makes you say it's GUI based? It has the exact same decorations and goofy menu options as a regular non-GUI Windows console. If it were really a GUI, I doubt they would go through the extra programming effort required to make it look *exactly* like a console app. --bb
It uses the same console application to do the displaying/execution. And, yes, this application sucks (ever done any serious copy/paste in it?) There's PoshConsole ( http://www.codeplex.com/PoshConsole ), but that TODO list is a bit extensive ;-P. Hopefully by Win7 time, the Windows group gets around to fixing the console, but that's like hoping they'll fix Paint or Notepad ;-P.
Hey, they do have fixed MSPaint and WordPad! :)
Oct 25 2008
prev sibling parent reply torhu <no spam.invalid> writes:
Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. 
 And, yes, this application sucks (ever done any serious copy/paste in it?)
That works fine for me if I enable Quick edit mode in the options. Then the right mouse button will do both copy and paste.
Oct 26 2008
parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. And,
 yes, this application sucks (ever done any serious copy/paste in it?)
That works fine for me if I enable Quick edit mode in the options. Then the right mouse button will do both copy and paste.
Except it only does block-oriented rectangular selection, which is odd for something that is primarily line-oriented. --bb
Oct 26 2008
parent reply torhu <no spam.invalid> writes:
Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. And,
 yes, this application sucks (ever done any serious copy/paste in it?)
That works fine for me if I enable Quick edit mode in the options. Then the right mouse button will do both copy and paste.
Except it only does block-oriented rectangular selection, which is odd for something that is primarily line-oriented.
Yeah, that's true. Pretty stupid.
Oct 26 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
torhu wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. 
 And,
 yes, this application sucks (ever done any serious copy/paste in it?)
That works fine for me if I enable Quick edit mode in the options. Then the right mouse button will do both copy and paste.
Except it only does block-oriented rectangular selection, which is odd for something that is primarily line-oriented.
Yeah, that's true. Pretty stupid.
My main problem is that you can't do it just with the keyboard, which is my standard method. I also take issue with the fact you can't copy more than is visible on a single screen, which goes along with the block selection mode.
Oct 26 2008
parent "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 1:52 PM, Robert Fraser
<fraserofthenight gmail.com> wrote:
 torhu wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution.
 And,
 yes, this application sucks (ever done any serious copy/paste in it?)
That works fine for me if I enable Quick edit mode in the options. Then the right mouse button will do both copy and paste.
Except it only does block-oriented rectangular selection, which is odd for something that is primarily line-oriented.
Yeah, that's true. Pretty stupid.
My main problem is that you can't do it just with the keyboard, which is my standard method. I also take issue with the fact you can't copy more than is visible on a single screen, which goes along with the block selection mode.
By the way I tried running powershell as a tab inside the Console2 prog I mentioned before and it does work fine. --bb
Oct 26 2008
prev sibling parent Yigal Chripun <yigal100 gmail.com> writes:
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 8:57 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
Any text-based program uses the same Windows console (unless it's a GUI application, and it uses controls to create a text box, etc). Including cygwin shell. To say it's a legacy technology is like saying Linux is a legacy technology because it's command line based. It's a false experience promoted by Microsoft to try and spread FUD about OSes that mainly support command line tools, like Linux. But command line tools are extremely useful and powerful, much easier to develop, and IMO easier to use. For instance, if you want to find all files that contain a certain text, grep -R text / and you're done. On windows it's 'click the start menu, select search, wait for the search window to pop up, click on the dog, etc'. Freaking annoying if you ask me ;)
 Anyone using a shell for Windows that works and supports UTF-8 properly?
I would guess it should work properly, most everything in windows supports unicode. Perhaps you have some configuration setting not set properly? I'd suggest searching msdn. -Steve
 PowerShell is GUI based as well.
After downloading it and giving it a try, I find this claim somewhat suspect. What makes you say it's GUI based? It has the exact same decorations and goofy menu options as a regular non-GUI Windows console. If it were really a GUI, I doubt they would go through the extra programming effort required to make it look *exactly* like a console app. --bb
I've just checked (it's been a long time since I used it) and you're correct. I don't know Why I remembered it as being GUI based, maybe the blue color threw me off..sorry for the confusion. but I'm sure that there are 3rd party GUI based shells for Windows.
Oct 27 2008
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Yigal Chripun wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
Any text-based program uses the same Windows console (unless it's a GUI application, and it uses controls to create a text box, etc). Including cygwin shell. To say it's a legacy technology is like saying Linux is a legacy technology because it's command line based. It's a false experience promoted by Microsoft to try and spread FUD about OSes that mainly support command line tools, like Linux. But command line tools are extremely useful and powerful, much easier to develop, and IMO easier to use. For instance, if you want to find all files that contain a certain text, grep -R text / and you're done. On windows it's 'click the start menu, select search, wait for the search window to pop up, click on the dog, etc'. Freaking annoying if you ask me ;)
 Anyone using a shell for Windows that works and supports UTF-8 properly?
I would guess it should work properly, most everything in windows supports unicode. Perhaps you have some configuration setting not set properly? I'd suggest searching msdn. -Steve
windows console AKA DOS Box *is* in fact legacy technology. It is ideas from Linux and incorporated in it.
Windows has gotten a lot better in the recent times - ever since it finally started to imitate Unix :o).
 Also, it doesn't have to be either/or situation regarding CLI vs GUI.
 There's Apple's quicksilver (IIRC the name) which is a gui app with CLI
 like interface. it has the best from both worlds. PowerShell is GUI
 based as well. IMO, CLI should be provided as just a widget in the GUI
 world and not a separate entity.
I'm not sure I understand. Widget in the GUI = a window with text in it living side by side, or embedded with, graphical windows? That's been the case for a long time. Andrei
Oct 25 2008
prev sibling parent ore-sama <spam here.lot> writes:
Bill Baxter Wrote:

 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)
Console is a legacy technology (you even still call it "DOS"), why expect features from it?
So tell me what the alternative is? I had trouble with running D tools from a Cygwin shell. Can't remember if I tried MSYS or not.
gui of course. MSYS's console is gui in fact.
Oct 25 2008
prev sibling next sibling parent ore-sama <spam here.lot> writes:
Bill Baxter Wrote:

 import std.stdio;
 void main(string[] args) {  writefln("Args: %s", args); }
 
 And passing it some wildcards.  It never expands anything.  Only thing
 it does do is mess with quotes some.  Here's an example:
 
 C:\> args.exe * "C:\Program Files" *.* c:\*
 Args: [args,*,C:\Program Files,*.*,c:\*]
It's not windows, it's program's standard startup module gets command line with GetCommandLine() and parses it into string[] args.
Oct 25 2008
prev sibling next sibling parent ore-sama <spam here.lot> writes:
Bill Baxter Wrote:

 I did that but "type <filewith-utf8.txt>"  still prints garbage.
 
 --bb
if application prints garbage, this indicates that it's implemented incorrectly or it's not encodings-aware. Correctly implemented application should transcode text to OCP before printing to console. This is what std.stdio.writef is supposed to do.
Oct 25 2008
prev sibling next sibling parent Kevin Bealer <kevinbealer gmail.com> writes:
Andrei Alexandrescu Wrote:

 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
 
 
 Andrei
I think this is a bad idea -- there are a lot of places that don't use Unicode or don't support 8 bit clean translation, and the operators in question would be a pain to use every time they were needed, since there is no obvious way to type them. And I don't just mean organizations that drag their feet, but also special cases within every new technology that have these blind spots. Does your cell phone web browser correctly display these symbols? Does the program "less" display these correctly? If you think it's just a matter of time, maybe, but consider that IBM still uses EBCDIC internally in mainframes. A lot of languages using only punctuation based syntax are already hard to read because of it, e.g. Perl can be very hard to read in some cases. Using the word "and" would make a lot of languages easier to read than using "&&". The standardized meanings should be kept, but I would favor something like $( stuff )$, $[ more stuff ]$ and so on rather than using special unicode tokens. modify bracket usage and "#text" to indicate special symbols as an extension of the #line and #function directives. If ".operation" is good enough for every method call, then why rather than importing thousands of individual extension operators that are only readable in the unicode-speaking contexts. Kevin
Oct 25 2008
prev sibling parent Alix Pexton <alixD.TpextonNO SPAMgmailD.Tcom> writes:
Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei
I've been following this thread without really having an opinion to offer, but I just had a thought... We already know that D's CTFE and templates can be used together to parse DSLs (matrix ops, regular expressions and IIRC Scheme too) and turn them into optimal native code. That suggests to me that it is already possible to write D code that can turn an expression written in established mathematic/scientific notation (complete with unicode symbols) into either conventional D code, or machine code. What I am not sure of is whether is would be possible to make it general enough to work with all mathmatical dialects (I seem to remember some overlapping in ways that might be problematic). A complete solution would have to be able to define new operatos (including thier associativity and precidence) in such a way that they can be looked up by the templates that evaluate the expresion. Another related thought I had: Would it be possible to write a compile-time parser that turned MathML into code? I'm not even sure if MathML is structured enough to represent the undelying meaning of an expression rather than just its graphical form. Perhaps it would be more interesting to write the code that did the tranformation in the opposite direction, turning expressions written in D into MathML ^^ A...
Oct 26 2008