digitalmars.D - Semantics of toString
- Justin Johansson (14/14) Nov 05 2009 I assert that the semantics of "toString" or similarly named/purposed me...
- Michal Minich (10/34) Nov 05 2009 My practice tells me to use toString only for debugging - to quickly get...
- Ary Borenszweig (7/8) Nov 05 2009 A string useful for debugging purposes and, when possible, useful for
- Jesse Phillips (5/27) Nov 05 2009 Well some Java author said that the toSting method was only intended for...
- Justin Whear (4/26) Nov 05 2009 Two things:
- Nick Sabalausky (9/28) Nov 05 2009 (Deliberately not reading the other replies before posting...)
- Don (3/19) Nov 05 2009 It's a hack from the early days of D. Should be unavailable unless the
- Nick Sabalausky (3/7) Nov 05 2009 What don't you like about it?
- Don (18/28) Nov 05 2009 It cannot even do the most basic stuff.
- Yigal Chripun (17/45) Nov 06 2009 The first issue you raise is IMO a problem with writefln and not with
- =?ISO-8859-1?Q?Pelle_M=E5nsson?= (3/56) Nov 06 2009 How do you do %.3f in {}-notation?
- Yigal Chripun (12/68) Nov 06 2009 That is incorrect since in my example I use the format string to switch
- Leandro Lucarella (20/28) Nov 06 2009 This is horrible, horrible for internationalization, you just can't assu...
- Andrei Alexandrescu (13/40) Nov 06 2009 I think you found a bug in Phobos. I tried this:
- Leandro Lucarella (25/54) Nov 06 2009 Yes.
- Andrei Alexandrescu (4/52) Nov 06 2009 Thanks!
- Yigal Chripun (24/46) Nov 06 2009 F in the above is _not_ a type specifier. It is a format specifier that
- Andrei Alexandrescu (4/79) Nov 06 2009 Not sure to what extent it helps, but Phobos supports positional
- dsimcha (10/29) Nov 05 2009 readers
- div0 (17/20) Nov 05 2009 -----BEGIN PGP SIGNED MESSAGE-----
- Justin Johansson (6/26) Nov 05 2009 There are some interesting replies coming along here. Thanks everybody ...
- Lutger (4/5) Nov 08 2009 Whatever you got, give it to me as a string for my printf debugging whil...
- Lutger (14/22) Nov 08 2009 My other reply didn't take the language agnostic into account, sorry.
- Justin Johansson (16/47) Nov 08 2009 Thanks for that Lutger.
- Lutger (10/67) Nov 10 2009 Your design makes better sense (to me at least) because it is based on w...
- Don (6/75) Nov 10 2009 There is a definite use for such as thing. But the existing toString()
- Lutger (5/11) Nov 10 2009 Since you are in the know and probably the biggest toString() hater arou...
- Justin Johansson (11/24) Nov 10 2009 I have a feeling (and I may well be wrong) that toString might be used i...
- Bill Baxter (17/39) Nov 10 2009 nd:
- Justin Johansson (5/47) Nov 10 2009 I think you are right; if I can dig up what it was, and if relevant to t...
- Don (7/18) Nov 10 2009 I'm hoping someone will come up with a design.
- Justin Johansson (2/26) Nov 10 2009 That's starting to look like a "serialize" method!
- Steven Schveighoffer (13/43) Nov 10 2009 As it should. I should be able to print a 10000 element container witho...
- Andrei Alexandrescu (6/57) Nov 10 2009 Walter does not feel strongly about Phobos. The save() method in "On
- Bill Baxter (13/33) Nov 10 2009 That looks pretty good, actually.
- Don (18/56) Nov 10 2009 The thing is, the toString() function is essentially a virtual function
- Bill Baxter (15/77) Nov 10 2009 Structs can't have virtual functions... so what do you mean?
- grauzone (5/28) Nov 10 2009 Just put it into an "interface DebugOutput", remove Object.toString(),
- Don (4/19) Nov 10 2009 How are you supposed to print one with it? It doesn't help.
- Andrei Alexandrescu (4/28) Nov 10 2009 I think the best option for toString is to take an output range and
- Denis Koroskin (4/27) Nov 10 2009 It means toString() must be either a template, or accept an abstract
- Andrei Alexandrescu (3/34) Nov 10 2009 It should take an interface.
- Bill Baxter (8/46) Nov 10 2009 before
- Andrei Alexandrescu (3/40) Nov 10 2009 I am not sure. Opinions as always are welcome.
- Bill Baxter (11/64) Nov 10 2009 g
- Don (8/59) Nov 11 2009 It also needs to be used by structs, which aren't inherited from Object....
- Denis Koroskin (7/49) Nov 11 2009 Some ranges may be polymorphic, so having base interface hierarchy in
- Andrei Alexandrescu (5/59) Nov 11 2009 It can't be clone() because it doesn't clone. For example say you have a...
- Denis Koroskin (6/65) Nov 11 2009 Well, range doesn't own any of the contents it covers, so deep copy is
- Andrei Alexandrescu (5/9) Nov 11 2009 Well so the second sentence contradicts the first. Let me put it another...
- Bill Baxter (4/13) Nov 11 2009 makeBreadCrumb() ?
- Philippe Sigaud (12/16) Nov 11 2009 different name - opSlice
- Denis Koroskin (3/22) Nov 11 2009 It remembers array bounds, not contents.
- Steven Schveighoffer (38/41) Nov 12 2009 Bad idea...
- Steven Schveighoffer (4/17) Nov 12 2009 Oops, I meant 3 virtual functions -- front, popNext, and empty.
- Denis Koroskin (10/32) Nov 12 2009 Output range has only one method: put.
- Steven Schveighoffer (18/54) Nov 12 2009 I was referring to range's ability to interact with foreach. An output ...
- Andrei Alexandrescu (3/5) Nov 12 2009 I think this particular point is incorrect.
- dsimcha (21/26) Nov 12 2009 Most of the overhead from indirect function calls come from the fact tha...
- Andrei Alexandrescu (7/47) Nov 12 2009 I think that, on the contrary, working with a delegate is less generic.
- Don (7/54) Nov 12 2009 How? It seems to introduce more requirements on the implementation, but
- Andrei Alexandrescu (4/59) Nov 12 2009 That seems plausible.
- Justin Johansson (5/55) Nov 12 2009 Which you mean -- interfaces, classes or both?
- Andrei Alexandrescu (3/56) Nov 12 2009 My understanding is that the costs are comparable.
- Andrei Alexandrescu (30/81) Nov 12 2009 You are right. If range interfaces accommodate block transfers, this
- Steven Schveighoffer (54/115) Nov 12 2009 IIRC, I don't think C++ iostreams use polymorphism, and I don't think th...
- Andrei Alexandrescu (23/137) Nov 12 2009 Oh yes they do. (Did you even google?) Virtual multiple inheritance, the...
- Steven Schveighoffer (28/81) Nov 12 2009 From my C++ book, it appears to only use virtual inheritance. I don't ...
- Andrei Alexandrescu (14/109) Nov 12 2009 You're right, but there is an issue because as far as I can recall these...
- Bill Baxter (16/27) Nov 12 2009 d
- Steven Schveighoffer (16/27) Nov 12 2009 Yep, you are right. It appears the reason they do this is so the
- Andrei Alexandrescu (8/43) Nov 12 2009 One problem I just realized is that, if we e.g. offer only put(in
- Steven Schveighoffer (22/60) Nov 12 2009 char[1] buf;
- Andrei Alexandrescu (5/74) Nov 12 2009 I was just thinking of offering an interface that offers utf8 and utf16
- Steven Schveighoffer (37/98) Nov 12 2009 :O
- Andrei Alexandrescu (6/115) Nov 12 2009 Well a stack-allocated buffer is stack-allocated, and passing a slice
- Bill Baxter (5/7) Nov 12 2009 Nonsense! Developers spend a lot of time debugging. Helping people
- Andrei Alexandrescu (8/18) Nov 12 2009 Sorry sorry. I just meant to say it's not worth coming with an airtight
- Steven Schveighoffer (9/24) Nov 12 2009 The main purpose to serialize is to be able to deserialize. The main
- Yigal Chripun (9/39) Nov 12 2009 I'd add to that the a format facility should be locale aware as in .Net.
- Steven Schveighoffer (6/8) Nov 12 2009 Debugging is not always done by the developer on his system where a
- Steven Schveighoffer (92/104) Nov 12 2009 Some rudamentary attempts at benchmarking:
- dsimcha (6/20) Nov 12 2009 Your benchmarks don't show that the direct call is much faster. You had...
- Steven Schveighoffer (12/36) Nov 12 2009 The direct call was 5 seconds faster. Divide by 10 billion and you get ...
- dsimcha (8/45) Nov 12 2009 Yes, about 0.5 nanoseconds. In other words, if your CPU is roughly 2 GH...
- Justin Johansson (10/86) Nov 10 2009 s/over-my-dead-body/over-your-dead-body/ :-)
- Bill Baxter (17/94) Nov 10 2009 of
- bearophile (4/6) Nov 10 2009 I have added a toString to my copy of the BigInt.
- Don (16/97) Nov 10 2009 I almost always want to print the value out in hex. And with some kind
- bearophile (7/9) Nov 10 2009 This may help:
- Bill Baxter (4/11) Nov 10 2009 Though they may be useful, those don't look to have anything to do
- bearophile (4/6) Nov 10 2009 Don has said: "But the performance would still be very poor, and that's ...
- Don (3/9) Nov 10 2009 It's problem 2 from my original posts: being able to output something
- Bill Baxter (5/9) Nov 10 2009 Maybe it's just my ignorance of BigNum issues, but those links look to
- bearophile (13/16) Nov 10 2009 Look the numeral() function inside here from those blog posts:
- Bill Baxter (11/25) Nov 10 2009 r by 10, and accumulate the modulus as the digit, converted to ['0', '9'...
- bearophile (4/5) Nov 10 2009 You are welcome.
- Bill Baxter (5/12) Nov 10 2009 ut
- Denis Koroskin (74/191) Nov 10 2009 Yes, it would solve half of the toString problems.
- Bill Baxter (55/191) Nov 10 2009 cs
- Don (10/161) Nov 10 2009 One thing it doesn't (easily) handle is the case where an int argument
- bearophile (4/6) Nov 10 2009 See my post about vectorized lazyness.
- Genghis Khan (2/117) Nov 12 2009 亞洲用戶有一個突出...
- HOSOKAWA Kenchi (5/32) Nov 12 2009 That is true. UTF8 works well.
I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you? **beers, Justin **caveat: free beer offer available in-store only
Nov 05 2009
Hello Justin,I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you? **beers, Justin **caveat: free beer offer available in-store onlyMy practice tells me to use toString only for debugging - to quickly get string representation of object in human readable format - nothing else ever. So it is good that toString is part of D object class. It quite unsuitable e.g. for serializing object to xml/html or other formats. You may find yourself later finding out that your object should not only be toString-ed to xml, but now to json... Better to use specific method for specific purpose. what matters me more of object methods, is opEquals being part of them. But that is different story.
Nov 05 2009
Justin Johansson wrote:So what does "toString" mean to you?A string useful for debugging purposes and, when possible, useful for programming tasks. For example in Java there's StringWriter and the toString method returns the String being written, I think that's fine. An XML node might return it's xml representation. But most of the time an object dosen't have a use as a string.
Nov 05 2009
Justin Johansson Wrote:I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you? **beers, Justin **caveat: free beer offer available in-store onlyWell some Java author said that the toSting method was only intended for debugging, but list containers use it so... I don't have that reference :( You can also check out the question on StackOverflow http://stackoverflow.com/questions/563676/is-tostring-only-useful-for-debugging But personally, output to the end-user should not come from toString and program logic should not be based on the string returned from toString.
Nov 05 2009
Justin Johansson Wrote:I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you? **beers, Justin **caveat: free beer offer available in-store onlyTwo things: 1) Primarily for debugging purposes. It's very convenient. 2) As a default behavior in a few cases. For instance, if a listbox widget hasn't been given a view that knows how to render objects of type Foo, it can default to rendering the results of toString.
Nov 05 2009
"Justin Johansson" <free beer.com> wrote in message news:hcuhet$15a2$1 digitalmars.com...I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you? **beers, Justin **caveat: free beer offer available in-store only(Deliberately not reading the other replies before posting...) It means to me, obtain a string-representation of an object (or an instance of a non-class type) in whatever form is reasonably appropriate for the given type. This string representation might include all data, but this is not guaranteed. It might be unique to each object, but this is not guaranteed. It might be fully-suitable for serialization, but this is not guaranteed.
Nov 05 2009
Justin Johansson wrote:I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 05 2009
"Don" <nospam nospam.com> wrote in message news:hcvf9l$91i$1 digitalmars.com...Justin Johansson wrote:What don't you like about it?So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 05 2009
Nick Sabalausky wrote:"Don" <nospam nospam.com> wrote in message news:hcvf9l$91i$1 digitalmars.com...It cannot even do the most basic stuff. (1) You can't even make a struct that behaves like an int. struct MyInt { int z; string toString() { .... } } void main() { int a = 400; MyInt b = 400; writefln("%05d %05d", a, b); writefln("%x %x", a, b); } (2) It doesn't behave like a stream. Suppose you have XmlDoc.toString() You can't emit the doc, piece by piece. You have to create the ENTIRE string in one go!Justin Johansson wrote:What don't you like about it?So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 05 2009
On 06/11/2009 07:34, Don wrote:Nick Sabalausky wrote:The first issue you raise is IMO a problem with writefln and not with toString since writefln doesn't handle user-defined types properly. I think that writefln (btw, horrible name) should only deal with strings and their formatting and all other types need to provide an (optionally formatted) string. a numeric type would provide formatting of properties like number of decimal places, thousands separator, etc while user defined specification type could provide a type of standard format. auto spec = new Specification(HTML); string ansi = spec.toString(Specification.ANSI); string iso = spec.toString(Specification.ISO); the c style format string that specifies types is a horrible horrible thing and should be removed. regarding the second issue: forech (node; XmlDoc.preOrder()) writfln("{0}", node.toString());"Don" <nospam nospam.com> wrote in message news:hcvf9l$91i$1 digitalmars.com...It cannot even do the most basic stuff. (1) You can't even make a struct that behaves like an int. struct MyInt { int z; string toString() { .... } } void main() { int a = 400; MyInt b = 400; writefln("%05d %05d", a, b); writefln("%x %x", a, b); } (2) It doesn't behave like a stream. Suppose you have XmlDoc.toString() You can't emit the doc, piece by piece. You have to create the ENTIRE string in one go!Justin Johansson wrote:What don't you like about it?So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 06 2009
Yigal Chripun wrote:On 06/11/2009 07:34, Don wrote:How do you do %.3f in {}-notation? Your formatting string should be written as writeln(ansi, " ", iso);Nick Sabalausky wrote:The first issue you raise is IMO a problem with writefln and not with toString since writefln doesn't handle user-defined types properly. I think that writefln (btw, horrible name) should only deal with strings and their formatting and all other types need to provide an (optionally formatted) string. a numeric type would provide formatting of properties like number of decimal places, thousands separator, etc while user defined specification type could provide a type of standard format. auto spec = new Specification(HTML); string ansi = spec.toString(Specification.ANSI); string iso = spec.toString(Specification.ISO); the c style format string that specifies types is a horrible horrible thing and should be removed."Don" <nospam nospam.com> wrote in message news:hcvf9l$91i$1 digitalmars.com...It cannot even do the most basic stuff. (1) You can't even make a struct that behaves like an int. struct MyInt { int z; string toString() { .... } } void main() { int a = 400; MyInt b = 400; writefln("%05d %05d", a, b); writefln("%x %x", a, b); } (2) It doesn't behave like a stream. Suppose you have XmlDoc.toString() You can't emit the doc, piece by piece. You have to create the ENTIRE string in one go!Justin Johansson wrote:What don't you like about it?So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 06 2009
On 06/11/2009 12:34, Pelle Månsson wrote:Yigal Chripun wrote:On 06/11/2009 07:34, Don wrote:Nick Sabalausky wrote:The first issue you raise is IMO a problem with writefln and not with toString since writefln doesn't handle user-defined types properly. I think that writefln (btw, horrible name) should only deal with strings and their formatting and all other types need to provide an (optionally formatted) string. a numeric type would provide formatting of properties like number of decimal places, thousands separator, etc while user defined specification type could provide a type of standard format. auto spec = new Specification(HTML); string ansi = spec.toString(Specification.ANSI); string iso = spec.toString(Specification.ISO); the c style format string that specifies types is a horrible horrible thing and should be removed."Don" <nospam nospam.com> wrote in message news:hcvf9l$91i$1 digitalmars.com...It cannot even do the most basic stuff. (1) You can't even make a struct that behaves like an int. struct MyInt { int z; string toString() { .... } } void main() { int a = 400; MyInt b = 400; writefln("%05d %05d", a, b); writefln("%x %x", a, b); } (2) It doesn't behave like a stream. Suppose you have XmlDoc.toString() You can't emit the doc, piece by piece. You have to create the ENTIRE string in one go!Justin Johansson wrote:What don't you like about it?So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.How do you do %.3f in {}-notation?writefln("{0:F3}", value);Your formatting string should be written as writeln(ansi, " ", iso);That is incorrect since in my example I use the format string to switch the order of the strings. ( hence the numbers inside the {} ) Please go and read the tango documentation starting with http://www.dsource.org/projects/tango/wiki/TutCSharpFormatter it has also links to the MSDN docs which describe the modifiers: for instance: http://msdn.microsoft.com/en-us/library/dwhawy9k%28VS.100%29.aspx This is one area in phobos that needs to be rewritten from scratch or better yet, use tango. I'm still waiting for when hell will freeze over and tango and phobos will be merged together in one consistent API.
Nov 06 2009
Yigal Chripun, el 6 de noviembre a las 14:23 me escribiste:This is horrible, horrible for internationalization, you just can't assume how a language order words. Anyway, about the type in the format, I think it's nice, as you just proved, tango have it too "{0:F3}" is saying "treat the value as a float and format it that way". The deal is, the type should not be used to know the size of the parameter in the stack like in C's printf(), it should be just a hint to convert the value to another type. So, type specification is important. Variables reordering is important too, and you even have it in POSIX's printf(): printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); (see http://www.opengroup.org/onlinepubs/9699919799/functions/printf.html) I like printf()'s format (I don't know if it's just because I'm used to it though :). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- "All mail clients suck. This one just sucks less." -me, circa 1995the c style format string that specifies types is a horrible horrible thing and should be removed.How do you do %.3f in {}-notation?writefln("{0:F3}", value);Your formatting string should be written as writeln(ansi, " ", iso);
Nov 06 2009
Leandro Lucarella wrote:Yigal Chripun, el 6 de noviembre a las 14:23 me escribiste:I think you found a bug in Phobos. I tried this: import std.stdio; void main() { int hour = 1, min = 2, precision = 2, sec = 3; writef("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); } and it prints 1:002:003 But it should really print: 1:02:03 right? AndreiThis is horrible, horrible for internationalization, you just can't assume how a language order words. Anyway, about the type in the format, I think it's nice, as you just proved, tango have it too "{0:F3}" is saying "treat the value as a float and format it that way". The deal is, the type should not be used to know the size of the parameter in the stack like in C's printf(), it should be just a hint to convert the value to another type. So, type specification is important. Variables reordering is important too, and you even have it in POSIX's printf(): printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); (see http://www.opengroup.org/onlinepubs/9699919799/functions/printf.html) I like printf()'s format (I don't know if it's just because I'm used to it though :).writefln("{0:F3}", value);the c style format string that specifies types is a horrible horrible thing and should be removed.How do you do %.3f in {}-notation?Your formatting string should be written as writeln(ansi, " ", iso);
Nov 06 2009
Andrei Alexandrescu, el 6 de noviembre a las 08:50 me escribiste:Yes. ------------------------ $ cat t.c #include <stdio.h> int main() { int hour = 1, min = 2, precision = 2, sec = 3; printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); return 0; } $ make t cc t.c -o t $ ./t 1:02:03 ----------------------- -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Vaporeso, al verse enfundado por la depresión, decide dar fin a su vida tomando Chinato Garda mezclado con kerosene al 50%. Ante el duro trance pierde la movilidad en sus miembros derechos: inferior y superior. En ese momento es considerado como el hombre lÃder del movimiento de izquierda de Occidente.So, type specification is important. Variables reordering is important too, and you even have it in POSIX's printf(): printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); (see http://www.opengroup.org/onlinepubs/9699919799/functions/printf.html) I like printf()'s format (I don't know if it's just because I'm used to it though :).I think you found a bug in Phobos. I tried this: import std.stdio; void main() { int hour = 1, min = 2, precision = 2, sec = 3; writef("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); } and it prints 1:002:003 But it should really print: 1:02:03 right?
Nov 06 2009
Leandro Lucarella wrote:Andrei Alexandrescu, el 6 de noviembre a las 08:50 me escribiste:Thanks! http://d.puremagic.com/issues/show_bug.cgi?id=3479 AndreiYes. ------------------------ $ cat t.c #include <stdio.h> int main() { int hour = 1, min = 2, precision = 2, sec = 3; printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); return 0; } $ make t cc t.c -o t $ ./t 1:02:03 -----------------------So, type specification is important. Variables reordering is important too, and you even have it in POSIX's printf(): printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); (see http://www.opengroup.org/onlinepubs/9699919799/functions/printf.html) I like printf()'s format (I don't know if it's just because I'm used to it though :).I think you found a bug in Phobos. I tried this: import std.stdio; void main() { int hour = 1, min = 2, precision = 2, sec = 3; writef("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); } and it prints 1:002:003 But it should really print: 1:02:03 right?
Nov 06 2009
On 06/11/2009 15:38, Leandro Lucarella wrote:Yigal Chripun, el 6 de noviembre a las 14:23 me escribiste:F in the above is _not_ a type specifier. It is a format specifier that means "fixed". More over, each type defines it's own format specifiers, and there's also a way to custom format stuff. Here's some more examples: (from MSDN) string myName = "Fred"; Console.WriteLine(String.Format("Name = {0}, hours = {1:hh}, minutes = {1:mm}", myName, DateTime.Now)); // Depending on the current time, the example displays output like the following: // Name = Fred, hours = 11, minutes = 30 string FormatString1 = String.Format("{0:dddd MMMM}", DateTime.Now); string FormatString2 = DateTime.Now.ToString("dddd MMMM"); Console.WriteLine("{0:F}", DateTime.Now); // NOT float // F for DateTime means Full date/time pattern (long time). Another issue with the .NET design is that it's locale aware. e.g. // Display using pt-BR culture's short date format DateTime thisDate = new DateTime(2008, 3, 15); CultureInfo culture = new CultureInfo("pt-BR"); Console.WriteLine(thisDate.ToString("d", culture)); // Displays 15/3/2008 besides, the printf format is plain unreadable. It's like comparing ASCII to Unicode - D moved to native Unicode support and should move to this much better design as well.This is horrible, horrible for internationalization, you just can't assume how a language order words. Anyway, about the type in the format, I think it's nice, as you just proved, tango have it too "{0:F3}" is saying "treat the value as a float and format it that way". The deal is, the type should not be used to know the size of the parameter in the stack like in C's printf(), it should be just a hint to convert the value to another type. So, type specification is important. Variables reordering is important too, and you even have it in POSIX's printf(): printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); (see http://www.opengroup.org/onlinepubs/9699919799/functions/printf.html) I like printf()'s format (I don't know if it's just because I'm used to it though :).the c style format string that specifies types is a horrible horrible thing and should be removed.How do you do %.3f in {}-notation?writefln("{0:F3}", value);Your formatting string should be written as writeln(ansi, " ", iso);
Nov 06 2009
Yigal Chripun wrote:On 06/11/2009 12:34, Pelle Månsson wrote:Not sure to what extent it helps, but Phobos supports positional parameters too. AndreiYigal Chripun wrote:On 06/11/2009 07:34, Don wrote:Nick Sabalausky wrote:The first issue you raise is IMO a problem with writefln and not with toString since writefln doesn't handle user-defined types properly. I think that writefln (btw, horrible name) should only deal with strings and their formatting and all other types need to provide an (optionally formatted) string. a numeric type would provide formatting of properties like number of decimal places, thousands separator, etc while user defined specification type could provide a type of standard format. auto spec = new Specification(HTML); string ansi = spec.toString(Specification.ANSI); string iso = spec.toString(Specification.ISO); the c style format string that specifies types is a horrible horrible thing and should be removed."Don" <nospam nospam.com> wrote in message news:hcvf9l$91i$1 digitalmars.com...It cannot even do the most basic stuff. (1) You can't even make a struct that behaves like an int. struct MyInt { int z; string toString() { .... } } void main() { int a = 400; MyInt b = 400; writefln("%05d %05d", a, b); writefln("%x %x", a, b); } (2) It doesn't behave like a stream. Suppose you have XmlDoc.toString() You can't emit the doc, piece by piece. You have to create the ENTIRE string in one go!Justin Johansson wrote:What don't you like about it?So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.How do you do %.3f in {}-notation?writefln("{0:F3}", value);Your formatting string should be written as writeln(ansi, " ", iso);That is incorrect since in my example I use the format string to switch the order of the strings. ( hence the numbers inside the {} ) Please go and read the tango documentation starting with http://www.dsource.org/projects/tango/wiki/TutCSharpFormatter it has also links to the MSDN docs which describe the modifiers: for instance: http://msdn.microsoft.com/en-us/library/dwhawy9k%28VS.100%29.aspx This is one area in phobos that needs to be rewritten from scratch or better yet, use tango. I'm still waiting for when hell will freeze over and tango and phobos will be merged together in one consistent API.
Nov 06 2009
== Quote from Don (nospam nospam.com)'s articleJustin Johansson wrote:methods/functions in manyI assert that the semantics of "toString" or similarly named/purposedreadersPL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NGshould be)responding with their own idea(s) of what the semantics of "toString" are (orsemantics then myin a language agnostic ideology. If there are more than, say, two or three different views on the saidthings"ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of theWhy? You've said this several times w/o giving your reason. IMHO toString() is a great way to get a default string representation of something. If you care about the formatting details, then you use a non-special method. How else would you recommend giving objects a sane default string representation?we should address as a community. So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 05 2009
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 dsimcha wrote:Why? You've said this several times w/o giving your reason. IMHO toString() is a great way to get a default string representation of something.And that's *exactly* what is wrong. There is *never* a good default for anything. Just look at all the discussion of nullable. (shit people even complain about float.init == NaN) - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFK82NwT9LetA9XoXwRAkh+AJ4552uLI2Fz938zj4SsmY0qIHga3QCfZ5pa Y5NmGJYF7tz8qk9K6GYB5J0= =jmrB -----END PGP SIGNATURE-----
Nov 05 2009
Don Wrote:Justin Johansson wrote:There are some interesting replies coming along here. Thanks everybody for chipping in. I must admit though, when I read Don's reply just now the first thought that went through my mind was "Sweet!" JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology. If there are more than, say, two or three different views on the said semantics then my "ill-definition" assertion is surely correct. If there are no replies on this matter, then guess I'm left concludeless. Just thinking in the language round-up that this is (just another) one of the things we should address as a community. So what does "toString" mean to you?It's a hack from the early days of D. Should be unavailable unless the -debug flag is set, to discourage people from using it. I hate it.
Nov 05 2009
Justin Johansson wrote: ...So what does "toString" mean to you?Whatever you got, give it to me as a string for my printf debugging while my debugger is broken.
Nov 08 2009
Justin Johansson wrote:I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.
Nov 08 2009
Lutger Wrote:Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.
Nov 08 2009
Justin Johansson wrote:Lutger Wrote:Your design makes better sense (to me at least) because it is based on why you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in no way related to formatting an object. You learn to work with it, but I find it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.
Nov 10 2009
Lutger wrote:Justin Johansson wrote:There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Lutger Wrote:Your design makes better sense (to me at least) because it is based on why you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in no way related to formatting an object. You learn to work with it, but I find it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.
Nov 10 2009
Don wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
Lutger Wrote:Don wrote: ...I have a feeling (and I may well be wrong) that toString might be used in relation to associative arrays. I implemented an AA recently based upon a struct key (I think). Though I cannot remember the exact details I do remember DMD saying something about toString not implemented and so without thinking I gave the struct a toString and that kept DMD happy. Since the code was throw-away I didn't bother to investigate. Like I say, I cannot remember the details but others may recall some similar experience. For all I know it may be a case of RTFM? beers, JustinThere is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
On Tue, Nov 10, 2009 at 3:59 AM, Justin Johansson <no spam.com> wrote:Lutger Wrote:thDon wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something wi=nd:it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater arou=urnsare there plans (or rejections thereof) to change toString() before D2 t=ngold? Seems to me it could break quite some code.I have a feeling (and I may well be wrong) that toString might be used in relation to associative arrays. =A0I implemented an AA recently based upo=a struct key (I think). =A0Though I cannot remember the exact details I d=oremember DMD saying something about toString not implemented and so without thinking I gave the struct a toString and that kept DMD happy. Since the code was throw-away I didn't bother to investigate. Like I say, I cannot remember the details but others may recall some simi=larexperience. =A0For all I know it may be a case of RTFM?Shouldn't be the case. From TFM: """ Classes can be used as the KeyType. For this to work, the class definition must override the following member functions of class Object: =95hash_t toHash() =95bool opEquals(Object) =95int opCmp(Object) """ --bb
Nov 10 2009
Bill Baxter Wrote:On Tue, Nov 10, 2009 at 3:59 AM, Justin Johansson <no spam.com> wrote:I think you are right; if I can dig up what it was, and if relevant to this discussion, I'll post it. Ignore what I said for mom. Just wondering now though and in reference to Lutger's commentLutger Wrote:Shouldn't be the case. From TFM: """ Classes can be used as the KeyType. For this to work, the class definition must override the following member functions of class Object: hash_t toHash() bool opEquals(Object) int opCmp(Object) """ --bbDon wrote: ...I have a feeling (and I may well be wrong) that toString might be used in relation to associative arrays. I implemented an AA recently based upon a struct key (I think). Though I cannot remember the exact details I do remember DMD saying something about toString not implemented and so without thinking I gave the struct a toString and that kept DMD happy. Since the code was throw-away I didn't bother to investigate. Like I say, I cannot remember the details but others may recall some similar experience. For all I know it may be a case of RTFM?There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.how much core code would be broken if toString was actually banished?Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
Lutger wrote:Don wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
Don Wrote:Lutger wrote:That's starting to look like a "serialize" method!Don wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
On Tue, 10 Nov 2009 07:49:11 -0500, Justin Johansson <no spam.com> wrote:Don Wrote:As it should. I should be able to print a 10000 element container without having to load a string representation of 10000 elements in memory. I'd also like to see the name toString changed to something more appropriate, like output(). And although I think a direct translation is mostly possible, emulating writefln string formatting from tango would be a burden. I don't know if there's any way around it without coming up with some complicated "formatting provider" interface/object implementation, and I don't think it's worth it. Unfortunately, I doubt Walter accepts this, it's been proposed in the past without success. -SteveLutger wrote:That's starting to look like a "serialize" method!Don wrote: ...toString()There is a definite use for such as thing. But the existingwithis much, much worse than useless. People think you can do somethingaround:it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() haterare there plans (or rejections thereof) to change toString() beforeD2 turnsgold? Seems to me it could break quite some code.I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }
Nov 10 2009
Steven Schveighoffer wrote:On Tue, 10 Nov 2009 07:49:11 -0500, Justin Johansson <no spam.com> wrote:Walter does not feel strongly about Phobos. The save() method in "On Iteration" intently makes it possible to define ranges as interfaces, which in turn should pave the way towards defining a coherent text streaming mechanism. AndreiDon Wrote:As it should. I should be able to print a 10000 element container without having to load a string representation of 10000 elements in memory. I'd also like to see the name toString changed to something more appropriate, like output(). And although I think a direct translation is mostly possible, emulating writefln string formatting from tango would be a burden. I don't know if there's any way around it without coming up with some complicated "formatting provider" interface/object implementation, and I don't think it's worth it. Unfortunately, I doubt Walter accepts this, it's been proposed in the past without success. -SteveLutger wrote:That's starting to look like a "serialize" method!Don wrote: ...toString()There is a definite use for such as thing. But the existingsomething withis much, much worse than useless. People think you can doaround:it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() haterare there plans (or rejections thereof) to change toString() beforeD2 turnsgold? Seems to me it could break quite some code.I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }
Nov 10 2009
On Tue, Nov 10, 2009 at 4:40 AM, Don <nospam nospam.com> wrote:Lutger wrote:That looks pretty good, actually. I guess I would like to see plain no-arg toString() still supported. A default toString() could be implemented in terms of the fancy one as: string toString() { char buf[]; toString( (string s) { buf ~= s; }, "" ); return assumeUnique!(buf); } could be a mixin in a library I suppose. I think I would like to see the format strings not necessarily tied to writefln's particular format. --bbDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
Bill Baxter wrote:On Tue, Nov 10, 2009 at 4:40 AM, Don <nospam nospam.com> wrote:The thing is, the toString() function is essentially a virtual function present in every struct. Each one of those functions needs a very strong justification to exist.Lutger wrote:That looks pretty good, actually. I guess I would like to see plain no-arg toString() still supported.Don wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.A default toString() could be implemented in terms of the fancy one as: string toString() { char buf[]; toString( (string s) { buf ~= s; }, "" ); return assumeUnique!(buf); } could be a mixin in a library I suppose.More for the benefit of consumers, or producers? Because void toString(void delegate(const(char)[]) sink, string fmt) { sink("xxx"); } isn't much more complex than: string toString() { return "xxx"; } other than the signature.I think I would like to see the format strings not necessarily tied to writefln's particular format.I think the format strings are actually pretty similar, Tango vs writefln? There might be enough common ground. I think the Tango format is a slight superset of the writefln one.
Nov 10 2009
On Tue, Nov 10, 2009 at 7:29 AM, Don <nospam nospam.com> wrote:Bill Baxter wrote:)On Tue, Nov 10, 2009 at 4:40 AM, Don <nospam nospam.com> wrote:Lutger wrote:Don wrote: ...There is a definite use for such as thing. But the existing toString(=Structs can't have virtual functions... so what do you mean?The thing is, the toString() function is essentially a virtual function present in every struct. Each one of those functions needs a very strong justification to exist.That looks pretty good, actually. I guess I would like to see plain no-arg toString() still supported.I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.Consumers. I was just thinking it would be a little annoying to have to reproduce the above 3-line snippet of code every time I want to get the string version of an object. But I guess such needs can be adequately served by std.string.format or sformat. So scratch that, no old-style toString() needed.A default toString() could be implemented in terms of the fancy one as: string toString() { =A0 =A0 char buf[]; =A0 =A0 toString( (string s) { buf ~=3D s; }, "" ); =A0 =A0 return assumeUnique!(buf); } could be a mixin in a library I suppose.More for the benefit of consumers, or producers?Because void toString(void delegate(const(char)[]) sink, string fmt) { =A0 sink("xxx"); } isn't much more complex than: string toString() { =A0return "xxx"; } other than the signature.Yeh, for authors of toString methods it's fine. Well, a different way to write delegates would be nice, but that's a different discussion.?I think I would like to see the format strings not necessarily tied to writefln's particular format.I think the format strings are actually pretty similar, Tango vs writefln=There might be enough common ground. I think the Tango format is a slight superset of the writefln one.Pretty similar, maybe, but I'd be surprised if they just happened to be identical without any attempt at compatibility having been made. --bb
Nov 10 2009
Don wrote:Lutger wrote:How are you supposed to print a BigInt then?Don wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Just put it into an "interface DebugOutput", remove Object.toString(), and be done with it. That interface could be defined in the same module as writefln or format, and its use will be clear.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }
Nov 10 2009
grauzone wrote:Don wrote:How are you supposed to print one with it? It doesn't help. (The problem even more obvious if you consider BigFloat).Lutger wrote:How are you supposed to print a BigInt then?Don wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Just put it into an "interface DebugOutput", remove Object.toString(), and be done with it. That interface could be defined in the same module as writefln or format, and its use will be clear.BigInt is a struct, so it doesn't have interfaces.
Nov 10 2009
Don wrote:grauzone wrote:Structs are a different matter. Nothing dictates that a struct should have a toString method, or what arguments that method should have, right? (There's this compiler/runtime hack to make struct toString work with writefln, but now that wirtefln uses compile time varargs, it can go.)Don wrote:How are you supposed to print one with it? It doesn't help. (The problem even more obvious if you consider BigFloat).Lutger wrote:How are you supposed to print a BigInt then?Don wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Just put it into an "interface DebugOutput", remove Object.toString(), and be done with it. That interface could be defined in the same module as writefln or format, and its use will be clear.BigInt is a struct, so it doesn't have interfaces.
Nov 10 2009
grauzone wrote:Don wrote:This discussion is about that hack. Yes, it might be unnecessary if compile time varargs work sufficiently well.grauzone wrote:Structs are a different matter. Nothing dictates that a struct should have a toString method, or what arguments that method should have, right? (There's this compiler/runtime hack to make struct toString work with writefln, but now that wirtefln uses compile time varargs, it can go.)Don wrote:How are you supposed to print one with it? It doesn't help. (The problem even more obvious if you consider BigFloat).Lutger wrote:How are you supposed to print a BigInt then?Don wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Just put it into an "interface DebugOutput", remove Object.toString(), and be done with it. That interface could be defined in the same module as writefln or format, and its use will be clear.BigInt is a struct, so it doesn't have interfaces.
Nov 10 2009
Don wrote:Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
On Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Don wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
Denis Koroskin wrote:On Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It should take an interface. AndreiDon wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:Denis Koroskin wrote:()On Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Don wrote:Lutger wrote:Don wrote: ...There is a definite use for such as thing. But the existing toString=beforeis much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() =teI think the best option for toString is to take an output range and wri=D2 turns gold? Seems to me it could break quite some code.=A0I'm hoping someone will come up with a design. =A0Straw man: =A0void toString(void delegate(const(char)[]) sink, string fmt) { =A0// fmt holds the format string from writefln/formatln. // call sink() to print partial results. =A0}So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbIt should take an interface.to it. (The sink is a simplified range.) AndreiIt means toString() must be either a template, or accept an abstract InputRange interface?
Nov 10 2009
Bill Baxter wrote:2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:I am not sure. Opinions as always are welcome. AndreiDenis Koroskin wrote:So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbOn Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It should take an interface.Don wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 10 2009
On Tue, Nov 10, 2009 at 5:27 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bill Baxter wrote:g2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:Denis Koroskin wrote:On Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Don wrote:Lutger wrote:Don wrote: ...There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do somethin=)with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString(=That's why my opinion is that the delegate idea is nice. :-) But I guess toString is already defined by Object, right? So it would make sense for an interface needed by an Object method to be defined in object.d. I suppose it could be an interface defined inside the Object class itself? (Does that work? can you define interfaces inside classes?) --bbI am not sure. Opinions as always are welcome.So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbIt should take an interface.It means toString() must be either a template, or accept an abstract InputRange interface?I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) Andreibefore D2 turns gold? Seems to me it could break quite some code.=A0I'm hoping someone will come up with a design. =A0Straw man: =A0void toString(void delegate(const(char)[]) sink, string fmt) { =A0// fmt holds the format string from writefln/formatln. // call sink() to print partial results. =A0}
Nov 10 2009
Bill Baxter wrote:On Tue, Nov 10, 2009 at 5:27 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It also needs to be used by structs, which aren't inherited from Object. So I don't see how a nested interface could work. I suggest a design acceptance criterion: the simplest case should be about as simple as: return "xxx"; or put("xxx");Bill Baxter wrote:That's why my opinion is that the delegate idea is nice. :-) But I guess toString is already defined by Object, right? So it would make sense for an interface needed by an Object method to be defined in object.d. I suppose it could be an interface defined inside the Object class itself? (Does that work? can you define interfaces inside classes?)2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:I am not sure. Opinions as always are welcome.Denis Koroskin wrote:So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbOn Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It should take an interface.Don wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 11 2009
On Wed, 11 Nov 2009 04:27:45 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bill Baxter wrote:Some ranges may be polymorphic, so having base interface hierarchy in Phobos would be useful anyway. BTW, save() is already implemented and used throughout the Phobos under a different name - opSlice (i.e. auto copy = range[]). It's a bikeshed discussion, but why save() and not opSlice(), or even clone()?2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:I am not sure. Opinions as always are welcome. AndreiDenis Koroskin wrote:So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbOn Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It should take an interface.Don wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 11 2009
Denis Koroskin wrote:On Wed, 11 Nov 2009 04:27:45 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It can't be clone() because it doesn't clone. For example say you have a T[] - one would expect clone() actually copies the content. But using opSlice is a good idea. AndreiBill Baxter wrote:Some ranges may be polymorphic, so having base interface hierarchy in Phobos would be useful anyway. BTW, save() is already implemented and used throughout the Phobos under a different name - opSlice (i.e. auto copy = range[]). It's a bikeshed discussion, but why save() and not opSlice(), or even clone()?2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:I am not sure. Opinions as always are welcome. AndreiDenis Koroskin wrote:So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbOn Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It should take an interface.Don wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 11 2009
On Wed, 11 Nov 2009 18:50:47 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Denis Koroskin wrote:Well, range doesn't own any of the contents it covers, so deep copy is impossible. Yet, there is also .dup array property which is pretends to be a standard way of creating instance copies.On Wed, 11 Nov 2009 04:27:45 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It can't be clone() because it doesn't clone. For example say you have a T[] - one would expect clone() actually copies the content. But using opSlice is a good idea. AndreiBill Baxter wrote:Some ranges may be polymorphic, so having base interface hierarchy in Phobos would be useful anyway. BTW, save() is already implemented and used throughout the Phobos under a different name - opSlice (i.e. auto copy = range[]). It's a bikeshed discussion, but why save() and not opSlice(), or even clone()?2009/11/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:I am not sure. Opinions as always are welcome. AndreiDenis Koroskin wrote:So yet another type in object.d? Or require users in import something specific in every module that's going to use toString? --bbOn Wed, 11 Nov 2009 02:49:54 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:It should take an interface.Don wrote:It means toString() must be either a template, or accept an abstract InputRange interface?Lutger wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.) AndreiDon wrote: ...I'm hoping someone will come up with a design. Straw man: void toString(void delegate(const(char)[]) sink, string fmt) { // fmt holds the format string from writefln/formatln. // call sink() to print partial results. }There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Since you are in the know and probably the biggest toString() hater around: are there plans (or rejections thereof) to change toString() before D2 turns gold? Seems to me it could break quite some code.
Nov 11 2009
Denis Koroskin wrote:Well, range doesn't own any of the contents it covers, so deep copy is impossible. Yet, there is also .dup array property which is pretends to be a standard way of creating instance copies.Well so the second sentence contradicts the first. Let me put it another way: you have the entire vocabulary at your disposal to define save(). Wouldn't you think clone() may be a bit more confusing than others? Andrei
Nov 11 2009
2009/11/11 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:Denis Koroskin wrote:makeBreadCrumb() ? :-) --bbWell, range doesn't own any of the contents it covers, so deep copy is impossible. Yet, there is also .dup array property which is pretends to be a standard way of creating instance copies.Well so the second sentence contradicts the first. Let me put it another way: you have the entire vocabulary at your disposal to define save(). Wouldn't you think clone() may be a bit more confusing than others?
Nov 11 2009
Denis:BTW, save() is already implemented and used throughout the Phobos under adifferent name - opSlice (i.e. auto copy = range[]). It's a bikeshed discussion, but why save() and not opSlice(), or even clone()? 2009/11/11 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>It can't be clone() because it doesn't clone. For example say you have a T[] - one would expect clone() actually copies the content. But using opSlice is a good idea.I don't get it. Shouldn't save() copy the content? Do you mean we could use opSlice() (the parameterless version) as a save function and write "auto r2 = r1[];"? But, again maybe I don't get something: for dyn. arrays (aka the range archetype) opSlice is not a save, it's just an alias. So using opSlice doesn't work for remembering positions with arrays. Philippe
Nov 11 2009
On Wed, 11 Nov 2009 20:08:52 +0300, Philippe Sigaud <philippe.sigaud gmail.com> wrote:Denis:It remembers array bounds, not contents.BTW, save() is already implemented and used throughout the Phobos under adifferent name - opSlice (i.e. auto copy = range[]). It's a bikeshed discussion, but why save() and not opSlice(), or even clone()? 2009/11/11 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>It can't be clone() because it doesn't clone. For example say you have a T[] - one would expect clone() actually copies the content. But using opSlice is a good idea.I don't get it. Shouldn't save() copy the content? Do you mean we could use opSlice() (the parameterless version) as a save function and write "auto r2 = r1[];"? But, again maybe I don't get something: for dyn. arrays (aka the range archetype) opSlice is not a save, it's just an alias. So using opSlice doesn't work for remembering positions with arrays. Philippe
Nov 11 2009
On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this. 2. They are useful for passing to std.algorithm. But std.algorithm is template-interfaced. No need for using interfaces because the correct instatiation will be chosen. If you are intending to add a streaming module that uses ranges, would it not be templated for the range type as std.algorithm is? If not, the next logical choice is a delegate, which requires no vtable lookup. Using an interface is just asking for a performance penalty for not much gain. Here's what I mean by not much gain: I would expect a stream range that does output to have a method in it for outputting a buffer (I'd laugh at you if you wanted to define a stream range that outputs a character at a time). So the difference between: x.toString(outputRange, format) and x.toString(&outputRange.sink, format) is pretty darn minimal, and if outputRange is an interface or object, this saves a virtual call per buffer write. Plus the second form is more universal, you can pass any delegate, and not have to use a range type to wrap a delegate. Don't fall into the "OOP newbie" trap -- where just because you've found a new concept that is amazing, you want to use it for everything. I say this because I've seen in the past where someone discovers the power of OOP and then wants to use it for everything, when in some cases, it's overkill. Just look at some Java "classes"... From another thread:Walter does not feel strongly about Phobos.Huh? I feel like this sentence doesn't make sense, so maybe there's a typo. -Steve
Nov 12 2009
On Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.
Nov 12 2009
On Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:On Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put. I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate. But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.
Nov 12 2009
On Thu, 12 Nov 2009 08:56:06 -0500, Denis Koroskin <2korden gmail.com> wrote:On Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:I was referring to range's ability to interact with foreach. An output range wouldn't qualify as a foreachable entity anyways (and rightfully so). Just covering all the bases.On Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put.On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate.Yes, there is: A delegate is equivalent to a struct member function call. (load data pointer (i.e. this), push args, call function) A virtual function uses a vtable to look up the function address, and then is equivalent to a struct member call. An interface function call is equivalent to a virtual call with the added penalty that you might have to adjust the 'this' pointer before calling.But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).You can use scope classes to avoid the allocation, but you can't get around the virtual/interface call penalty. But even if a range is a struct, it's simply a different form of delegate, one in which you undoubtedly call only one member function. Might as well use a delegate to allow the most usefulness. -Steve
Nov 12 2009
Steven Schveighoffer wrote:A delegate is equivalent to a struct member function call. (load data pointer (i.e. this), push args, call function)I think this particular point is incorrect. Andrei
Nov 12 2009
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleSteven Schveighoffer wrote:Most of the overhead from indirect function calls come from the fact that they (usually) can't be inlined, not because they are indirect. The struct member function call is faster mostly because it can be inlined, not because it's direct. Here's roughly what the ASM would look like for a call to a member function of a struct on the stack, if I is a metasyntactic variable for any immediate value: mov EAX, EBP; // Copy frame pointer to EAX add EAX, I; // Add the offset of the struct to EAX. push EAX; // EAX is now the this ptr. Push it. call I; // Call the function. And for a delegate that lives on the stack: mov EAX, [EBP + I]; // Move delegate's this ptr into EAX. push EAX; // Push delegate's this ptr onto stack. call [EBP + I]; // Call whatever address is at offset I from EBP. I've actually benchmarked how much indirect function calls cost compared to direct calls that aren't inlined. The short answer is it's not measurable, at least when calling the same function indirectly in a loop over and over. It could in theory cause pipeline stalls because it's a branch, but according to some Intel optimization manual Don posted here a while back, modern CPUs predict the address of indirect function calls in their branch predictor. This means that if the same path is taken again and again, the overhead will be negligible.A delegate is equivalent to a struct member function call. (load data pointer (i.e. this), push args, call function)I think this particular point is incorrect. Andrei
Nov 12 2009
Denis Koroskin wrote:On Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:I think that, on the contrary, working with a delegate is less generic. A delegate is cost-wise much like a class with only one (non-final) method. Since we're taking that hit already, we may as well define actual interfaces and classes that have multiple methods. That makes things more flexible and more efficient. AndreiOn Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put. I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate. But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.
Nov 12 2009
Andrei Alexandrescu wrote:Denis Koroskin wrote:How? It seems to introduce more requirements on the implementation, but I'm not seeing any benefit in exchange. FWIW, with regard to performance, I can easily imagine the compiler being able to perform the equivalent of a "named return value" optimisation on a delegate return, giving some chance of inlining. That's a lot less obvious with an interface.On Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:I think that, on the contrary, working with a delegate is less generic. A delegate is cost-wise much like a class with only one (non-final) method. Since we're taking that hit already, we may as well define actual interfaces and classes that have multiple methods. That makes things more flexible and more efficient.On Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put. I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate. But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.
Nov 12 2009
Don wrote:Andrei Alexandrescu wrote:The benefit is that it allows writing all character widths.Denis Koroskin wrote:How? It seems to introduce more requirements on the implementation, but I'm not seeing any benefit in exchange.On Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:I think that, on the contrary, working with a delegate is less generic. A delegate is cost-wise much like a class with only one (non-final) method. Since we're taking that hit already, we may as well define actual interfaces and classes that have multiple methods. That makes things more flexible and more efficient.On Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put. I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate. But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.FWIW, with regard to performance, I can easily imagine the compiler being able to perform the equivalent of a "named return value" optimisation on a delegate return, giving some chance of inlining. That's a lot less obvious with an interface.That seems plausible. Andrei
Nov 12 2009
Andrei Alexandrescu Wrote:Denis Koroskin wrote:"Since we're taking that hit already, we may as well defineOn Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:I think that, on the contrary, working with a delegate is less generic. A delegate is cost-wise much like a class with only one (non-final) method. Since we're taking that hit already, we may as well define actual interfaces and classes that have multiple methods. That makes things more flexible and more efficient. AndreiOn Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put. I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate. But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.actual interfaces and classes that have multiple methods."Which you mean -- interfaces, classes or both? Don't interfaces have a higher cost than classes? Justin
Nov 12 2009
Justin Johansson wrote:Andrei Alexandrescu Wrote:My understanding is that the costs are comparable. AndreiDenis Koroskin wrote:"Since we're taking that hit already, we may as well defineOn Thu, 12 Nov 2009 16:23:22 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:I think that, on the contrary, working with a delegate is less generic. A delegate is cost-wise much like a class with only one (non-final) method. Since we're taking that hit already, we may as well define actual interfaces and classes that have multiple methods. That makes things more flexible and more efficient. AndreiOn Thu, 12 Nov 2009 08:22:26 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:Output range has only one method: put. I'm not sure, but I don't think there is a performance difference between calling a virtual function through an interface and invoking a delegate. But I agree passing a delegate is more generic. You can substitute an output range with a delegate (obj.toString(&range.put, fmt)) without any performance hit, but not vice versa (obj.toString(new DelegateWrapRange(&myput), fmt) implies an additional allocation and additional indirection per range.put call).On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oops, I meant 3 virtual functions -- front, popNext, and empty. -SteveI think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance. Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.actual interfaces and classes that have multiple methods."Which you mean -- interfaces, classes or both? Don't interfaces have a higher cost than classes?
Nov 12 2009
Steven Schveighoffer wrote:On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You are right. If range interfaces accommodate block transfers, this problem may be addressed. I agree that one virtual call per character output would be overkill. (I seem to recall it's one of the reasons why C++'s iostreams are so inefficient.)I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance.Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this. 2. They are useful for passing to std.algorithm. But std.algorithm is template-interfaced. No need for using interfaces because the correct instatiation will be chosen. If you are intending to add a streaming module that uses ranges, would it not be templated for the range type as std.algorithm is? If not, the next logical choice is a delegate, which requires no vtable lookup. Using an interface is just asking for a performance penalty for not much gain.I think the cost of calling through the delegate is roughly the same as a virtual call.Here's what I mean by not much gain: I would expect a stream range that does output to have a method in it for outputting a buffer (I'd laugh at you if you wanted to define a stream range that outputs a character at a time). So the difference between:Well I'd laugh at you if you thought I'm that brain dead :o).x.toString(outputRange, format) and x.toString(&outputRange.sink, format) is pretty darn minimal, and if outputRange is an interface or object, this saves a virtual call per buffer write. Plus the second form is more universal, you can pass any delegate, and not have to use a range type to wrap a delegate. Don't fall into the "OOP newbie" trap -- where just because you've found a new concept that is amazing, you want to use it for everything. I say this because I've seen in the past where someone discovers the power of OOP and then wants to use it for everything, when in some cases, it's overkill. Just look at some Java "classes"...There is no need to worry that I'll fall into at least that particular OOP newbie trap. What I think we should do is define a text output interface that allows writing individual characters of all widths and also arrays of all widths. That would be a universal means for text output. interface TextOutputStream { void put(dchar); // also accommodates char and wchar void put(in char[]); void put(in wchar[]); void put(in dchar[]); } The toString method (re-baptized as toStream) would take such an interface. Better ideas are always welcome. Perhaps I'm falling another OOP newbie trap! (Seriously!) One possible course of action would be to extend the text output stream to print (and possibly format) some or all primitive types, a la today's phobos streams. That would make TextOutputStream fatter and more diluted, something that I don't like. But then we might define a FormattingTextOutputStream that extends TextOutputStream with all that stuff.From another thread:I meant to say, Walter does not want to do library design. AndreiWalter does not feel strongly about Phobos.Huh? I feel like this sentence doesn't make sense, so maybe there's a typo.
Nov 12 2009
On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote:IIRC, I don't think C++ iostreams use polymorphism, and I don't think they use the "one char at a time" method.On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You are right. If range interfaces accommodate block transfers, this problem may be addressed. I agree that one virtual call per character output would be overkill. (I seem to recall it's one of the reasons why C++'s iostreams are so inefficient.)I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance.Not exactly. I think you are right that struct member calls are faster than delegates, but only slightly. The difference being that a struct member call does not need to load the function address from the stack, it can hard-code the address directly. However, virtual calls have to be lower performing because you are doing two indirections, one to the class vtable, then one to the function address itself. Plus those two locations are most likely located on the heap, not the stack, and so may not be in the cache.Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.>2. They are useful for passing to std.algorithm. But std.algorithm is template-interfaced. No need for using interfaces because the correct instatiation will be chosen. If you are intending to add a streaming module that uses ranges, would it not be templated for the range type as std.algorithm is? If not, the next logical choice is a delegate, which requires no vtable lookup. Using an interface is just asking for a performance penalty for not much gain.I think the cost of calling through the delegate is roughly the same as a virtual call.This still fits within a single function, which takes one of the 3 widths (pick one, they can all be translated to eachother): void put(in char[] str) { foreach(dchar dc; str) { put((&dc)[0..1]); } } Note that you probably want to build a buffer of dchars instead of putting one at a time, but you get the idea. Also, putting a single character is probably pretty uncommon, but can be handled in a similar fashion. That being said, one other point that makes all this moot is -- toString is for debugging, not for general purpose. We don't need to support everything that is possible. You should be able to say "hey, toString only accepts char[], deal." Of course, you could substitute wchar[] or dchar[], but I think by far char[] is the most common (and is the default type for string literals). That's not to say there is no reason to have a TextOutputStream object. Such a thing is perfectly usable for a toString which takes a char[] delegate sink, just pass &put. In fact, there could be a default toString function in Object that does just that: class Object { ... void toString(delegate void(in char[] buf) put, string fmt) const {} void toString(TextOutputStream tos, string fmt) const { toString(&tos.put, fmt); } } Of course, then TextOutputStream has to be druntime-accessible, so maybe it's not a great idea... But there are ways around that: abstract class BaseTextOutputStream : TextOutputStream { void format(const Object o, string fmt) { o.toString(&this.put, fmt); } }x.toString(outputRange, format) and x.toString(&outputRange.sink, format) is pretty darn minimal, and if outputRange is an interface or object, this saves a virtual call per buffer write. Plus the second form is more universal, you can pass any delegate, and not have to use a range type to wrap a delegate. Don't fall into the "OOP newbie" trap -- where just because you've found a new concept that is amazing, you want to use it for everything. I say this because I've seen in the past where someone discovers the power of OOP and then wants to use it for everything, when in some cases, it's overkill. Just look at some Java "classes"...There is no need to worry that I'll fall into at least that particular OOP newbie trap. What I think we should do is define a text output interface that allows writing individual characters of all widths and also arrays of all widths. That would be a universal means for text output. interface TextOutputStream { void put(dchar); // also accommodates char and wchar void put(in char[]); void put(in wchar[]); void put(in dchar[]); } The toString method (re-baptized as toStream) would take such an interface. Better ideas are always welcome. Perhaps I'm falling another OOP newbie trap! (Seriously!)I'm trying to remember but I thought he did care about this particular issue, but it may be muddled in my memory. Also note that toString has special status from the compiler in regards to structs (that hack with the xtoString function in the struct's typeinfo), so it doesn't just affect library code. -SteveFrom another thread:I meant to say, Walter does not want to do library design.Walter does not feel strongly about Phobos.Huh? I feel like this sentence doesn't make sense, so maybe there's a typo.
Nov 12 2009
Steven Schveighoffer wrote:On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oh yes they do. (Did you even google?) Virtual multiple inheritance, the works. http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/Steven Schveighoffer wrote:IIRC, I don't think C++ iostreams use polymorphismOn Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You are right. If range interfaces accommodate block transfers, this problem may be addressed. I agree that one virtual call per character output would be overkill. (I seem to recall it's one of the reasons why C++'s iostreams are so inefficient.)I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance., and I don't think they use the "one char at a time" method.Well they do offer one char at a time and also a block transfer. http://msdn.microsoft.com/en-us/library/760t8w1z%28VS.80%29.aspx I'm not sure how the heck but they still manage to call one virtual method per char, otherwise they'd be plenty fast, which they aren't. I seem to recall write() has a default implementation that calls put() in a loop or something. It's not a topic that I want to study closely. iostreams suck, why spend time on learning the quirks of a broken design.I think the only way to figure is to measure. For one thing I disagree with the comment about the cache - a vtable is quite likely to be warm after a couple of calls. I know one thing - Walter's old format function used delegates and it was unusably slow.Not exactly. I think you are right that struct member calls are faster than delegates, but only slightly. The difference being that a struct member call does not need to load the function address from the stack, it can hard-code the address directly. However, virtual calls have to be lower performing because you are doing two indirections, one to the class vtable, then one to the function address itself. Plus those two locations are most likely located on the heap, not the stack, and so may not be in the cache.Ranges are special in two respects: 1. They are foreachable. I think everyone agrees that calling 2 interface functions per loop iteration is much lower performing than using opApply, which calls one delegate function per loop. My recommendation -- use opApply when dealing with polymorphism. I don't think there's a way around this.>2. They are useful for passing to std.algorithm. But std.algorithm is template-interfaced. No need for using interfaces because the correct instatiation will be chosen. If you are intending to add a streaming module that uses ranges, would it not be templated for the range type as std.algorithm is? If not, the next logical choice is a delegate, which requires no vtable lookup. Using an interface is just asking for a performance penalty for not much gain.I think the cost of calling through the delegate is roughly the same as a virtual call.I don't get the idea. I'm seeing one virtual call per character.This still fits within a single function, which takes one of the 3 widths (pick one, they can all be translated to eachother): void put(in char[] str) { foreach(dchar dc; str) { put((&dc)[0..1]); } } Note that you probably want to build a buffer of dchars instead of putting one at a time, but you get the idea.x.toString(outputRange, format) and x.toString(&outputRange.sink, format) is pretty darn minimal, and if outputRange is an interface or object, this saves a virtual call per buffer write. Plus the second form is more universal, you can pass any delegate, and not have to use a range type to wrap a delegate. Don't fall into the "OOP newbie" trap -- where just because you've found a new concept that is amazing, you want to use it for everything. I say this because I've seen in the past where someone discovers the power of OOP and then wants to use it for everything, when in some cases, it's overkill. Just look at some Java "classes"...There is no need to worry that I'll fall into at least that particular OOP newbie trap. What I think we should do is define a text output interface that allows writing individual characters of all widths and also arrays of all widths. That would be a universal means for text output. interface TextOutputStream { void put(dchar); // also accommodates char and wchar void put(in char[]); void put(in wchar[]); void put(in dchar[]); } The toString method (re-baptized as toStream) would take such an interface. Better ideas are always welcome. Perhaps I'm falling another OOP newbie trap! (Seriously!)Also, putting a single character is probably pretty uncommon, but can be handled in a similar fashion.I'm not sure about the uncommonality of outputting one character, but it may be good to discourage it just to not foster slow code.That being said, one other point that makes all this moot is -- toString is for debugging, not for general purpose. We don't need to support everything that is possible. You should be able to say "hey, toString only accepts char[], deal." Of course, you could substitute wchar[] or dchar[], but I think by far char[] is the most common (and is the default type for string literals).I was hoping we could elevate the usefulness of toString a bit.That's not to say there is no reason to have a TextOutputStream object. Such a thing is perfectly usable for a toString which takes a char[] delegate sink, just pass &put. In fact, there could be a default toString function in Object that does just that: class Object { ... void toString(delegate void(in char[] buf) put, string fmt) const {} void toString(TextOutputStream tos, string fmt) const { toString(&tos.put, fmt); } }I'd agree with the delegate idea if we established that UTF-8 is favored compared to all other formats. Andrei
Nov 12 2009
On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote:From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oh yes they do. (Did you even google?) Virtual multiple inheritance, the works. http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/Steven Schveighoffer wrote:IIRC, I don't think C++ iostreams use polymorphismOn Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You are right. If range interfaces accommodate block transfers, this problem may be addressed. I agree that one virtual call per character output would be overkill. (I seem to recall it's one of the reasons why C++'s iostreams are so inefficient.)I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance.You missed the note. I didn't implement it, but you could easily implement a stack-allocated buffer to cache the conversions, passing multiple converted code-points at once. But I don't think it's even worth discussing per my other points.void put(in char[] str) { foreach(dchar dc; str) { put((&dc)[0..1]); } } Note that you probably want to build a buffer of dchars instead of putting one at a time, but you get the idea.I don't get the idea. I'm seeing one virtual call per character.Whatever kind of data the output stream gets, it's going to convert it to the format it wants anyways (as for stdout, I think that would be utf8), the only benefit is if you have data stored in a different width that you wanted to output. Calling a conversion function in that case I think is reasonable enough, and saves the output stream from having to convert/deal with it. In other words, I don't think it's going to be that common a case where you need anything other than utf8 output, and therefore the cost of creating an interface, making virtual calls, disallowing simple delegate passing etc is worth the convenience *just in case* you have data stored as wchar[] you want to output.That being said, one other point that makes all this moot is -- toString is for debugging, not for general purpose. We don't need to support everything that is possible. You should be able to say "hey, toString only accepts char[], deal." Of course, you could substitute wchar[] or dchar[], but I think by far char[] is the most common (and is the default type for string literals).I was hoping we could elevate the usefulness of toString a bit.D seems to favor UTF8 -- it is the default type for string literals. I don't think I've ever used dchar, and I usually only use wchar to talk to Win32 functions when required. The question I'd ask is -- how common is it where the versions other than char[] would be more convenient? -SteveThat's not to say there is no reason to have a TextOutputStream object. Such a thing is perfectly usable for a toString which takes a char[] delegate sink, just pass &put. In fact, there could be a default toString function in Object that does just that: class Object { ... void toString(delegate void(in char[] buf) put, string fmt) const {} void toString(TextOutputStream tos, string fmt) const { toString(&tos.put, fmt); } }I'd agree with the delegate idea if we established that UTF-8 is favored compared to all other formats.
Nov 12 2009
Steven Schveighoffer wrote:On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge. At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.Steven Schveighoffer wrote:From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oh yes they do. (Did you even google?) Virtual multiple inheritance, the works. http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/Steven Schveighoffer wrote:IIRC, I don't think C++ iostreams use polymorphismOn Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You are right. If range interfaces accommodate block transfers, this problem may be addressed. I agree that one virtual call per character output would be overkill. (I seem to recall it's one of the reasons why C++'s iostreams are so inefficient.)I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance.I'm not sure. http://www.gnu.org/s/libc/manual/html_node/Streams-and-I18N.html#Streams-and-I18N gnu defines means to set and detect a utf-16 console, which dmd observes (grep std/ for fwide). But then I'm not sure how many are using that kind of stuff.You missed the note. I didn't implement it, but you could easily implement a stack-allocated buffer to cache the conversions, passing multiple converted code-points at once. But I don't think it's even worth discussing per my other points.void put(in char[] str) { foreach(dchar dc; str) { put((&dc)[0..1]); } } Note that you probably want to build a buffer of dchars instead of putting one at a time, but you get the idea.I don't get the idea. I'm seeing one virtual call per character.Whatever kind of data the output stream gets, it's going to convert it to the format it wants anyways (as for stdout, I think that would be utf8), the only benefit is if you have data stored in a different width that you wanted to output. Calling a conversion function in that case I think is reasonable enough, and saves the output stream from having to convert/deal with it. In other words, I don't think it's going to be that common a case where you need anything other than utf8 output, and therefore the cost of creating an interface, making virtual calls, disallowing simple delegate passing etc is worth the convenience *just in case* you have data stored as wchar[] you want to output.That being said, one other point that makes all this moot is -- toString is for debugging, not for general purpose. We don't need to support everything that is possible. You should be able to say "hey, toString only accepts char[], deal." Of course, you could substitute wchar[] or dchar[], but I think by far char[] is the most common (and is the default type for string literals).I was hoping we could elevate the usefulness of toString a bit.I don't know. I think Asian-language users might give a salient answer. AndreiD seems to favor UTF8 -- it is the default type for string literals. I don't think I've ever used dchar, and I usually only use wchar to talk to Win32 functions when required. The question I'd ask is -- how common is it where the versions other than char[] would be more convenient?That's not to say there is no reason to have a TextOutputStream object. Such a thing is perfectly usable for a toString which takes a char[] delegate sink, just pass &put. In fact, there could be a default toString function in Object that does just that: class Object { ... void toString(delegate void(in char[] buf) put, string fmt) const {} void toString(TextOutputStream tos, string fmt) const { toString(&tos.put, fmt); } }I'd agree with the delegate idea if we established that UTF-8 is favored compared to all other formats.
Nov 12 2009
On Thu, Nov 12, 2009 at 10:46 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:dI'd agree with the delegate idea if we established that UTF-8 is favore=Icompared to all other formats.D seems to favor UTF8 -- it is the default type for string literals. =A0=odon't think I've ever used dchar, and I usually only use wchar to talk t=nWin32 functions when required. The question I'd ask is -- how common is it where the versions other tha=This isn't authoritative, but I don't think utf-16 is commonly used in Japan (except for calling Windows APIs). If you look at Mozilla the default Japanese encoding listed is Shift-JIS. A lot of Japanese email still gets sent as ISO-2022-JP. Otherwise utf-8 I think. A quick look at www.asahi.com shows they're using EUC-JP. nicovideo.jp is using utf-8. I seem to recall that my Japanese Visual Studio even saved files in Utf-8, or at least could be set to use utf-8. In short, I think utf-8 is closer to being a widely accepted standard for documents over there than utf-16 is. --bbchar[] would be more convenient?I don't know. I think Asian-language users might give a salient answer.
Nov 12 2009
On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yep, you are right. It appears the reason they do this is so the conversion to the appropriate width can be done per character (and is a no-op for char).From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge.At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.After running my tests, it appears the virtual call vs. delegate is so negligible, and the virtual call vs. direct call is only slightly less negligible, I think the virtualness may not matter. However, I think avoiding one *call* per character is a worthy goal. This doesn't mean I change my mind :) I still think there is little benefit to having to conjure up an entire object just to convert something to a string vs. writing a simple inner function. One way to find out is to support only char[], and see who complains :) It'd be much easier to go from supporting char[] to supporting all the widths than going from supporting all to just one. -Steve
Nov 12 2009
Steven Schveighoffer wrote:On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:One problem I just realized is that, if we e.g. offer only put(in char[]) or a delegate to that effect, we make it impossible to output one character efficiently. The (&c)[0 .. 1] trick will not work in safe mode. You'd have to allocate a one-element array dynamically. Also, many OSs adopted UTF-16 as their standard format. It may be wise to design for compatibility. AndreiYep, you are right. It appears the reason they do this is so the conversion to the appropriate width can be done per character (and is a no-op for char).From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge.At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.After running my tests, it appears the virtual call vs. delegate is so negligible, and the virtual call vs. direct call is only slightly less negligible, I think the virtualness may not matter. However, I think avoiding one *call* per character is a worthy goal. This doesn't mean I change my mind :) I still think there is little benefit to having to conjure up an entire object just to convert something to a string vs. writing a simple inner function. One way to find out is to support only char[], and see who complains :) It'd be much easier to go from supporting char[] to supporting all the widths than going from supporting all to just one.
Nov 12 2009
On Thu, 12 Nov 2009 14:40:12 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote:char[1] buf; buf[0] = c; put(buf); Although it would be a useful feature to be able to convert a value type to an array of one element reference, especially since that should be as safe as taking a slice of a static array. Another solution, although I'm unaware of the added costs: void toString(void delegate(in char[]...) put, string fmt);On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:One problem I just realized is that, if we e.g. offer only put(in char[]) or a delegate to that effect, we make it impossible to output one character efficiently. The (&c)[0 .. 1] trick will not work in safe mode. You'd have to allocate a one-element array dynamically.Yep, you are right. It appears the reason they do this is so the conversion to the appropriate width can be done per character (and is a no-op for char).From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge.At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.After running my tests, it appears the virtual call vs. delegate is so negligible, and the virtual call vs. direct call is only slightly less negligible, I think the virtualness may not matter. However, I think avoiding one *call* per character is a worthy goal. This doesn't mean I change my mind :) I still think there is little benefit to having to conjure up an entire object just to convert something to a string vs. writing a simple inner function. One way to find out is to support only char[], and see who complains :) It'd be much easier to go from supporting char[] to supporting all the widths than going from supporting all to just one.Also, many OSs adopted UTF-16 as their standard format. It may be wise to design for compatibility.So you want toString's to look like this? version(utf16isdefault) { textobj.put("Array: "w); ... } else { textobj.put("Array: "); ... } -Steve
Nov 12 2009
Steven Schveighoffer wrote:On Thu, 12 Nov 2009 14:40:12 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:This would not compile in SafeD.Steven Schveighoffer wrote:char[1] buf; buf[0] = c; put(buf);On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:One problem I just realized is that, if we e.g. offer only put(in char[]) or a delegate to that effect, we make it impossible to output one character efficiently. The (&c)[0 .. 1] trick will not work in safe mode. You'd have to allocate a one-element array dynamically.Yep, you are right. It appears the reason they do this is so the conversion to the appropriate width can be done per character (and is a no-op for char).From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge.At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.After running my tests, it appears the virtual call vs. delegate is so negligible, and the virtual call vs. direct call is only slightly less negligible, I think the virtualness may not matter. However, I think avoiding one *call* per character is a worthy goal. This doesn't mean I change my mind :) I still think there is little benefit to having to conjure up an entire object just to convert something to a string vs. writing a simple inner function. One way to find out is to support only char[], and see who complains :) It'd be much easier to go from supporting char[] to supporting all the widths than going from supporting all to just one.Although it would be a useful feature to be able to convert a value type to an array of one element reference, especially since that should be as safe as taking a slice of a static array. Another solution, although I'm unaware of the added costs: void toString(void delegate(in char[]...) put, string fmt);I was just thinking of offering an interface that offers utf8 and utf16 and utf32. AndreiAlso, many OSs adopted UTF-16 as their standard format. It may be wise to design for compatibility.So you want toString's to look like this? version(utf16isdefault) { textobj.put("Array: "w); ... } else { textobj.put("Array: "); ... } -Steve
Nov 12 2009
On Thu, 12 Nov 2009 16:19:39 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote::O Why not? I would expect that using a local buffer would be the main way for converting non-string things to strings, or to avoid calling the delegate/vfunction lots of times. i.e. if I want to output an integer i: if(i == 0) put("0"); else { char[20] buf; int idx = buf.length - 1; while(i != 0) { buf[idx] = i % 10; --idx; i /= 10; } put(buf[idx..$]); // no compily in SafeD??? } Do I have to allocate a heap buffer in SafeD?On Thu, 12 Nov 2009 14:40:12 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:This would not compile in SafeD.Steven Schveighoffer wrote:char[1] buf; buf[0] = c; put(buf);On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:One problem I just realized is that, if we e.g. offer only put(in char[]) or a delegate to that effect, we make it impossible to output one character efficiently. The (&c)[0 .. 1] trick will not work in safe mode. You'd have to allocate a one-element array dynamically.Yep, you are right. It appears the reason they do this is so the conversion to the appropriate width can be done per character (and is a no-op for char).From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge.At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.After running my tests, it appears the virtual call vs. delegate is so negligible, and the virtual call vs. direct call is only slightly less negligible, I think the virtualness may not matter. However, I think avoiding one *call* per character is a worthy goal. This doesn't mean I change my mind :) I still think there is little benefit to having to conjure up an entire object just to convert something to a string vs. writing a simple inner function. One way to find out is to support only char[], and see who complains :) It'd be much easier to go from supporting char[] to supporting all the widths than going from supporting all to just one.Yes, and your explaination for this is because many OSes adopt UTF-16 as their standard format. My expectation is that the outputter will convert to the required OS format anyways, regardless of what you pass it, so why should we write code to cater to what the OS wants? I'd like to write string-handling code once and be done with it, not try to optimize my toString functions so that they use the "right" methods for the current OS. I asserted that the only reason you want to use the functions other than the char[] version is in the case where your data is *stored* as wchar[] or dchar[]. Otherwise, it makes no sense to do the conversion because the outputter already does it for you. So the question becomes, how often do you need to output data that's already in dchar[] or wchar[] format, and is it worth passing around a list of functions just in case you need that, or should you just call a conversion routine the few times you need it? Let's not forget that this is mainly for debugging... -SteveI was just thinking of offering an interface that offers utf8 and utf16 and utf32.Also, many OSs adopted UTF-16 as their standard format. It may be wise to design for compatibility.So you want toString's to look like this? version(utf16isdefault) { textobj.put("Array: "w); ... } else { textobj.put("Array: "); ... } -Steve
Nov 12 2009
Steven Schveighoffer wrote:On Thu, 12 Nov 2009 16:19:39 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Well a stack-allocated buffer is stack-allocated, and passing a slice out of it to a function may cause the function to escape the slice.Steven Schveighoffer wrote::O Why not? I would expect that using a local buffer would be the main way for converting non-string things to strings, or to avoid calling the delegate/vfunction lots of times.On Thu, 12 Nov 2009 14:40:12 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:This would not compile in SafeD.Steven Schveighoffer wrote:char[1] buf; buf[0] = c; put(buf);On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:One problem I just realized is that, if we e.g. offer only put(in char[]) or a delegate to that effect, we make it impossible to output one character efficiently. The (&c)[0 .. 1] trick will not work in safe mode. You'd have to allocate a one-element array dynamically.Yep, you are right. It appears the reason they do this is so the conversion to the appropriate width can be done per character (and is a no-op for char).From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge.At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.After running my tests, it appears the virtual call vs. delegate is so negligible, and the virtual call vs. direct call is only slightly less negligible, I think the virtualness may not matter. However, I think avoiding one *call* per character is a worthy goal. This doesn't mean I change my mind :) I still think there is little benefit to having to conjure up an entire object just to convert something to a string vs. writing a simple inner function. One way to find out is to support only char[], and see who complains :) It'd be much easier to go from supporting char[] to supporting all the widths than going from supporting all to just one.i.e. if I want to output an integer i: if(i == 0) put("0"); else { char[20] buf; int idx = buf.length - 1; while(i != 0) { buf[idx] = i % 10; --idx; i /= 10; } put(buf[idx..$]); // no compily in SafeD??? } Do I have to allocate a heap buffer in SafeD?I'm afraid so. Unless of course you have a put(dchar) routine handy :o).If it's mainly for debugging maybe it's not worth spending time on. AndreiYes, and your explaination for this is because many OSes adopt UTF-16 as their standard format. My expectation is that the outputter will convert to the required OS format anyways, regardless of what you pass it, so why should we write code to cater to what the OS wants? I'd like to write string-handling code once and be done with it, not try to optimize my toString functions so that they use the "right" methods for the current OS. I asserted that the only reason you want to use the functions other than the char[] version is in the case where your data is *stored* as wchar[] or dchar[]. Otherwise, it makes no sense to do the conversion because the outputter already does it for you. So the question becomes, how often do you need to output data that's already in dchar[] or wchar[] format, and is it worth passing around a list of functions just in case you need that, or should you just call a conversion routine the few times you need it? Let's not forget that this is mainly for debugging...I was just thinking of offering an interface that offers utf8 and utf16 and utf32.Also, many OSs adopted UTF-16 as their standard format. It may be wise to design for compatibility.So you want toString's to look like this? version(utf16isdefault) { textobj.put("Array: "w); ... } else { textobj.put("Array: "); ... } -Steve
Nov 12 2009
On Thu, Nov 12, 2009 at 1:54 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Nonsense! Developers spend a lot of time debugging. Helping people debug their programs is certainly worth spending time on. --bbLet's not forget that this is mainly for debugging...If it's mainly for debugging maybe it's not worth spending time on.
Nov 12 2009
Bill Baxter wrote:On Thu, Nov 12, 2009 at 1:54 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Sorry sorry. I just meant to say it's not worth coming with an airtight design. We might afford some extra conversions and extra virtual calls I guess. But that being said, I'd so much want to start thinking of an actual text serialization infrastructure. Why develop one later with the mention "well use that stuff for debugging only, this is the real stuff." AndreiNonsense! Developers spend a lot of time debugging. Helping people debug their programs is certainly worth spending time on. --bbLet's not forget that this is mainly for debugging...If it's mainly for debugging maybe it's not worth spending time on.
Nov 12 2009
On Thu, 12 Nov 2009 17:13:06 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bill Baxter wrote:The main purpose to serialize is to be able to deserialize. The main reason to print debug information is so a person can read it. I don't know if those two goals overlap enough. I think we need both. Maybe one uses the other, I'm not sure, but a way to say "here's how you interact with writefln and friends" would be very nice. -SteveOn Thu, Nov 12, 2009 at 1:54 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Sorry sorry. I just meant to say it's not worth coming with an airtight design. We might afford some extra conversions and extra virtual calls I guess. But that being said, I'd so much want to start thinking of an actual text serialization infrastructure. Why develop one later with the mention "well use that stuff for debugging only, this is the real stuff."Nonsense! Developers spend a lot of time debugging. Helping people debug their programs is certainly worth spending time on. --bbLet's not forget that this is mainly for debugging...If it's mainly for debugging maybe it's not worth spending time on.
Nov 12 2009
Steven Schveighoffer wrote:On Thu, 12 Nov 2009 17:13:06 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I'd add to that the a format facility should be locale aware as in .Net. i.e: (pseudo-code) auto str = format("{0}", 2.4, CurrentCulture); // or specify a specific locale str will be either "2.4" or "2,4" based on locale. this serves an entirely different purpose from serialization even though both have common parts. you can't and shouldn't try to de-serialize the above text representation.Bill Baxter wrote:The main purpose to serialize is to be able to deserialize. The main reason to print debug information is so a person can read it. I don't know if those two goals overlap enough. I think we need both. Maybe one uses the other, I'm not sure, but a way to say "here's how you interact with writefln and friends" would be very nice. -SteveOn Thu, Nov 12, 2009 at 1:54 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Sorry sorry. I just meant to say it's not worth coming with an airtight design. We might afford some extra conversions and extra virtual calls I guess. But that being said, I'd so much want to start thinking of an actual text serialization infrastructure. Why develop one later with the mention "well use that stuff for debugging only, this is the real stuff."Nonsense! Developers spend a lot of time debugging. Helping people debug their programs is certainly worth spending time on. --bbLet's not forget that this is mainly for debugging...If it's mainly for debugging maybe it's not worth spending time on.
Nov 12 2009
On Thu, 12 Nov 2009 16:54:13 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Debugging is not always done by the developer on his system where a debugger is available. The main use I see for toString is logging (for the purpose of debugging post-mortem failures on customer's systems). -SteveLet's not forget that this is mainly for debugging...If it's mainly for debugging maybe it's not worth spending time on.
Nov 12 2009
On Thu, 12 Nov 2009 11:14:56 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Some rudamentary attempts at benchmarking: testme.d: struct S { void foo(int x){} } interface I { void foo(int x); } class C : I { void foo(int x){} } const loopcount = 10_000_000_000L; void doVirtual() { C c = new C; for(auto x = loopcount; x > 0; x--) c.foo(x); } void doInterface() { I i = new C; for(auto x = loopcount; x > 0; x--) i.foo(x); } void doDelegate() { auto d = new C; auto dg = &d.foo; for(auto x = loopcount; x > 0; x--) dg(x); } void doStruct() { S s; for(auto x = loopcount; x > 0; x--) s.foo(x); } void main(char[][] args) { switch(args[1]) { case "virtual": doVirtual(); break; case "interface": doInterface(); break; case "struct": doStruct(); break; case "delegate": doDelegate(); break; } } [steves steveslaptop testd]$ time ./testme interface real 1m18.152s user 1m16.638s sys 0m0.015s [steves steveslaptop testd]$ time ./testme virtual real 1m11.146s user 1m10.497s sys 0m0.014s [steves steveslaptop testd]$ time ./testme struct real 1m5.828s user 1m5.249s sys 0m0.011s [steves steveslaptop testd]$ time ./testme delegate real 1m10.464s user 1m9.856s sys 0m0.010s According to this, delegates are slightly faster than virtual calls, but not by much. By far a direct call is faster, but I was surprised at how little overhead virtual calls add in relation to the loop counter. I had to use 10 billion loops or else the difference was undetectable. I used dmd 1.046 -release -O (the -release is needed to get rid of the class method checking the invariant every call). The relative assembly for calling a virtual method is: mov ECX,[EBX] mov EAX,EBX push dword ptr -8[EBP] call dword ptr 014h[ECX] and the assembly for calling a delegate is: push dword ptr -8[EBP] mov EAX,-010h[EBP] call EBX -SteveI think the cost of calling through the delegate is roughly the same as a virtual call.Not exactly. I think you are right that struct member calls are faster than delegates, but only slightly. The difference being that a struct member call does not need to load the function address from the stack, it can hard-code the address directly. However, virtual calls have to be lower performing because you are doing two indirections, one to the class vtable, then one to the function address itself. Plus those two locations are most likely located on the heap, not the stack, and so may not be in the cache.
Nov 12 2009
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article By far a direct call is faster, but I was surprised at howlittle overhead virtual calls add in relation to the loop counter. I had to use 10 billion loops or else the difference was undetectable. I used dmd 1.046 -release -O (the -release is needed to get rid of the class method checking the invariant every call). The relative assembly for calling a virtual method is: mov ECX,[EBX] mov EAX,EBX push dword ptr -8[EBP] call dword ptr 014h[ECX] and the assembly for calling a delegate is: push dword ptr -8[EBP] mov EAX,-010h[EBP] call EBX -SteveYour benchmarks don't show that the direct call is much faster. You had inlining disabled. Was this intentional? If so, it proves my point that most of the overhead from virtual calls comes from the fact that they can't usually be inlined, not because they're virtual.
Nov 12 2009
On Thu, 12 Nov 2009 12:38:00 -0500, dsimcha <dsimcha yahoo.com> wrote:== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article By far a direct call is faster, but I was surprised at howThe direct call was 5 seconds faster. Divide by 10 billion and you get a small but present amount. Inlining makes the struct member function call disappear (b/c foo does nothing!), so it's not really a relevant benchmark. I did the "struct" version as a baseline. Consider that the struct version is the cost of doing the loop increments, pushing the 'this' pointer and argument, and calling the function. Any difference from that is the overhead of virtual/delegate/interface calls. Inlining is not possible with delegates (yet), so it's not really important for this argument. -Stevelittle overhead virtual calls add in relation to the loop counter. I had to use 10 billion loops or else the difference was undetectable. I used dmd 1.046 -release -O (the -release is needed to get rid of the class method checking the invariant every call). The relative assembly for calling a virtual method is: mov ECX,[EBX] mov EAX,EBX push dword ptr -8[EBP] call dword ptr 014h[ECX] and the assembly for calling a delegate is: push dword ptr -8[EBP] mov EAX,-010h[EBP] call EBX -SteveYour benchmarks don't show that the direct call is much faster. You had inlining disabled. Was this intentional? If so, it proves my point that most of the overhead from virtual calls comes from the fact that they can't usually be inlined, not because they're virtual.
Nov 12 2009
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s articleOn Thu, 12 Nov 2009 12:38:00 -0500, dsimcha <dsimcha yahoo.com> wrote:Yes, about 0.5 nanoseconds. In other words, if your CPU is roughly 2 GHz, about one **clock cycle**. This is definitely negligible IMHO.== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article By far a direct call is faster, but I was surprised at howThe direct call was 5 seconds faster. Divide by 10 billion and you get a small but present amount.little overhead virtual calls add in relation to the loop counter. I had to use 10 billion loops or else the difference was undetectable. I used dmd 1.046 -release -O (the -release is needed to get rid of the class method checking the invariant every call). The relative assembly for calling a virtual method is: mov ECX,[EBX] mov EAX,EBX push dword ptr -8[EBP] call dword ptr 014h[ECX] and the assembly for calling a delegate is: push dword ptr -8[EBP] mov EAX,-010h[EBP] call EBX -SteveYour benchmarks don't show that the direct call is much faster. You had inlining disabled. Was this intentional? If so, it proves my point that most of the overhead from virtual calls comes from the fact that they can't usually be inlined, not because they're virtual.Inlining makes the struct member function call disappear (b/c foo does nothing!), so it's not really a relevant benchmark.Right, my point is that the overhead of indirect function calls compared to direct function calls is too small to ever be worth considering assuming the direct function call is not inlined. However, when the direct function call may be inlined, this is where indirect calls really hurt because they usually can't be inlined.I did the "struct" version as a baseline. Consider that the struct version is the cost of doing the loop increments, pushing the 'this' pointer and argument, and calling the function. Any difference from that is the overhead of virtual/delegate/interface calls. Inlining is not possible with delegates (yet), so it's not really important for this argument. -Steve
Nov 12 2009
Don Wrote:Lutger wrote:s/over-my-dead-body/over-your-dead-body/ :-) At least those are the words that Brendan Eich uses when people seek to make JavaScript multi-threaded. http://weblogs.mozillazine.org/roadmap/archives/2007/02/threads_suck.html http://www.teknico.net/misc/fortune/concurrency.en.txt Google: http://www.google.com.au/#hl=en&q=Brendan+Eich+"your+dead+body" Best regards and thanks to all respondents on "toString" topic, JustinJustin Johansson wrote:There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Lutger Wrote:Your design makes better sense (to me at least) because it is based on why you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in no way related to formatting an object. You learn to work with it, but I find it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.
Nov 10 2009
On Tue, Nov 10, 2009 at 2:51 AM, Don <nospam nospam.com> wrote:Lutger wrote:ofJustin Johansson wrote:Lutger Wrote:Justin Johansson wrote:I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative =ofD NG readers responding with their own idea(s) of what the semantics =pe"toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) ty=ngobjects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printi=Thenout the name of the object class. =A0For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. =A0=toagain, and since I'm working on a scripting language, sometimes I like =hysee debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinYour design makes better sense (to me at least) because it is based on w=gyou want a string from some object. Take .NET for example: it does provide very elaborate and nice formattin=eoptions based and toString() with parameters. For some types however, th=waydefault toString() gives you the name of the type itself which is in no =t arelated to formatting an object. You learn to work with it, but I find i=You can definitely do something with it -- printf debugging. And if I were using BigInt, that's exactly why I'd want BigInt to have a toString. Just out of curiousity, how does someone print out the value of a BigInt right now? --bbbit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.
Nov 10 2009
Bill Baxter:Just out of curiousity, how does someone print out the value of a BigInt right now?I have added a toString to my copy of the BigInt. Bye, bearophile
Nov 10 2009
Bill Baxter wrote:On Tue, Nov 10, 2009 at 2:51 AM, Don <nospam nospam.com> wrote:I almost always want to print the value out in hex. And with some kind of digit separators, so that I can see how many digits it has. Just out of curiousity, how does someone print out theLutger wrote:You can definitely do something with it -- printf debugging. And if I were using BigInt, that's exactly why I'd want BigInt to have a toString.Justin Johansson wrote:There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Lutger Wrote:Your design makes better sense (to me at least) because it is based on why you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in no way related to formatting an object. You learn to work with it, but I find it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.value of a BigInt right now?In Tango, there's just .toHex() and .toDecimalString(). Needs proper formatting options, it's the biggest thing which isn't done. I hit one too many compiler segfaults and starting patching the compiler instead <g>. But I really want a decent toString(). Given a BigInt n, you should be able to just do writefln("%s %x", n, n); // Phobos formatln("{0} {0:X}", n); // Tango To solve this part of the issue, it would be enough to have toString() take a string parameter. (it would be "x" or "X" in this case). string toString(string fmt); But the performance would still be very poor, and that's much more difficult to solve.
Nov 10 2009
Don:But the performance would still be very poor, and that's much more difficult to solve.This may help: http://fredrik-j.blogspot.com/2008/07/making-division-in-python-faster.html http://fredrik-j.blogspot.com/2008/07/division-sequel-with-bonus-material.html http://bugs.python.org/issue3451 Bye, bearophile
Nov 10 2009
On Tue, Nov 10, 2009 at 6:11 AM, bearophile <bearophileHUGS lycos.com> wrote:Don:Though they may be useful, those don't look to have anything to do with formatting user types into strings, which is the subject at hand. --bbBut the performance would still be very poor, and that's much more difficult to solve.This may help: http://fredrik-j.blogspot.com/2008/07/making-division-in-python-faster.html http://fredrik-j.blogspot.com/2008/07/division-sequel-with-bonus-material.html http://bugs.python.org/issue3451
Nov 10 2009
Bill Baxter:Though they may be useful, those don't look to have anything to do with formatting user types into strings, which is the subject at hand.Don has said: "But the performance would still be very poor, and that's much more difficult to solve." And those links show a way to quickly convert a large multi-precision integer into a string. What is that I am missing? Bye, bearophile
Nov 10 2009
bearophile wrote:Bill Baxter:It's problem 2 from my original posts: being able to output something large (eg an xml doc) in a piece-by-piece manner.Though they may be useful, those don't look to have anything to do with formatting user types into strings, which is the subject at hand.Don has said: "But the performance would still be very poor, and that's much more difficult to solve." And those links show a way to quickly convert a large multi-precision integer into a string. What is that I am missing?
Nov 10 2009
On Tue, Nov 10, 2009 at 7:04 AM, bearophile <bearophileHUGS lycos.com> wrote:Bill Baxter:Maybe it's just my ignorance of BigNum issues, but those links look to me to be about divsion and not generating string representations. Are those somehow synonymous in BigInt land? --bbThough they may be useful, those don't look to have anything to do with formatting user types into strings, which is the subject at hand.Don has said: "But the performance would still be very poor, and that's much more difficult to solve." And those links show a way to quickly convert a large multi-precision integer into a string. What is that I am missing?
Nov 10 2009
Bill Baxter:Maybe it's just my ignorance of BigNum issues, but those links look to me to be about divsion and not generating string representations. Are those somehow synonymous in BigInt land?Look the numeral() function inside here from those blog posts: http://www.dd.chalmers.se/~frejohl/code/div.py To convert a positive integer to string you have to keep dividing a number by 10, and accumulate the modulus as the digit, converted to ['0', '9']. When the number is zero you are done: n = 541489 result = "" while n: n, digit = divmod(n, 10) But all those large divisions are slow if the number is huge. So that div.py python program shows a faster algorithm that does something smarter, to decrease the computational complexity of all that. Bye, bearophile
Nov 10 2009
On Tue, Nov 10, 2009 at 9:16 AM, bearophile <bearophileHUGS lycos.com> wrot= e:Bill Baxter:r by 10, and accumulate the modulus as the digit, converted to ['0', '9']. = When the number is zero you are done:Maybe it's just my ignorance of BigNum issues, but those links look to me to be about divsion and not generating string representations. =A0Are those somehow synonymous in BigInt land?Look the numeral() function inside here from those blog posts: http://www.dd.chalmers.se/~frejohl/code/div.py To convert a positive integer to string you have to keep dividing a numbe=n =3D 541489 result =3D "" while n: =A0 =A0n, digit =3D divmod(n, 10) But all those large divisions are slow if the number is huge. So that div=.py python program shows a faster algorithm that does something smarter, to= decrease the computational complexity of all that. Well, anyway, slowness of BigInt is not what Don was referring to. He was talking about the general slowness of a toString interface that forces allocating enough memory to hold the entire result, instead of being able to dole out the result piecemeal. --bb
Nov 10 2009
Bill Baxter:Well, anyway, [...]You are welcome. Bye, bearophile
Nov 10 2009
On Tue, Nov 10, 2009 at 4:30 AM, Don <nospam nospam.com> wrote:o=A0Just out of curiousity, how does someone print out the value of a BigInt right now?In Tango, there's just .toHex() and .toDecimalString(). Needs proper formatting options, it's the biggest thing which isn't done. I hit one to=many compiler segfaults and starting patching the compiler instead <g>. B=utI really want a decent toString().Ah, ok. So there is something, it's just not called "toString". --bb
Nov 10 2009
On Tue, 10 Nov 2009 15:30:20 +0300, Don <nospam nospam.com> wrote:Bill Baxter wrote:Yes, it would solve half of the toString problems. Another part (i.e. memory allocation) could be solved by providing an optional buffer to the toString: char[] toString(string format = "s" /* comes from %s which is a default qualifier */, char[] buffer = null) { // operate on the buffer, possibly resizing it // which is safe and fast - it only allocates // when *really* necessary, instead of always, as now return buffer; } You can use it almost the same way you used it before: string s = assumeUnique(someObject.toString()); // because we return a mutable string now Optimization example: int sprintf(string format, ...) { char[512] preallocatedBuffer; char[] buffer = preallocatedBuffer[]; // buffer may grow, but // initially points to a preallocatedBuffer char[] storage = buffer[]; // storage for a current element ... for (...) { // iterate over qualifiers (and arguments) string currentQualifier = format[i..j]; auto currentArgument = argsTuple[n]; char[] result = currentArgument.toString(storage); if (result.ptr is storage.ptr) { // okay, string was constructed in-place storage = storage[result.length..$]; } else { // storage didn't have enough space for the whole // string (a reallocation occurred) int offset = buffer.length - storage.length; // increase the capacity buffer.length *= 2; // append our string to the buffer buffer[offset..offset+storage.length] = storage[]; // renew the temporary storage storage = preallocatedBuffer[]; } } ... } Another example: class Array(T) { // ... private T[] elements; char[] toString(string format, char[] buffer) { auto builder = StringBuilder(buffer); // reallocates when no space left builder.append("["); foreach (i, o; elements) { if (i > 0) builder.append(", "); // separator buffer = builder.getBuffer()[appender.length..$]; char[] result = o.toString(format, buffer); if (result.ptr is buffer.ptr) { // no reallocation builder.length += result.length; // without copying } else { builder.append(result); } } builder.append("]"); return builder.toString(); } } auto array = new Array!(int); array ~= [0, 1, 2, 3, 4]; assert(array.toString() == "[0, 1, 2, 3, 4]"); It's not very easy to take advantage of, but it's usable the old way (well, almost). Any ideas?On Tue, Nov 10, 2009 at 2:51 AM, Don <nospam nospam.com> wrote:I almost always want to print the value out in hex. And with some kind of digit separators, so that I can see how many digits it has. Just out of curiousity, how does someone print out theLutger wrote:You can definitely do something with it -- printf debugging. And if I were using BigInt, that's exactly why I'd want BigInt to have a toString.Justin Johansson wrote:There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Lutger Wrote:Your design makes better sense (to me at least) because it is based on why you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in no way related to formatting an object. You learn to work with it, but I find it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.value of a BigInt right now?In Tango, there's just .toHex() and .toDecimalString(). Needs proper formatting options, it's the biggest thing which isn't done. I hit one too many compiler segfaults and starting patching the compiler instead <g>. But I really want a decent toString(). Given a BigInt n, you should be able to just do writefln("%s %x", n, n); // Phobos formatln("{0} {0:X}", n); // Tango To solve this part of the issue, it would be enough to have toString() take a string parameter. (it would be "x" or "X" in this case). string toString(string fmt); But the performance would still be very poor, and that's much more difficult to solve.
Nov 10 2009
2009/11/10 Denis Koroskin <2korden gmail.com>:On Tue, 10 Nov 2009 15:30:20 +0300, Don <nospam nospam.com> wrote:veBill Baxter wrote:On Tue, Nov 10, 2009 at 2:51 AM, Don <nospam nospam.com> wrote:Lutger wrote:Justin Johansson wrote:Lutger Wrote:Justin Johansson wrote:I assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciati=csof D NG readers responding with their own idea(s) of what the semanti=y.of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorr=reSemantics of toString would depend on the object, I would think the=arare three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a cle=.string representation. floating points, dates, curreny and the like=1)3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for =irand 2) class objects, but that may be more confusing than it's worth.Thanks for that Lutger. Do you think it would make better sense if programming languages/the=olibraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof t=aargue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. =A0For debug purposes there are times also when I like to see=kestring printed out in quotes so you can tell the difference between "123" and 123. =A0Then again, and since I'm working on a scripting language, sometimes I li=ewto see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people vi=ntoString as being good for debug purposes and that about it. Cheers JustinYour design makes better sense (to me at least) because it is based o=nowhy you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in =dway related to formatting an object. You learn to work with it, but I fin=if I=A0You can definitely do something with it -- printf debugging. =A0And =it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.fwere using BigInt, that's exactly why I'd want BigInt to have a toString.I almost always want to print the value out in hex. And with some kind o=oodigit separators, so that I can see how many digits it has. =A0Just out of curiousity, how does someone print out thevalue of a BigInt right now?In Tango, there's just .toHex() and .toDecimalString(). Needs proper formatting options, it's the biggest thing which isn't done. I hit one t=Butmany compiler segfaults and starting patching the compiler instead <g>. =With Don's delegate idea, if you do have a toString with special performance concerns, then it can use its own stack-allocated buffer. void toString(void delegate(const(char)[]) put, string format) { char[512] preallocBuffer; foreach( ... ) { ... put(preallocBuffer[0..lenUsed]); } } Which in some cases (like writefln) should be almost as efficient as passing a buffer in. It avoids willy-nilly unbounded allocations anyway. But the nice thing is that it's easy to upgrade to. You can keep it simple and leave toString pretty much like you had it before, just changing the signature and the return. void toString(void delegate(const(char)[]) put, string format) { char ret[]; foreach( ... ) { ... ret ~=3D "..."; } put(ret); // only this line needed to change for Don-style toString } And to get the string you just need to call format: assert(std.string.format(thing) =3D=3D "blah"); If the buffer is going to be passed in, then probably it should be passed in as a full fledged output stream object with .write() methods and such. I don't want to have to worry about buffer management to write a toString method. That should be encapsulated. But it seems to me that Don's method offers exactly the right minimality of interface to allow encapsulating that management without requiring it to be done in a heavy-handed way. --bbI really want a decent toString(). Given a BigInt n, you should be able to just do writefln("%s %x", n, n); =A0// Phobos formatln("{0} {0:X}", n); // Tango To solve this part of the issue, it would be enough to have toString() take a string parameter. (it would be "x" or "X" in this case). string toString(string fmt); But the performance would still be very poor, and that's much more difficult to solve.Yes, it would solve half of the toString problems. Another part (i.e. memory allocation) could be solved by providing an optional buffer to the toString: char[] toString(string format =3D "s" /* comes from %s which is a default qualifier */, char[] buffer =3D null) { =A0 =A0// operate on the buffer, possibly resizing it =A0 =A0// which is safe and fast - it only allocates =A0 =A0// when *really* necessary, instead of always, as now =A0 =A0return buffer; }
Nov 10 2009
Bill Baxter wrote:2009/11/10 Denis Koroskin <2korden gmail.com>:Thanks. 'put' is so much better than 'sink'. <g>On Tue, 10 Nov 2009 15:30:20 +0300, Don <nospam nospam.com> wrote:With Don's delegate idea, if you do have a toString with special performance concerns, then it can use its own stack-allocated buffer. void toString(void delegate(const(char)[]) put, string format) { char[512] preallocBuffer; foreach( ... ) { ... put(preallocBuffer[0..lenUsed]); } }Bill Baxter wrote:Yes, it would solve half of the toString problems. Another part (i.e. memory allocation) could be solved by providing an optional buffer to the toString: char[] toString(string format = "s" /* comes from %s which is a default qualifier */, char[] buffer = null) { // operate on the buffer, possibly resizing it // which is safe and fast - it only allocates // when *really* necessary, instead of always, as now return buffer; }On Tue, Nov 10, 2009 at 2:51 AM, Don <nospam nospam.com> wrote:I almost always want to print the value out in hex. And with some kind of digit separators, so that I can see how many digits it has. Just out of curiousity, how does someone print out theLutger wrote:You can definitely do something with it -- printf debugging. And if I were using BigInt, that's exactly why I'd want BigInt to have a toString.Justin Johansson wrote:There is a definite use for such as thing. But the existing toString() is much, much worse than useless. People think you can do something with it, but you can't. eg, people have asked for BigInt to support toString(). That is an over-my-dead-body.Lutger Wrote:Your design makes better sense (to me at least) because it is based on why you want a string from some object. Take .NET for example: it does provide very elaborate and nice formatting options based and toString() with parameters. For some types however, the default toString() gives you the name of the type itself which is in no way related to formatting an object. You learn to work with it, but I find it a bit muddled. As a last note, I think people view toString as a debug thing mostly because it is very underpowered.Justin Johansson wrote:Thanks for that Lutger. Do you think it would make better sense if programming languages/their libraries separated functions/methods which are currently loosely purposed as "toString" into methods which are more specific to the types you suggest (leaving only the types/classifications and number thereof to argue about)? In my own D project, I've introduced a toDebugString method and left toString alone. There are times when I like D's default toString printing out the name of the object class. For debug purposes there are times also when I like to see a string printed out in quotes so you can tell the difference between "123" and 123. Then again, and since I'm working on a scripting language, sometimes I like to see debug output distinguish between different numeric types. Anyway going by the replies on this topic, looks like most people view toString as being good for debug purposes and that about it. Cheers JustinI assert that the semantics of "toString" or similarly named/purposed methods/functions in many PL's (including and not limited to D) is ill-defined. To put this statement into perspective, I would be most appreciative of D NG readers responding with their own idea(s) of what the semantics of "toString" are (or should be) in a language agnostic ideology.My other reply didn't take the language agnostic into account, sorry. Semantics of toString would depend on the object, I would think there are three general types of objects: 1. objects with only one sensible or one clear default string representations, like integers. Maybe even none of these exist (except strings themselves?) 2. objects that, given some formatting options or locale have a clear string representation. floating points, dates, curreny and the like. 3. objects that have no sensible default representation. toString() would not make sense for 3) type objects and only for 2) type objects as part of a formatting / localization package. toString() as a debugging aid sometimes doubles as a formatter for 1) and 2) class objects, but that may be more confusing than it's worth.value of a BigInt right now?In Tango, there's just .toHex() and .toDecimalString(). Needs proper formatting options, it's the biggest thing which isn't done. I hit one too many compiler segfaults and starting patching the compiler instead <g>. But I really want a decent toString(). Given a BigInt n, you should be able to just do writefln("%s %x", n, n); // Phobos formatln("{0} {0:X}", n); // Tango To solve this part of the issue, it would be enough to have toString() take a string parameter. (it would be "x" or "X" in this case). string toString(string fmt); But the performance would still be very poor, and that's much more difficult to solve.If the buffer is going to be passed in, then probably it should be passed in as a full fledged output stream object with .write() methods and such. I don't want to have to worry about buffer management to write a toString method. That should be encapsulated. But it seems to me that Don's method offers exactly the right minimality of interface to allow encapsulating that management without requiring it to be done in a heavy-handed way.One thing it doesn't (easily) handle is the case where an int argument gives the length of another one. (eg the "%*s" writefln format). I guess this can still be handled (very inefficiently) by converting the parameter value into a text number -- generally, though, that'd only be for direct interchangability with a built-in type; you'd normally do such things by calling a member function on the struct. The other issue is grauzone's comment: perhaps compile-time varargs make this whole approach obsolete.
Nov 10 2009
Don:It's problem 2 from my original posts: being able to output something large (eg an xml doc) in a piece-by-piece manner.See my post about vectorized lazyness. Bye, bearophile
Nov 10 2009
Andrei Alexandrescu Wrote:Steven Schveighoffer wrote:On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You're right, but there is an issue because as far as I can recall these functions' implementation do end up calling a virtual function per char; that might be streambuf.overflow. I'm not keen on investigating this any further, but I'd be grateful if you shared any related knowledge. At the end of the day, there seem to be violent agreement that we don't want one virtual call per character or one delegate call per character.Steven Schveighoffer wrote:From my C++ book, it appears to only use virtual inheritance. I don't know enough about virtual inheritance to know how that changes function calls. As far as virtual functions, only the destructor is virtual, so there is no issue there.On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Oh yes they do. (Did you even google?) Virtual multiple inheritance, the works. http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/Steven Schveighoffer wrote:IIRC, I don't think C++ iostreams use polymorphismOn Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:You are right. If range interfaces accommodate block transfers, this problem may be addressed. I agree that one virtual call per character output would be overkill. (I seem to recall it's one of the reasons why C++'s iostreams are so inefficient.)I think the best option for toString is to take an output range and write to it. (The sink is a simplified range.)Bad idea... A range only makes sense as a struct, not an interface/object. I'll tell you why: performance.I'm not sure. http://www.gnu.org/s/libc/manual/html_node/Streams-and-I18N.html#Streams-and-I18N gnu defines means to set and detect a utf-16 console, which dmd observes (grep std/ for fwide). But then I'm not sure how many are using that kind of stuff.You missed the note. I didn't implement it, but you could easily implement a stack-allocated buffer to cache the conversions, passing multiple converted code-points at once. But I don't think it's even worth discussing per my other points.void put(in char[] str) { foreach(dchar dc; str) { put((&dc)[0..1]); } } Note that you probably want to build a buffer of dchars instead of putting one at a time, but you get the idea.I don't get the idea. I'm seeing one virtual call per character.Whatever kind of data the output stream gets, it's going to convert it to the format it wants anyways (as for stdout, I think that would be utf8), the only benefit is if you have data stored in a different width that you wanted to output. Calling a conversion function in that case I think is reasonable enough, and saves the output stream from having to convert/deal with it. In other words, I don't think it's going to be that common a case where you need anything other than utf8 output, and therefore the cost of creating an interface, making virtual calls, disallowing simple delegate passing etc is worth the convenience *just in case* you have data stored as wchar[] you want to output.That being said, one other point that makes all this moot is -- toString is for debugging, not for general purpose. We don't need to support everything that is possible. You should be able to say "hey, toString only accepts char[], deal." Of course, you could substitute wchar[] or dchar[], but I think by far char[] is the most common (and is the default type for string literals).I was hoping we could elevate the usefulness of toString a bit.I don't know. I think Asian-language users might give a salient answer.D seems to favor UTF8 -- it is the default type for string literals. I don't think I've ever used dchar, and I usually only use wchar to talk to Win32 functions when required. The question I'd ask is -- how common is it where the versions other than char[] would be more convenient?That's not to say there is no reason to have a TextOutputStream object. Such a thing is perfectly usable for a toString which takes a char[] delegate sink, just pass &put. In fact, there could be a default toString function in Object that does just that: class Object { ... void toString(delegate void(in char[] buf) put, string fmt) const {} void toString(TextOutputStream tos, string fmt) const { toString(&tos.put, fmt); } }I'd agree with the delegate idea if we established that UTF-8 is favored compared to all other formats.
Nov 12 2009
Bill Baxter Wrote:On Thu, Nov 12, 2009 at 10:46 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:That is true. UTF8 works well. Now few person believe dream of fixed length UTF16. Surrogate Pairs must die. We, maybe not only Japanese but all Asian users, also need converters between UTFs and traditional local encoding. Implementations are up to local users.This isn't authoritative, but I don't think utf-16 is commonly used in Japan (except for calling Windows APIs). If you look at Mozilla the default Japanese encoding listed is Shift-JIS. A lot of Japanese email still gets sent as ISO-2022-JP. Otherwise utf-8 I think. A quick look at www.asahi.com shows they're using EUC-JP. nicovideo.jp is using utf-8. I seem to recall that my Japanese Visual Studio even saved files in Utf-8, or at least could be set to use utf-8. In short, I think utf-8 is closer to being a widely accepted standard for documents over there than utf-16 is. --bbI don't know. I think Asian-language users might give a salient answer.I'd agree with the delegate idea if we established that UTF-8 is favored compared to all other formats.don't think I've ever used dchar, and I usually only use wchar to talk to Win32 functions when required. The question I'd ask is -- how common is it where the versions other than char[] would be more convenient?
Nov 12 2009