digitalmars.D - Google Code Jam 2011 Language Usage
- Peter Alexander (24/24) May 08 2011 The Google Code Jam is a programming competition where you have to solve...
- bearophile (9/25) May 08 2011 But a person from Japan has used D to be among the top ten, this is good...
- Peter Alexander (10/12) May 08 2011 Unfortunately the ranks in the first don't mean much at all.
- Keywan Ghadami (1/1) May 08 2011 just an idea:new name for d -> d2lang
- Andrew Wiley (44/68) May 08 2011 I was one of the D users, although I wasn't really worried about competi...
- Timon Gehr (41/83) May 08 2011 Well, I don't like D's readf either (I use scanf, 2-3x faster and better
- Timon Gehr (7/7) May 08 2011 Whoops, there was a mistake:
- Andrei Alexandrescu (4/7) May 08 2011 Looking forward to detailed feedback about readf. It was implemented in
- Andrej Mitrovic (19/19) May 08 2011 I'm very happy with using Jesse's interact library for user input:
- Andrej Mitrovic (1/1) May 08 2011 *that checks if a delegate throws and returns true if so*
- Timon Gehr (48/55) May 08 2011 What I consider the most important points about readf:
- Peter Alexander (4/12) May 09 2011 std.readf is broken.
- Andrei Alexandrescu (5/20) May 09 2011 That's not a bug, see my comment in
- Timon Gehr (15/22) May 09 2011 In my experience readf behavior is not very useful for routine coding ta...
- Andrei Alexandrescu (28/51) May 09 2011 If this assessment would be reverted by simply inserting spaces in the
- Andrei Alexandrescu (91/147) May 09 2011 So far so good. By design one space in readf means "skip all whitespace"...
- Timon Gehr (78/226) May 09 2011 using
- Andrew Wiley (50/151) May 08 2011 What bothers me about that code is that you had to write a string to
- Jonathan M Davis (11/115) May 08 2011 stdin is already a struct in D. To do it in a more Java-like manner woul...
- bearophile (39/41) May 08 2011 I have tried to implement a D solution to the first problem, because its...
The Google Code Jam is a programming competition where you have to solve algorithmic problems using whatever programming language you like. The stats of what programming languages were used in the first round were collected: http://www.go-hero.net/jam/11/languages Some select figures for languages used to solve the first question: C++ 5032 Java 2321 C 532 Haskell 100 Clojure 13 GO 13 D 5 Scheme 5 (In previous 3 years, D had between 2-4 entries for the first question, so not much change, despite total contestant counts increasing quite dramatically) Generally, I believe people tend to use the language they are most familiar with, and for people that know more than one language they will choose the one that is most expressive. Stability of implementations could also be an issue. Obviously you can't draw too many conclusions from this alone, but more data is always better. Take what you will from it.
May 08 2011
Peter Alexander:Some select figures for languages used to solve the first question: C++ 5032 Java 2321 C 532 Haskell 100 Clojure 13 GO 13 D 5 Scheme 5The third most used language is Python.(In previous 3 years, D had between 2-4 entries for the first question, so not much change, despite total contestant counts increasing quite dramatically)But a person from Japan has used D to be among the top ten, this is good: http://www.go-hero.net/jam/11/name/hos.lyric The first, second and third persons are using the most used language, second most used and third most used (C++, Java, Python) :-)Obviously you can't draw too many conclusions from this alone, but more data is always better. Take what you will from it.From those numbers it looks like D isn't gaining mindshare, unfortunately. Go appreciated, even if much lass than Python. Among the supported languages there is Cobol and Fortran, and many others, but I don't see Ada. Bye, bearophile
May 08 2011
On 8/05/11 12:39 PM, bearophile wrote:But a person from Japan has used D to be among the top ten, this is good: http://www.go-hero.net/jam/11/name/hos.lyricUnfortunately the ranks in the first don't mean much at all. Most rounds last only a few hours, so everyone competes at the same time, but the first round last 24 hours, so most participants just come in and solve the problems whenever they want. What that means is that people at the top of the board on the first round are just those that started the competition as soon as it started. --- Interestingly, that contestant barely used any of D's features. The code he wrote may as well have been C++.
May 08 2011
On Sun, May 8, 2011 at 6:10 AM, Peter Alexander <peter.alexander.au gmail.com> wrote:The Google Code Jam is a programming competition where you have to solve algorithmic problems using whatever programming language you like. The stats of what programming languages were used in the first round were collected: http://www.go-hero.net/jam/11/languages Some select figures for languages used to solve the first question: C++ 5032 Java 2321 C 532 Haskell 100 Clojure 13 GO 13 D 5 Scheme 5 (In previous 3 years, D had between 2-4 entries for the first question, so not much change, despite total contestant counts increasing quite dramatically) Generally, I believe people tend to use the language they are most familiar with, and for people that know more than one language they will choose the one that is most expressive. Stability of implementations could also be an issue. Obviously you can't draw too many conclusions from this alone, but more data is always better. Take what you will from it.I was one of the D users, although I wasn't really worried about competing. I just wanted to see how D would compare after doing so many programming contests in Java. The main thing that frustrated me was that getting input in D wasn't anywhere near as straightforward as it is in Java. For the first problem, I'd do something like this in Java: Scanner in = new Scanner(System.in); int numTests = in.nextInt(); for(int test = 0; test < numTests; tests++) { //need the test index for output int numSteps = in.nextInt(); for(; numSteps < 0; numSteps--) char robot = in.nextChar(); int button = in.nextInt(); //solve the problem! } //print the output! } In D, that looked like this: string line; int num; stdin.readln(line); formattedRead(line, "%s", &num); for(int casen = 0; casen < num; casen++) { ... In a few places, I could have used stdin.readf instead of readln/formattedRead, but not many because the number of items within a test is on the same line as the items. I could have just been missing something, but something that was trivial in Java became brittle in D because I had to exactly match the whitespace for things to work. I suppose I could have read a line and used splitter to split on whitespace, but that would make me have to watch more state and would wind up looking like this: string line; stdin.readln(line); auto split = split(line); int num = to!int(split[0]); split = split[1..$]; ... Actually... now that I'm looking at that, if I wrote a Scanner-like class based on this, is there any chance it could go into Phobos? Seems like between split and to, we could get something much less brittle working.
May 08 2011
Andrew Wiley wrote:I was one of the D users, although I wasn't really worried about competing. I just wanted to see how D would compare after doing so many programming contests in Java. The main thing that frustrated me was that getting input in D wasn't anywhere near as straightforward as it is in Java. For the first problem, I'd do something like this in Java: Scanner in = new Scanner(System.in); int numTests = in.nextInt(); for(int test = 0; test < numTests; tests++) { //need the test index for output int numSteps = in.nextInt(); for(; numSteps < 0; numSteps--) char robot = in.nextChar(); int button = in.nextInt(); //solve the problem! } //print the output! }Well, I don't like D's readf either (I use scanf, 2-3x faster and better whitespace handling). That said, you really made my day. The problem is not that reading input in D is less straightforward than in Java, the problem is, that you are used to Java's way of doing IO. (which I pretty much dislike, I guess it is a matter of taste) You do not actually have to bother with string handling at all when doing IO in C/C++/D. Reading array of integers: int[100000] array; //somewhere in static storage, faster ... scanf("%d",&n); foreach(ref x;array) scanf("%d",&x); Or, some heap activity involved, and actually more keystrokes, but some people like this way: readf("%s",&n);//read number of items int[] array=to!(int[])(split(strip(readln()))); How I would have written your example in D. int numTests; scanf("%d", &numTests); foreach(test;0..numTests){ int numSteps; scanf("%d", &numSteps); foreach(step;0..numSteps){ //you have a bug in this line of your Java code introducing a looooong loop char robot; scanf("%c", &robot); int button; scanf("%d", &button); //solve the problem! } //print the output }In D, that looked like this: string line; int num; stdin.readln(line); formattedRead(line, "%s", &num); for(int casen = 0; casen < num; casen++) { ... In a few places, I could have used stdin.readf instead of readln/formattedRead, but not many because the number of items within a test is on the same line as the items.That is not a problem at all, you can read the first few elements with readf and the rest of the line with readlnI could have just been missing something, but something that was trivial in Java became brittle in D because I had to exactly match the whitespace forI actually think Java's way is brittle. You have to instantiate a class just to read IO.things to work. I suppose I could have read a line and used splitter to split on whitespace, but that would make me have to watch more state and would wind up looking like this: string line; stdin.readln(line); auto split = split(line); int num = to!int(split[0]); split = split[1..$];I don't get this.... Actually... now that I'm looking at that, if I wrote a Scanner-like class based on this, is there any chance it could go into Phobos? Seems like between split and to, we could get something much less brittle working.No chance, that is not the way D/Phobos works. You do not have a class for everything that would not need one. (just like Phobos does not have a writer class for output) However I agree that Phobos has to provide some better input handling, since using possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function. Timon
May 08 2011
Whoops, there was a mistake: Reading array of integers: int[100000] array; //somewhere in static storage, faster ... scanf("%d",&n); foreach(ref x;array[0..n]) scanf("%d",&x); // note the slice Timon
May 08 2011
On 5/8/11 3:04 PM, Timon Gehr wrote:However I agree that Phobos has to provide some better input handling, since using possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function.Looking forward to detailed feedback about readf. It was implemented in a hurry so definitely it has a long way to go. Andrei
May 08 2011
I'm very happy with using Jesse's interact library for user input: https://github.com/he-the-great/JPDLibs/tree/cmdln Last time I've used it I combined it with std.conv since I needed either a number or a "q" from the user, e.g.: int input; auto line = userInput!string("Enter value:"); if (line == "q") { quit(); } else if (!throws!(ConvException)( { input = to!int(line); } )) // try converting to int { if (input >= -127 && input <= 127) { // do something } } Here throws() is just a custom function that asserts that a delegate throws.
May 08 2011
*that checks if a delegate throws and returns true if so*
May 08 2011
Andrei Alexandrescu wrote:On 5/8/11 3:04 PM, Timon Gehr wrote:What I consider the most important points about readf: 1. Whitespace handling is different than scanf. It is much stricter and even feels inconsistent, Eg: int a,b; readf("%s %s",&a,&b);//input "1 2\n" read. readf("%s %s",&a,&b);//input "1 2\n" read (and a==1 && b==2). readf("%s",&a);//input "1\n" read. yay. readf("%s",&a);//input " 1\n" skipped. All subsequent input is skipped too. readf("%s ",&a);//input "1 \n" read. readf("%s ",&a);//input "1\n" skipped, presumably because the trailing space (!) is missing. readf(" %s",&a);//input "1\n" read. readf("\t%s",&a);//input "1\n": exception is thrown. readf("%s\n",&a);//input "1\n" read. readf("%s\n",&a);//input "1 \n": exception is thrown. readf("%s\t\n",&a);//input "1\t\n" read. readf("%s \n",&a);//input "1 \n" skipped. readf throws an exception after any further input. And some more, I do not remember all of them. Exceptions are most of the time only as useful as "Enforcement failed". You (almost?) never want this behavior, even at the points it marginally makes sense. It would be nice to have an optional whitespace-enforcing version that _really_ enforces it (as opposed to the current implementation), but that should not be the default. And then it should be consistent (also on skipping or exception throwing). 2. readf takes pointers. Ugly, end of story. I even like C++ cin with all its '>>' more. scanf has that problem too, but it is a C function, you _cannot_ expect it to do any better than that. D has variadic template functions that may take ref parameters. It can be done entirely pointer-free. 3. nonsense like readf("mooh",&a); cannot be caught at compile time. When/Why did you throw away the idea of static overloads? It would have been a powerful feature, and very useful for this case. scanf in C/C++ does not have this problem, because most modern compilers generate warnings for this. But that is making some functions "more equal than the others" 4. readf is slow. It is about 3-4 times slower than scanf (not 2-3, as I mistakenly claimed before). I think this is just a quality of implementation issue, but it is important. Especially for programming competitions where there are time limits, you do not want IO to unnecessarily become a mayor bottleneck. (Input files can be huge) Other than that, D is WAY the most convenient language I have ever tried to solve small algorithmic tasks in. 5. Not really readf related: There's writef(ln) and there is write(ln). And then there is readf. I will provide a proof-of-concept for the read function soon. TimonHowever I agree that Phobos has to provide some better input handling, since using possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function.Looking forward to detailed feedback about readf. It was implemented in a hurry so definitely it has a long way to go. Andrei
May 08 2011
On 8/05/11 11:57 PM, Timon Gehr wrote:Andrei Alexandrescu wrote:std.readf is broken. http://d.puremagic.com/issues/show_bug.cgi?id=4656 This bug makes it quite difficult to evaluate readf. I just use scanf now.Looking forward to detailed feedback about readf. It was implemented in a hurry so definitely it has a long way to go. AndreiWhat I consider the most important points about readf: 1. Whitespace handling is different than scanf. It is much stricter and even feels inconsistent, Eg:
May 09 2011
On 5/9/11 2:53 AM, Peter Alexander wrote:On 8/05/11 11:57 PM, Timon Gehr wrote:That's not a bug, see my comment in http://d.puremagic.com/issues/show_bug.cgi?id=4656. The error message _is_ a bug though! AndreiAndrei Alexandrescu wrote:std.readf is broken. http://d.puremagic.com/issues/show_bug.cgi?id=4656 This bug makes it quite difficult to evaluate readf. I just use scanf now.Looking forward to detailed feedback about readf. It was implemented in a hurry so definitely it has a long way to go. AndreiWhat I consider the most important points about readf: 1. Whitespace handling is different than scanf. It is much stricter and even feels inconsistent, Eg:
May 09 2011
Andrei Alexandrescu wrote:I've implemented readf to be a fair amount more Nazi about whitespace than scanf in an attempt to improve its precision. Scanf has been famously difficult to use for complex input parsing and validation, and I attribute some of that to its laissez-faire attitude toward whitespace. I'd be glad to relax some of readf's insistence on precise whitespace handling if there's enough evidence that that serves most of our users. I personally believe that the current behavior (strict by default, easy to relax) is best.In my experience readf behavior is not very useful for routine coding tasks that involve some IO. If you really need to have very strict requirements about the input format, readf does not serve you well, because a ' ' still skips all whitespace, a failure to read leaves the file pointer in an undefined position etc. All carryovers from scanf. I never want to use scanf when there is a valid chance of invalid input. As far as I can see, neither readf nor scanf can be used for sophisticated input validation or parsing of non-trivial input. You have to do it manually. How does readf make things better with strict(er) whitespace handling? What behavior is by design, what behavior is caused by bugs? Can you give a real-world example where readf design clearly beats scanf design? (as it is the default it should be almost always better, but I fail to see it) Apart from that, what about the other points I mentioned? Timon
May 09 2011
On 5/9/11 12:43 PM, Timon Gehr wrote:Andrei Alexandrescu wrote:If this assessment would be reverted by simply inserting spaces in the formatting string, I'd be hard pressed to agree. I do agree that readf behavior is surprising if you expect 100% scanf compatibility. This is intentional and beneficial as I believe scanf is wanting in more than one way.I've implemented readf to be a fair amount more Nazi about whitespace than scanf in an attempt to improve its precision. Scanf has been famously difficult to use for complex input parsing and validation, and I attribute some of that to its laissez-faire attitude toward whitespace. I'd be glad to relax some of readf's insistence on precise whitespace handling if there's enough evidence that that serves most of our users. I personally believe that the current behavior (strict by default, easy to relax) is best.In my experience readf behavior is not very useful for routine coding tasks that involve some IO.If you really need to have very strict requirements about the input format, readf does not serve you well, because a ' ' still skips all whitespace, a failure to read leaves the file pointer in an undefined position etc.That is not an issue (albeit some the underlying machinery is not yet implemented). If you want to skip at most one space but no other whitespace, insert "%*1[ ]" in the formatting string. To skip any number of spaces, insert "%*[ ]". Skipping exactly one space is not supported at the formatting string level, but you can always read one character with %c and then enforce the character is ' '. I agree that that could be improved. What's needed is a specification for the minimum number of characters read, e.g. "%*1.1[ ]" for scanning and skipping exactly one space. In contrast, having e.g. %d skipping all whitespace is a losing proposition if you want to do precision parsing. This is because that behavior can't be disabled. That's why I excised it. Reading is greedy. Failure to read leaves the pointer in a defined position, but we need to improve documentation.All carryovers from scanf. I never want to use scanf when there is a valid chance of invalid input.I agree, but that's a problem with scanf that should and could be fixed. There's almost always a chance of invalid input.As far as I can see, neither readf nor scanf can be used for sophisticated input validation or parsing of non-trivial input. You have to do it manually. How does readf make things better with strict(er) whitespace handling?Far as I can see, implementing Posix %[charset] extension would make readf a powerful one-stop shop for parsing input. Of course its speed needs to be up to snuff too. And of course its specification can be improved, which is where your input is very valuable.What behavior is by design, what behavior is caused by bugs? Can you give a real-world example where readf design clearly beats scanf design? (as it is the default it should be almost always better, but I fail to see it) Apart from that, what about the other points I mentioned?I answered all of these in my other, longer post. Andrei
May 09 2011
On 5/8/11 5:57 PM, Timon Gehr wrote:Andrei Alexandrescu wrote:Thanks very much for providing detailed feedback.On 5/8/11 3:04 PM, Timon Gehr wrote:What I consider the most important points about readf:However I agree that Phobos has to provide some better input handling, since using possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function.Looking forward to detailed feedback about readf. It was implemented in a hurry so definitely it has a long way to go. Andrei1. Whitespace handling is different than scanf. It is much stricter and even feels inconsistent, Eg: int a,b; readf("%s %s",&a,&b);//input "1 2\n" read. readf("%s %s",&a,&b);//input "1 2\n" read (and a==1&& b==2).So far so good. By design one space in readf means "skip all whitespace".readf("%s",&a);//input "1\n" read. yay. readf("%s",&a);//input " 1\n" skipped. All subsequent input is skipped too.I'm not seeing skipping in my tests; I do see an exception being thrown. Here's how I test: import std.stdio; void main() { int a, b; readf("%s",&a); assert(a == 1); readf("%s",&b); assert(b == 2); } dmd ./test && echo '1\n 2' | ./test The first input is read into 'a' and reading stops just at the \n. Next you're trying to read "\n 2" into b, which fails due to the strict whitespace handling. To fix this, you'd need to insert a space before the second "%s". I'm not hooked on this strict whitespace handling, but I think it makes a lot of sense particularly when you want to make sure the input looks exactly as you think it should. With scanf you can't have precise parsing even if you wanted; with readf all you need is to insert a space. Precision is important. For example, Hive uses a \t for field separation when streaming to a file. It is very important to figure that you have one tab there versus two (two means a NULL field was in between).readf("%s ",&a);//input "1 \n" read. readf("%s ",&a);//input "1\n" skipped, presumably because the trailing space (!) is missing.On my machine this passes: import std.stdio; void main() { int a, b; readf("%s ",&a); assert(a == 1); readf("%s ",&b); assert(b == 2); } dmd ./test && echo '1\n 2' | ./test The explanation is that, again, a space means "skip all whitespace". So the first space eats the "\n " and the second space eats the final "\n" in the input (produced by echo). Please adjust this example so it unduly fails.readf(" %s",&a);//input "1\n" read. readf("\t%s",&a);//input "1\n": exception is thrown.A "\t" in the formatting string for readf simply requires a tab. To skip over any number of tabs, do this: readf("%*1[\t]%s",&a); That instructs readf to read, but not store, a string consisting of at most one tab. (To skip multiple tabs drop the "1".) This functionality is not yet implemented.readf("%s\n",&a);//input "1\n" read. readf("%s\n",&a);//input "1 \n": exception is thrown.That is as expected - if you specify \n readf expects a \n.readf("%s\t\n",&a);//input "1\t\n" read. readf("%s \n",&a);//input "1 \n" skipped. readf throws an exception after any further input.My testbed: import std.stdio; void main() { int a, b; readf("%s\t\n",&a); assert(a == 1); readf("%s \n",&b); assert(b == 2); } dmd ./test && echo "1\t\n2 " | ./test It fails because it can't find the last \n. That's a bug.And some more, I do not remember all of them. Exceptions are most of the time only as useful as "Enforcement failed". You (almost?) never want this behavior, even at the points it marginally makes sense. It would be nice to have an optional whitespace-enforcing version that _really_ enforces it (as opposed to the current implementation), but that should not be the default. And then it should be consistent (also on skipping or exception throwing).Except for one bug and one lacking implementation artifact, I find the current behavior consistent with a strict approach to whitespace handling.2. readf takes pointers. Ugly, end of story. I even like C++ cin with all its '>>' more. scanf has that problem too, but it is a C function, you _cannot_ expect it to do any better than that. D has variadic template functions that may take ref parameters. It can be done entirely pointer-free.When I implemented readf, ref variadic arguments weren't working. I'd be hesitant to change it right now as it does not improve actual functionality and disrupts current uses. But I agree ideally it should accept parameters by reference.3. nonsense like readf("mooh",&a); cannot be caught at compile time. When/Why did you throw away the idea of static overloads? It would have been a powerful feature, and very useful for this case. scanf in C/C++ does not have this problem, because most modern compilers generate warnings for this. But that is making some functions "more equal than the others"One early version I had was doing that and spelled readf!"format string"(arguments); Unfortunately, sometimes runtime-computed formatting strings are needed and useful (see the recent std.log discussion...) so I decided to go with dynamic formatting for now. Once we get that right, providing an optional compile-time-checked formatting function shouldn't be too difficult with CTFE.4. readf is slow. It is about 3-4 times slower than scanf (not 2-3, as I mistakenly claimed before). I think this is just a quality of implementation issue, but it is important.I agree. I'm amazed readf is not slower actually. It uses by character file iteration, by far the slowest (and most embarrassing) code I wrote in Phobos: each character read entails one call to getc() to fetch the character, one call to ungetc() to restore the stream position, and finally one more call to getc() to move forward. The code is correct but very slow. Some C APIs provide undocumented means to peek at the next character in the stream without actually advancing the stream, which is what we need. I know how to do it on most Unixen and Walter knows how to do it on his own cstdlib implementation. We didn't have the time yet, and I'm glad the matter is under spotlight.Especially for programming competitions where there are time limits, you do not want IO to unnecessarily become a mayor bottleneck. (Input files can be huge)Agreed.Other than that, D is WAY the most convenient language I have ever tried to solve small algorithmic tasks in. 5. Not really readf related: There's writef(ln) and there is write(ln). And then there is readf. I will provide a proof-of-concept for the read function soon.Good idea. I suggest you provide a template read(T)() that mimics the functionality of Java's nextInt, nextFloat etc: auto a = stdin.next!int(); auto b = stdin.next!double(); auto s = stdin.next!string("\n"); // read a string up to \n ... Andrei
May 09 2011
Sry, overlooked this post. Andrei Alexandrescu wrote:On 5/8/11 5:57 PM, Timon Gehr wrote:usingAndrei Alexandrescu wrote:On 5/8/11 3:04 PM, Timon Gehr wrote:However I agree that Phobos has to provide some better input handling, sinceI tested inputting manually in terminal. The exception is thrown only when I provide an EOF. Seems like the input is not being skipped after all, but readf does not return until there is an EOF.Thanks very much for providing detailed feedback.What I consider the most important points about readf:possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function.Looking forward to detailed feedback about readf. It was implemented in a hurry so definitely it has a long way to go. Andrei1. Whitespace handling is different than scanf. It is much stricter and even feels inconsistent, Eg: int a,b; readf("%s %s",&a,&b);//input "1 2\n" read. readf("%s %s",&a,&b);//input "1 2\n" read (and a==1&& b==2).So far so good. By design one space in readf means "skip all whitespace".readf("%s",&a);//input "1\n" read. yay. readf("%s",&a);//input " 1\n" skipped. All subsequent input is skipped too.I'm not seeing skipping in my tests; I do see an exception being thrown. Here's how I test: import std.stdio; void main() { int a, b; readf("%s",&a); assert(a == 1); readf("%s",&b); assert(b == 2); } dmd ./test && echo '1\n 2' | ./testI'm not hooked on this strict whitespace handling, but I think it makes a lot of sense particularly when you want to make sure the input looks exactly as you think it should. With scanf you can't have precise parsing even if you wanted; with readf all you need is to insert a space. Precision is important. For example, Hive uses a \t for field separation when streaming to a file. It is very important to figure that you have one tab there versus two (two means a NULL field was in between).It should be possible to do that with scanf using %[] if I'm not mistaken.readf("%s ",&a);//input "1 \n" read. readf("%s ",&a);//input "1\n" skipped, presumably because the trailing space (!) is missing. On my machine this passes: import std.stdio; void main() { int a, b; readf("%s ",&a); assert(a == 1); readf("%s ",&b); assert(b == 2); } dmd ./test && echo '1\n 2' | ./test The explanation is that, again, a space means "skip all whitespace". So the first space eats the "\n " and the second space eats the final "\n" in the input (produced by echo). Please adjust this example so it unduly fails.Again, misinterpretation on my side. Typing into the terminal expects new input until a non-whitespace character is inserted. Should be fine, but can be surprising.I did not know it would ever be! That removes many of my concerns. (and the 'read' function removes the rest)readf(" %s",&a);//input "1\n" read. readf("\t%s",&a);//input "1\n": exception is thrown.A "\t" in the formatting string for readf simply requires a tab. To skip over any number of tabs, do this: readf("%*1[\t]%s",&a); That instructs readf to read, but not store, a string consisting of at most one tab. (To skip multiple tabs drop the "1".) This functionality is not yet implemented.At least I found one. =)readf("%s\n",&a);//input "1\n" read. readf("%s\n",&a);//input "1 \n": exception is thrown.That is as expected - if you specify \n readf expects a \n.readf("%s\t\n",&a);//input "1\t\n" read. readf("%s \n",&a);//input "1 \n" skipped. readf throws an exception after any further input.My testbed: import std.stdio; void main() { int a, b; readf("%s\t\n",&a); assert(a == 1); readf("%s \n",&b); assert(b == 2); } dmd ./test && echo "1\t\n2 " | ./test It fails because it can't find the last \n. That's a bug.And some more, I do not remember all of them. Exceptions are most of the time only as useful as "Enforcement failed". You (almost?) never want this behavior, even at the points it marginally makes sense. It would be nice to have an optional whitespace-enforcing version that _really_ enforces it (as opposed to the current implementation), but that should not be the default. And then it should be consistent (also on skipping or exception throwing).Except for one bug and one lacking implementation artifact, I find the current behavior consistent with a strict approach to whitespace handling.Agreed. Thanks for your explanations!We can have both, since it will never be possible to read in raw pointers: import std.stdio; import std.conv; private bool containsPointersImpl(T...)(){ //nesting this inside containsPointer template removes eponymous template trick. Is this a bug? foreach(t;T) static if(is(t U:U*)) return true; return false; } template containsPointers(T...){enum containsPointers=containsPointersImpl!T();} private bool onlyPointersImpl(T...)(){ foreach(t;T) static if(!is(t U:U*)) return false; return true; } template onlyPointers(T...){enum onlyPointers=onlyPointersImpl!T();} private string _readfImpl(int len){ string res="return std.stdio.stdin.readf(format,"; foreach(t;0..len) res~="&args["~to!string(t)~"], "; res~=");"; return res; } int _readf(T...)(string format, ref T args) if(!containsPointers!T){mixin(_readfImpl(T.length));} //classic definition for backwards compatibility. int _readf(T...)(string format, T args) if(onlyPointers!T){ return std.stdio.stdin.readf(format, args); } void main(){ int a; _readf(" %s",&a); writeln(a); _readf(" %s",a); writeln(a); }2. readf takes pointers. Ugly, end of story. I even like C++ cin with all its '>>' more. scanf has that problem too, but it is a C function, you _cannot_ expect it to do any better than that. D has variadic template functions that may take ref parameters. It can be done entirely pointer-free.When I implemented readf, ref variadic arguments weren't working. I'd be hesitant to change it right now as it does not improve actual functionality and disrupts current uses. But I agree ideally it should accept parameters by reference.The problem I see here is that the dynamic version still cannot be checked when passed a statically known format string. Why did you drop the idea of allowing something like int readf(T...)(static string format, T args) ?3. nonsense like readf("mooh",&a); cannot be caught at compile time. When/Why did you throw away the idea of static overloads? It would have been a powerful feature, and very useful for this case. scanf in C/C++ does not have this problem, because most modern compilers generate warnings for this. But that is making some functions "more equal than the others"One early version I had was doing that and spelled readf!"format string"(arguments); Unfortunately, sometimes runtime-computed formatting strings are needed and useful (see the recent std.log discussion...) so I decided to go with dynamic formatting for now. Once we get that right, providing an optional compile-time-checked formatting function shouldn't be too difficult with CTFE.Yes, I think it should support: auto a = read!int; auto b = read!double; auto s = read!string("\n"); // this could be an overload on immutability. alternative would be read!(string,"\n"); I don not know. auto x = read!(int[])(50); // read an array of 50 integers separated by whitespace auto y = read!(int[],",")(50); // read an array of 50 integers separated by commas auto z = read!(int[],", ")(50); // read an array of 50 integers separated by commas and whitespace Plus the same for every type that can be to!type(string)'d. But also: read should replace readf wherever possible in the following forms: int a; double b; string s; read(a,b,s);//reads whitespace-separated a, b and s in turn. (delimiter could be changed by template argument or so) char[] c=new char[1000]; read(c); // only relocates c if the number of read characters exceeds 1000. One problem I see: An evildoer could provide a huge input, filling up the whole RAM. I think this vulnerability is also present in readln. Any ideas? Non-string arrays are handled this way: int[100] arr; read(arr); // reads 100 integers and stores in arr read(arr[0..20]); //reads 20 integers into the first 20 slots of arr int arr[] = new arr[100]; read(arr); //ditto Rationale: reading input should not /require/ heap activity. The read function would cover all cases where no strict whitespace handling is required, and readf would take the rest! I think that would be a very nice solution. Timon4. readf is slow. It is about 3-4 times slower than scanf (not 2-3, as I mistakenly claimed before). I think this is just a quality of implementation issue, but it is important.I agree. I'm amazed readf is not slower actually. It uses by character file iteration, by far the slowest (and most embarrassing) code I wrote in Phobos: each character read entails one call to getc() to fetch the character, one call to ungetc() to restore the stream position, and finally one more call to getc() to move forward. The code is correct but very slow. Some C APIs provide undocumented means to peek at the next character in the stream without actually advancing the stream, which is what we need. I know how to do it on most Unixen and Walter knows how to do it on his own cstdlib implementation. We didn't have the time yet, and I'm glad the matter is under spotlight.Especially for programming competitions where there are time limits, you do not want IO to unnecessarily become a mayor bottleneck. (Input files can be huge)Agreed.Other than that, D is WAY the most convenient language I have ever tried to solve small algorithmic tasks in. 5. Not really readf related: There's writef(ln) and there is write(ln). And then there is readf. I will provide a proof-of-concept for the read function soon.Good idea. I suggest you provide a template read(T)() that mimics the functionality of Java's nextInt, nextFloat etc: auto a = stdin.next!int(); auto b = stdin.next!double(); auto s = stdin.next!string("\n"); // read a string up to \n ... Andrei
May 09 2011
On Sun, May 8, 2011 at 3:04 PM, Timon Gehr <timon.gehr gmx.ch> wrote:Andrew Wiley wrote:What bothers me about that code is that you had to write a string to represent something that should be implicit. It may just be that formattedRead is more strict than scanf, but I had problems getting whitespace to behave properly with format code strings. Plus, when you just type %d, what if I want a long? What if I want an infinite precision integer? These things aren't solved by C function calls, and trying to come up with a string format code for every possible input would needlessly complicate things. Or, some heap activity involved, and actually more keystrokes, but someI was one of the D users, although I wasn't really worried aboutcompeting.I just wanted to see how D would compare after doing so many programming contests in Java. The main thing that frustrated me was that getting input in D wasn't anywhere near as straightforward as it is in Java. For the first problem, I'd do something like this in Java: Scanner in = new Scanner(System.in); int numTests = in.nextInt(); for(int test = 0; test < numTests; tests++) { //need the test index for output int numSteps = in.nextInt(); for(; numSteps < 0; numSteps--) char robot = in.nextChar(); int button = in.nextInt(); //solve the problem! } //print the output! }Well, I don't like D's readf either (I use scanf, 2-3x faster and better whitespace handling). That said, you really made my day. The problem is not that reading input in D is less straightforward than in Java, the problem is, that you are used to Java's way of doing IO. (which I pretty much dislike, I guess it is a matter of taste) You do not actually have to bother with string handling at all when doing IO in C/C++/D. Reading array of integers: int[100000] array; //somewhere in static storage, faster ... scanf("%d",&n); foreach(ref x;array) scanf("%d",&x);people like this way: readf("%s",&n);//read number of items int[] array=to!(int[])(split(strip(readln()))); How I would have written your example in D. int numTests; scanf("%d", &numTests); foreach(test;0..numTests){ int numSteps; scanf("%d", &numSteps); foreach(step;0..numSteps){ //you have a bug in this line of your Java code introducing a looooong loop char robot; scanf("%c", &robot); int button; scanf("%d", &button); //solve the problem! } //print the output }As a note, I recently discovered while running through some D1 code that %c isn't a format code recognized by the D2 formatting functions. I realize this is C though.The documentation seems to imply that readf reads an entire line. Was I just misunderstanding it?In D, that looked like this: string line; int num; stdin.readln(line); formattedRead(line, "%s", &num); for(int casen = 0; casen < num; casen++) { ... In a few places, I could have used stdin.readf instead of readln/formattedRead, but not many because the number of items within atestis on the same line as the items.That is not a problem at all, you can read the first few elements with readf and the rest of the line with readlnThat doesn't make it brittle, that makes it heavy and/or overkill. What's brittle is when I have to exactly match whitespace, write strings for things that should be implicit, and keep track of more state than is strictly necessary. Java's Scanner is nice because you ask for an integer and get an integer, and as long as you ask for the right things in the right order, you don't have to track any state whatsoever. Keeping track of where you are in the input stream is something better left to the code doing the reading rather than the user. Your way doesn't involve state, but it also doesn't generalize to other types of streams.I could have just been missing something, but something that was trivialinJava became brittle in D because I had to exactly match the whitespacefor I actually think Java's way is brittle. You have to instantiate a class just to read IO.It's simple. I have a line that looks like this: 4 3 2 67 5 The first number is the number of numbers that follow, and the code looks like this: string line = "4 3 2 67 5"; auto split = split(line); int num = to!int(line[0]); line = line[1..$]; foreach(index; 0..num) { int cur - to!int(line[0]); line = line[1..$]; // do things } I realize this is just a more complicated version of your heap code above, but suppose I needed to read an integer, a string, and a floating point number for each item. This scales up quite nicely to that sort of thing.things to work. I suppose I could have read a line and used splitter to split on whitespace, but that would make me have to watch more state and would wind up looking like this: string line; stdin.readln(line); auto split = split(line); int num = to!int(split[0]); split = split[1..$];I don't get this.Yes, if I had thought a bit more, I wouldn't have said class. This could just be implemented as a few simple methods for reading primitives from string ranges (or character ranges, actually, as that would be more general). I would expect something like this to appear with the stream API that we'll hopefully build at some point. A class would probably be overkill.... Actually... now that I'm looking at that, if I wrote a Scanner-like class based on this, is there any chance it could go into Phobos? Seems like between split and to, we could get something much less brittle working.No chance, that is not the way D/Phobos works. You do not have a class for everything that would not need one. (just like Phobos does not have a writer class for output)However I agree that Phobos has to provide some better input handling, since using possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function.I think that input handling like this should be built on top of a stream API, and because that API isn't here yet, improving input may be premature. Or it may be too useful to wait.
May 08 2011
Andrew Wiley wrote:stdin is already a struct in D. To do it in a more Java-like manner would likely involve having a templated read function which is templated on the type that you want to get out of stdin next. Essentially, you'd do something like std.conv.parse directly on stdin by having it as part of std.stdio.File. Now, personally, I just always read in the whole line and then use std.conv.parse on it. I'm not sure if that actually costs you anything in terms of functionality, though it might be possible to implement a templated read function on std.stdio.File more efficiently. And using parse like that, you can get much friendlier I/O which is closer to what you'd get with Scanner in Java. - Jonathan M DavisI was one of the D users, although I wasn't really worried about competing. I just wanted to see how D would compare after doing so many programming contests in Java. The main thing that frustrated me was that getting input in D wasn't anywhere near as straightforward as it is in Java. For the first problem, I'd do something like this in Java: Scanner in = new Scanner(System.in); int numTests = in.nextInt(); for(int test = 0; test < numTests; tests++) { //need the test index for output int numSteps = in.nextInt(); for(; numSteps < 0; numSteps--) char robot = in.nextChar(); int button = in.nextInt(); //solve the problem! } //print the output! }Well, I don't like D's readf either (I use scanf, 2-3x faster and better whitespace handling). That said, you really made my day. The problem is not that reading input in D is less straightforward than in Java, the problem is, that you are used to Java's way of doing IO. (which I pretty much dislike, I guess it is a matter of taste) You do not actually have to bother with string handling at all when doing IO in C/C++/D. Reading array of integers: int[100000] array; //somewhere in static storage, faster ... scanf("%d",&n); foreach(ref x;array) scanf("%d",&x); Or, some heap activity involved, and actually more keystrokes, but some people like this way: readf("%s",&n);//read number of items int[] array=to!(int[])(split(strip(readln()))); How I would have written your example in D. int numTests; scanf("%d", &numTests); foreach(test;0..numTests){ int numSteps; scanf("%d", &numSteps); foreach(step;0..numSteps){ //you have a bug in this line of your Java code introducing a looooong loop char robot; scanf("%c", &robot); int button; scanf("%d", &button); //solve the problem! } //print the output }In D, that looked like this: string line; int num; stdin.readln(line); formattedRead(line, "%s", &num); for(int casen = 0; casen < num; casen++) { ... In a few places, I could have used stdin.readf instead of readln/formattedRead, but not many because the number of items within a test is on the same line as the items.That is not a problem at all, you can read the first few elements with readf and the rest of the line with readlnI could have just been missing something, but something that was trivial in Java became brittle in D because I had to exactly match the whitespace forI actually think Java's way is brittle. You have to instantiate a class just to read IO.things to work. I suppose I could have read a line and used splitter to split on whitespace, but that would make me have to watch more state and would wind up looking like this: string line; stdin.readln(line); auto split = split(line); int num = to!int(split[0]); split = split[1..$];I don't get this.... Actually... now that I'm looking at that, if I wrote a Scanner-like class based on this, is there any chance it could go into Phobos? Seems like between split and to, we could get something much less brittle working.No chance, that is not the way D/Phobos works. You do not have a class for everything that would not need one. (just like Phobos does not have a writer class for output) However I agree that Phobos has to provide some better input handling, since using possibly unsafe C functions is the best way to do it by now. (I think readf is severely crippled) I may try to implement a meaningful "read" function.
May 08 2011
Andrew Wiley:The main thing that frustrated me was that getting input in D wasn't anywhere near as straightforward as it is in Java. For the first problem,I have tried to implement a D solution to the first problem, because its input is a bit more complex. I have used C++ code written the winner as starting point. After several failed D versions (this is BAD for D2/Phobos), I've written a Python prototype and then I have translated it to D2: import std.stdio, std.math, std.conv, std.string, std.array, std.algorithm; auto next(R)(ref R range) { auto result = range.front(); range.popFront(); return result; } void main() { auto fin = File("input.txt"); auto fout = File("output.txt", "w"); foreach (i; 0 .. to!int(fin.readln().strip())) { int[2] lastP = 1; int[2] lastT = 0; int t = 0; auto parts = splitter(fin.readln().strip(), " "); foreach (_; 0 .. to!int(next(parts))) { string s = next(parts); int q = to!int(next(parts)); int id = cast(int)(s == "B"); t = max(t, abs(q - lastP[id]) + lastT[id]) + 1; lastP[id] = q; lastT[id] = t; } } } Three problems I've found in translating the prototype: - A next() function/method is missing, but I needed it, so I have had to define it, to keep code from becoming hairy and quite less readable. to!int expects a stripped string. In my code I am never sure to have a stripped string coming from input, so I have to always add a strip(), this is dumb: foreach (i; 0 .. to!int(fin.readln().strip())) { ==> foreach (i; 0 .. to!int(fin.readln())) { std.algorithm.splitter() doesn't default to splitting on whitespace as std.string.split() does. This is bad because in this program I need to add a strip() and in general it's bad because if there are two spaces, or a newline, it causes a mess, so I'd like a new overload of splitter() that acts as split(): auto parts = splitter(fin.readln().strip(), " "); ==> auto parts = splitter(fin.readln()); Bye, bearophile
May 08 2011