digitalmars.D.bugs - [Issue 11282] New: std.process: add capability for two-way inter-process communication without deadlock
- d-bugmail puremagic.com (85/85) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (12/12) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (7/7) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (22/22) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (14/14) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (6/6) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (16/16) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (13/13) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (10/10) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
- d-bugmail puremagic.com (12/12) Oct 16 2013 http://d.puremagic.com/issues/show_bug.cgi?id=11282
http://d.puremagic.com/issues/show_bug.cgi?id=11282 Summary: std.process: add capability for two-way inter-process communication without deadlock Product: D Version: unspecified Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: andrei erdani.com PDT --- From communication with Hans Fugal: import std.stdio; import std.range; import std.process; void main() { // exactly how large this needs to be to block the pipeline is probably // somewhat system-dependent, though likely very similar for most // boxen in a given family (linux, osx, etc). string s = repeat('a', 4096 * 64).array; auto p = pipeProcess(["cat"]); p.stdin.write(s); p.stdin.close; writeln(p.stdout.byChunk(4096)); } dtruss -f output (OSX, I can give strace on linux with a little more effort if it makes a difference): ... 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 44424/0x1092df: write_nocancel(0x4, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x4000) = 16384 0 ... 44425/0x1092eb: read(0x0, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x10000) = 65536 0 44425/0x1092eb: write(0x1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x10000) = 65536 0 44425/0x1092eb: read(0x0, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 0x10000) = 65536 0 (hang) Threading and low-level workarounds are of course an option, this isn't a bug. But it would be friendly to take a page from the python playbook and provide something that makes this kind of use case easy. Or at least warn about the possibility and provide a pointer as to possible workarounds in the docs. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 Brad Roberts <braddr puremagic.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |braddr puremagic.com --- How much more buffering do you recommend? Bottom line is there's got to be a limit before blocking enters the picture, otherwise infinite buffering is required, and that's just impossible. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 PDT --- I think a primitive that would use different threads for reading and writing would help. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 Hans Fugal <hans fugal.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hans fugal.net Hi Brad, the idea is to have something like python's subprocess.Popen.communicate() which (conceptually) writes a string to the subprocess's stdin, and then reads from its stdout and stderr until EOF. This must be implemented with alternating event-driven writes and reads (i.e. using select() - different threads would work but it's unnecessary overhead), to avoid the pipe deadlock that you can get when you try to write everything and then read everything. This is a common pattern that is I think worth making easy. It is not the only pattern, sometimes you need more control and stdin isn't known up-front, you need parallelism and pipelining, etc. But it greatly simplifies the cases where it applies. 1. http://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 --- Right. I'm familiar with the problem space. The issue, which python chooses to make it the users problem, and only really visible in docs, if they happen to see it: Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited. That's a bit more cavalier than I prefer in our standard library. If we require both the input range and the output sink to be supplied, then that puts the choice front and center to the api and not an implementation detail to leave as buyer beware. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 PDT --- Wouldn't a select/threads-based approach take care of the buffering issue? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 Andrei, no you have to buffer all of stdout before you return (and passing all of stdin before calling is a form of buffering too). A source/sink approach would work just fine. Perhaps you can already do that with ProcessPipes's stdin/stdout/stderr? I'm still ramping up on the rich stdlib so I'm not sure what constructs but if I had a way to easily plug p.stdin into some kind of source (maybe a string that already exists, i.e. is already buffered), and p.stdout into some kind of sink (maybe just a string that grows (buffers)), and then I have some way to know when it has finished I can extract the stdout. i.e. if it's easy to go string -> process -> string by wiring together a couple existing classes, then all we have to do is document it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 To elaborate, there are 4 buffers involved. Two of them are in the operating system (the pipe buffers), and two are in our space. If we try to flush all of our buffer before reading, then the pipe buffers can fill up and cause deadlock. Brad is saying that just solving the pipe buffers problem doesn't go as far as he'd like - he wants to also solve the problem of having to buffer in the program too, i.e. generally the problem where stdin may be very large (or infinite) and processing stdout doesn't want to have to read all the way to EOF which may be very large (or infinite). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 (also I find it somewhat ironic that I am advocating for the equivalent of python's subprocess.Popen.communicate() as 80% of the time I am grumbling that there isn't something more general because communicate() doesn't do what I need, so I am definitely in favor of a more general but still deadlock-free approach) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013
http://d.puremagic.com/issues/show_bug.cgi?id=11282 --- Restating a little, moving the limits out of the pipe layer into the library layer doesn't solve the underlying fundamental problem, buffering forever is untenable. So, that requires a usage pattern change. Simply increasing the size of the buffers might make more apps work, but it also means that it's more likely that when the still arbitrary limits are hit that it's harder to understand why the limits exist. I prefer low limits here because it makes it more obvious early that the pattern is broken. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 16 2013