digitalmars.D.learn - missing data with parallel and stdin
- moechofe (26/26) May 23 2016 Hi, I write a script that take a list of files from STDIN,
- Jack Stouffer (8/15) May 23 2016 Boy that's a confusing way to write that. Here's a clearer version
- moechofe (6/8) May 23 2016 Like this?:
- Era Scarecrow (9/15) May 23 2016 Last night I took the code sample and left copy out, everything
Hi, I write a script that take a list of files from STDIN, compute some stuff, and copy files with a new names. I got 33k lines at input but got only 3k-5k in the destination folder. This is not append if I remove the .parallel() function. What did I do wrong? void delegate(string source,string dest) handler; if(use_symlink) handler = delegate(string s,string d){ symlink(s,d); }; else handler = delegate(string s,string d){ copy(s,d); }; foreach(entry; parallel(stdin.byLineCopy)) try { auto source = buildPath(static_path,entry); auto md5 = digest!MD5(File(source).byChunk(64*1024)); auto hash = toHexString!(LetterCase.lower)(md5); auto file = text(hash,'_',baseName(entry)); auto dest = buildPath(hashed_path,file); handler(source,dest); writeln(entry,' ',file); } catch(Exception e) { error("Couldn't read, hash or copy %s",entry); }
May 23 2016
On Monday, 23 May 2016 at 08:59:31 UTC, moechofe wrote:void delegate(string source,string dest) handler; if(use_symlink) handler = delegate(string s,string d){ symlink(s,d); }; else handler = delegate(string s,string d){ copy(s,d); };Boy that's a confusing way to write that. Here's a clearer version if(use_symlink) handler = delegate(string s,string d){ symlink(s,d); }; else handler = delegate(string s,string d){ copy(s,d); };What did I do wrong?Sounds like a data race problem. Use a lock on the file write operation and see if that helps.
May 23 2016
On Monday, 23 May 2016 at 14:16:13 UTC, Jack Stouffer wrote:Sounds like a data race problem. Use a lock on the file write operation and see if that helps.Like this?: synchronized(mutex) copy(source,dest); That didn't solve anything. What I observe is: when the process is slower, more files are copied.
May 23 2016
On Monday, 23 May 2016 at 15:53:23 UTC, moechofe wrote:On Monday, 23 May 2016 at 14:16:13 UTC, Jack Stouffer wrote:Last night I took the code sample and left copy out, everything else I got working. However when I ran it I noticed it's only running on one core and worked fine. However when I put in a number for how many to work on at once (adding any number to parallel's call) it would crash the program quite often, generally because it couldn't close files it was scanning. Looking over the documentation you appear to be using parallel correctly, so I don't know why it isn't working.Sounds like a data race problem. Use a lock on the file write operation and see if that helps.That didn't solve anything. What I observe is: when the process is slower, more files are copied.
May 23 2016