digitalmars.D.learn - Multi-threaded sorting of text file
- MGW (15/15) Jan 03 2020 Need help:
- Alex (5/21) Jan 03 2020 As far as I know, there isn't any native in D. Maybe I overlooked
- =?UTF-8?Q?Ali_=c3=87ehreli?= (8/25) Jan 04 2020 How long are the lines? If 1K bytes, 100M would fit in memory just fine....
Need help: There' s a large text file (hundreds of thousands of lines). The structure is as follows: 2345|wedwededwedwedwe ...... 872625|rfrferwewweww ..... 23|rergrferfefer .... ................ It is necessary to sort this file by the first field having received: 23|rergrferfefer....... 2345|wedwededwedwedwe....... 872625|rfrferwewweww....... There are also N CPU (from 4 to 8) and 16 Gb of Memory. Necessary come up with an algorithm in D for fast sorting using multithreading.
Jan 03 2020
On Saturday, 4 January 2020 at 07:51:49 UTC, MGW wrote:Need help: There' s a large text file (hundreds of thousands of lines). The structure is as follows: 2345|wedwededwedwedwe ...... 872625|rfrferwewweww ..... 23|rergrferfefer .... ................ It is necessary to sort this file by the first field having received: 23|rergrferfefer....... 2345|wedwededwedwedwe....... 872625|rfrferwewweww....... There are also N CPU (from 4 to 8) and 16 Gb of Memory. Necessary come up with an algorithm in D for fast sorting using multithreading.As far as I know, there isn't any native in D. Maybe I overlooked some at code.dlang.org. But there are plenty out there in the wild. Found this on the first shoot: https://stackoverflow.com/questions/23531625/multithreaded-sorting-application/23532317
Jan 03 2020
On 1/3/20 11:51 PM, MGW wrote:Need help: There' s a large text file (hundreds of thousands of lines).How long are the lines? If 1K bytes, 100M would fit in memory just fine. There is a parallel quick sort example on the std.parallelism page: https://dlang.org/phobos/std_parallelism.htmlThe structure is as follows: 2345|wedwededwedwedwe ...... 872625|rfrferwewweww ..... 23|rergrferfefer .... ................. It is necessary to sort this file by the first field having received: 23|rergrferfefer....... 2345|wedwededwedwedwe....... 872625|rfrferwewweww.......Are you going to write the result back to a file? Then you would hardly notice any improvement from parallelism because relative slowness of I/O would determine the overall performance.There are also N CPU (from 4 to 8) and 16 Gb of Memory. Necessary come up with an algorithm in D for fast sorting using multithreading.Ali
Jan 04 2020