digitalmars.D.learn - automate tuple creation
- forkit (46/46) Jan 19 2022 so I have this code below, that creates an array of tuples.
- forkit (2/2) Jan 19 2022 On Wednesday, 19 January 2022 at 21:59:15 UTC, forkit wrote:
- H. S. Teoh (23/34) Jan 19 2022 Why can't you just use a loop to initialize it?
- =?UTF-8?Q?Ali_=c3=87ehreli?= (24/27) Jan 19 2022 That works but would be unnecessarily slow and be against the idea of
- =?UTF-8?Q?Ali_=c3=87ehreli?= (6/10) Jan 19 2022 But that's a mistake: If rnd is thread-local like that, it should be
- forkit (38/38) Jan 19 2022 On Wednesday, 19 January 2022 at 22:35:58 UTC, Ali Çehreli wrote:
- forkit (5/5) Jan 19 2022 On Wednesday, 19 January 2022 at 23:22:17 UTC, forkit wrote:
- H. S. Teoh (21/23) Jan 19 2022 Premature optimization. ;-) There's nothing wrong with allocating an
- =?UTF-8?Q?Ali_=c3=87ehreli?= (7/14) Jan 19 2022 Not in this case because I am pointing at premature pessimization. :)
- forkit (12/12) Jan 19 2022 On Wednesday, 19 January 2022 at 21:59:15 UTC, forkit wrote:
- H. S. Teoh (12/18) Jan 19 2022 Do the id's have to be unique? If not, std.random.uniform() would do
- forkit (76/77) Jan 19 2022 yep...
- forkit (27/31) Jan 19 2022 arrg!
- forkit (73/73) Jan 19 2022 On Thursday, 20 January 2022 at 04:38:39 UTC, forkit wrote:
- bauss (5/9) Jan 20 2022 Don't make them random then, but use an incrementor.
- forkit (104/108) Jan 20 2022 The 'uniqueness' of id would actually be created in the database.
- Stanislav Blinov (29/53) Jan 20 2022 Allocating 4 megs to generate 10 numbers??? You can generate a
- forkit (27/39) Jan 20 2022 Nice. Thanks. I had to compromise a little though, as assumUnique
- forkit (15/15) Jan 20 2022 On Thursday, 20 January 2022 at 21:16:46 UTC, forkit wrote:
- Steven Schveighoffer (13/20) Jan 20 2022 Because it would allow altering const data.
- forkit (17/18) Jan 20 2022 I'm not sure I understand. At what point in this function is
- =?UTF-8?Q?Ali_=c3=87ehreli?= (18/32) Jan 20 2022 If that were allowed, you could mutate elements of record and would
- =?UTF-8?Q?Ali_=c3=87ehreli?= (7/9) Jan 20 2022 As H. S. Teoh would add at this point, that is not idiomatic but the
- forkit (147/147) Jan 20 2022 On Thursday, 20 January 2022 at 23:49:59 UTC, Ali Çehreli wrote:
- forkit (7/7) Jan 20 2022 On Friday, 21 January 2022 at 01:35:40 UTC, forkit wrote:
- =?UTF-8?Q?Ali_=c3=87ehreli?= (36/51) Jan 20 2022 Does that make just the following definition @safe or the entire module
- forkit (11/26) Jan 20 2022 Oh. this was intentional, as I wanted to write once, and only
- forkit (6/12) Jan 20 2022 oops. looking back at that code, it seems I didn't write what i
- H. S. Teoh (7/9) Jan 20 2022 [...]
- forkit (27/29) Jan 20 2022 :-)
- forkit (6/31) Jan 20 2022 actually something not right with Appender I think...
- Stanislav Blinov (4/6) Jan 21 2022 You're using writeln, which goes through C I/O buffered writes.
- forkit (32/32) Jan 21 2022 On Friday, 21 January 2022 at 08:53:26 UTC, Stanislav Blinov
- forkit (137/137) Jan 21 2022 On Friday, 21 January 2022 at 09:10:56 UTC, forkit wrote:
- H. S. Teoh (16/21) Jan 21 2022 Actually you don't even need to do this, unless you want precise control
- Steven Schveighoffer (7/19) Jan 21 2022 Yeah, iota is a random-access range, so you can just pass it directly,
- forkit (9/15) Jan 21 2022 thanks. that makes more sense actually ;-)
- forkit (18/18) Jan 21 2022 On Friday, 21 January 2022 at 21:01:11 UTC, forkit wrote:
- forkit (19/19) Jan 21 2022 On Friday, 21 January 2022 at 21:43:38 UTC, forkit wrote:
- H. S. Teoh (8/27) Jan 21 2022 What's the point of calling .dup here? The only reference to records is
- forkit (99/103) Jan 21 2022 good pickup. thanks ;-)
- forkit (22/22) Jan 21 2022 On Friday, 21 January 2022 at 22:25:32 UTC, forkit wrote:
- Steven Schveighoffer (16/43) Jan 21 2022 oof! use enums for compile-time strings ;)
- forkit (8/15) Jan 21 2022 yes, I was thinking this over as I was waking up this morning,
- H. S. Teoh (22/32) Jan 21 2022 [...]
- Steven Schveighoffer (24/31) Jan 20 2022 The compiler rules aren't enforced based on what code you wrote, it
so I have this code below, that creates an array of tuples. but instead of hardcoding 5 tuples (or hardcoding any amount of tuples), what I really want to do is automate the creation of how-ever-many tuples I ask for: i.e. instead of calling this: createBoolMatrix(mArrBool); I would call something like this: createBoolMatrix(mArrBool,5); // create an array of 5 typles. Some ideas about direction would be welcome ;-) // --- module test; import std.stdio; import std.range; import std.traits; import std.random; safe: void main() { uint[][] mArrBool; createBoolMatrix(mArrBool); process(mArrBool); } void process(T)(const ref T t) if (isForwardRange!T && !isInfinite!T) { t.writeln; // sample output -> [[0, 1], [1, 0], [1, 1], [1, 1], [1, 1]] } void createBoolMatrix(ref uint[][] m) { auto rnd = Random(unpredictableSeed); // btw. below does register with -profile=gc m = [ [cast(uint)rnd.dice(0.6, 1.4), cast(uint)rnd.dice(0.4, 1.6)].randomShuffle(rnd), [cast(uint)rnd.dice(0.6, 1.4), cast(uint)rnd.dice(0.4, 1.6)].randomShuffle(rnd), [cast(uint)rnd.dice(0.6, 1.4), cast(uint)rnd.dice(0.4, 1.6)].randomShuffle(rnd), [cast(uint)rnd.dice(0.6, 1.4), cast(uint)rnd.dice(0.4, 1.6)].randomShuffle(rnd), [cast(uint)rnd.dice(0.6, 1.4), cast(uint)rnd.dice(0.4, 1.6)].randomShuffle(rnd) ]; } // --
Jan 19 2022
On Wednesday, 19 January 2022 at 21:59:15 UTC, forkit wrote:oh. that randomShuffle was unnecessary ;-)
Jan 19 2022
On Wed, Jan 19, 2022 at 09:59:15PM +0000, forkit via Digitalmars-d-learn wrote:so I have this code below, that creates an array of tuples. but instead of hardcoding 5 tuples (or hardcoding any amount of tuples), what I really want to do is automate the creation of how-ever-many tuples I ask for: i.e. instead of calling this: createBoolMatrix(mArrBool); I would call something like this: createBoolMatrix(mArrBool,5); // create an array of 5 typles.Why can't you just use a loop to initialize it? uint[][] createBoolMatrix(size_t n) { auto result = new uint[][n]; // allocate outer array foreach (ref row; result) { row = new uint[n]; // allocate inner array foreach (ref cell; row) { cell = cast(uint) rnd.dice(0.6, 1.4); } } return result; } Or, if you wanna use those new-fangled range-based idioms: uint[][] createBoolMatrix(size_t n) { return iota(n) .map!(i => iota(n) .map!(j => cast(uint) rnd.dice(0.6, 1.4)) .array) .array; } T -- Verbing weirds language. -- Calvin (& Hobbes)
Jan 19 2022
On 1/19/22 13:59, forkit wrote:void createBoolMatrix(ref uint[][] m) { auto rnd = Random(unpredictableSeed);That works but would be unnecessarily slow and be against the idea of random number generators. The usual approach is, once you have a randomized sequence, you just continue using it. For example, I move rnd to module scope and initialize it once. Random rnd; shared static this() { rnd = Random(unpredictableSeed); } auto randomValue() { return cast(uint)rnd.dice(0.6, 1.4); } // Returning a dynamically allocated array looks expensive // here. Why not use a struct or std.typecons.Tuple instead? auto randomTuple() { return [ randomValue(), randomValue() ]; } void createBoolMatrix(ref uint[][] m, size_t count) { import std.algorithm : map; import std.range : iota; m = count.iota.map!(i => randomTuple()).array; } Ali
Jan 19 2022
On 1/19/22 14:33, Ali Çehreli wrote:Random rnd; shared static this() { rnd = Random(unpredictableSeed); }But that's a mistake: If rnd is thread-local like that, it should be initialized in a 'static this' (not 'shared static this'). Otherwise, only the main thread's 'rnd' would be randomized, which is the only thread that executes 'shared static this' blocks. Ali
Jan 19 2022
On Wednesday, 19 January 2022 at 22:35:58 UTC, Ali Çehreli wrote:so I combined ideas from all responses: // -- module test; import std.stdio : writeln; import std.range : iota, isForwardRange, hasSlicing, hasLength, isInfinite, array; import std.random : Random, unpredictableSeed, dice; import std.algorithm : map; safe: Random rnd; static this() { rnd = Random(unpredictableSeed); } void main() { uint[][] mArrBool; // e.g: create a matrix consisting of 5 tuples, with each tuple containing 3 random bools (0 or 1) createBoolMatrix(mArrBool,5, 2); process(mArrBool); } void createBoolMatrix(ref uint[][] m, size_t numberOfTuples, size_t numberOfBoolsInTuple) { m = iota(numberOfTuples) .map!(i => iota(numberOfBoolsInTuple) .map!(numberOfBoolsInTuple => cast(uint) rnd.dice(0.6, 1.4)) .array).array; } void process(T)(const ref T t) if (isForwardRange!T && hasSlicing!T && hasLength!T && !isInfinite!T) { t.writeln; } //--
Jan 19 2022
On Wednesday, 19 January 2022 at 23:22:17 UTC, forkit wrote:oops // e.g: create a matrix consisting of 5 tuples, with each tuple containing 3 random bools (0 or 1) createBoolMatrix(mArrBool,5, 3);
Jan 19 2022
On Wed, Jan 19, 2022 at 02:33:02PM -0800, Ali Çehreli via Digitalmars-d-learn wrote: [...]// Returning a dynamically allocated array looks expensive // here. Why not use a struct or std.typecons.Tuple instead?Premature optimization. ;-) There's nothing wrong with allocating an array. If you're worried about memory efficiency, you could allocate the entire matrix in a single block and just assemble slices of it in the outer block, like this: uint[][] createBoolMatrix(size_t count) { auto buffer = new uint[count*count]; return iota(count).map!(i => buffer[count*i .. count*(i+1)]) .array; } This lets you do only 2 GC allocations instead of (1+count) GC allocations. May help with memory fragmentation if `count` is large and you create a lot of these things. But I honestly wouldn't bother with this unless your memory profiler is reporting a problem in this aspect of your program. It just adds complexity to your code (== poorer long-term maintainability) for meager benefits. T -- What do you call optometrist jokes? Vitreous humor.
Jan 19 2022
On 1/19/22 15:21, H. S. Teoh wrote:On Wed, Jan 19, 2022 at 02:33:02PM -0800, Ali Çehreli viaDigitalmars-d-learn wrote:[...]Not in this case because I am pointing at premature pessimization. :) There is no reason to use two-element dynamic arrays when uint[2], Tuple!(uint, uint), and structs are available.// Returning a dynamically allocated array looks expensive // here. Why not use a struct or std.typecons.Tuple instead?Premature optimization. ;-)There's nothing wrong with allocating an array.Agreed. Ali
Jan 19 2022
On Wednesday, 19 January 2022 at 21:59:15 UTC, forkit wrote:so at the moment i can get a set number of tuples, with a set number of bool values contained within each tuple. e.g. createBoolMatrix(mArrBool,3, 2); [[1, 0], [1, 1], [1, 0]] my next challenge (more for myself, but happy for input).. is to enhance this to an return an associative array: e.g createBoolAssociativeMatrix(mArrBool,3, 2); [ [1000:[1, 0]], [1001:[1, 1]], [1001:[1, 0]]] where 1000 is some random id...
Jan 19 2022
On Thu, Jan 20, 2022 at 12:12:56AM +0000, forkit via Digitalmars-d-learn wrote: [...]createBoolAssociativeMatrix(mArrBool,3, 2); [ [1000:[1, 0]], [1001:[1, 1]], [1001:[1, 0]]] where 1000 is some random id...Do the id's have to be unique? If not, std.random.uniform() would do the job. If they have to be unique, you can either use a sequential global counter (a 64-bit counter will suffice -- you'll won't exhaust it for at least 60+ years of bumping the counter once per CPU tick at 8.4 GHz), or use an AA of ids already generated and just call uniform() to generate a new one until it doesn't collide anymore. T -- A mathematician learns more and more about less and less, until he knows everything about nothing; whereas a philospher learns less and less about more and more, until he knows nothing about everything.
Jan 19 2022
On Thursday, 20 January 2022 at 00:30:44 UTC, H. S. Teoh wrote:Do the id's have to be unique?yep... I'm almost there ;-) // --- module test; import std.stdio : writeln; import std.range : iota, isForwardRange, hasSlicing, hasLength, isInfinite; import std.array : array, Appender; import std.random : Random, unpredictableSeed, dice, choice; import std.algorithm : map, uniq; safe: Random rnd; static this() { rnd = Random(unpredictableSeed); } void main() { int recordsNeeded = 5; uint[] uniqueIDs; makeUniqueIDs(uniqueIDs, recordsNeeded); writeln(uniqueIDs); uint[][] mArrBool; // e.g: create a matrix consisting of 5 tuples, // with each tuple containing 3 random bools (0 or 1) createBoolMatrix(mArrBool,recordsNeeded, 3); // process just writeln's it's argument at the moment process(mArrBool); // [[1, 1, 1], [0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 1, 0]] // to do (integrate a single value taken from uniqueIDs so that each tuple looks like this: [999575454:[1, 1, 1]] // e.g. // processRecords(records); // output from above should look like this below: // [ [999575454:[1, 1, 1]], [999704246:[0, 0, 1]], [999969331:[1, 1, 1]], [999678591:[1, 1, 1]], [999691754:[1, 1, 0]] ] } void createBoolMatrix(ref uint[][] m, size_t numberOfTuples, size_t numberOfBoolsInTuple) { m = iota(numberOfTuples) .map!(i => iota(numberOfBoolsInTuple) .map!(numberOfBoolsInTuple => cast(uint) rnd.dice(0.6, 1.4)) .array).array; } void process(T)(const ref T t) if (isForwardRange!T && hasSlicing!T && hasLength!T && !isInfinite!T) { t.writeln; } void processRecords(T)(const ref T t) if (isForwardRange!T && hasSlicing!T && hasLength!T && !isInfinite!T) { t.writeln; } void makeUniqueIDs(ref uint[] arr, size_t sz) { // id needs to be 9 digits, and needs to start with 999 int[] a = iota(999_000_000, 1_000_000_000).array; // can produce a max of 1_000_000 records. Appender!(uint[]) appndr; // pre-allocate space to avoid costly reallocations appndr.reserve(sz+1); foreach(value; 1..(sz + 1)) appndr ~= cast(uint)a.choice(rnd); // just interesting to see often this asserts. //assert(appndr[].array == appndr[].uniq.array); arr = appndr[].uniq.array; // function should not return if this asserts (i.e. app will exit) assert(arr[].array == arr[].uniq.array); } // ---
Jan 19 2022
On Thursday, 20 January 2022 at 04:00:59 UTC, forkit wrote:void makeUniqueIDs(ref uint[] arr, size_t sz) { ... }arrg! what was i thinking! ;-) // --- void makeUniqueIDs(ref uint[] arr, size_t sz) { arr.reserve(sz); // id needs to be 9 digits, and needs to start with 999 int[] a = iota(999_000_000, 1_000_000_000).array; // above will contain 1_000_000 records that we can choose from. int i = 0; uint x; while(i != sz) { x = cast(uint)a.choice(rnd); // ensure every id added is unique. if (!arr.canFind(x)) { arr ~= x; i++; } else i--; } } //------
Jan 19 2022
On Thursday, 20 January 2022 at 04:38:39 UTC, forkit wrote:all done ;-) // --- module test; import std.stdio : writeln; import std.range : iota, isForwardRange, hasSlicing, hasLength, isInfinite; import std.array : array, Appender; import std.random : Random, unpredictableSeed, dice, choice; import std.algorithm : map, uniq, canFind; safe: Random rnd; static this() { rnd = Random(unpredictableSeed); } void main() { int recordsNeeded = 2; int boolValuesNeeded = 3; uint[] uniqueIDs; makeUniqueIDs(uniqueIDs, recordsNeeded); uint[][] tuples; createBoolMatrix(tuples, recordsNeeded, boolValuesNeeded); uint[][uint][] records = CreateTupleDictionary(uniqueIDs, tuples); processRecords(records); } auto CreateTupleDictionary(ref uint[] ids, ref uint[][] tuples) { uint[][uint][] records; foreach(i, id; ids) records ~= [ ids[i] : tuples[i] ]; return records.dup; } void processRecords(T)(const ref T t) if (isForwardRange!T && hasSlicing!T && hasLength!T && !isInfinite!T) { t.writeln; // output from above should look like this: // [[999583661:[1, 1, 0]], [999273256:[1, 1, 1]]] // hoping to explore parallel here too... } void createBoolMatrix(ref uint[][] m, size_t numberOfTuples, size_t numberOfBoolsInTuple) { m = iota(numberOfTuples) .map!(i => iota(numberOfBoolsInTuple) .map!(numberOfBoolsInTuple => cast(uint) rnd.dice(0.6, 1.4)) .array).array; } void makeUniqueIDs(ref uint[] arr, size_t sz) { arr.reserve(sz); // id needs to be 9 digits, and needs to start with 999 int[] a = iota(999_000_000, 1_000_000_000).array; // above will contain 1_000_000 records that we can choose from. int i = 0; uint x; while(i != sz) { x = cast(uint)a.choice(rnd); // ensure every id added is unique. if (!arr.canFind(x)) { arr ~= x; i++; } } } // ---
Jan 19 2022
On Thursday, 20 January 2022 at 04:00:59 UTC, forkit wrote:On Thursday, 20 January 2022 at 00:30:44 UTC, H. S. Teoh wrote:Don't make them random then, but use an incrementor. If you can have ids that aren't integers then you could use uuids too. https://dlang.org/phobos/std_uuid.htmlDo the id's have to be unique?yep...
Jan 20 2022
On Thursday, 20 January 2022 at 10:11:10 UTC, bauss wrote:Don't make them random then, but use an incrementor. If you can have ids that aren't integers then you could use uuids too. https://dlang.org/phobos/std_uuid.htmlThe 'uniqueness' of id would actually be created in the database. I just creating a dataset to simulate an export. I'm pretty much done, just wish -profile=gc was working in createUniqueIDArray(..) // --------------- module test; safe: import std.stdio : write, writef, writeln, writefln; import std.range : iota, isForwardRange, hasSlicing, hasLength, isInfinite; import std.array : array, byPair; import std.random : Random, unpredictableSeed, dice, choice; import std.algorithm : map, uniq, canFind; debug { import std; } Random rnd; static this() { rnd = Random(unpredictableSeed); } void main() { const int recordsNeeded = 10; const int valuesPerRecord = 8; int[] idArray; createUniqueIDArray(idArray, recordsNeeded); int[][] valuesArray; createValuesArray(valuesArray, recordsNeeded, valuesPerRecord); int[][int][] records = CreateDataSet(idArray, valuesArray, recordsNeeded); ProcessRecords(records); } void ProcessRecords(ref const(int[][int][]) recArray) { void processRecord(ref int id, ref const(int)[] result) { writef("%s\t%s", id, result); } foreach(ref record; recArray) { foreach (ref rp; record.byPair) { processRecord(rp.expand); } writeln; } } int[][int][] CreateDataSet(ref int[] idArray, ref int[][] valuesArray, int numRecords) { int[][int][] records; records.reserve(numRecords); debug { writefln("records.capacity is %s", records.capacity); } foreach(i, id; idArray) records ~= [ idArray[i] : valuesArray[i] ]; // NOTE: does register with -profile=gc return records.dup; } void createValuesArray(ref int[][] m, size_t recordsNeeded, size_t valuesPerRecord) { m = iota(recordsNeeded) .map!(i => iota(valuesPerRecord) .map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)) .array).array; // NOTE: does register with -profile=gc } void createUniqueIDArray(ref int[] idArray, int recordsNeeded) { idArray.reserve(recordsNeeded); debug { writefln("idArray.capacity is %s", idArray.capacity); } // id needs to be 9 digits, and needs to start with 999 // below will contain 1_000_000 records that we can choose from. int[] ids = iota(999_000_000, 1_000_000_000).array; // NOTE: does NOT register with -profile=gc int i = 0; int x; while(i != recordsNeeded) { x = ids.choice(rnd); // ensure every id added is unique. if (!idArray.canFind(x)) { idArray ~= x; // NOTE: does NOT register with -profile=gc i++; } } } /+ sample output: 999623777 [0, 0, 1, 1, 1, 0, 0, 0] 999017078 [1, 0, 1, 1, 1, 1, 1, 1] 999269073 [1, 1, 0, 0, 1, 1, 0, 1] 999408504 [0, 1, 1, 1, 1, 1, 0, 0] 999752314 [1, 0, 0, 1, 1, 1, 1, 0] 999660730 [0, 1, 0, 0, 1, 1, 1, 1] 999709822 [1, 1, 1, 0, 1, 1, 0, 0] 999642248 [1, 1, 1, 0, 0, 1, 1, 0] 999533069 [1, 1, 1, 0, 0, 0, 0, 0] 999661591 [1, 1, 1, 1, 1, 0, 1, 1] +/ // ---------------
Jan 20 2022
On Thursday, 20 January 2022 at 12:15:56 UTC, forkit wrote:void createUniqueIDArray(ref int[] idArray, int recordsNeeded) { idArray.reserve(recordsNeeded); debug { writefln("idArray.capacity is %s", idArray.capacity); } // id needs to be 9 digits, and needs to start with 999 // below will contain 1_000_000 records that we can choose from. int[] ids = iota(999_000_000, 1_000_000_000).array; // NOTE: does NOT register with -profile=gc int i = 0; int x; while(i != recordsNeeded) { x = ids.choice(rnd); // ensure every id added is unique. if (!idArray.canFind(x)) { idArray ~= x; // NOTE: does NOT register with -profile=gc i++; } } }Allocating 4 megs to generate 10 numbers??? You can generate a random number between 999000000 and 1000000000. ``` immutable(int)[] createUniqueIDArray(int recordsNeeded) { import std.random; import std.algorithm.searching : canFind; int[] result = new int[recordsNeeded]; int i = 0; int x; while(i != recordsNeeded) { // id needs to be 9 digits, and needs to start with 999 x = uniform(999*10^^6, 10^^9); // ensure every id added is unique. if (!result[0 .. i].canFind(x)) result[i++] = x; } import std.exception : assumeUnique; return result.assumeUnique; } void main() { import std.stdio; createUniqueIDArray(10).writeln; } ``` Only one allocation, and it would be tracked with -profile=gc...
Jan 20 2022
On Thursday, 20 January 2022 at 12:40:09 UTC, Stanislav Blinov wrote:Allocating 4 megs to generate 10 numbers??? You can generate a random number between 999000000 and 1000000000. ... // id needs to be 9 digits, and needs to start with 999 x = uniform(999*10^^6, 10^^9); // ensure every id added is unique. if (!result[0 .. i].canFind(x)) result[i++] = x; } import std.exception : assumeUnique; return result.assumeUnique; ...Nice. Thanks. I had to compromise a little though, as assumUnique is system, and all my code is safe (and trying to avoid the need for inline system wrapper ;-) //--- void createUniqueIDArray(ref int[] idArray, int recordsNeeded) { idArray.reserve(recordsNeeded); debug { writefln("idArray.capacity is %s", idArray.capacity); } int i = 0; int x; while(i != recordsNeeded) { // generate a random 9 digit id that starts with 999 x = uniform(999*10^^6, 10^^9); // thanks Stanislav! // ensure every id added is unique. if (!idArray.canFind(x)) { idArray ~= x; // NOTE: does NOT register with -profile=gc i++; } } } //---
Jan 20 2022
On Thursday, 20 January 2022 at 21:16:46 UTC, forkit wrote:Cannot work out why I cannot pass valuesArray in as ref const?? get error: Error: cannot append type `const(int[])[const(int)]` to type `int[][int][]` // -- int[][int][] CreateDataSet(ref const int[] idArray, ref const(int[][]) valuesArray, const int numRecords) { int[][int][] records; records.reserve(numRecords); foreach(i, id; idArray) records ~= [ idArray[i] : valuesArray[i] ]; return records.dup; } // ---
Jan 20 2022
On 1/20/22 5:07 PM, forkit wrote:On Thursday, 20 January 2022 at 21:16:46 UTC, forkit wrote:Because it would allow altering const data. e.g.: ```d const(int[])[const(int)] v = [1: [1, 2, 3]]; int[][int][] arr = [v]; // assume this works arr[0][1][0] = 5; // oops, just set v[1][0] ``` General rule of thumb is that you can convert the HEAD of a structure to mutable from const, but not the TAIL (the stuff it points at). An associative array is a pointer-to-implementation construct, so it's a reference. -SteveCannot work out why I cannot pass valuesArray in as ref const?? get error: Error: cannot append type `const(int[])[const(int)]` to type `int[][int][]`
Jan 20 2022
On Thursday, 20 January 2022 at 22:31:17 UTC, Steven Schveighoffer wrote:Because it would allow altering const data.I'm not sure I understand. At what point in this function is valuesArray modified, and thus preventing it being passed in with const? // --- int[][int][] CreateDataSet ref const int[] idArray, ref int[][] valuesArray, const int numRecords) { int[][int][] records; records.reserve(numRecords); foreach(i, const id; idArray) records ~= [ idArray[i] : valuesArray[i] ]; return records.dup; } // ----
Jan 20 2022
On 1/20/22 15:01, forkit wrote:On Thursday, 20 January 2022 at 22:31:17 UTC, Steven Schveighoffer wrote:Elements of records are mutable.Because it would allow altering const data.I'm not sure I understand. At what point in this function is valuesArray modified, and thus preventing it being passed in with const? // --- int[][int][] CreateDataSet ref const int[] idArray, ref int[][] valuesArray, const int numRecords) { int[][int][] records;records.reserve(numRecords); foreach(i, const id; idArray) records ~= [ idArray[i] : valuesArray[i] ];If that were allowed, you could mutate elements of record and would break the promise to your caller. Aside: There is no reason to pass arrays and associative arrays as 'ref const' in D as they are already reference types. Unlike C++, there is no copying of the elements. When you pass by value, just a couple of fundamental types are copied. Furthermore and in theory, there may be a performance penalty when an array is passed by reference because elements would be accessed by dereferencing twice: Once for the parameter reference and once for the .ptr property of the array. (This is in theory.) void foo(ref const int[]) {} // Unnecessary void foo(const int[]) {} // Idiomatic void foo(in int[]) {} // Intentful :) Passing arrays by reference makes sense when the function will mutate the argument. Ali
Jan 20 2022
On 1/20/22 15:10, Ali Çehreli wrote:void foo(const int[]) {} // IdiomaticAs H. S. Teoh would add at this point, that is not idiomatic but the following are (with different meanings): void foo(const(int)[]) {} // Idiomatic void foo(const(int[])) {} // Idiomaticvoid foo(in int[]) {} // Intentful :)I still like that one. :) Ali
Jan 20 2022
On Thursday, 20 January 2022 at 23:49:59 UTC, Ali Çehreli wrote:so here is final code, in idiomatic D, as far as I can tell ;-) curious output when using -profile=gc .. a line referring to: std.array.Appender!(immutable(char)[]).Appender.Data std.array.Appender!string.Appender.this C:\D\dmd2\windows\bin\..\..\src\phobos\std\array.d:3330 That's not real helpful, as I'm not sure what line of my code its referrring to. // --------------- /+ ===================================================================== This program create a sample dataset consisting of 'random' records, and then outputs that dataset to a file. Arguments can be passed on the command line, or otherwise default values are used instead. Example of that output can be seen at the end of this code. ===================================================================== +/ module test; safe import std.stdio : write, writef, writeln, writefln; import std.range : iota; import std.array : array, byPair; import std.random : Random, unpredictableSeed, dice, choice, uniform; import std.algorithm : map, uniq, canFind; import std.conv : to; import std.stdio : File; import std.format; debug { import std; } Random rnd; static this() { rnd = Random(unpredictableSeed); } // thanks Ali void main(string[] args) { int recordsNeeded, valuesPerRecord; string fname; if(args.length < 4) { recordsNeeded = 10; valuesPerRecord= 8; fname = "D:/rnd_records.txt"; } else { // assumes valid values being passed in ;-) recordsNeeded = to!int(args[1]); valuesPerRecord = to!int(args[2]); fname = args[3]; } int[] idArray; createUniqueIDArray(idArray, recordsNeeded); int[][] valuesArray; createValuesArray(valuesArray, recordsNeeded, valuesPerRecord); int[][int][] records = CreateDataSet(idArray, valuesArray, recordsNeeded); ProcessRecords(records, fname); writefln("All done. Check if records written to %s", fname); } void createUniqueIDArray (ref int[] idArray, const(int) recordsNeeded) { idArray.reserve(recordsNeeded); debug { writefln("idArray.capacity is %s", idArray.capacity); } int i = 0; int x; while(i != recordsNeeded) { // id needs to be 9 digits, and needs to start with 999 x = uniform(999*10^^6, 10^^9); // thanks Stanislav // ensure every id added is unique. if (!idArray.canFind(x)) { idArray ~= x; // NOTE: does NOT appear to register with -profile=gc i++; } } } void createValuesArray (ref int[][] valuesArray, const(int) recordsNeeded, const(int) valuesPerRecord) { valuesArray = iota(recordsNeeded) .map!(i => iota(valuesPerRecord) .map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)) .array).array; // NOTE: does register with -profile=gc } int[][int][] CreateDataSet (const(int)[] idArray, int[][] valuesArray, const(int) numRecords) { int[][int][] records; records.reserve(numRecords); debug { writefln("records.capacity is %s", records.capacity); } foreach(i, const id; idArray) { // NOTE: below does register with -profile=gc records ~= [ idArray[i] : valuesArray[i] ]; } return records.dup; } void ProcessRecords (in int[][int][] recArray, const(string) fname) { auto file = File(fname, "w"); scope(exit) file.close; string[] formattedRecords; formattedRecords.reserve(recArray.length); debug { writefln("formattedRecords.capacity is %s", formattedRecords.capacity); } void processRecord(const(int) id, const(int)[] values) { // NOTE: below does register with -profile=gc formattedRecords ~= id.to!string ~ values.format!"%(%s,%)"; } foreach(ref const record; recArray) { foreach (ref rp; record.byPair) { processRecord(rp.expand); } } foreach(ref rec; formattedRecords) file.writeln(rec); } /+ sample file output: 9992511730,1,0,1,0,1,0,1 9995369731,1,1,1,1,1,1,1 9993136031,1,0,0,0,1,0,0 9998979051,1,1,1,1,0,1,1 9998438090,1,1,0,1,1,0,0 9995132750,0,0,1,0,1,1,1 9997123630,0,1,1,1,0,1,1 9998351590,1,0,0,1,1,1,1 9991454121,1,1,1,1,1,0,1 9997673520,1,1,1,1,1,1,1 +/ // ---------------
Jan 20 2022
On Friday, 21 January 2022 at 01:35:40 UTC, forkit wrote:oops. nasty mistake to make ;-) module test; safe should be: module test; safe:
Jan 20 2022
On 1/20/22 17:35, forkit wrote:module test; safeDoes that make just the following definition safe or the entire module safe? Trying... Yes, I am right. To make the module safe, use the following syntax: safe:idArray.reserve(recordsNeeded);[...]idArray ~= x; // NOTE: does NOT appear to register with -profile=gcBecause you've already reserved enough memory above. Good.int[][int][] records; records.reserve(numRecords);That's good for the array part. However...// NOTE: below does register with -profile=gc records ~= [ idArray[i] : valuesArray[i] ];The right hand side is a freshly generated associative array. For every element of 'records', there is a one-element AA created. AA will need to allocate memory for its element. So, GC allocation is expected there.string[] formattedRecords; formattedRecords.reserve(recArray.length);[...]// NOTE: below does register with -profile=gc formattedRecords ~= id.to!string ~ values.format!"%(%s,%)";Again, although 'formattedRecords' has reserved memory, the right hand side has dynamic memory allocations. 1) id.to!string allocates 2) format allocates memory for its 'string' result (I think the Appender report comes from format's internals.) 3) Operator ~ makes a new string from the previous two (Somehow, I don't see three allocations though. Perhaps an NRVO is applied there. (?)) I like the following better, which reduces the allocations: formattedRecords ~= format!"%s%(%s,%)"(id.to!string, values);foreach(ref rec; formattedRecords) file.writeln(rec);The bigger question is, why did 'formattedRecords' exist at all? You could have written the output directly to the file. But even *worse* and with apologies, ;) here is something crazy that achieves the same thing: void ProcessRecords (in int[][int][] recArray, const(string) fname) { import std.algorithm : joiner; auto toWrite = recArray.map!(e => e.byPair); File("rnd_records.txt", "w").writefln!"%(%(%(%s,%(%s,%)%)%)\n%)"(toWrite); } I've done lot's of trial and error for the required number of nested %( %) pairs. Phew... Ali
Jan 20 2022
On Friday, 21 January 2022 at 02:30:35 UTC, Ali Çehreli wrote:The bigger question is, why did 'formattedRecords' exist at all? You could have written the output directly to the file.Oh. this was intentional, as I wanted to write once, and only once, to the file. The consequence of that decision of course, is the extra memory allocations... But in my example code I only create 10 records. In reality, my dataset will have 100,000's of records, so I don't want to write 100,000s of time to the same file.But even *worse* and with apologies, ;) here is something crazy that achieves the same thing: void ProcessRecords (in int[][int][] recArray, const(string) fname) { import std.algorithm : joiner; auto toWrite = recArray.map!(e => e.byPair); File("rnd_records.txt", "w").writefln!"%(%(%(%s,%(%s,%)%)%)\n%)"(toWrite); } I've done lot's of trial and error for the required number of nested %( %) pairs. Phew... AliYes, that does look worse ;-) But I'm looking into that code to see if I can salvage something from it ;-)
Jan 20 2022
On Friday, 21 January 2022 at 03:45:08 UTC, forkit wrote:On Friday, 21 January 2022 at 02:30:35 UTC, Ali Çehreli wrote:oops. looking back at that code, it seems I didn't write what i intended :-( I might have to use a kindof stringbuilder instead, then write a massive string once to the file.The bigger question is, why did 'formattedRecords' exist at all? You could have written the output directly to the file.Oh. this was intentional, as I wanted to write once, and only once, to the file.
Jan 20 2022
On Fri, Jan 21, 2022 at 03:50:37AM +0000, forkit via Digitalmars-d-learn wrote: [...]I might have to use a kindof stringbuilder instead, then write a massive string once to the file.[...] std.array.appender is your friend. T -- Meat: euphemism for dead animal. -- Flora
Jan 20 2022
On Friday, 21 January 2022 at 03:57:01 UTC, H. S. Teoh wrote:std.array.appender is your friend. T:-) // -- void ProcessRecords (in int[][int][] recArray, const(string) fname) { auto file = File(fname, "w"); scope(exit) file.close; Appender!string bigString = appender!string; bigString.reserve(recArray.length); debug { writefln("bigString.capacity is %s", bigString.capacity); } void processRecord(const(int) id, const(int)[] values) { bigString ~= id.to!string ~ values.format!"%(%s,%)" ~ "\n"; } foreach(ref const record; recArray) { foreach (ref rp; record.byPair) { processRecord(rp.expand); } } file.write(bigString[]); } // ---
Jan 20 2022
On Friday, 21 January 2022 at 04:08:33 UTC, forkit wrote:// -- void ProcessRecords (in int[][int][] recArray, const(string) fname) { auto file = File(fname, "w"); scope(exit) file.close; Appender!string bigString = appender!string; bigString.reserve(recArray.length); debug { writefln("bigString.capacity is %s", bigString.capacity); } void processRecord(const(int) id, const(int)[] values) { bigString ~= id.to!string ~ values.format!"%(%s,%)" ~ "\n"; } foreach(ref const record; recArray) { foreach (ref rp; record.byPair) { processRecord(rp.expand); } } file.write(bigString[]); } // ---actually something not right with Appender I think... 100_000 records took 20sec (ok) 1_000_000 records never finished - after 1hr/45min I cancelled the process. ??
Jan 20 2022
On Friday, 21 January 2022 at 03:50:37 UTC, forkit wrote:I might have to use a kindof stringbuilder instead, then write a massive string once to the file.You're using writeln, which goes through C I/O buffered writes. Whether you make one call or several is of little consequence - you're limited by buffer size and options.
Jan 21 2022
On Friday, 21 January 2022 at 08:53:26 UTC, Stanislav Blinov wrote:turns out the problem has nothing to do with appender... It's actually this line: if (!idArray.canFind(x)): when i comment this out in the function below, the program does what I want in seconds. only problem is, the ids are no longer unique (in the file) // --- void createUniqueIDArray (ref int[] idArray, const(int) recordsNeeded) { idArray.reserve(recordsNeeded); debug { writefln("idArray.capacity is %s", idArray.capacity); } int i = 0; int x; while(i != recordsNeeded) { // id needs to be 9 digits, and needs to start with 999 x = uniform(999*10^^6, 10^^9); // thanks Stanislav // ensure every id added is unique. //if (!idArray.canFind(x)) //{ idArray ~= x; // NOTE: does NOT appear to register with -profile=gc i++; //} } debug { writefln("idArray.length = %s", idArray.length); } } // ----
Jan 21 2022
On Friday, 21 January 2022 at 09:10:56 UTC, forkit wrote:ok... in the interest of corecting the code I posted previously... ... here is a version that actually works in secs (for a million records), as opposed to hours! // --------------- /+ ===================================================================== This program create a sample dataset consisting of 'random' records, and then outputs that dataset to a file. Arguments can be passed on the command line, or otherwise default values are used instead. Example of that output can be seen at the end of this code. ===================================================================== +/ module test; safe: import std.stdio : write, writef, writeln, writefln; import std.range : iota, takeExactly; import std.array : array, byPair, Appender, appender; import std.random : Random, unpredictableSeed, dice, choice, uniform; import std.algorithm : map, uniq, canFind, among; import std.conv : to; import std.format; import std.stdio : File; import std.file : exists; import std.exception : enforce; debug { import std; } Random rnd; static this() { rnd = Random(unpredictableSeed); } // thanks Ali void main(string[] args) { int recordsNeeded, valuesPerRecord; string fname; if(args.length < 4) { //recordsNeeded = 1_000_000; //recordsNeeded = 100_000; recordsNeeded = 10; valuesPerRecord= 8; //fname = "D:/rnd_records.txt"; fname = "./rnd_records.txt"; } else { // assumes valid values being passed in ;-) recordsNeeded = to!int(args[1]); valuesPerRecord = to!int(args[2]); fname = args[3]; } debug { writefln("%s records, %s values for record, will be written to file: %s", recordsNeeded, valuesPerRecord, fname); } else { enforce(!exists(fname), "Oop! That file already exists!"); } // id needs to be 9 digits, and needs to start with 999 int[] idArray = takeExactly(iota(999*10^^6, 10^^9), recordsNeeded).array; debug { writefln("idArray.length = %s", idArray.length); } int[][] valuesArray; createValuesArray(valuesArray, recordsNeeded, valuesPerRecord); int[][int][] records = CreateDataSet(idArray, valuesArray, recordsNeeded); ProcessRecords(records, fname); writefln("All done. Check if records written to %s", fname); } void createValuesArray (ref int[][] valuesArray, const(int) recordsNeeded, const(int) valuesPerRecord) { valuesArray = iota(recordsNeeded) .map!(i => iota(valuesPerRecord) .map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)) .array).array; // NOTE: does register with -profile=gc debug { writefln("valuesArray.length = %s", valuesArray.length); } } int[][int][] CreateDataSet (const(int)[] idArray, int[][] valuesArray, const(int) numRecords) { int[][int][] records; records.reserve(numRecords); debug { writefln("records.capacity is %s", records.capacity); } foreach(i, const id; idArray) { // NOTE: below does register with -profile=gc records ~= [ idArray[i] : valuesArray[i] ]; } debug { writefln("records.length = %s", records.length); } return records.dup; } void ProcessRecords (in int[][int][] recArray, const(string) fname) { auto file = File(fname, "w"); scope(exit) file.close; Appender!string bigString = appender!string; bigString.reserve(recArray.length); debug { writefln("bigString.capacity is %s", bigString.capacity); } // NOTE: forward declaration required for this nested function void processRecord(const(int) id, const(int)[] values) { // NOTE: below does register with -profile=gc bigString ~= id.to!string ~ "," ~ values.format!"%(%s,%)" ~ "\n"; } foreach(ref const record; recArray) { foreach (ref rp; record.byPair) { processRecord(rp.expand); } } file.write(bigString[]); } /+ sample file output: 9992511730,1,0,1,0,1,0,1 9995369731,1,1,1,1,1,1,1 9993136031,1,0,0,0,1,0,0 9998979051,1,1,1,1,0,1,1 9998438090,1,1,0,1,1,0,0 9995132750,0,0,1,0,1,1,1 9997123630,0,1,1,1,0,1,1 9998351590,1,0,0,1,1,1,1 9991454121,1,1,1,1,1,0,1 9997673520,1,1,1,1,1,1,1 +/ // ---------------
Jan 21 2022
On Fri, Jan 21, 2022 at 10:12:42AM +0000, forkit via Digitalmars-d-learn wrote: [...]Random rnd; static this() { rnd = Random(unpredictableSeed); } // thanks AliActually you don't even need to do this, unless you want precise control over the initialization of your RNG. If you don't specify the RNG parameter in the calls to std.random functions, they will use the default RNG, which is already initialized with unpredictableSeed. [...]// id needs to be 9 digits, and needs to start with 999 int[] idArray = takeExactly(iota(999*10^^6, 10^^9), recordsNeeded).array;[...] This is wasteful if you're not planning to use every ID in this million-entry long array. Much better to just use an AA to keep track of which IDs have already been generated instead. Of course, if you plan to use most of the array, then the AA may wind up using more memory than the array. So it depends on your use case. T -- Never wrestle a pig. You both get covered in mud, and the pig likes it.
Jan 21 2022
On 1/21/22 1:36 PM, H. S. Teoh wrote:On Fri, Jan 21, 2022 at 10:12:42AM +0000, forkit via Digitalmars-d-learn wrote:[...]Yeah, iota is a random-access range, so you can just pass it directly, and not allocate anything. Looking at the usage, it doesn't need to be an array at all. But modifying the code to properly accept the range might prove difficult for someone not used to it. -Steve// id needs to be 9 digits, and needs to start with 999 int[] idArray = takeExactly(iota(999*10^^6, 10^^9), recordsNeeded).array;[...] This is wasteful if you're not planning to use every ID in this million-entry long array. Much better to just use an AA to keep track of which IDs have already been generated instead. Of course, if you plan to use most of the array, then the AA may wind up using more memory than the array. So it depends on your use case.
Jan 21 2022
On Friday, 21 January 2022 at 18:50:46 UTC, Steven Schveighoffer wrote:Yeah, iota is a random-access range, so you can just pass it directly, and not allocate anything. Looking at the usage, it doesn't need to be an array at all. But modifying the code to properly accept the range might prove difficult for someone not used to it. -Stevethanks. that makes more sense actually ;-) now i can get rid of the idArray completely, and just do: foreach(i, id; enumerate(iota(iotaStartNum, iotaStartNum + recordsNeeded))) { records ~= [ id: valuesArray[i] ]; }
Jan 21 2022
On Friday, 21 January 2022 at 21:01:11 UTC, forkit wrote:even better, I got rid of all those uncessary arrays ;-) // --- int[][int][] CreateDataSet (const(int) recordsNeeded, const(int)valuesPerRecord) { int[][int][] records; records.reserve(recordsNeeded); foreach(i, id; iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate) { records ~= [ id: iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array ]; } return records.dup; } // ---
Jan 21 2022
On Friday, 21 January 2022 at 21:43:38 UTC, forkit wrote:oops... should be: // --- int[][int][] CreateDataSet (const(int) recordsNeeded, const(int)valuesPerRecord) { int[][int][] records; records.reserve(recordsNeeded); const int iotaStartNum = 100_000_001; foreach(i, id; iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate) { records ~= [ id: iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array ]; } return records.dup; } // ---
Jan 21 2022
On Fri, Jan 21, 2022 at 09:43:38PM +0000, forkit via Digitalmars-d-learn wrote:On Friday, 21 January 2022 at 21:01:11 UTC, forkit wrote:[...]even better, I got rid of all those uncessary arrays ;-) // --- int[][int][] CreateDataSet (const(int) recordsNeeded, const(int)valuesPerRecord) { int[][int][] records; records.reserve(recordsNeeded); foreach(i, id; iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate) { records ~= [ id: iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array ]; } return records.dup;What's the point of calling .dup here? The only reference to records is going out of scope, so why can't you just return it? The .dup is just creating extra work for nothing. T -- Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn
Jan 21 2022
On Friday, 21 January 2022 at 21:56:33 UTC, H. S. Teoh wrote:What's the point of calling .dup here? The only reference to records is going out of scope, so why can't you just return it? The .dup is just creating extra work for nothing. Tgood pickup. thanks ;-) // ---- module test; safe: import std.stdio : write, writef, writeln, writefln; import std.range : iota, enumerate; import std.array : array, byPair, Appender, appender; import std.random : Random, unpredictableSeed, dice, randomCover; import std.algorithm : map; import std.conv : to; import std.format; import std.stdio : File; import std.file : exists; import std.exception : enforce; debug { import std; } Random rnd; static this() { rnd = Random(unpredictableSeed); } void main(string[] args) { int recordsNeeded, valuesPerRecord; string fname; if(args.length < 4) { recordsNeeded = 10; // default valuesPerRecord= 8; // default fname = "D:/rnd_records.txt"; // default //fname = "./rnd_records.txt"; // default } else { // assumes valid values being passed in ;-) recordsNeeded = to!int(args[1]); valuesPerRecord = to!int(args[2]); fname = args[3]; } debug { writefln("%s records, %s values for record, will be written to file: %s", recordsNeeded, valuesPerRecord, fname); } else { enforce(!exists(fname), "Oop! That file already exists!"); enforce(recordsNeeded <= 1_000_000_000, "C'mon! That's too many records!"); } int[][int][] records = CreateDataSet(recordsNeeded, valuesPerRecord); ProcessDataSet(records, fname); writefln("All done. Check if records written to %s", fname); } int[][int][] CreateDataSet (const(int) recordsNeeded, const(int) valuesPerRecord) { const int iotaStartNum = 100_000_001; int[][int][] records; records.reserve(recordsNeeded); debug { writefln("records.capacity is %s", records.capacity); } foreach(i, id; iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate) { // NOTE: below does register with -profile=gc records ~= [ id: iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array ]; } debug { writefln("records.length = %s", records.length); } return records; } // this creates a big string of 'formatted' records, and outputs that string to a file. void ProcessDataSet (in int[][int][] records, const(string) fname) { auto file = File(fname, "w"); scope(exit) file.close; Appender!string bigString = appender!string; bigString.reserve(records.length); debug { writefln("bigString.capacity is %s", bigString.capacity); } // NOTE: forward declaration required for this nested function void processRecord(const(int) id, const(int)[] values) { bigString ~= id.to!string ~ "," ~ values.format!"%(%s,%)" ~ "\n"; } foreach(ref const record; records) { foreach (ref rp; record.byPair) { processRecord(rp.expand); } } debug { writeln; writeln(bigString[].until("\n")); writeln; } // display just one record file.write(bigString[]); } // ----
Jan 21 2022
On Friday, 21 January 2022 at 22:25:32 UTC, forkit wrote:I really like how alias and mixin can simplify my code even further: //--- int[][int][] CreateDataSet (const(int) recordsNeeded, const(int) valuesPerRecord) { int[][int][] records; records.reserve(recordsNeeded); const int iotaStartNum = 100_000_001; alias iotaValues = Alias!"iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate"; alias recordValues = Alias!"iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array"; foreach(i, id; mixin(iotaValues)) { records ~= [ id: mixin(recordValues) ]; } return records; } //---
Jan 21 2022
On 1/21/22 6:24 PM, forkit wrote:On Friday, 21 January 2022 at 22:25:32 UTC, forkit wrote:oof! use enums for compile-time strings ;) ```d enum iotaValues = "iota(..."; ```I really like how alias and mixin can simplify my code even further: //--- int[][int][] CreateDataSet (const(int) recordsNeeded, const(int) valuesPerRecord) { Â Â Â int[][int][] records; Â Â Â records.reserve(recordsNeeded); Â Â Â const int iotaStartNum = 100_000_001; Â Â Â alias iotaValues = Alias!"iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate"; Â Â Â alias recordValues = Alias!"iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array";Â Â Â foreach(i, id; mixin(iotaValues)) Â Â Â { Â Â Â Â Â Â Â records ~= [ id: mixin(recordValues) ]; Â Â Â } Â Â Â return records; }Not sure I agree that the mixin looks better. Also, I'm curious about this code: ```d iota(valuesPerRecord).map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).array; ``` That second `valuesPerRecord` is not used in the lambda, and also it's not referring to the original element, it's the name of a parameter in the lambda. Are you sure this is doing what you want? -Steve
Jan 21 2022
On Saturday, 22 January 2022 at 01:33:16 UTC, Steven Schveighoffer wrote:so I why watching this video by Andrei: https://www.youtube.com/watch?v=mCrVYYlFTrA In it, he talked about writing the simplest design that could possibly work.... Which got me thinking.... // ---- module test; safe: import std.stdio : write, writef, writeln, writefln; import std.range : iota, enumerate; import std.array : array, byPair, Appender, appender; import std.random : Random, unpredictableSeed, dice, randomCover; import std.algorithm : map; import std.conv : to; import std.format; import std.stdio : File; import std.file : exists; import std.exception : enforce; import std.meta : Alias; debug { import std; } Random rnd; static this() { rnd = Random(unpredictableSeed); } void main(string[] args) { int recordsNeeded, valuesPerRecord; string fname; if(args.length < 4) // then set defaults { recordsNeeded = 10; valuesPerRecord= 8; version(Windows) { fname = "D:/rnd_records.txt"; } version(linux) { fname = "./rnd_records.txt"; } } else { // assumes valid values being passed in ;-) recordsNeeded = to!int(args[1]); valuesPerRecord = to!int(args[2]); fname = args[3]; } debug { writefln("%s records (where a record is: id and %s values), will be written to file: %s", recordsNeeded, valuesPerRecord, fname); } else { enforce(!exists(fname), "Oops! That file already exists!"); enforce(recordsNeeded <= 1_000_000_000, "C'mon! That's too many records!"); } CreateDataFile(recordsNeeded, valuesPerRecord, fname); writefln("All done. Check if records written to %s", fname); } void CreateDataFile(const(int) recordsNeeded, const(int) valuesPerRecord, const(string) fname) { auto file = File(fname, "w"); scope(exit) file.close; Appender!string bigString = appender!string; bigString.reserve(recordsNeeded); const int iotaStartNum = 100_000_001; foreach(i, id; iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate) { bigString ~= id.to!string ~ "," ~ valuesPerRecord.iota.map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4)).format!"%(%s,%)" ~ "\n"; } file.write(bigString[]); } // ----
Jan 21 2022
On Saturday, 22 January 2022 at 01:33:16 UTC, Steven Schveighoffer wrote:That second `valuesPerRecord` is not used in the lambda, and also it's not referring to the original element, it's the name of a parameter in the lambda. Are you sure this is doing what you want? -SteveIt just worked, so i didn't think about it too much.. but it seems to work either way. And to be honest, the only part of it I understand, is the dice part ;-) In any case I changed it: from: valuesPerRecord => to: i => // ---- void CreateDataFile(const(int) recordsNeeded, const(int) valuesPerRecord, const(string) fname) { auto rnd = Random(unpredictableSeed); auto file = File(fname, "w"); scope(exit) file.close; Appender!string bigString = appender!string; bigString.reserve(recordsNeeded); const int iotaStartNum = 100_000_001; foreach(i, id; iota(iotaStartNum, iotaStartNum + recordsNeeded).enumerate) { bigString ~= id.to!string ~ "," ~ valuesPerRecord.iota.map!(i => rnd.dice(0.6, 1.4)).format!"%(%s,%)" ~ "\n"; } file.write(bigString[]); } // ---
Jan 22 2022
On Friday, 21 January 2022 at 18:36:42 UTC, H. S. Teoh wrote:This is wasteful if you're not planning to use every ID in this million-entry long array. Much better to just use an AA to keep track of which IDs have already been generated instead. Of course, if you plan to use most of the array, then the AA may wind up using more memory than the array. So it depends on your use case. Tyes, I was thinking this over as I was waking up this morning, and thought... what the hell am I doing generating all those numbers that might never get used. better to do: const int iotaStartNum = 100_000_000; int[] idArray = iota(startiotaStartNum, iotaStartNum + recordsNeeded).array;
Jan 21 2022
On Fri, Jan 21, 2022 at 09:10:56AM +0000, forkit via Digitalmars-d-learn wrote: [...]turns out the problem has nothing to do with appender... It's actually this line: if (!idArray.canFind(x)): when i comment this out in the function below, the program does what I want in seconds. only problem is, the ids are no longer unique (in the file)[...] Ah yes, the good ole O(N²) trap that new programmers often fall into. :-) Using .canFind on an array of generated IDs means scanning the entire array every time you find a non-colliding ID. As the array grows, the cost of doing this increases. The overall effect is O(N²) time complexity, because you're continually scanning the array every time you generate a new ID. Use an AA instead, and performance should dramatically increase. I.e., instead of: size_t[] idArray; ... if (!idArray.canFind(x)): // O(N) cost to scan array write: bool[size_t] idAA; ... if (x in idAA) ... // O(1) cost to look up an ID T -- VI = Visual Irritation
Jan 21 2022
On 1/20/22 6:01 PM, forkit wrote:On Thursday, 20 January 2022 at 22:31:17 UTC, Steven Schveighoffer wrote:The compiler rules aren't enforced based on what code you wrote, it doesn't have the capability of proving that your code doesn't modify things. Instead, it enforces simple rules that allow prove that const data cannot be modified. I'll make it into a simpler example: ```d const int[] arr = [1, 2, 3, 4 5]; int[] arr2 = arr; ``` This code does not modify any data in arr. But that in itself isn't easy to prove. In order to ensure that arr is never modified, the compiler would have to analyze all the code, and every possible way that arr2 might escape or be used somewhere at some point to modify the data. It doesn't have the capability or time to do that (if I understand correctly, this is NP-hard). Instead, it just says, you can't convert references from const to mutable without a cast. That guarantees that you can't modify const data. However, it does rule out a certain class of code that might not modify the const data, even if it has the opportunity to. It's like saying, "we don't let babies play with sharp knives" vs. "we will let babies play with sharp knives but stop them just before they stab themselves." -SteveBecause it would allow altering const data.I'm not sure I understand. At what point in this function is valuesArray modified, and thus preventing it being passed in with const?
Jan 20 2022