digitalmars.D - dconf dod parser is 20% slower thoery

monkyyy (58/58) Jul 14 https://www.youtube.com/watch?v=s_1OG9GwyOw

Robert Schadek (34/34) Jul 18 On Monday, 14 July 2025 at 08:01:42 UTC, monkyyy wrote:

monkyyy (45/49) Jul 18 Im still seeing an extra layer of indirection.

monkyyy <crazymonkyyy gmail.com> writes:

https://www.youtube.com/watch?v=s_1OG9GwyOw

https://github.com/burner/Darser/blob/dbe6269aba4abce3b0c6ac07d00da6c86ee7c758/presentation/example.d#L150

```d
struct Parser {
	Document[] documents;
	Definitions[] definitionss;
	Definition[] definitions;
	OperationDefinition[] operationDefinitions;
	SelectionSet[] selectionSets;
	OperationType[] operationTypes;
	Selections[] selectionss;
	Selection[] selections;
	FragmentSpread[] fragmentSpreads;
	InlineFragment[] inlineFragments;
	Field[] fields;
	FieldName[] fieldNames;
	Arguments[] argumentss;
	ArgumentList[] argumentLists;
	Argument[] arguments;
	FragmentDefinition[] fragmentDefinitions;
	Directives[] directivess;
	Directive[] directives;
	VariableDefinitions[] variableDefinitionss;
	VariableDefinitionList[] variableDefinitionLists;
	VariableDefinition[] variableDefinitions;
	Variable[] variables;
	DefaultValue[] defaultValues;
	ValueOrVariable[] valueOrVariables;
	Value[] values;
	Type[] types;
```

this isnt quite right for soa

this is adding an extra layer of indirection

if you have some n^2 iteration lets say
```d
foreach(i;0..selections.length){
foreach(j;0..fields.length){
   ref selection()=>selections[i];
   ref field()=>fields[j];
   foo(selection,fields);
}}
```

The pointer math isnt being automagicly done *compile time* by 
the optimizer in the way this would:


```d
struct Parser(int length){
   Selection[length] selections;
   Field[length] fields;
   ...
```

(it cant be fully optimized at compile time 
selections[3].offset-fields[3].offset changes at runtime)

in the original code theres extra indirection, a bounds check 
that may not disappear(because *.length== all other *.length is 
not generteed), etc.

Ive gave up on aosoa, it probaly needs compiler support, but 
thats the thoery for a reason; I think lib level we should have a 
different target for making some bread and butter dod patterns

Jul 14

Robert Schadek <rburners gmail.com> writes:

On Monday, 14 July 2025 at 08:01:42 UTC, monkyyy wrote:

Thank you for taking an interest.

I thought about this a bit more and I think one issue is also 
that stuff is loaded one L1 cache line at a time, and the .sizeof 
of each struct is a lot small than one L1 cache line. Leading to 
waste. The parse tree being a tree will likely require quite a 
bit of index/pointer chasing. Resulting in more loads.

That begin said, what would be a nice try is to turn this.

```d
struct Parser {
	Document[] documents;
	Definitions[] definitionss;
	Definition[] definitions;
	OperationDefinition[] operationDefinitions;
	SelectionSet[] selectionSets;
}
```

into

```d
alias Nodes = SumType!(Document
, Definitions
, Definition
, OperationDefinition
, SelectionSet);

struct Parser {
   Nodes[] nodes;
}
```

The problem then is that the tree turned into an array should be 
accessed basically front to back to minimize cache misses. That 
might be difficult as its a tree, but hey, there is always 
something. If you're interested the parser generator branch that 
gives the individual structs is here 
https://github.com/burner/Darser/tree/ddd_parser

Jul 18

monkyyy <crazymonkyyy gmail.com> writes:

On Friday, 18 July 2025 at 08:53:23 UTC, Robert Schadek wrote:
 
 SumType

 struct Parser {
   Nodes[] nodes;

Im still seeing an extra layer of indirection.

consider int[100], int[] and int[int] and the data 1->3, 3->2, 
2->4, 4->(-1)

the ideal is int[100], it will allocate either on the stack or 
global scope fast has as possible and indexing it should always 
be inlinable, and I trust even the dumbest prefetcher to see "oh 
a 1 wants to see[3]"

int[int] is far from ideal but the hash maybe the prefetcher 
decides to untangle it and I imagine you used it in places for 
your nondata oriented way

int[] has 3 pieces that needs to be calculated at runtime 
`int[].ptr`, `int[].length` and `int[3]`, the pefetcher has to a) 
think that "3 is probably in that" b) the asm has to have a 
int[].ptr registor, taking up some cache line d) the prefetcher 
has to be able to combine `(int[].ptr + 3*int.sizeof)` which may 
as well be the hash of the hashtable thats math with several 
steps, e) then a swap context to druntimes assert(3<int[].length) 
f) then swap back

---

int[] is fast enough *as an range* because it has 2 iterations to 
find [0] but then its ++ing while checking a length, an ancient c 
for loop, the aa's are likely better for trees unless you go all 
the way

---

You have a big old block of memory, put it in a big static array 
or inline void* math for it to be data oriented design

Im struggling with getting it to work with you know none of my 
dips ever being accepted espeally none of the meta programming 
ones(my current best solution is opend dependent); but I think 
youd need a pattern of structs that accept pointer abstractions 
from an array

```d
struct foo(P){
    ...
    P left;
    P right;
}
struct array(alias T,int N){
   alias P=...
   T!P[N] data;
   ...
}
array!(foo,100) bar;
```

Jul 18

D Programming

C/C++ Programming

Other

digitalmars.D - dconf dod parser is 20% slower thoery