digitalmars.D - Need for (C++20) Contiguous Range

Per =?UTF-8?B?Tm9yZGzDtnc=?= (11/11) Oct 08 2020 C++20 defines

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/15) Oct 08 2020 Optimizations that require linear memory access, like SIMD. I

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/5) Oct 08 2020 How could `isContiguousRange` be defined in D?

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/8) Oct 08 2020 It would probably require a new range interface where front and

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/15) Oct 08 2020 Or a method slice() that return the container as a slice? In c++

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (9/12) Oct 08 2020 To elaborate, c++ containers have a member data() that points to

Steven Schveighoffer (6/17) Oct 08 2020 We have arrays in D, there is no need to duplicate what C++ does because...

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/12) Oct 08 2020 std::array and std::vector are array types, but the concept of
IGotD- (7/12) Oct 08 2020 Just clarify, you mean that anything that is continuous memory

Steven Schveighoffer (10/26) Oct 08 2020 Yes. I have argued in the past

Andrei Alexandrescu (4/34) Oct 08 2020 enum bool isDynamicArray(T) = is(DynamicArrayTypeOf!T) &&

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/5) Oct 08 2020 There's always the possibility of adding a builtin __traits(...).
Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/5) Oct 08 2020 BTW, what was the motivation behind all this special handling of

Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) Oct 08 2020 Let's take that in a separate thread.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/8) Oct 09 2020 So if I create my owner container that has an opCast overload to

Petar Kirov [ZombineDev] (23/32) Oct 09 2020 I think Steven's idea is not that the container itself is a

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/15) Oct 09 2020 Alright, I guess I would want to access the underlying buffer

Steven Schveighoffer (23/33) Oct 09 2020 No, just use a cast to an array (if that's what you implemented).

Simen =?UTF-8?B?S2rDpnLDpXM=?= (8/12) Oct 09 2020 There are ranges with contiguous memory that aren't slices -

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/12) Oct 09 2020 I guess it in some cases would be technically possible to emulate

IGotD- (8/16) Oct 09 2020 This thread makes me happy that I don't need to meddle with

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/10) Oct 09 2020 LOL IIRC Bjarne Stroustrup said that he does not know what will

Steven Schveighoffer (9/18) Oct 09 2020 Map over a contiguous range is going to return from the stack, not from

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/9) Oct 09 2020 All kinds of numerics. Like, getting the real parts from a

Steven Schveighoffer (9/17) Oct 09 2020 So basically, a contiguous primitive might be ptr, distance between

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/19) Oct 09 2020 I guess that is possible. I think a cacheline aligned bitmask

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/15) Oct 09 2020 Ok, so here is another example. Priority queue. Traversing it in

Steven Schveighoffer (9/29) Oct 09 2020 It's fast because the unordered accessor gets a slice, and

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (19/24) Oct 09 2020 Yes, at least the following:

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/8) Oct 09 2020 Not necessarily compression, just as likely performance

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/14) Oct 09 2020 Anyway, I think the main benefit of having stride support is that

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (46/60) Oct 09 2020 Yes, C++ has data(), but that does not imply that you provide an

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/10) Oct 09 2020 I meant if you want to add a filter that returns every other
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/3) Oct 09 2020 It might be worth mentioning that C++23 most likely will have

Steven Schveighoffer (3/7) Oct 08 2020 isDynamicArray!T

James Blachly (4/13) Oct 08 2020 Is this trait true of slices pointing to user managed blocks? In

Steven Schveighoffer (4/17) Oct 09 2020 The memory a slice points at has nothing to do with its type. If it's a

FeepingCreature (7/12) Oct 08 2020 `frontArray`, a version of `front` that returned a slice of the

Imperatorn (2/14) Oct 08 2020 The only thing I read was "C++ continuous rage"

H. S. Teoh (7/10) Oct 08 2020 [[...]

Andrei Alexandrescu (7/24) Oct 08 2020 It occured to me we could define that traite, but we got away without it...

Walter Bright (2/4) Oct 08 2020 We didn't "get away" with anything, we just did it right the first time ...

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

C++20 defines

- std::ranges::contiguous_range [1] on top of
- std::random_access_iterator [2]

as specializations of

- std::ranges::random_access_range on top of
- std::random_access_iterator

. A contiguous ranges has all its elements stored continuously 
(adjacently) in memory.

Does anybody know the application for this concept/predicate?

[1] https://en.cppreference.com/w/cpp/ranges/contiguous_range
[2] https://en.cppreference.com/w/cpp/iterator/contiguous_iterator

Oct 08 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 8 October 2020 at 13:30:30 UTC, Per Nordlöw wrote:
 C++20 defines

 - std::ranges::contiguous_range [1] on top of
 - std::random_access_iterator [2]

 as specializations of

 - std::ranges::random_access_range on top of
 - std::random_access_iterator

 . A contiguous ranges has all its elements stored continuously 
 (adjacently) in memory.

 Does anybody know the application for this concept/predicate?

 [1] https://en.cppreference.com/w/cpp/ranges/contiguous_range
 [2] 
 https://en.cppreference.com/w/cpp/iterator/contiguous_iterator

Optimizations that require linear memory access, like SIMD. I 
believe.

Oct 08 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim Grøstad 
wrote:
 Optimizations that require linear memory access, like SIMD. I 
 believe.

How could `isContiguousRange` be defined in D?

Oct 08 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 8 October 2020 at 13:59:09 UTC, Per Nordlöw wrote:
 On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim 
 Grøstad wrote:
 Optimizations that require linear memory access, like SIMD. I 
 believe.

 How could `isContiguousRange` be defined in D?

It would probably require a new range interface where front and 
popFront produce slices instead of elements.

Oct 08 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 8 October 2020 at 14:06:18 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 8 October 2020 at 13:59:09 UTC, Per Nordlöw wrote:
 On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim 
 Grøstad wrote:
 Optimizations that require linear memory access, like SIMD. I 
 believe.

 How could `isContiguousRange` be defined in D?

 It would probably require a new range interface where front and 
 popFront produce slices instead of elements.

Or a method slice() that return the container as a slice? In c++ 
containers have data and length, so I guess that is an option too.

But it would be more generic to have a front that it is a slice. 
Would work for btrees, deques etc.

Oct 08 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 8 October 2020 at 14:18:45 UTC, Ola Fosheim Grøstad 
wrote:
 Or a method slice() that return the container as a slice? In 
 c++ containers have data and length, so I guess that is an 
 option too.

To elaborate, c++ containers have a member data() that points to 
the beginning of the contiguous buffer and a member size() that 
indicates the number of elements. D has a way to indicate length, 
but I don't think there is any commonly used data() equivalent? 
Although one probably could define a freestanding function that 
can be overloaded? Then you could just test with the compiles 
trait.

Oct 08 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/8/20 10:34 AM, Ola Fosheim Grøstad wrote:
 On Thursday, 8 October 2020 at 14:18:45 UTC, Ola Fosheim Grøstad wrote:
 Or a method slice() that return the container as a slice? In c++ 
 containers have data and length, so I guess that is an option too.

 
 To elaborate, c++ containers have a member data() that points to the 
 beginning of the contiguous buffer and a member size() that indicates 
 the number of elements. D has a way to indicate length, but I don't 
 think there is any commonly used data() equivalent? Although one 
 probably could define a freestanding function that can be overloaded? 
 Then you could just test with the compiles trait.
 

We have arrays in D, there is no need to duplicate what C++ does because 
they don't have a real array type.

In D, you would check if the range is a dynamic array, and then use 
ptr/length to deal with optimizations (I do this in iopipe).

-Steve

Oct 08 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 8 October 2020 at 14:45:47 UTC, Steven Schveighoffer 
wrote:
 We have arrays in D, there is no need to duplicate what C++ 
 does because they don't have a real array type.

std::array and std::vector are array types, but the concept of 
being contiguous applies to all containers, also third party.

 In D, you would check if the range is a dynamic array, and then 
 use ptr/length to deal with optimizations (I do this in iopipe).

ok, if all custom containers with a linear buffer provides the 
same pointer interface then that would work like data().

But I think the c++ solution is too weak. It does not work with 
deques, btrees etc... :-/

Oct 08 2020

IGotD- <nise nise.com> writes:

On Thursday, 8 October 2020 at 14:45:47 UTC, Steven Schveighoffer 
wrote:
 We have arrays in D, there is no need to duplicate what C++ 
 does because they don't have a real array type.

 In D, you would check if the range is a dynamic array, and then 
 use ptr/length to deal with optimizations (I do this in iopipe).

 -Steve

Just clarify, you mean that anything that is continuous memory 
can be converted to a slice (as supposed to an array, even if the 
relationship is somewhat ambiguous is D).

So if a type with continuous memory storage is converted to a 
slice, will isDynamicArray!T also be valid for the slice?

Oct 08 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/8/20 11:45 AM, IGotD- wrote:
 On Thursday, 8 October 2020 at 14:45:47 UTC, Steven Schveighoffer wrote:
 We have arrays in D, there is no need to duplicate what C++ does 
 because they don't have a real array type.

 In D, you would check if the range is a dynamic array, and then use 
 ptr/length to deal with optimizations (I do this in iopipe).

 
 Just clarify, you mean that anything that is continuous memory can be 
 converted to a slice (as supposed to an array, even if the relationship 
 is somewhat ambiguous is D).

Yes. I have argued in the past 
(https://dlang.org/articles/d-array-article.html) that the "true" array 
type is hidden in the runtime, and that we should just call T[] a slice.

 
 So if a type with continuous memory storage is converted to a slice, 
 will isDynamicArray!T also be valid for the slice?
 

Yes.

I was going to post the code for isDynamicArray, but like many things in 
std.traits, it's needlessly complicated.

It should just be:

enum isDynamicArray(T) = is(T == U[], U);

-Steve

Oct 08 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 10/8/20 12:42 PM, Steven Schveighoffer wrote:
 On 10/8/20 11:45 AM, IGotD- wrote:
 On Thursday, 8 October 2020 at 14:45:47 UTC, Steven Schveighoffer wrote:
 We have arrays in D, there is no need to duplicate what C++ does 
 because they don't have a real array type.

 In D, you would check if the range is a dynamic array, and then use 
 ptr/length to deal with optimizations (I do this in iopipe).

 Just clarify, you mean that anything that is continuous memory can be 
 converted to a slice (as supposed to an array, even if the 
 relationship is somewhat ambiguous is D).

 
 Yes. I have argued in the past 
 (https://dlang.org/articles/d-array-article.html) that the "true" array 
 type is hidden in the runtime, and that we should just call T[] a slice.
 
 So if a type with continuous memory storage is converted to a slice, 
 will isDynamicArray!T also be valid for the slice?

 Yes.
 
 I was going to post the code for isDynamicArray, but like many things in 
 std.traits, it's needlessly complicated.
 
 It should just be:
 
 enum isDynamicArray(T) = is(T == U[], U);

enum bool isDynamicArray(T) = is(DynamicArrayTypeOf!T) && 
!isAggregateType!T;

Is that again about supporting enums? Sigh.

Oct 08 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 8 October 2020 at 17:45:30 UTC, Andrei Alexandrescu 
wrote:
 Is that again about supporting enums? Sigh.

There's always the possibility of adding a builtin __traits(...). 
;)

Oct 08 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 8 October 2020 at 17:45:30 UTC, Andrei Alexandrescu 
wrote:
 Is that again about supporting enums? Sigh.

BTW, what was the motivation behind all this special handling of 
enumerations?

Oct 08 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 8 October 2020 at 19:30:08 UTC, Per Nordlöw wrote:
 BTW, what was the motivation behind all this special handling 
 of enumerations?

Let's take that in a separate thread.

Oct 08 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 8 October 2020 at 16:42:51 UTC, Steven Schveighoffer 
wrote:
 It should just be:

 enum isDynamicArray(T) = is(T == U[], U);

So if I create my owner container that has an opCast overload to 
a slice of the underlying buffer, would isDynamicArray be true 
for my container?

If not you need a new test.

Oct 09 2020

Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:

On Friday, 9 October 2020 at 09:52:12 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 8 October 2020 at 16:42:51 UTC, Steven 
 Schveighoffer wrote:
 It should just be:

 enum isDynamicArray(T) = is(T == U[], U);

 So if I create my owner container that has an opCast overload 
 to a slice of the underlying buffer, would isDynamicArray be 
 true for my container?

 If not you need a new test.

I think Steven's idea is not that the container itself is a 
contiguous range, but that slicing it would return a contiguous 
range, which itself is a D slice. I.e. you need to implement 
opSlice() (or opIndex()) with no parameters, which should return 
a slice of the container's contents.

For example, std.array.Appender implements it:
https://dlang.org/phobos/std_array#.Appender.opSlice

However std.container.array doesn't and instead returns a custom 
Range struct.
https://dlang.org/phobos/std_container_array#.Array.opSlice

On the other hand, std.container.array has opSliceAssign:
https://dlang.org/phobos/std_container_array#.Array.opSliceAssign

which covers part of the need for C++'s contiguous range concept.

Another interesting thing to note is that mir-algorithm (which 
implements multi-dimensional slices) has specifically the concept 
of ContiguousSlices:
http://mir-algorithm.libmir.org/mir_ndslice_slice.html#SliceKind

http://mir-algorithm.libmir.org/mir_ndslice_traits.html
http://mir-algorithm.libmir.org/mir_ndslice_slice.html#Slice 
(check the "Internal Binary Representation" section)

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 13:05:44 UTC, Petar Kirov 
[ZombineDev] wrote:
 I think Steven's idea is not that the container itself is a 
 contiguous range, but that slicing it would return a contiguous 
 range, which itself is a D slice. I.e. you need to implement 
 opSlice() (or opIndex()) with no parameters, which should 
 return a slice of the container's contents.

Alright, I guess I would want to access the underlying buffer 
regardless of whether the container is indexable or not...

Although, an optimization protocol probably should let you query 
whether elements can be out of order. And possibly also test 
whether an element is present or not.

Then you could allow direct access to the buffer of a simple hash 
table or priority queue etc.

Hm. Interesting topic.

Oct 09 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/9/20 5:52 AM, Ola Fosheim Grøstad wrote:
 On Thursday, 8 October 2020 at 16:42:51 UTC, Steven Schveighoffer wrote:
 It should just be:

 enum isDynamicArray(T) = is(T == U[], U);

 
 So if I create my owner container that has an opCast overload to a slice 
 of the underlying buffer, would isDynamicArray be true for my container?

No.

 
 If not you need a new test.
 

No, just use a cast to an array (if that's what you implemented).

We are talking about a range that is also contiguous in memory. The only 
logical reason to require this is to get a ptr and a length and use the 
CPU specialized instructions for these things. We have that type 
pattern, it's a slice. C++ didn't have that formally defined, so they 
needed to add something.

When I developed iopipe, I made the declaration that the buffer window 
shall be a random-access range. I thought of the future, where people 
might find new cool ways to use buffers that weren't arrays.

And when I got to thinking about the performance of such buffers (like, 
let's say a ring buffer that has 2 slices over the same underlying 
buffer, so you never have to copy), the cost of *indexing* is going to 
overwhelm the benefit of not having to copy. I even implemented a 
ringbuffer using memory map tricks, and it's not any better than a 
straight array.

So even though I still define the buffer window as having to be a random 
access range, all current buffer types are arrays (that may change if I 
ever get around to implementing an inter-thread iopipe).

Contiguous memory has the properties that allow the CPU to shine, and 
the way to say "I have contiguous memory" in D is: use a slice.

-Steve

Oct 09 2020

Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:

On Friday, 9 October 2020 at 13:31:18 UTC, Steven Schveighoffer 
wrote:
 We are talking about a range that is also contiguous in memory. 
 The only logical reason to require this is to get a ptr and a 
 length and use the CPU specialized instructions for these 
 things. We have that type pattern, it's a slice.

There are ranges with contiguous memory that aren't slices - 
stride and occasionally map over a contiguous range comes to 
mind. Converting these to slices may not be possible, and 
certainly not efficient.

--
   Simen

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 13:38:28 UTC, Simen Kjærås wrote:
 There are ranges with contiguous memory that aren't slices - 
 stride and occasionally map over a contiguous range comes to 
 mind. Converting these to slices may not be possible, and 
 certainly not efficient.

I guess it in some cases would be technically possible to emulate 
fixed stride with a recast to a slice of a struct with padding 
before and after the element you want.

But not very clean.

Microsoft's experimental gsl::span implementation did support 
strides, but I don't think that is possible with the std::span 
that ended up in the standard.  (C++ span is similar to D slices)

Oct 09 2020

IGotD- <nise nise.com> writes:

On Friday, 9 October 2020 at 15:14:19 UTC, Ola Fosheim Grøstad 
wrote:
 I guess it in some cases would be technically possible to 
 emulate fixed stride with a recast to a slice of a struct with 
 padding before and after the element you want.

 But not very clean.

 Microsoft's experimental gsl::span implementation did support 
 strides, but I don't think that is possible with the std::span 
 that ended up in the standard.  (C++ span is similar to D 
 slices)

This thread makes me happy that I don't need to meddle with 
modern C++. In order to use modern C++ you need to become some 
kind of botanist in order to keep track of all the primitives in 
C++. Just like in the vegetable kingdom, you might discover new 
primitives that you didn't know and new primitives seem to cross 
breed all the time creating new ones.

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 16:09:08 UTC, IGotD- wrote:
 This thread makes me happy that I don't need to meddle with 
 modern C++. In order to use modern C++ you need to become some 
 kind of botanist in order to keep track of all the primitives 
 in C++. Just like in the vegetable kingdom, you might discover 
 new primitives that you didn't know and new primitives seem to 
 cross breed all the time creating new ones.

LOL IIRC Bjarne Stroustrup said that he does not know what will 
be in C++23 because he will need a couple of years to play with 
what is in C++20. (Or something along those lines).

Oct 09 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/9/20 9:38 AM, Simen Kjærås wrote:
 On Friday, 9 October 2020 at 13:31:18 UTC, Steven Schveighoffer wrote:
 We are talking about a range that is also contiguous in memory. The 
 only logical reason to require this is to get a ptr and a length and 
 use the CPU specialized instructions for these things. We have that 
 type pattern, it's a slice.

 
 There are ranges with contiguous memory that aren't slices - stride and 
 occasionally map over a contiguous range comes to mind. Converting these 
 to slices may not be possible, and certainly not efficient.

Map over a contiguous range is going to return from the stack, not from 
the data.

A stride is not contiguous, it skips data.

I'm struggling to understand what the benefit of defining these as 
contiguous is. What can you do when defining something that isn't an 
array as contiguous memory? What are the primitives that you use, for 
what purpose?

-Steve

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer 
wrote:
 I'm struggling to understand what the benefit of defining these 
 as contiguous is. What can you do when defining something that 
 isn't an array as contiguous memory? What are the primitives 
 that you use, for what purpose?

All kinds of numerics. Like, getting the real parts from a 
complex set and use an algorithm that has been defined on reals 
only.

Oct 09 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/9/20 11:47 AM, Ola Fosheim Grøstad wrote:
 On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer wrote:
 I'm struggling to understand what the benefit of defining these as 
 contiguous is. What can you do when defining something that isn't an 
 array as contiguous memory? What are the primitives that you use, for 
 what purpose?

 
 All kinds of numerics. Like, getting the real parts from a complex set 
 and use an algorithm that has been defined on reals only.

So basically, a contiguous primitive might be ptr, distance between 
elements, and total elements? And this is something you can plug into a 
low-level API? I can see that being a possibility, but also the 
underlying type has to be an array anyway.

I don't know that we would have to define a range primitive for this. 
Just a set of operations on an array of data, with parameters for the 
other pieces.

-Steve

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 16:05:09 UTC, Steven Schveighoffer 
wrote:
 So basically, a contiguous primitive might be ptr, distance 
 between elements, and total elements? And this is something you 
 can plug into a low-level API? I can see that being a 
 possibility, but also the underlying type has to be an array 
 anyway.

I guess that is possible. I think a cacheline aligned bitmask 
version is needed to do better than what C++ numerics TS will end 
up with, but hard to design.

 I don't know that we would have to define a range primitive for 
 this. Just a set of operations on an array of data, with 
 parameters for the other pieces.

It depends on how common it is to process data without changing 
it?

I don't know. Maybe look at existing code and see what people do. 
Not only D code, but look at what people who use dataflow 
frameworks do.

I guess github would be a good resource.

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer 
wrote:
 What can you do when defining something that isn't an array as 
 contiguous memory? What are the primitives that you use, for 
 what purpose?

Ok, so here is another example. Priority queue. Traversing it in 
order would jump over the place and be slow. A sequential scan 
has major cache-prefetch benefits.

All you need is a new filter "unordered", when that is applied to 
a container that provides an unordered contiguous memory 
interface it can do a sequential scan.

So you would get:

"container.somerangefunction" is in order, but slow.

"container.unordered.somerangefunction" may not be in order, but 
is fast.

Oct 09 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/9/20 12:05 PM, Ola Fosheim Grøstad wrote:
 On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer wrote:
 What can you do when defining something that isn't an array as 
 contiguous memory? What are the primitives that you use, for what 
 purpose?

 
 Ok, so here is another example. Priority queue. Traversing it in order 
 would jump over the place and be slow. A sequential scan has major 
 cache-prefetch benefits.
 
 All you need is a new filter "unordered", when that is applied to a 
 container that provides an unordered contiguous memory interface it can 
 do a sequential scan.
 
 So you would get:
 
 "container.somerangefunction" is in order, but slow.
 
 "container.unordered.somerangefunction" may not be in order, but is fast.
 
 

It's fast because the unordered accessor gets a slice, and 
somerangefunction sees its a slice and does something magic because it's 
contiguous memory.

The question is, would there be any purpose to returning something OTHER 
than a slice for unordered? And I don't mean, you happen to get benefits 
because it's sequential, it has to be something that the algorithm will 
do differently because it sees it's a special type.

-Steve

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 16:57:52 UTC, Steven Schveighoffer 
wrote:
 The question is, would there be any purpose to returning 
 something OTHER than a slice for unordered? And I don't mean, 
 you happen to get benefits because it's sequential, it has to 
 be something that the algorithm will do differently because it 
 sees it's a special type.

Yes, at least the following:

1. alignment guarantees (so you don't have to check that at 
runtime)
2. min-size guarantees (e.g. all slices are at least 2 elements)
3. annotation of elements being valid or not (scanning hashtables 
etc)
4. other guarantees (not-null, not-Nan etc)

You also have the bitsequences/masks I've mentioned that could be 
more extensive, giving various qualities of the slice, so that 
you can scan faster without even loading the data. Both point 3.) 
and 4.) could be bitmasks.

There are also some compression techniques that provide 
skip-pointers. Kinda like skip-lists:

https://en.wikipedia.org/wiki/Skip_list

There are certainly many interesting possibilities, but I think 
bitmasks would go a long way (with the option to skip all-zero 
masks).

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 17:09:44 UTC, Ola Fosheim Grøstad 
wrote:
 There are also some compression techniques that provide 
 skip-pointers. Kinda like skip-lists:

Not necessarily compression, just as likely performance 
optimization of scanning. Although compression techniques use 
such skip-pointers (or rather offsets) to jump from one header to 
the next, bypassing a variable sized data chunk.

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer 
wrote:
 A stride is not contiguous, it skips data.

Anyway, I think the main benefit of having stride support is that 
you can write an algorithm that requires say double or a pair of 
doubles (like a coordinate) and optimize it and compile it to one 
precompiled library and then use it on any sequence of structs. 
All you need to provide is the initial offset and the stride 
offset.

So, if you have some complicated algorithm with maybe some asm in 
it to do cluster analysis of coordinates, then you can use it on 
any uniform struct memory layout. It can even be completely 
different structs as long as they all have the same offset.

Assuming that the stride is a runtime parameter.

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 13:31:18 UTC, Steven Schveighoffer 
wrote:
 We are talking about a range that is also contiguous in memory. 
 The only logical reason to require this is to get a ptr and a 
 length and use the CPU specialized instructions for these 
 things. We have that type pattern, it's a slice. C++ didn't 
 have that formally defined, so they needed to add something.

Yes, C++ has data(), but that does not imply that you provide an 
array interface. Actually, I'm not even sure it implies that the 
elements are in order... but it probably does now that they 
require random-access properties.

Either way, you want a SIMD aligned slice. I guess if one had 
some kind of interface where you fetchUnaligned first and from 
there fetchAligned, and the finish with fetchUnaligned could work.

A better solution is to have a bitmask that tells you whether the 
element is active or not. Then you just provide an aligned 
pointer to the range-algorithm. But this isn't going to work for 
safe code.

Modern SIMD on x86 has this built in.

This also means that you can continue to use SIMD operations 
after filter.

So when you "pop" slices you might get a bit mask sequence like 
this

00001111 11111111 11100000

(I show 8 elements per slice here, probably should be configured 
to 2, 4, 8 16, 32, 64, 128, 256)


Ok, so then you have a filter than will return every other 
element, then you get the same slices, but a different mask 
pattern:

00001010 10101010 10100000

Pretty neat actually.

This would also work with randomly populated buffers (like a 
hash).

 And when I got to thinking about the performance of such 
 buffers (like, let's say a ring buffer that has 2 slices over 
 the same underlying buffer, so you never have to copy), the 
 cost of *indexing* is going to overwhelm the benefit of not 
 having to copy. I even implemented a ringbuffer using memory 
 map tricks, and it's not any better than a straight array.

I've literally spent weeks implementing various (some threadsafe) 
ringbuffers. It can be surprisingly challenging when you start to 
deal with slices and "transactional-like" features. Simple 
concept, but it can be a challenge to get up to speed and also 
being sure that it is correct.

There are some neat tricks one can use when implementing 
ringbuffers that are 2^n depending on what operations you desire. 
One important question is whether you always require one element 
to be unused or not, that tiny detail can have a major impact on 
the implementation :-D.

Anyway, when implementing a ring buffer, only include the 
operations you actually need. The more generic it is, the more 
overhead you'll get. Sadly. It is mind-blowing how many different 
implementation strategies there are. (for such a trivial concept)

It is nearly impossible to create a best-of-breed library 
implementation for that reason. Very sensitive to the use context.

But this is getting off-topic...

 Contiguous memory has the properties that allow the CPU to 
 shine, and the way to say "I have contiguous memory" in D is: 
 use a slice.

But that kinda implies that the buffer is indexable and in order.

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Friday, 9 October 2020 at 13:55:19 UTC, Ola Fosheim Grøstad 
wrote:
 Ok, so then you have a filter than will return every other 
 element, then you get the same slices, but a different mask 
 pattern:

 00001010 10101010 10100000

I meant if you want to add a filter that returns every other 
element then you just do a bitwise and with 10101010 on the mask.

You can do the same with conditionals. E.g. you can get the signs 
of all the elements in the SIMD register as a bitmask.

Oct 09 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

It might be worth mentioning that C++23 most likely will have 
optimized numerics ranges. That was mentioned in a talk on 
CPPCON2020.

Oct 09 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/8/20 9:59 AM, Per Nordlöw wrote:
 On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim Grøstad wrote:
 Optimizations that require linear memory access, like SIMD. I believe.

 
 How could `isContiguousRange` be defined in D?

isDynamicArray!T

-Steve

Oct 08 2020

James Blachly <james.blachly gmail.com> writes:

On 10/8/20 10:06 AM, Steven Schveighoffer wrote:
 On 10/8/20 9:59 AM, Per Nordlöw wrote:
 On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim Grøstad wrote:
 Optimizations that require linear memory access, like SIMD. I believe.

 How could `isContiguousRange` be defined in D?

 
 isDynamicArray!T
 
 -Steve

Is this trait true of slices pointing to user managed blocks? In 
particular I may use arena-allocation strategy but would love to have 
the type also be `isContiguousRange`

Oct 08 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/8/20 6:28 PM, James Blachly wrote:
 On 10/8/20 10:06 AM, Steven Schveighoffer wrote:
 On 10/8/20 9:59 AM, Per Nordlöw wrote:
 On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim Grøstad wrote:
 Optimizations that require linear memory access, like SIMD. I believe.

 How could `isContiguousRange` be defined in D?

 isDynamicArray!T

 
 Is this trait true of slices pointing to user managed blocks? In 
 particular I may use arena-allocation strategy but would love to have 
 the type also be `isContiguousRange`

The memory a slice points at has nothing to do with its type. If it's a 
T[], it's a dynamic array.

-Steve

Oct 09 2020

FeepingCreature <feepingcreature gmail.com> writes:

On Thursday, 8 October 2020 at 13:59:09 UTC, Per Nordlöw wrote:
 On Thursday, 8 October 2020 at 13:43:01 UTC, Ola Fosheim 
 Grøstad wrote:
 Optimizations that require linear memory access, like SIMD. I 
 believe.

 How could `isContiguousRange` be defined in D?

`frontArray`, a version of `front` that returned a slice of the 
next n elements iff there was an internal buffer already in that 
form? That would work with arrays, static buffers (sockets), ring 
buffers... Not sure if you also needed a `popFrontArray`.

I think the standard approach right now is just to define a range 
over slices.

Oct 08 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Thursday, 8 October 2020 at 13:30:30 UTC, Per Nordlöw wrote:
 C++20 defines

 - std::ranges::contiguous_range [1] on top of
 - std::random_access_iterator [2]

 as specializations of

 - std::ranges::random_access_range on top of
 - std::random_access_iterator

 . A contiguous ranges has all its elements stored continuously 
 (adjacently) in memory.

 Does anybody know the application for this concept/predicate?

 [1] https://en.cppreference.com/w/cpp/ranges/contiguous_range
 [2] 
 https://en.cppreference.com/w/cpp/iterator/contiguous_iterator

The only thing I read was "C++ continuous rage"

Oct 08 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Oct 08, 2020 at 03:55:09PM +0000, Imperatorn via Digitalmars-d wrote:
 On Thursday, 8 October 2020 at 13:30:30 UTC, Per Nordl�w wrote:

[...]
 [1] https://en.cppreference.com/w/cpp/ranges/contiguous_range


[[...]
 The only thing I read was "C++ continuous rage"

LOL, that about describes my 15+ years of experience with C++. :-D


T

-- 
Turning your clock 15 minutes ahead won't cure lateness---you're just making
time go faster!

Oct 08 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 10/8/20 9:30 AM, Per Nordlöw wrote:
 C++20 defines
 
 - std::ranges::contiguous_range [1] on top of
 - std::random_access_iterator [2]
 
 as specializations of
 
 - std::ranges::random_access_range on top of
 - std::random_access_iterator
 
 . A contiguous ranges has all its elements stored continuously 
 (adjacently) in memory.
 
 Does anybody know the application for this concept/predicate?
 
 [1] https://en.cppreference.com/w/cpp/ranges/contiguous_range
 [2] https://en.cppreference.com/w/cpp/iterator/contiguous_iterator

It occured to me we could define that traite, but we got away without it 
by implicitly decreeing that anything contiguous is just supposed to be 
a T[].

There would be a few cases where that could be useful, i.e. the result 
of map() over a T[]. But we'd need a strong case on why it would be 
useful to distinguish that from a random-access range.

Oct 08 2020

Walter Bright <newshound2 digitalmars.com> writes:

On 10/8/2020 10:49 AM, Andrei Alexandrescu wrote:
 It occured to me we could define that traite, but we got away without it by 
 implicitly decreeing that anything contiguous is just supposed to be a T[].

We didn't "get away" with anything, we just did it right the first time :-)

Oct 08 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Need for (C++20) Contiguous Range