www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Neat project: add pointer capability to std.bitmanip.bitfields

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Once done, this is a fantastic example of (a) the power of generative 
programming, and (b) the advantages of using library facilities instead 
of built-in features.

https://issues.dlang.org/show_bug.cgi?id=15397

Who would want to take it?


Andrei
Dec 02 2015
next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative programming, and (b) the advantages of using library 
 facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Dec 02 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/02/2015 02:54 PM, Vladimir Panteleev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of generative
 programming, and (b) the advantages of using library facilities
 instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Well it's system. -- Andrei
Dec 02 2015
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 2 December 2015 at 19:59:14 UTC, Andrei 
Alexandrescu wrote:
 On 12/02/2015 02:54 PM, Vladimir Panteleev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
 Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative
 programming, and (b) the advantages of using library 
 facilities
 instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Well it's system. -- Andrei
Considering that system is the default, I really don't think that's enough.
Dec 02 2015
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Thursday, 3 December 2015 at 00:08:17 UTC, Vladimir Panteleev 
wrote:
 On Wednesday, 2 December 2015 at 19:59:14 UTC, Andrei 
 Alexandrescu wrote:
 On 12/02/2015 02:54 PM, Vladimir Panteleev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
 Alexandrescu wrote:
 [...]
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Well it's system. -- Andrei
Considering that system is the default, I really don't think that's enough.
That just reminded me to file this: https://issues.dlang.org/show_bug.cgi?id=15399
Dec 02 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/2/15 7:22 PM, Vladimir Panteleev wrote:
 On Thursday, 3 December 2015 at 00:08:17 UTC, Vladimir Panteleev wrote:
 On Wednesday, 2 December 2015 at 19:59:14 UTC, Andrei Alexandrescu wrote:
 On 12/02/2015 02:54 PM, Vladimir Panteleev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei Alexandrescu
 wrote:
 [...]
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Well it's system. -- Andrei
Considering that system is the default, I really don't think that's enough.
That just reminded me to file this: https://issues.dlang.org/show_bug.cgi?id=15399
Nice. What I'd say is that at the end of the day there's documentation. -- Andrei
Dec 02 2015
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Thursday, 3 December 2015 at 00:31:46 UTC, Andrei Alexandrescu 
wrote:
 Nice. What I'd say is that at the end of the day there's 
 documentation. -- Andrei
Just to provide a bit of perspective... Although memory corruption may not seem so scary in short-lived programs where it can be trivially reproduced with the same input, it can be an absolute nightmare when it occurs in long-running server processes, for developers, sysadmins and end-users alike. A long time ago, a network service I wrote started having severe memory corruption issues. It went from crashing about once a day to once every few hours, and every time it crashed it pissed off a dozen users or more. Great ire and vitriol[1] was expressed towards the service, and everything I tried only seemed to make the situation worse. It took months of studying and playing with D GC internals (incl. unsucessful attempts to use the GC's own debugging code and additional debugging GC proxies - see Diamond), and finally after three nights of replay debugging a virtual machine which recorded a crash on my home PC, and trying to infer meaning from memory dumps of GC control structures, I've tracked down the bug. The final result was the addition of InvalidMemoryOperationError to Druntime. So, this is why I am on a war campaign against anything that might result in memory corruption in D. Another recent example was the controversy over readln (yes, std.stdio.File.readln, one of the basic operations in any programming language) corrupting memory: sure, this patch will make D programs no longer crash and burn in weird ways, but it will also make readln slower! Think of the benchmarks! Thankfully a solution was found which was both safe and acceptably fast. You said that "at the end of the day there's documentation". I would argue that at least in this case, it may not be enough. Consider, for example, a hypothetical user type "Pack", which takes a tuple/struct and automatically arranges the fields into a struct such that space is used optimally (e.g. all bools are clumped together into a bitfield, enums are only given as much bits as their .max needs, etc.). Pack only needs to know one property of each field: how many bits it really needs, and as such, it might elect to be agnostic of what a pointer is. If it uses std.bitmanip.bitfields as its backend, it will happily pack a pointer at its user's request, and the user will never see the pointer warning in Pack's documentation. And yes, although it's easy to point the finger at users and say "ha ha, it's your own fault, you did not RTFM", I think we should strive for better than that. I was recently close to having a repeat of the memory corruption situation (with the same service, too) due to the struct alignment issue. Luckily there was only one week of strife before I narrowed down the bug. Memory corruption may only manifest in certain conditions (e.g. release mode, after updating/switching compilers), and rollback/bisect are not always viable or useful options. Even if the issue I filed had been fixed at the time, it would not have helped in that particular case, since, well, D is unsafe by default, and it is situations such as these that make me occasionally glance in Rust's direction. [1]: http://dump.thecybershadow.net/41a97cdab29f9cd56340ce8a5163d6f8/rage.txt
Dec 03 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/03/2015 06:13 AM, Vladimir Panteleev wrote:
 You said that "at the end of the day there's documentation". I would
 argue that at least in this case, it may not be enough. Consider, for
 example, a hypothetical user type "Pack", which takes a tuple/struct and
 automatically arranges the fields into a struct such that space is used
 optimally (e.g. all bools are clumped together into a bitfield, enums
 are only given as much bits as their .max needs, etc.). Pack only needs
 to know one property of each field: how many bits it really needs, and
 as such, it might elect to be agnostic of what a pointer is. If it uses
 std.bitmanip.bitfields as its backend, it will happily pack a pointer at
 its user's request, and the user will never see the pointer warning in
 Pack's documentation. And yes, although it's easy to point the finger at
 users and say "ha ha, it's your own fault, you did not RTFM", I think we
 should strive for better than that.
I understand how you feel but I really don't know what else to do, which makes the entire discussion somewhat theoretical. At the end of the day Pack must document its own behavior and its users should have some notion of its characteristics. If Pack wishes to disallow pointers that can be easily done with static introspection. If it doesn't have enough information, it's poorly designed - it can't just take any bits and shove them in any way. -- Andrei
Dec 03 2015
parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Thursday, 3 December 2015 at 14:24:52 UTC, Andrei Alexandrescu 
wrote:
 On 12/03/2015 06:13 AM, Vladimir Panteleev wrote:
 You said that "at the end of the day there's documentation". I 
 would
 argue that at least in this case, it may not be enough. 
 Consider, for
 example, a hypothetical user type "Pack", which takes a 
 tuple/struct and
 automatically arranges the fields into a struct such that 
 space is used
 optimally (e.g. all bools are clumped together into a 
 bitfield, enums
 are only given as much bits as their .max needs, etc.). Pack 
 only needs
 to know one property of each field: how many bits it really 
 needs, and
 as such, it might elect to be agnostic of what a pointer is. 
 If it uses
 std.bitmanip.bitfields as its backend, it will happily pack a 
 pointer at
 its user's request, and the user will never see the pointer 
 warning in
 Pack's documentation. And yes, although it's easy to point the 
 finger at
 users and say "ha ha, it's your own fault, you did not RTFM", 
 I think we
 should strive for better than that.
I understand how you feel but I really don't know what else to do, which makes the entire discussion somewhat theoretical.
Doesn't matter as long as it's explicitly opt-in, and there's more than one way to do that. Template parameter flag, an alternative declaration such as unsafeBitfield, disallowing pointers but allowing a shallow wrapper around them ("UnmanagedPtr" OSLT), etc.
Dec 03 2015
prev sibling parent reply rsw0x <anonymous anonymous.com> writes:
On Wednesday, 2 December 2015 at 19:54:26 UTC, Vladimir Panteleev 
wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
 Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative programming, and (b) the advantages of using 
 library facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Iirc I'm the one that originally brought this up. There's no reason for lsb smuggling in pointers to be unsafe I personally think tricks like these are important in advertising D as a systems language, as I'm often missing some low level features compared to GNU C. Bye.
Dec 02 2015
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Thursday, 3 December 2015 at 01:31:05 UTC, rsw0x wrote:
 On Wednesday, 2 December 2015 at 19:54:26 UTC, Vladimir 
 Panteleev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
 Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative programming, and (b) the advantages of using 
 library facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?
Warning, this is very unsafe and incompatible with the GC. Bit-twiddling GC pointers can lead to memory corruption and very hard-to-track bugs. Such a feature must be opt-in in a very explicit way.
Iirc I'm the one that originally brought this up. There's no reason for lsb smuggling in pointers to be unsafe
True, assuming that: 1. The pointers are still aligned at machine word boundaries 2. The underlying storage type is one that the GC will scan for pointers (e.g. void or void*, not size_t/ubyte) 3. The setters enforce that the discarded pointer bits were zero 4. No more than 4 bits are reused (as the smallest GC object size is 16 bytes) Point 4 actually relies on the GC's current implementation, which could be an issue (it ties what is allowed to compile in code using the standard library with an implementation detail).
Dec 03 2015
parent reply deadalnix <deadalnix gmail.com> writes:
First it check for alignement. Considering this :

On Thursday, 3 December 2015 at 09:11:12 UTC, Vladimir Panteleev 
wrote:
 True, assuming that:

 1. The pointers are still aligned at machine word boundaries
No. The pointer needs to be aligned as per underlying data type expectation. If it isn't aligned, the operation that produced this unaligned pointer must be unsafe, not the bitfield capability.
 2. The underlying storage type is one that the GC will scan for 
 pointers (e.g. void or void*, not size_t/ubyte)
Yes, this one is not a problem currently, but can be (and in fact should be with a better GC). Hopefully, this is something I want to improve already and not a blocker considering the API, just an implementation detail.
 3. The setters enforce that the discarded pointer bits were zero
If these bits aren't 0, the operation that set them to 1 is the one that is unsafe.
 4. No more than 4 bits are reused (as the smallest GC object 
 size is 16 bytes)
Not correct. Considering the pointer is of type T*, then it is safe as long as align(T*) <= sizeof(T) which is correct on all architectures I know of (tho I wouldn't be surprised that some weird arch not used since the 70s may break this constraint). The only valid concern here is 2. , but currently not a problem with the GC we have, and simply an implementation issue.
Dec 03 2015
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Friday, 4 December 2015 at 01:35:45 UTC, deadalnix wrote:
 First it check for alignement. Considering this :

 On Thursday, 3 December 2015 at 09:11:12 UTC, Vladimir 
 Panteleev wrote:
 True, assuming that:

 1. The pointers are still aligned at machine word boundaries
No. The pointer needs to be aligned as per underlying data type expectation. If it isn't aligned, the operation that produced this unaligned pointer must be unsafe, not the bitfield capability.
You misunderstood. The bitfield must *store* the pointers at addresses that are aligned at machine word boundaries.
 3. The setters enforce that the discarded pointer bits were 
 zero
If these bits aren't 0, the operation that set them to 1 is the one that is unsafe.
Well, that depends on how many bits are discarded? And that's not log2(T.sizeof). `cast(size_t)ptr % T.sizeof` may not be 0 in all cases.
 4. No more than 4 bits are reused (as the smallest GC object 
 size is 16 bytes)
Not correct. Considering the pointer is of type T*, then it is safe as long as align(T*) <= sizeof(T) which is correct on all architectures I know of (tho I wouldn't be surprised that some weird arch not used since the 70s may break this constraint).
I realized this was off after posting but I don't understand your reasoning either. The size and alignment just put a bound on the number of bits, but without verification in the setter you can't be sure, right?
Dec 04 2015
parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 4 December 2015 at 10:31:19 UTC, Vladimir Panteleev 
wrote:
 I realized this was off after posting but I don't understand 
 your reasoning either. The size and alignment just put a bound 
 on the number of bits, but without verification in the setter 
 you can't be sure, right?
If one of the bit within the alignment is not 0, that mean you did something unsafe previously to create that pointer. There should be no safe way (and I know of no safe way) to create such a pointer. In fact, some hardware will outright fault if you try to manipulate unaligned data.
Dec 04 2015
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Friday, 4 December 2015 at 23:38:08 UTC, deadalnix wrote:
 On Friday, 4 December 2015 at 10:31:19 UTC, Vladimir Panteleev 
 wrote:
 I realized this was off after posting but I don't understand 
 your reasoning either. The size and alignment just put a bound 
 on the number of bits, but without verification in the setter 
 you can't be sure, right?
If one of the bit within the alignment is not 0, that mean you did something unsafe previously to create that pointer.
But this only applies to ... pointers to pointers, right? In D, only pointer variables have to be aligned to maintain safety, and even then that only applies to GC pointers (a C function may return an "unaligned" pointer pointer). struct { align(1): ubyte a; int b; } is still quite safe, and so is interpreting a pointer into an array of ubytes as an uint.
Dec 04 2015
parent reply deadalnix <deadalnix gmail.com> writes:
On Saturday, 5 December 2015 at 00:33:15 UTC, Vladimir Panteleev 
wrote:
 On Friday, 4 December 2015 at 23:38:08 UTC, deadalnix wrote:
 On Friday, 4 December 2015 at 10:31:19 UTC, Vladimir Panteleev 
 wrote:
 I realized this was off after posting but I don't understand 
 your reasoning either. The size and alignment just put a 
 bound on the number of bits, but without verification in the 
 setter you can't be sure, right?
If one of the bit within the alignment is not 0, that mean you did something unsafe previously to create that pointer.
But this only applies to ... pointers to pointers, right? In D, only pointer variables have to be aligned to maintain safety, and even then that only applies to GC pointers (a C function may return an "unaligned" pointer pointer). struct { align(1): ubyte a; int b; } is still quite safe, and so is interpreting a pointer into an array of ubytes as an uint.
No a pointer has some alignment that depends on whatever data it points to. You cannot steal any bits for a char* (and taggedPointer will reject it) you can steal one bit from a short*, and so on. This is checked statically.
Dec 04 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/4/15 10:57 PM, deadalnix wrote:
 On Saturday, 5 December 2015 at 00:33:15 UTC, Vladimir Panteleev wrote:
 On Friday, 4 December 2015 at 23:38:08 UTC, deadalnix wrote:
 On Friday, 4 December 2015 at 10:31:19 UTC, Vladimir Panteleev wrote:
 I realized this was off after posting but I don't understand your
 reasoning either. The size and alignment just put a bound on the
 number of bits, but without verification in the setter you can't be
 sure, right?
If one of the bit within the alignment is not 0, that mean you did something unsafe previously to create that pointer.
But this only applies to ... pointers to pointers, right? In D, only pointer variables have to be aligned to maintain safety, and even then that only applies to GC pointers (a C function may return an "unaligned" pointer pointer). struct { align(1): ubyte a; int b; } is still quite safe, and so is interpreting a pointer into an array of ubytes as an uint.
No a pointer has some alignment that depends on whatever data it points to. You cannot steal any bits for a char* (and taggedPointer will reject it) you can steal one bit from a short*, and so on. This is checked statically.
I think what Vladimir is referring to is an align(1) struct: struct Foo { align(1): ubyte a; int b; } Foo foo; int *ptr = &foo.b; // not pointing at aligned integer I think we should identify that tagged* does not support such pointers, and probably the ctor should assert this situation isn't occurring. -Steve
Dec 04 2015
parent deadalnix <deadalnix gmail.com> writes:
On Saturday, 5 December 2015 at 04:34:03 UTC, Steven 
Schveighoffer wrote:
 I think what Vladimir is referring to is an align(1) struct:

 struct Foo
 {
    align(1):
    ubyte a;
    int b;
 }

 Foo foo;
 int *ptr = &foo.b; // not pointing at aligned integer

 I think we should identify that tagged* does not support such 
 pointers, and probably the ctor should assert this situation 
 isn't occurring.

 -Steve
Yeah that is unsafe. What will happen is very architecture dependent to boot, even if you don't use the bitfield thing. That should probably not be safe.
Dec 05 2015
prev sibling parent reply ZombineDev <valid_email he.re> writes:
On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative programming, and (b) the advantages of using library 
 facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Dec 02 2015
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 2 December 2015 at 23:04:16 UTC, ZombineDev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
 Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative programming, and (b) the advantages of using 
 library facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Yeah, that'd be great if we could remove these scary warning about the GC on these, this is only FUD. It works just fine with the GC.
Dec 02 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/02/2015 06:38 PM, deadalnix wrote:
 Yeah, that'd be great if we could remove these scary warning about the
 GC on these, this is only FUD. It works just fine with the GC.
LSBs may work, MSBs most likely not. -- Andrei
Dec 02 2015
parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 2 December 2015 at 23:51:40 UTC, Andrei 
Alexandrescu wrote:
 On 12/02/2015 06:38 PM, deadalnix wrote:
 Yeah, that'd be great if we could remove these scary warning 
 about the
 GC on these, this is only FUD. It works just fine with the GC.
LSBs may work, MSBs most likely not. -- Andrei
Yes, this is checking that you are only stealing LSB and check that alignment allows you to steal that much. There are all the checks in place to make it safe.
Dec 02 2015
prev sibling parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Wednesday, 2 December 2015 at 23:38:33 UTC, deadalnix wrote:
 On Wednesday, 2 December 2015 at 23:04:16 UTC, ZombineDev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei 
 Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of 
 generative programming, and (b) the advantages of using 
 library facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Yeah, that'd be great if we could remove these scary warning about the GC on these, this is only FUD. It works just fine with the GC.
With the current GC, yes. If we allow this, any future GC implementation will have to expect pointers to be misaligned. If a GC is type aware, it can use this information to reject false pointers without having to look them up. Anyway, I guess that will not affect performance much, so it's probably ok.
Dec 03 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/3/15 7:20 AM, Marc Schütz wrote:
 On Wednesday, 2 December 2015 at 23:38:33 UTC, deadalnix wrote:
 On Wednesday, 2 December 2015 at 23:04:16 UTC, ZombineDev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei Alexandrescu
 wrote:
 Once done, this is a fantastic example of (a) the power of
 generative programming, and (b) the advantages of using library
 facilities instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Yeah, that'd be great if we could remove these scary warning about the GC on these, this is only FUD. It works just fine with the GC.
With the current GC, yes. If we allow this, any future GC implementation will have to expect pointers to be misaligned. If a GC is type aware, it can use this information to reject false pointers without having to look them up. Anyway, I guess that will not affect performance much, so it's probably ok.
First I will say, there is confusion on what is valid and what is not. Misaligned pointers are pointers that are stored misaligned. In other words, they are stored not on a 4-byte or 8-byte boundary for 32 bits or 64 bits arch respectively. An interior pointer is a pointer that is *properly aligned* but does not point at the first byte of a piece of memory. taggedPointer and taggedClassRef create *interior pointers*, not *misaligned pointers*. Andrei's proposal will create *misaligned pointers*. There is a huge difference. I can make an interior pointer without casts on any type: SomeType *pointer = ...; void[] p = pointer[0..1]; p = p[1..$]; If the GC does not support this being the only pointer to a memory location, then the GC is not suitable for D. Period. Code will break in subtle ways if you use such a GC. I can't see how a language with void* and/or unions could allow such a GC. -Steve
Dec 03 2015
next sibling parent Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Thursday, 3 December 2015 at 13:02:24 UTC, Steven 
Schveighoffer wrote:
 First I will say, there is confusion on what is valid and what 
 is not. Misaligned pointers are pointers that are stored 
 misaligned. In other words, they are stored not on a 4-byte or 
 8-byte boundary for 32 bits or 64 bits arch respectively.

 An interior pointer is a pointer that is *properly aligned* but 
 does not point at the first byte of a piece of memory. 
 taggedPointer and taggedClassRef create *interior pointers*, 
 not *misaligned pointers*. Andrei's proposal will create 
 *misaligned pointers*. There is a huge difference.

 I can make an interior pointer without casts on any type:

 SomeType *pointer = ...;
 void[] p = pointer[0..1];
 p = p[1..$];

 If the GC does not support this being the only pointer to a 
 memory location, then the GC is not suitable for D. Period. 
 Code will break in subtle ways if you use such a GC.

 I can't see how a language with void* and/or unions could allow 
 such a GC.
Indeed, I was talking about interior pointers. But you're right, I missed the fact that void pointers (and some others) can be valid interior pointers even to unstructured values. So the optimization I had in mind is not applicable in D, anyway. We should then just adjust the specification to specifically allow changing the LSBs.
Dec 03 2015
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/3/15 8:02 AM, Steven Schveighoffer wrote:
 An interior pointer is a pointer that is *properly aligned* but does not
 point at the first byte of a piece of memory. taggedPointer and
 taggedClassRef create *interior pointers*, not *misaligned pointers*.
 Andrei's proposal will create *misaligned pointers*. There is a huge
 difference.
I need to correct this. Andrei's proposal does not create misaligned pointers (as he specifically calls for no shifting for such pointers), just pointers to memory that is unrelated to the referenced memory. The effect is the same -- you cannot rely on such pointers to keep the memory in the GC. -Steve
Dec 03 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/02/2015 06:04 PM, ZombineDev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of generative
 programming, and (b) the advantages of using library facilities
 instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Sigh, yes. Both taggedPointer and taggedClassRef should be features of bitfields, not distinct names. One good thing to do would be to integrate those within bitfields, and then later perhaps undocumented. -- Andrei
Dec 02 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/2/15 6:51 PM, Andrei Alexandrescu wrote:
 On 12/02/2015 06:04 PM, ZombineDev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei Alexandrescu wrote:
 Once done, this is a fantastic example of (a) the power of generative
 programming, and (b) the advantages of using library facilities
 instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Sigh, yes. Both taggedPointer and taggedClassRef should be features of bitfields, not distinct names. One good thing to do would be to integrate those within bitfields, and then later perhaps undocumented.
taggedPointer and taggedClassRef are GC safe (despite the incorrect warning listed in the docs). Your proposed mechanism is not. IMO, we should keep those and close your enhancement, it doesn't add anything useful. Seems to me something that can break very easily. Phobos should in no way support such egregious casts implicitly. Even in system code. Do you have any rationale to prefer arbitrary bitfield pointers over GC safe ones? -Steve
Dec 03 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/03/2015 07:45 AM, Steven Schveighoffer wrote:
 On 12/2/15 6:51 PM, Andrei Alexandrescu wrote:
 On 12/02/2015 06:04 PM, ZombineDev wrote:
 On Wednesday, 2 December 2015 at 19:39:47 UTC, Andrei Alexandrescu
 wrote:
 Once done, this is a fantastic example of (a) the power of generative
 programming, and (b) the advantages of using library facilities
 instead of built-in features.

 https://issues.dlang.org/show_bug.cgi?id=15397

 Who would want to take it?


 Andrei
So, something like
Sigh, yes. Both taggedPointer and taggedClassRef should be features of bitfields, not distinct names. One good thing to do would be to integrate those within bitfields, and then later perhaps undocumented.
taggedPointer and taggedClassRef are GC safe (despite the incorrect warning listed in the docs). Your proposed mechanism is not.
It can be restricted to support what tagged* do.
 IMO, we should keep those and close your enhancement, it doesn't add
 anything useful. Seems to me something that can break very easily.
Please leave it open, thanks.
 Phobos should in no way support such egregious casts implicitly. Even in
  system code.

 Do you have any rationale to prefer arbitrary bitfield pointers over GC
 safe ones?
1. The less restricted version offers use of high-order bits as well. If we don't support that, those who need it will do that in client code with the usual liabilities. 2. There's no reason for taggedPointer and taggedClassRef to exist. They should be integrated within bitfields. Andrei
Dec 03 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/3/15 9:28 AM, Andrei Alexandrescu wrote:
 On 12/03/2015 07:45 AM, Steven Schveighoffer wrote:
 taggedPointer and taggedClassRef are GC safe (despite the incorrect
 warning listed in the docs). Your proposed mechanism is not.
It can be restricted to support what tagged* do.
This is a possibility. Allowing higher bit manipulation is no good for the GC. Allowing lower bit manipulation that extends past a single element is no good also. These restrictions are enforced at compile time by the tagged functions.
 IMO, we should keep those and close your enhancement, it doesn't add
 anything useful. Seems to me something that can break very easily.
Please leave it open, thanks.
I of course would not close it, that is not my place.
 Phobos should in no way support such egregious casts implicitly. Even in
  system code.

 Do you have any rationale to prefer arbitrary bitfield pointers over GC
 safe ones?
1. The less restricted version offers use of high-order bits as well.
Again, this is not GC-safe. But another thing taggedPointer and taggedClassRef do (that I think your proposal does not) is restrict the lower bits that can be manipulated based on the alignment of the target type.
 If
 we don't support that, those who need it will do that in client code
 with the usual liabilities.
The usual liabilities aren't mitigated by the proposal. I was under the impression that D should allow error-prone code, but shouldn't promote it.
 2. There's no reason for taggedPointer and taggedClassRef to exist. They
 should be integrated within bitfields.
One departure from bitfields for tagged* is that the API does not allow invalid pointer/bitfield specifications (the number of bits reserved for the pointer is implied from the pointer type and arch). Your proposal uses an assert to verify the bits from the pointer source are zero, allowing a possible corruption to occur if you compile with -release. The tagged* types prove at compile time that you will ALWAYS see zero bits there (because of alignment). As long as bitfields follows the same rules and API, then I think it could potentially be merged. But I don't see a huge value in this, seems like an unnecessary code break. -Steve
Dec 03 2015
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 3 December 2015 at 12:45:10 UTC, Steven 
Schveighoffer wrote:
 Do you have any rationale to prefer arbitrary bitfield pointers 
 over GC safe ones?
There are various valid use of this in HHVM for instance. One of the nasty trick that is used is to allocate the memory to JIT code in the the lower 32bits of memory, and then pad pointer with 0 to retrieve them. Because of this, various datastructures can be compacted, and address of code can be cramed directly in the instruction stream (at least on x86) when it can't be on 64 bits. There is a talk by Drew Parowski where he explains it (https://www.youtube.com/watch?v=XqK8Yuoq4ig I think, but not sure). However, I agree with the sentiment. This is the kind of features you are looking for to get the last few percent and shouldn't be encouraged. That is highly non portable and probably doesn't belong in an std module. NB: I considered adding this functionality when doing the taggedPointer thing, (x64 has 48bits of effective address space) but eventually decided against.
Dec 03 2015