www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Is void* compatible with function pointers?

reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
In C and C++, void* is for data pointers but it works (by accident?) for 
function pointers on all popular platforms.

Does D have anything to say about this topic?

Ali
Jun 23 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Jun 2014 16:30:41 -0400, Ali =C3=87ehreli <acehreli yahoo.com=
 wrote:
 In C and C++, void* is for data pointers but it works (by accident?) f=
or =
 function pointers on all popular platforms.
Doing a bit of research, it seems that it is architecture dependent. For= = instance, a memory model may use pointers that are larger for function = addresses than for object pointers. In this case, casting to and from vo= id = * would lose data. But most platforms have the same pointer width for both data and = functions. So it works, not by accident, but because that's what the = architecture dictates. I think it's important to state that the conversion from function pointe= r = to void pointer is not specified as "undefined", but actually just not = specified at all. I would say it's architecture-defined.
 Does D have anything to say about this topic?
Since most architectures use same-size words for function addresses and = = object addresses, D would be fine to say it's defined and valid. I think= = the extreme outliers are architectures that are not equal, and D will no= t = be harmed too badly by making this distinction. Any D flavor that would = be = ported to such an architecture may have to be a derived language. -Steve
Jun 23 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Mon, Jun 23, 2014 at 04:49:27PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On Mon, 23 Jun 2014 16:30:41 -0400, Ali Çehreli <acehreli yahoo.com> wrote:
 
In C and C++, void* is for data pointers but it works (by accident?) for
function pointers on all popular platforms.
AFAIK, in C++ casting a function pointer to void* is undefined behaviour, and while it does work for the most common platforms, there are systems for which it would fail catastrophically. This is a particularly nasty sticking point for Posix C++ implementations (you can find discussions on this topic if you google for it), because according to the spec, dlsym() should never work for function symbols in the loaded library, though in practice, all Posix platforms (that I know of) don't actually have any problem with it. [...]
Does D have anything to say about this topic?
Since most architectures use same-size words for function addresses and object addresses, D would be fine to say it's defined and valid. I think the extreme outliers are architectures that are not equal, and D will not be harmed too badly by making this distinction. Any D flavor that would be ported to such an architecture may have to be a derived language.
[...] We *could* technically define D's void* to be the same size as the largest pointer type on that particular platform, and it would avoid the problem. The issue in C++ arises because void* is assumed to point to *data* as opposed to code, so it may be smaller than necessary to be a function pointer. But again, in practice I think it's safe to say that the platforms affected by this issue are few in number and not commonly-used, so we should be safe. T -- Study gravitation, it's a field with a lot of potential.
Jun 23 2014
parent reply "deadalnix" <deadalnix gmail.com> writes:
If it will work that way on most plateforms, the other way around
can do VERY nasty things, even on common plateforms like ARM.

This is because most CPUs consider the instructions as immutable.
Even x86 do not provide any guarantee (which makes it very hard
to swap implementation outside of a VM).

Casting from void* to function is pretty guaranteed to be
undefined behavior on all plateforms, unless you emit the proper
barriers.
Jun 23 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Jun 2014 20:48:34 -0400, deadalnix <deadalnix gmail.com> wrote:

 If it will work that way on most plateforms, the other way around
 can do VERY nasty things, even on common plateforms like ARM.
Casting to and from a function pointer should not do anything horrific, unless the intermediate type cannot hold the full function pointer. It is no different than casting from immutable pointer to mutable pointer, and then back to immutable pointer. As long as you don't write to it while it's mutable, it's not illegal.
 This is because most CPUs consider the instructions as immutable.
 Even x86 do not provide any guarantee (which makes it very hard
 to swap implementation outside of a VM).
Remember, these are not functions, but function pointers. You are not modifying the function at all.
 Casting from void* to function is pretty guaranteed to be
 undefined behavior on all plateforms, unless you emit the proper
 barriers.
Can you demonstrate a C example that will fail on ARM? -Steve
Jun 23 2014
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 24 June 2014 at 01:41:13 UTC, Steven Schveighoffer 
wrote:
 This is because most CPUs consider the instructions as 
 immutable.
 Even x86 do not provide any guarantee (which makes it very hard
 to swap implementation outside of a VM).
Remember, these are not functions, but function pointers. You are not modifying the function at all.
void* is not a function and can come from anything that is mutable.
 Casting from void* to function is pretty guaranteed to be
 undefined behavior on all plateforms, unless you emit the 
 proper
 barriers.
Can you demonstrate a C example that will fail on ARM?
Anything that JIT is affected. Consider the pseudocode: void JIT() { void* code = malloc(XXX); // Write instructions into the allocated chunk of memory. membar(); // Memory barrier to avoid reordering. auto fun = cast(void function()) code; fun(); // Even if your codegen is correct, anything can happen this point. }
Jun 23 2014
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 24 June 2014 at 04:24:51 UTC, deadalnix wrote:
 On Tuesday, 24 June 2014 at 01:41:13 UTC, Steven Schveighoffer 
 wrote:
 This is because most CPUs consider the instructions as 
 immutable.
 Even x86 do not provide any guarantee (which makes it very 
 hard
 to swap implementation outside of a VM).
Remember, these are not functions, but function pointers. You are not modifying the function at all.
void* is not a function and can come from anything that is mutable.
 Casting from void* to function is pretty guaranteed to be
 undefined behavior on all plateforms, unless you emit the 
 proper
 barriers.
Can you demonstrate a C example that will fail on ARM?
Anything that JIT is affected. Consider the pseudocode: void JIT() { void* code = malloc(XXX); // Write instructions into the allocated chunk of memory. membar(); // Memory barrier to avoid reordering. auto fun = cast(void function()) code; fun(); // Even if your codegen is correct, anything can happen this point. }
That's one thing, but casting a function pointer to void* and then back again later, without modification, must be defined, no? That's a very different situation to forging your own executable code.
Jun 23 2014
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Jun 2014 00:24:50 -0400, deadalnix <deadalnix gmail.com> wrote:

 On Tuesday, 24 June 2014 at 01:41:13 UTC, Steven Schveighoffer wrote:
 This is because most CPUs consider the instructions as immutable.
 Even x86 do not provide any guarantee (which makes it very hard
 to swap implementation outside of a VM).
Remember, these are not functions, but function pointers. You are not modifying the function at all.
void* is not a function and can come from anything that is mutable.
 Casting from void* to function is pretty guaranteed to be
 undefined behavior on all plateforms, unless you emit the proper
 barriers.
Can you demonstrate a C example that will fail on ARM?
Anything that JIT is affected. Consider the pseudocode: void JIT() { void* code = malloc(XXX); // Write instructions into the allocated chunk of memory. membar(); // Memory barrier to avoid reordering. auto fun = cast(void function()) code; fun(); // Even if your codegen is correct, anything can happen this point. }
This is not what we are talking about. We are talking about whether a valid function pointer can be cast to void * and back again. It is undefined for C/C++. Should it be defined for D? -Steve
Jun 24 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Jun 24, 2014 at 11:36:26AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On Tue, 24 Jun 2014 00:24:50 -0400, deadalnix <deadalnix gmail.com> wrote:
[...]
Anything that JIT is affected. Consider the pseudocode:

void JIT() {
     void* code = malloc(XXX);
     // Write instructions into the allocated chunk of memory.

     membar(); // Memory barrier to avoid reordering.

     auto fun = cast(void function()) code;
     fun(); // Even if your codegen is correct, anything can happen this point.
}
This is not what we are talking about. We are talking about whether a valid function pointer can be cast to void * and back again. It is undefined for C/C++. Should it be defined for D?
[...] It's a judgment call. Defining it means that (void*).sizeof must be the size of the largest pointer type on a particular platform, meaning that if, for example, pointer to data is 32-bit but pointer to function is 64-bit, then we're effectively doubling the size of void* (it must be 64-bit to support free casting to/from function pointers) relative to any other non-function pointer. So code that deals only with data void* will have to pay for this cost for no benefit. Having said that, though, I have my doubts as to how significant this issue is -- almost all modern platforms that I'm aware of have equal pointer sizes for code and data alike -- it's just so much easier to make them uniform size; it makes compilers and linkers easier to implement, and runtime loaders easier to write, etc., not to mention simplifying CPU design (no need for two separate address decoders to deal with code vs. data, use uniform instruction formats for accessing code/data, etc.). OTOH, perhaps the correct solution is for the user to handle this by using a union, thus guaranteeing that there will be no platform dependent issues (you can even throw delegates in there and it will work correctly, whereas currently, casting a delegate to void* definitely won't work properly): union GenericPtr { void* data; void function(void) funcptr; void delegate(void) dg; } void func() {} int x; GenericPtr.data = &x; GenericPtr.funcptr = &func; GenericPtr.dg = () { do_something(); }; ... // etc. T -- "Holy war is an oxymoron." -- Lazarus Long
Jun 24 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Jun 2014 12:55:09 -0400, H. S. Teoh via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Tue, Jun 24, 2014 at 11:36:26AM -0400, Steven Schveighoffer via  
 Digitalmars-d wrote:
 On Tue, 24 Jun 2014 00:24:50 -0400, deadalnix <deadalnix gmail.com>  
 wrote:
[...]
Anything that JIT is affected. Consider the pseudocode:

void JIT() {
     void* code = malloc(XXX);
     // Write instructions into the allocated chunk of memory.

     membar(); // Memory barrier to avoid reordering.

     auto fun = cast(void function()) code;
     fun(); // Even if your codegen is correct, anything can happen  
this point.
}
This is not what we are talking about. We are talking about whether a valid function pointer can be cast to void * and back again. It is undefined for C/C++. Should it be defined for D?
[...] It's a judgment call. Defining it means that (void*).sizeof must be the size of the largest pointer type on a particular platform, meaning that if, for example, pointer to data is 32-bit but pointer to function is 64-bit, then we're effectively doubling the size of void* (it must be 64-bit to support free casting to/from function pointers) relative to any other non-function pointer. So code that deals only with data void* will have to pay for this cost for no benefit.
there are other options. We can not support the platform (I don't think this is a huge loss). We can not support casting function pointers to void * on that platform. -Steve
Jun 24 2014
prev sibling parent reply "Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:
On Monday, 23 June 2014 at 20:49:27 UTC, Steven Schveighoffer 
wrote:
 Since most architectures use same-size words for function 
 addresses and object addresses, D would be fine to say it's 
 defined and valid. I think the extreme outliers are 
 architectures that are not equal, and D will not be harmed too 
 badly by making this distinction. Any D flavor that would be 
 ported to such an architecture may have to be a derived 
 language.

 -Steve
While it might be fine, I would be concerned that people wouldn't understand the difference between a function and a delegate. They would figure that if you can store a function reference in a void* then you should be able to fit a delegate in as well, and proceed to lose data. I would make it something where the compiler forces you to make an explicit cast. Before that, it should warn you about the potential loss of data.
Jun 23 2014
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Jun 2014 17:16:04 -0400, Chris Williams  
<yoreanon-chrisw yahoo.co.jp> wrote:

 On Monday, 23 June 2014 at 20:49:27 UTC, Steven Schveighoffer wrote:
 Since most architectures use same-size words for function addresses and  
 object addresses, D would be fine to say it's defined and valid. I  
 think the extreme outliers are architectures that are not equal, and D  
 will not be harmed too badly by making this distinction. Any D flavor  
 that would be ported to such an architecture may have to be a derived  
 language.
While it might be fine, I would be concerned that people wouldn't understand the difference between a function and a delegate. They would figure that if you can store a function reference in a void* then you should be able to fit a delegate in as well, and proceed to lose data. I would make it something where the compiler forces you to make an explicit cast. Before that, it should warn you about the potential loss of data.
That shouldn't work, even for an explicit cast. It currently is deprecated, not sure what version it will be removed (I didn't know it ever worked in the first place!) -Steve
Jun 23 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/23/2014 1:30 PM, Ali Çehreli wrote:
 In C and C++, void* is for data pointers but it works (by accident?) for
 function pointers on all popular platforms.
For 16 bit programs, function pointers can be different sizes from data pointers. But D doesn't support 16 bit programming, and I know of no 32+ bit platform that does this. It is possible to put the x86 in a mode with 48 bit function pointers, but so far as I know no C compiler allows that anyway.
 Does D have anything to say about this topic?
I'd say they were the same size. If, by some quirk, someone wants to port D to a platform where this was not true, they could certainly get it to work, but I don't think it is practical to burden the other 99.99999% with worrying about that.
Jun 24 2014