www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Fragile ABI

reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
http://michelf.ca/blog/2009/some-ideas-for-dynamic-vtables-in-d/

The above blog post, written in 2009, proposes a system for 
solving the Fragile ABI Problem in D. Just wondering whether 
anything like this is a planned feature for druntime.

C++'s fragile ABI makes it very difficult to write class 
libraries without some sort of workaround. For example, RapidXML 
and AGG are distributed as source code; GDI+ is a header-only 
wrapper over an underlying C interface; and Qt makes heavy use of 
the Pimpl idiom, which makes its source code much more complex 
than it needs to be. This is also a major problem for any program 
which wants to expose a plugin API.

It would be nice if D could sidestep this issue. It's frustrating 
that C is currently the only real option for developing native 
libraries without worrying about their ABI.
Aug 16 2012
next sibling parent reply "Kagamin" <spam here.lot> writes:
You can also use C++ to develop COM components which have 
standardized ABI.
Aug 16 2012
next sibling parent reply Piotr Szturmaj <bncrbme jadamspam.pl> writes:
Kagamin wrote:
 You can also use C++ to develop COM components which have standardized ABI.
...only on Windows.
Aug 16 2012
next sibling parent reply "Ivan Trombley" <itrombley dot-borg.org> writes:
On Thursday, 16 August 2012 at 15:54:12 UTC, Piotr Szturmaj wrote:
 Kagamin wrote:
 You can also use C++ to develop COM components which have 
 standardized ABI.
...only on Windows.
Code compiled with VC has a different vtable layout than code compiled with say GCC, so even on Windows there is no standard.
Aug 18 2012
parent "Kagamin" <spam here.lot> writes:
On Saturday, 18 August 2012 at 20:36:11 UTC, Ivan Trombley wrote:
 On Thursday, 16 August 2012 at 15:54:12 UTC, Piotr Szturmaj 
 wrote:
 Kagamin wrote:
 You can also use C++ to develop COM components which have 
 standardized ABI.
...only on Windows.
Code compiled with VC has a different vtable layout than code compiled with say GCC, so even on Windows there is no standard.
You're probably talking about C++ ABI in context of multiple and virtual inheritance. I doubt gcc can't into COM. Even dmd backend can.
Aug 21 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-08-16 17:54, Piotr Szturmaj wrote:
 Kagamin wrote:
 You can also use C++ to develop COM components which have standardized
 ABI.
...only on Windows.
What about XPCOM? http://en.wikipedia.org/wiki/XPCOM -- /Jacob Carlborg
Aug 20 2012
prev sibling parent reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
I'm aware that just exposing class interfaces (via COM or other 
means) is an option, but that tends to cause many problems of its 
own:

- You can no longer call new or delete on the underlying class; 
you're obliged to use factory methods. This leads to issues with 
templates and stack allocation (although this will be less of an 
issue for D compared to C++).

- You can't directly inherit from any of the library's classes. 
This is an issue for any library which wants to allow "custom 
components" (for example, widgets in a GUI toolkit, dialog boxes 
in a browser plugin, or controls in a desktop panel). Coding 
strictly against interfaces is one workaround, but that often 
makes things more complicated than they need to be.

- The library can't provide template methods for classes exposed 
this way, and class methods can no longer be inlined.

- It forces the developer to maintain both a public interface and 
a private implementation; he can no longer code as though he's 
writing classes for his own use.

All of these are relatively minor issues, but taken together, 
they make library development quite difficult. For example, I 
think it would be almost impossible to develop a COM GUI toolkit.
Aug 16 2012
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 16 August 2012 at 16:25:01 UTC, R Grocott wrote:
 All of these are relatively minor issues, but taken together, 
 they make library development quite difficult. For example, I 
 think it would be almost impossible to develop a COM GUI 
 toolkit.
And yet that is what DirectX and WinRT are all about. When you use programming models like COM, you are programming against interfaces, this is also know as component model programming in the CS papers. Your GUI toolkit just needs to expose interfaces for the different types of events it is required to handle, and you give via API calls objects whose instances implement said interfaces. To avoid too much code rewrite, some component systems, COM included, support a form of delegation where you can implement just a part of the interface, while delegating the remaining calls to a contained object. -- Paulo
Aug 16 2012
parent reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
 And yet that is what DirectX and WinRT are all about.
DirectX, yes: it's a good example of an OO library with a purely interface-based API. The only DirectX plugin architecture I can bring to mind, though (DirectShow filters), actually uses C++ classes (such as CSource) to hide a lot of the underlying complexity. I get the impression that wouldn't be necessary if interface-based plugins were as simple to create as inheritance-based ones. Likewise, WinRT actually hides a huge amount of complexity inside its language bindings. Inheriting from a WinRT object using only the underlying COM interfaces involves a lot of hassle, and is a prime example of what I'm talking about. See here: http://www.interact-sw.co.uk/iangblog/2011/09/25/native-winrt-inheritance
 Your GUI toolkit just needs to expose interfaces for the 
 different
 types of events it is required to handle, and you give via API 
 calls
 objects whose instances implement said interfaces.

 To avoid too much code rewrite, some component systems, COM 
 included,
 support a form of delegation where you can implement just a 
 part of the
 interface, while delegating the remaining calls to a contained 
 object.
That sounds clean in theory - but so does Pimpl, and I know from experience that Pimpl tends to make a horrible mess out of any codebase. It doesn't help that, as far as I know, COM is Windows-only, and D doesn't natively support the method-defaulting system you described. In practice, I think that proper interoperability w/r/t classes and inheritance will tend be cleaner than coding strictly against an interface. This can be demonstrated by choosing any base-class derived-class pair, from any OO codebase, and attempting to rewrite that parent-child relationship as a server-client one.
Aug 16 2012
next sibling parent reply "Jacob Carlborg" <doob me.com> writes:
On Thursday, 16 August 2012 at 20:19:27 UTC, R Grocott wrote:

 Likewise, WinRT actually hides a huge amount of complexity 
 inside its language bindings. Inheriting from a WinRT object 
 using only the underlying COM interfaces involves a lot of 
 hassle, and is a prime example of what I'm talking about. See 
 here:

 http://www.interact-sw.co.uk/iangblog/2011/09/25/native-winrt-inheritance
Someone has written a piece of code making it easy to use WinRT from D. I don't have the link but it should be possible to find in some of the newsgroups. -- /Jacob Carlborg
Aug 17 2012
parent reply "xenon325" <1 mail.net> writes:
On Friday, 17 August 2012 at 13:49:22 UTC, Jacob Carlborg wrote:
 Someone has written a piece of code making it easy to use WinRT 
 from D. I don't have the link but it should be possible to find 
 in some of the newsgroups.
Are you referring to this: http://lunesu.com/uploads/ModernCOMProgramminginD.pdf ?
Aug 22 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-08-22 11:16, xenon325 wrote:
 On Friday, 17 August 2012 at 13:49:22 UTC, Jacob Carlborg wrote:
 Someone has written a piece of code making it easy to use WinRT from
 D. I don't have the link but it should be possible to find in some of
 the newsgroups.
Are you referring to this: http://lunesu.com/uploads/ModernCOMProgramminginD.pdf
Yes, exactly. -- /Jacob Carlborg
Aug 22 2012
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Thursday, 16 August 2012 at 20:19:27 UTC, R Grocott wrote:
 That sounds clean in theory - but so does Pimpl, and I know 
 from experience that Pimpl tends to make a horrible mess out of 
 any codebase. It doesn't help that, as far as I know, COM is 
 Windows-only, and D doesn't natively support the 
 method-defaulting system you described.

 In practice, I think that proper interoperability w/r/t classes 
 and inheritance will tend be cleaner than coding strictly 
 against an interface. This can be demonstrated by choosing any 
 base-class derived-class pair, from any OO codebase, and 
 attempting to rewrite that parent-child relationship as a 
 server-client one.
What you ask for sounds quite similar to COM composition with delegation. If the problem is just fields and vtable shifts, D interfaces should work too. Though there's no easy way for composition and delegation with interfaces in D AFAIK. This already led to the null_t hack, but the problem wasn't really solved.
Aug 20 2012
parent reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
On Monday, 20 August 2012 at 15:26:48 UTC, Kagamin wrote:
 What you ask for sounds quite similar to COM composition with 
 delegation.
Would anybody mind linking to resources which describe COM composition with delegation? It's been suggested twice in this thread as an alternative way to develop a non-fragile API, but anything related to COM is almost invisible to search engines (even moreso than D itself).
Aug 20 2012
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Monday, 20 August 2012 at 18:37:00 UTC, R Grocott wrote:
 On Monday, 20 August 2012 at 15:26:48 UTC, Kagamin wrote:
 What you ask for sounds quite similar to COM composition with 
 delegation.
Would anybody mind linking to resources which describe COM composition with delegation? It's been suggested twice in this thread as an alternative way to develop a non-fragile API, but anything related to COM is almost invisible to search engines (even moreso than D itself).
You just need to know Windows development well, as this type of information requires MSDN navigation skills. :) The best book about COM programming that I know is, The Waite Group's COM/DCOM Primer Plus http://www.amazon.com/DCOM-Primer-Plus-Waite-Group/dp/0672314924/ref=sr_1_cc_1?s=aps&ie=UTF8&qid=1345529642&sr=1-1-catcorr Back to how composition works, http://msdn.microsoft.com/en-us/library/ms678443%28v=vs.85%29 http://msdn.microsoft.com/en-us/library/ms686558%28v=vs.85%29 With ATL the code gets much more simple to get, http://support.microsoft.com/kb/173823 Another example from CodeGuru, http://www.codeguru.com/cpp/com-tech/atl/article.php/c3579/Containment-and-Aggregation.htm -- Paulo
Aug 20 2012
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Monday, 20 August 2012 at 18:37:00 UTC, R Grocott wrote:
 On Monday, 20 August 2012 at 15:26:48 UTC, Kagamin wrote:
 What you ask for sounds quite similar to COM composition with 
 delegation.
Would anybody mind linking to resources which describe COM composition with delegation? It's been suggested twice in this thread as an alternative way to develop a non-fragile API, but anything related to COM is almost invisible to search engines (even moreso than D itself).
class MyButton: IButton { IButton impl; void text(string t) property // void IButton.text(string) { impl.text=t; } // ... other IButton methods } slightly similar to what emplace does
Aug 21 2012
prev sibling parent reply "David Piepgrass" <qwertie256 gmail.com> writes:
On Monday, 20 August 2012 at 18:37:00 UTC, R Grocott wrote:
 On Monday, 20 August 2012 at 15:26:48 UTC, Kagamin wrote:
 What you ask for sounds quite similar to COM composition with 
 delegation.
Would anybody mind linking to resources which describe COM composition with delegation? It's been suggested twice in this thread as an alternative way to develop a non-fragile API, but anything related to COM is almost invisible to search engines (even moreso than D itself).
There's nothing novel about COM except aggregation, and aggregation is just an implementation detail where a class pretends that it implements an interface but the calls to that interface go to another object, conceptually it's like "alias this" except that a dynamic cast (i.e. QueryInterface) is required to reach the second object: http://msdn.microsoft.com/en-us/library/ms686558(v=vs.85) For the most part COM sucks really bad: it is a very ordinary object-oriented ABI but without numerous features that we otherwise take for granted: - In COM, you can't define static methods - In COM, you can't overload functions - In COM, constructors can't have arguments - In COM, there are no fields, only properties - In COM, class inheritance is not allowed (an interface IB can inherit from IA, but if you implement a class A that implements IA, you can't write a class B that derives from A and implements IB. In C++/ATL a template-based workaround is possible if A and B are in the same DLL.) Moreover COM ABIs are fragile, in that there is almost zero support for adding or removing methods without either breaking everything or creating a new, independent, incompatible version (the only exception: you can safely add a method at the end of an interface, if you can be certain that no other interface inherits from it.) Finally, it's Windows-only (although it has been reimplemented on Linux, e.g. for WINE) and modules must be registered in the Windows Registry. I think the only reason we still use COM today is that, sadly, there is no other OO standard interoperable with all languages. C++ vtables are the closest competitor; I guess their fatal flaw is that there is no standard for memory management across C++ DLLs.
Aug 21 2012
next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Tuesday, 21 August 2012 at 15:15:02 UTC, David Piepgrass wrote:
 On Monday, 20 August 2012 at 18:37:00 UTC, R Grocott wrote:
 On Monday, 20 August 2012 at 15:26:48 UTC, Kagamin wrote:
 What you ask for sounds quite similar to COM composition with 
 delegation.
Would anybody mind linking to resources which describe COM composition with delegation? It's been suggested twice in this thread as an alternative way to develop a non-fragile API, but anything related to COM is almost invisible to search engines (even moreso than D itself).
There's nothing novel about COM except aggregation, and aggregation is just an implementation detail where a class pretends that it implements an interface but the calls to that interface go to another object, conceptually it's like "alias this" except that a dynamic cast (i.e. QueryInterface) is required to reach the second object: http://msdn.microsoft.com/en-us/library/ms686558(v=vs.85) For the most part COM sucks really bad: it is a very ordinary object-oriented ABI but without numerous features that we otherwise take for granted: - In COM, you can't define static methods - In COM, you can't overload functions - In COM, constructors can't have arguments - In COM, there are no fields, only properties - In COM, class inheritance is not allowed (an interface IB can inherit from IA, but if you implement a class A that implements IA, you can't write a class B that derives from A and implements IB. In C++/ATL a template-based workaround is possible if A and B are in the same DLL.) Moreover COM ABIs are fragile, in that there is almost zero support for adding or removing methods without either breaking everything or creating a new, independent, incompatible version (the only exception: you can safely add a method at the end of an interface, if you can be certain that no other interface inherits from it.) Finally, it's Windows-only (although it has been reimplemented on Linux, e.g. for WINE) and modules must be registered in the Windows Registry. I think the only reason we still use COM today is that, sadly, there is no other OO standard interoperable with all languages. C++ vtables are the closest competitor; I guess their fatal flaw is that there is no standard for memory management across C++ DLLs.
Even .NET with his goal of supporting multiple languages has the CLS as the common set of datatypes and OO concepts to support across .NET languages. Given that OO has so many types of possible implementations, it is hard to implement an ABI that works across multiple languages. Lets see how the improved COM (WinRT) turns out to be. -- Paulo
Aug 21 2012
parent reply "David Piepgrass" <qwertie256 gmail.com> writes:
 I think the only reason we still use COM today is that, sadly, 
 there is no other OO standard interoperable with all 
 languages. C++ vtables are the closest competitor; I guess 
 their fatal flaw is that there is no standard for memory 
 management across C++ DLLs.
Even .NET with his goal of supporting multiple languages has the CLS as the common set of datatypes and OO concepts to support across .NET languages. Given that OO has so many types of possible implementations, it is hard to implement an ABI that works across multiple languages.
Sure, but .NET apps are not limited to CLS. Two different .NET languages can easily interoperate outside the rules of CLS (as long as it is still within the rules of .NET). Whereas operating beyond the limits of COM is much harder. Besides that, CLS itself is far more expansive than COM, allowing function overloading, inheritance, constructor arguments, etc. It's unfortunate that .NET has limitations that make it hard for languages with novel features, like D, to fit in. (D could target .NET, of course, but there would be a significant cost, in terms of either performance, interoperability with other .NET code, and/or placing limitations on what D code can do.)
 Lets see how the improved COM (WinRT) turns out to be.
Sadly, WinRT is again intended to be Windows-only, so developers like me that hate lock-in will avoid it in preference for .NET (hi Mono!) and yucky old C.
Aug 21 2012
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 22 August 2012 at 00:15:12 UTC, David Piepgrass 
wrote:
 Lets see how the improved COM (WinRT) turns out to be.
Sadly, WinRT is again intended to be Windows-only, so developers like me that hate lock-in will avoid it in preference for .NET (hi Mono!) and yucky old C.
Because UNIX systems are still in the stone age in terms of ABI, as they barely changes since the 70's and no one seems to care enough to change things. I like UNIX a lot, but got to know it, after knowing what is possible in more advanced languages, so it always dismays me that specially when dealing with most commercial UNIX it feels like being in the 70's computing age. So that lives only Apple and Microsoft with room for real OS innovation in mainstream OS, and like any vendor they prefer to look for solutions that fit only their OS. Mac OS x is also UNIX, but Apple has been changing it already quite a lot compared with the other vendors, hence my Apple remark. -- Paulo
Aug 21 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-22 06:32:29 +0000, "Paulo Pinto" <pjmlp progtools.org> said:

 On Wednesday, 22 August 2012 at 00:15:12 UTC, David Piepgrass wrote:
 Lets see how the improved COM (WinRT) turns out to be.
Sadly, WinRT is again intended to be Windows-only, so developers like me that hate lock-in will avoid it in preference for .NET (hi Mono!) and yucky old C.
Because UNIX systems are still in the stone age in terms of ABI, as they barely changes since the 70's and no one seems to care enough to change things. I like UNIX a lot, but got to know it, after knowing what is possible in more advanced languages, so it always dismays me that specially when dealing with most commercial UNIX it feels like being in the 70's computing age. So that lives only Apple and Microsoft with room for real OS innovation in mainstream OS, and like any vendor they prefer to look for solutions that fit only their OS. Mac OS x is also UNIX, but Apple has been changing it already quite a lot compared with the other vendors, hence my Apple remark.
Actually, the difference is standardization. Microsoft's COM and Apple's Objective-C runtime are built on top of C APIs (and you can access them through C if you want, although it's a little awkward). COM implementations and Objective-C runtime implementations exist for other UNIXes too, as well as other similar things, but no one is pushing them enough for them to become a standard. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 22 2012
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 22 August 2012 at 12:56:12 UTC, Michel Fortin wrote:
 On 2012-08-22 06:32:29 +0000, "Paulo Pinto" 
 <pjmlp progtools.org> said:

 On Wednesday, 22 August 2012 at 00:15:12 UTC, David Piepgrass 
 wrote:
 Lets see how the improved COM (WinRT) turns out to be.
Sadly, WinRT is again intended to be Windows-only, so developers like me that hate lock-in will avoid it in preference for .NET (hi Mono!) and yucky old C.
Because UNIX systems are still in the stone age in terms of ABI, as they barely changes since the 70's and no one seems to care enough to change things. I like UNIX a lot, but got to know it, after knowing what is possible in more advanced languages, so it always dismays me that specially when dealing with most commercial UNIX it feels like being in the 70's computing age. So that lives only Apple and Microsoft with room for real OS innovation in mainstream OS, and like any vendor they prefer to look for solutions that fit only their OS. Mac OS x is also UNIX, but Apple has been changing it already quite a lot compared with the other vendors, hence my Apple remark.
Actually, the difference is standardization. Microsoft's COM and Apple's Objective-C runtime are built on top of C APIs (and you can access them through C if you want, although it's a little awkward). COM implementations and Objective-C runtime implementations exist for other UNIXes too, as well as other similar things, but no one is pushing them enough for them to become a standard.
Yes, I have to agree. Wasn't Taligent something in that direction? -- Paulo
Aug 22 2012
prev sibling parent "Kagamin" <spam here.lot> writes:
On Tuesday, 21 August 2012 at 15:15:02 UTC, David Piepgrass wrote:
 Finally, it's Windows-only (although it has been reimplemented 
 on Linux, e.g. for WINE) and modules must be registered in the 
 Windows Registry.
Conceptually COM is just a set of ideas for interoperable component design, which happens to actually work across languages and frameworks: .net, C, C++, Delphi, D. Registry works only as a centralized repository of components which is needed if you want to plug arbitrary components into your system, it has nothing to do with the problem of fragile ABI.
Aug 23 2012
prev sibling next sibling parent reply "dsimcha" <dsimcha yahoo.com> writes:
On Thursday, 16 August 2012 at 14:58:23 UTC, R Grocott wrote:
 C++'s fragile ABI makes it very difficult to write class 
 libraries without some sort of workaround. For example, 
 RapidXML and AGG are distributed as source code; GDI+ is a 
 header-only wrapper over an underlying C interface; and Qt 
 makes heavy use of the Pimpl idiom, which makes its source code 
 much more complex than it needs to be. This is also a major 
 problem for any program which wants to expose a plugin API.
Since pimpl is useful but messy, given D's metaprogramming capabilities, maybe what we need is a Pimpl template in Phobos: // The implementation struct. struct SImpl { int a, b, c; void fun() {} } // Automatically generate code for the Pimpl wrapper. alias Pimpl!SImpl S; auto s = new S; On the other hand, IIUC Pimpl doesn't solve the vtable part of the problem, only the data members part. (Correct me if I'm wrong here, since I admit to knowing very little about the fragile ABI problem or its workarounds.)
Aug 16 2012
parent "R Grocott" <rgrocottbugzilla gmail.com> writes:
Hm. Come to think of it, I have another question for somebody 
knowledgeable about this stuff:

The blog post I linked only talks about rewriting vtables at 
dynamic-link-time. Would it be possible to implement something 
similar for the sizes of structs and scope-classes, and the 
relative address of member variables? dsimcha's post has led me 
to realize that without these additional features, the ABI would 
still be quite fragile.

I know very little about dynamic linking, but I guess it might be 
too expensive to justify? If nothing else, it would probably 
block a few compile-time optimisations (since offsets and sizes 
would have to be treated as unknown variables rather than integer 
constants).
Aug 17 2012
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-16 14:58:22 +0000, "R Grocott" <rgrocottbugzilla gmail.com> said:

 http://michelf.ca/blog/2009/some-ideas-for-dynamic-vtables-in-d/
 
 The above blog post, written in 2009, proposes a system for solving the 
 Fragile ABI Problem in D. Just wondering whether anything like this is 
 a planned feature for druntime.
I wrote this post. I'd still like to have a non-fragile ABI. But there's probably a tradeoff to make: if you want absolute performance, a non-fragile ABI will get in your way. If you're writing a GUI toolkit, performance probably isn't going to be your biggest concern, but if you're writing a game it might. So I think this problem would be worth solving in a way that allows the designer of a class to choose.
 It would be nice if D could sidestep this issue. It's frustrating that 
 C is currently the only real option for developing native libraries 
 without worrying about their ABI.
It's frustrating indeed. The D/Objective-C proof of concept I developed is interesting in this regard: extern(Objective-C) classes have a non-fragile ABI (for methods, and for fields too if I add support for Apple's modern runtime). So basically you have fragile and non-fragile objects living together, using two distinct class hierarchies. The extern(Objective-C) hierarchy being the non-fragile one, but slightly slower too. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 17 2012
prev sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 16 August 2012 at 14:58:23 UTC, R Grocott wrote:
 http://michelf.ca/blog/2009/some-ideas-for-dynamic-vtables-in-d/

 The above blog post, written in 2009, proposes a system for 
 solving the Fragile ABI Problem in D. Just wondering whether 
 anything like this is a planned feature for druntime.

 C++'s fragile ABI makes it very difficult to write class 
 libraries without some sort of workaround. For example, 
 RapidXML and AGG are distributed as source code; GDI+ is a 
 header-only wrapper over an underlying C interface; and Qt 
 makes heavy use of the Pimpl idiom, which makes its source code 
 much more complex than it needs to be. This is also a major 
 problem for any program which wants to expose a plugin API.

 It would be nice if D could sidestep this issue. It's 
 frustrating that C is currently the only real option for 
 developing native libraries without worrying about their ABI.
The is only so, because in most mainstream OSs, the C ABI is the OS ABI. If you make use of an OS written in, lets say Modula-2, then the OS ABI would be a Modula-2 ABI, which even C would have to comply to somehow. I think z/OS, if I'm not mistaken on the OS, is one of the few OS where you have a good ABI across multiple languages. This is similar to what Microsoft somehow tries to achieve with COM and .NET. -- Paulo
Aug 17 2012
parent reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
Paulo -

Surely there are ways to work around the OS's native ABI, though, 
even from within compiled code?

Rewriting object vtables (etc.) might be a bit more involved than 
simple name mangling, but I'm almost certain it would be possible 
to implement it as part of druntime. The blog post which I linked 
describes one way to go about it.

Michael -

As far as I can tell, most of the performance cost in your idea 
comes from the additional level of indirection, since you store 
vtable offsets in a global variable which must be read before 
each function call.

This might just be wishful thinking, but would it be possible for 
druntime to write new vtable offsets directly into the program's 
machine code, at dynamic-link time? That would remove all of the 
run-time performance overhead, but, as I say, I'm not sure 
whether it's actually possible.

I think the biggest obstacle is that it would require the .text 
section (or equivalent) of the executable to be writable. I know 
that's possible on Linux, but I'm not sure whether the same is 
true for most other operating systems. I guess it might also be a 
security risk?
Aug 17 2012
next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-17 12:36:56 +0000, "R Grocott" <rgrocottbugzilla gmail.com> said:

 Michael -
 
 As far as I can tell, most of the performance cost in your idea comes 
 from the additional level of indirection, since you store vtable 
 offsets in a global variable which must be read before each function 
 call.
 
 This might just be wishful thinking, but would it be possible for 
 druntime to write new vtable offsets directly into the program's 
 machine code, at dynamic-link time? That would remove all of the 
 run-time performance overhead, but, as I say, I'm not sure whether it's 
 actually possible.
 
 I think the biggest obstacle is that it would require the .text section 
 (or equivalent) of the executable to be writable. I know that's 
 possible on Linux, but I'm not sure whether the same is true for most 
 other operating systems. I guess it might also be a security risk?
It's certainly doable if you put it at the right place. But druntime is not the one in charge of dynamic linking: the dynamic linker from the OS is, so that's the ideal place to do such a thing. If you want to do it in druntime it'll be a huge hassle, if doable at all it'll probably break easily. But on the OS side you hit a rock because of different approach to dynamic linking (Windows does not really have a dynamic linker). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 17 2012
next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 17 August 2012 at 13:42:27 UTC, Michel Fortin wrote:
 On 2012-08-17 12:36:56 +0000, "R Grocott" [..] (Windows does 
 not really have a dynamic linker).
Why? What is it lacking? From my experience, I don't have much to complain about it. Just wanting to know, not trolling. -- Paulo
Aug 17 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-17 14:00:42 +0000, "Paulo Pinto" <pjmlp progtools.org> said:

 On Friday, 17 August 2012 at 13:42:27 UTC, Michel Fortin wrote:
 On 2012-08-17 12:36:56 +0000, "R Grocott" [..] (Windows does not really 
 have a dynamic linker).
Why? What is it lacking? From my experience, I don't have much to complain about it. Just wanting to know, not trolling.
Quote from <http://xenophilia.org/winvunix.html>:
 In Unix, a shared object (.so) file contains code to be used by the 
 program, and also the names of functions and data that it expects to 
 find in the program. When the file is joined to the program, all 
 references to those functions and data in the file's code are changed 
 to point to the actual locations in the program where the functions and 
 data are placed in memory. This is basically a link operation.
 In Windows, a dynamic-link library (.dll) file has no dangling 
 references. Instead, an access to functions or data goes through a 
 lookup table. So the DLL code does not have to be fixed up at runtime 
 to refer to the program's memory; instead, the code already uses the 
 DLL's lookup table, and the lookup table is modified at runtime to 
 point to the functions and data.
-- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 17 2012
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 17 August 2012 at 17:42:24 UTC, Michel Fortin wrote:
 On 2012-08-17 14:00:42 +0000, "Paulo Pinto" 
 <pjmlp progtools.org> said:

 On Friday, 17 August 2012 at 13:42:27 UTC, Michel Fortin wrote:
 On 2012-08-17 12:36:56 +0000, "R Grocott" [..] (Windows does 
 not really have a dynamic linker).
Why? What is it lacking? From my experience, I don't have much to complain about it. Just wanting to know, not trolling.
Quote from <http://xenophilia.org/winvunix.html>:
 In Unix, a shared object (.so) file contains code to be used 
 by the program, and also the names of functions and data that 
 it expects to find in the program. When the file is joined to 
 the program, all references to those functions and data in the 
 file's code are changed to point to the actual locations in 
 the program where the functions and data are placed in memory. 
 This is basically a link operation.
 In Windows, a dynamic-link library (.dll) file has no dangling 
 references. Instead, an access to functions or data goes 
 through a lookup table. So the DLL code does not have to be 
 fixed up at runtime to refer to the program's memory; instead, 
 the code already uses the DLL's lookup table, and the lookup 
 table is modified at runtime to point to the functions and 
 data.
For this just looks at two different ways to implement dynamic loading. Having Windows experience since the Windows 3.1 days, alongside with many other OS besides the typical Linux fan, I always find sad this vision of Linux == UNIX, right, Windows wrong. The author also recognises that Aix, which is UNIX, also has strange dynamic loading rules, according to his understanding what dynamic loading is. -- Paulo
Aug 17 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-17 20:49:21 +0000, "Paulo Pinto" <pjmlp progtools.org> said:

 On Friday, 17 August 2012 at 17:42:24 UTC, Michel Fortin wrote:
 Quote from <http://xenophilia.org/winvunix.html>:
 
 In Unix, a shared object (.so) file contains code to be used by the 
 program, and also the names of functions and data that it expects to 
 find in the program. When the file is joined to the program, all 
 references to those functions and data in the file's code are changed 
 to point to the actual locations in the program where the functions and 
 data are placed in memory. This is basically a link operation.
 In Windows, a dynamic-link library (.dll) file has no dangling 
 references. Instead, an access to functions or data goes through a 
 lookup table. So the DLL code does not have to be fixed up at runtime 
 to refer to the program's memory; instead, the code already uses the 
 DLL's lookup table, and the lookup table is modified at runtime to 
 point to the functions and data.
For this just looks at two different ways to implement dynamic loading. Having Windows experience since the Windows 3.1 days, alongside with many other OS besides the typical Linux fan, I always find sad this vision of Linux == UNIX, right, Windows wrong.
I never intended to mean Windows == wrong. What I am saying is that Windows does not have a dynamic linker capable of changing pointers to symbols in the code it loads (which is what a linker does). Code compiled for Windows need instead to refer to a separate lookup table when accessing symbols from others DLLs. Neither approach is wrong; each has different tradeoffs. Implementing a non-fragile ABI would be possible using a lookup table of runtime calculated values for field and vtable offsets, and those could be implemented both through a dynamic linker with DLLs, but accessing the lookup table adds some runtime overhead each time you need to access this symbol in the code because of the indirection. In the original linked article (which I wrote) what was proposed was to have the dynamic linker calculate offsets for fields and vtable entries and insert those offsets directly in the code (just like a linker does when it resolves symbols). But for that you'd need a custom linker (both static and dynamic), and probably a custom shared library format. So it's a huge task, especially when you consider that it should run on multiple platforms. But this same approach could make the C++ ABI non-fragile too. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 17 2012
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 8/18/12, Michel Fortin <michel.fortin michelf.ca> wrote:
 But for that you'd need a custom linker
 (both static and dynamic), and probably a custom shared library format.
 So it's a huge task, especially when you consider that it should run on
 multiple platforms.
Isn't this what the DDL project was all about? I don't know what state it was left in though..
Aug 17 2012
prev sibling next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Saturday, 18 August 2012 at 01:06:10 UTC, Michel Fortin wrote:
 On 2012-08-17 20:49:21 +0000, "Paulo Pinto" 
 <pjmlp progtools.org> said:

 On Friday, 17 August 2012 at 17:42:24 UTC, Michel Fortin wrote:
 Quote from <http://xenophilia.org/winvunix.html>:
 
 In Unix, a shared object (.so) file contains code to be used 
 by the program, and also the names of functions and data 
 that it expects to find in the program. When the file is 
 joined to the program, all references to those functions and 
 data in the file's code are changed to point to the actual 
 locations in the program where the functions and data are 
 placed in memory. This is basically a link operation.
 In Windows, a dynamic-link library (.dll) file has no 
 dangling references. Instead, an access to functions or data 
 goes through a lookup table. So the DLL code does not have 
 to be fixed up at runtime to refer to the program's memory; 
 instead, the code already uses the DLL's lookup table, and 
 the lookup table is modified at runtime to point to the 
 functions and data.
For this just looks at two different ways to implement dynamic loading. Having Windows experience since the Windows 3.1 days, alongside with many other OS besides the typical Linux fan, I always find sad this vision of Linux == UNIX, right, Windows wrong.
I never intended to mean Windows == wrong. What I am saying is that Windows does not have a dynamic linker capable of changing pointers to symbols in the code it loads (which is what a linker does). Code compiled for Windows need instead to refer to a separate lookup table when accessing symbols from others DLLs. Neither approach is wrong; each has different tradeoffs. Implementing a non-fragile ABI would be possible using a lookup table of runtime calculated values for field and vtable offsets, and those could be implemented both through a dynamic linker with DLLs, but accessing the lookup table adds some runtime overhead each time you need to access this symbol in the code because of the indirection. In the original linked article (which I wrote) what was proposed was to have the dynamic linker calculate offsets for fields and vtable entries and insert those offsets directly in the code (just like a linker does when it resolves symbols). But for that you'd need a custom linker (both static and dynamic), and probably a custom shared library format. So it's a huge task, especially when you consider that it should run on multiple platforms. But this same approach could make the C++ ABI non-fragile too.
Wouldn't DLL redirection with Toolhelp32 help here? But you're right the main problem is the portability across OS, specially since most tend to have a C like linker, instead of a more intelligent one. This is can only be improved with higher level linkers, which is a bit what happens with the solutions where the binaries have bytecode which is only compiled at load time, like .NET and JVM in their default offerings. Or even older Lillith system, or Native Oberon with bytecode modules. So, a solution would be either something as you describe, or having everyone use a D based OS. :) -- Paulo
Aug 17 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-18 05:29:18 +0000, "Paulo Pinto" <pjmlp progtools.org> said:

 Wouldn't DLL redirection with Toolhelp32 help here?
No idea what toolhelp32 is. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 18 2012
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Saturday, 18 August 2012 at 12:45:19 UTC, Michel Fortin wrote:
 On 2012-08-18 05:29:18 +0000, "Paulo Pinto" 
 <pjmlp progtools.org> said:

 Wouldn't DLL redirection with Toolhelp32 help here?
No idea what toolhelp32 is.
It is a debugging library introduced in the Win16 days, the 32 suffix is the Win32 version. I remember reading articles in DDJ about using it to modify DDLs function entries. -- Paulo
Aug 18 2012
prev sibling parent reply "Jacob Carlborg" <doob me.com> writes:
On Saturday, 18 August 2012 at 01:06:10 UTC, Michel Fortin wrote:

 In the original linked article (which I wrote) what was 
 proposed was to have the dynamic linker calculate offsets for 
 fields and vtable entries and insert those offsets directly in 
 the code (just like a linker does when it resolves symbols). 
 But for that you'd need a custom linker (both static and 
 dynamic), and probably a custom shared library format. So it's 
 a huge task, especially when you consider that it should run on 
 multiple platforms. But this same approach could make the C++ 
 ABI non-fragile too.
I'm having a hard time to see why a regular application couldn't do this, i.e. druntime. I'm mostly familiar with Mac OS X and seems pretty easy just to access the running executable and change what you want in it. That's what the dynamic linker is doing anyway. There's even a flag for object files indicating it's a dynamic linker (don't know if that is used any more). Sure it would probably break easily if the runtime of the OS changed (new version of the dynamic linker, something changing the object format). -- /Jacob Carlborg
Aug 18 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-18 10:20:17 +0000, "Jacob Carlborg" <doob me.com> said:

 On Saturday, 18 August 2012 at 01:06:10 UTC, Michel Fortin wrote:
 
 In the original linked article (which I wrote) what was proposed was to 
 have the dynamic linker calculate offsets for fields and vtable entries 
 and insert those offsets directly in the code (just like a linker does 
 when it resolves symbols). But for that you'd need a custom linker 
 (both static and dynamic), and probably a custom shared library format. 
 So it's a huge task, especially when you consider that it should run on 
 multiple platforms. But this same approach could make the C++ ABI 
 non-fragile too.
I'm having a hard time to see why a regular application couldn't do this, i.e. druntime. I'm mostly familiar with Mac OS X and seems pretty easy just to access the running executable and change what you want in it. That's what the dynamic linker is doing anyway. There's even a flag for object files indicating it's a dynamic linker (don't know if that is used any more). Sure it would probably break easily if the runtime of the OS changed (new version of the dynamic linker, something changing the object format).
Using a lookup table it could be done. But if you're going to patch the code as a dynamic linker does but after the dynamic linking stage, then you'll have to play around with no-execute flags as well as address layout randomization, and this is going to be ugly. Speaking of OS X, if your app is sandboxed I think it won't be able to do anything like that. Given that sandboxing is the beginning of a trend on many platforms, I'm not sure implementing all that would be worthwhile: all it'd accomplish is make processes that can't be sandboxed run a little faster. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 18 2012
next sibling parent reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
Hey Michel, posted this question earlier in the thread, but I 
think you might have missed it:

 The blog post I linked only talks about rewriting vtables at 
 dynamic-link-time. Would it be possible to implement something 
 similar for the sizes of structs and scope-classes, and the 
 relative address of member variables? dsimcha's post has led me 
 to realize that without these additional features, the ABI 
 would still be quite fragile.
Long story short: Could your proposal realistically be extended to support flexible sizeofs and member-variable offsets? I figure that such a system might as well stabilize the entire class ABI, rather than just vtables. I would guess that flexible sizeofs would increase the runtime cost only very slightly, but flexible member-variables might be more of an issue. The cost could be minimized by looking up offsets only when crossing a library boundary (say, when calling a template method on an extern (D_FlexibleABI) class, or accessing a protected member variable of same), but would that actually be possible?
Aug 18 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-18 14:50:54 +0000, "R Grocott" <rgrocottbugzilla gmail.com> said:

 Hey Michel, posted this question earlier in the thread, but I think you 
 might have missed it:
 
 The blog post I linked only talks about rewriting vtables at 
 dynamic-link-time. Would it be possible to implement something similar 
 for the sizes of structs and scope-classes, and the relative address of 
 member variables? dsimcha's post has led me to realize that without 
 these additional features, the ABI would still be quite fragile.
Long story short: Could your proposal realistically be extended to support flexible sizeofs and member-variable offsets? I figure that such a system might as well stabilize the entire class ABI, rather than just vtables. I would guess that flexible sizeofs would increase the runtime cost only very slightly, but flexible member-variables might be more of an issue. The cost could be minimized by looking up offsets only when crossing a library boundary (say, when calling a template method on an extern (D_FlexibleABI) class, or accessing a protected member variable of same), but would that actually be possible?
Apple's Modern Objective-C runtime has a non-fragile ABI that extends to member variables which works very well. It could be implemented pretty much the same way in D. There would a very small runtime overhead for accessing member variables though: each time you access a member you need need first to read a runtime-initialized global variable giving you its offset. As for structs, you could do it the same way, but I don't think you'd get enough benefit to compensate the drawback in performance and elsewhere. To truly have non-fragile structs you'd need to disable almost all compile-time introspection too. That would be very disruptive. So I don't think it makes sense to have non-fragile structs. By the way, did you take a look at the benchmarks? <http://michelf.ca/blog/2009/vtable-benchmarking/> -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 18 2012
parent "R Grocott" <rgrocottbugzilla gmail.com> writes:
On Saturday, 18 August 2012 at 17:55:22 UTC, Michel Fortin wrote:
 As for structs, you could do it the same way, but I don't think 
 you'd get enough benefit to compensate the drawback in 
 performance and elsewhere. To truly have non-fragile structs 
 you'd need to disable almost all compile-time introspection 
 too. That would be very disruptive.

 So I don't think it makes sense to have non-fragile structs.
Can't D allocate classes on the stack, in some circumstances? Would this lead to the same problems that you just described, or is the problem caused by how structs and classes are used, rather than how they're implemented?
 By the way, did you take a look at the benchmarks?
 <http://michelf.ca/blog/2009/vtable-benchmarking/>
Those are some pretty interesting numbers - seems as though the overhead is only about 10% - 20% over that of a regular vtable dispatch? I would have guessed that it would almost double the overhead, but I suppose that most of the overhead of a function call comes from saving registers, managing the stack, etc. The overhead is so small, in fact, that I'd feel quite comfortable using it in a very performance-sensitive situation (like, say, a rasterizer library), presuming that your benchmarks were accurate. Another strong argument in favour of trying to have this system implemented :)
Aug 18 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-08-18 14:52, Michel Fortin wrote:

 Using a lookup table it could be done.

 But if you're going to patch the code as a dynamic linker does but after
 the dynamic linking stage, then you'll have to play around with
 no-execute flags as well as address layout randomization, and this is
 going to be ugly.
These are usually not pretty things :)
 Speaking of OS X, if your app is sandboxed I think it won't be able to
 do anything like that. Given that sandboxing is the beginning of a trend
 on many platforms, I'm not sure implementing all that would be
 worthwhile: all it'd accomplish is make processes that can't be
 sandboxed run a little faster.
Yeah, that could be a problem. -- /Jacob Carlborg
Aug 18 2012
prev sibling parent reply "R Grocott" <rgrocottbugzilla gmail.com> writes:
On Friday, 17 August 2012 at 13:42:27 UTC, Michel Fortin wrote:
 It's certainly doable if you put it at the right place. But 
 druntime is not the one in charge of dynamic linking: the 
 dynamic linker from the OS is, so that's the ideal place to do 
 such a thing. If you want to do it in druntime it'll be a huge 
 hassle, if doable at all it'll probably break easily. But on 
 the OS side you hit a rock because of different approach to 
 dynamic linking (Windows does not really have a dynamic linker).
So basically, any system to create a more sensible ABI needs to built on top of the C ABI, name mangling, editable vtables, and nothing else? Ouch. If druntime can't detect whether symbols were dynamically or statically linked, does that mean that your system would add an overhead to *all* virtual function calls, not just those which call into a dynamic library? If so, that'd be a pretty strong argument against implementing anything like it, unless it's something which has to be explicitly switched on (using, say, an "extern(D_FlexibleABI)" statement).
Aug 17 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-08-17 14:13:10 +0000, "R Grocott" <rgrocottbugzilla gmail.com> said:

 On Friday, 17 August 2012 at 13:42:27 UTC, Michel Fortin wrote:
 It's certainly doable if you put it at the right place. But druntime is 
 not the one in charge of dynamic linking: the dynamic linker from the 
 OS is, so that's the ideal place to do such a thing. If you want to do 
 it in druntime it'll be a huge hassle, if doable at all it'll probably 
 break easily. But on the OS side you hit a rock because of different 
 approach to dynamic linking (Windows does not really have a dynamic 
 linker).
So basically, any system to create a more sensible ABI needs to built on top of the C ABI, name mangling, editable vtables, and nothing else? Ouch.
That's pretty much it yes.
 If druntime can't detect whether symbols were dynamically or statically 
 linked, does that mean that your system would add an overhead to *all* 
 virtual function calls, not just those which call into a dynamic 
 library?
Yes. Neither druntime nor the compiler knows whether you're creating an executable. The typical compilation process is to convert D code to an object file containing machine code. Then that object file is linked either inside an executable or inside a shared library. But the machine code is written before the compiler knows where it'll go.
 If so, that'd be a pretty strong argument against implementing anything 
 like it, unless it's something which has to be explicitly switched on 
 (using, say, an "extern(D_FlexibleABI)" statement).
Exactly my thoughts. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Aug 17 2012
parent "Jacob Carlborg" <doob me.com> writes:
On Friday, 17 August 2012 at 17:42:18 UTC, Michel Fortin wrote:

 Yes. Neither druntime nor the compiler knows whether you're 
 creating an executable. The typical compilation process is to 
 convert D code to an object file containing machine code. Then 
 that object file is linked either inside an executable or 
 inside a shared library. But the machine code is written before 
 the compiler knows where it'll go.
In most cases that is true. But in some cases the compiler do know what will be compiled. DMD has the -shared and -lib flags. They indicate that a shared library or static library should be built. -- /Jacob Carlborg
Aug 18 2012
prev sibling parent "Jacob Carlborg" <doob me.com> writes:
On Friday, 17 August 2012 at 12:36:57 UTC, R Grocott wrote:
 Paulo -

 Surely there are ways to work around the OS's native ABI, 
 though, even from within compiled code?

 Rewriting object vtables (etc.) might be a bit more involved 
 than simple name mangling, but I'm almost certain it would be 
 possible to implement it as part of druntime. The blog post 
 which I linked describes one way to go about it.

 Michael -

 As far as I can tell, most of the performance cost in your idea 
 comes from the additional level of indirection, since you store 
 vtable offsets in a global variable which must be read before 
 each function call.

 This might just be wishful thinking, but would it be possible 
 for druntime to write new vtable offsets directly into the 
 program's machine code, at dynamic-link time? That would remove 
 all of the run-time performance overhead, but, as I say, I'm 
 not sure whether it's actually possible.

 I think the biggest obstacle is that it would require the .text 
 section (or equivalent) of the executable to be writable. I 
 know that's possible on Linux, but I'm not sure whether the 
 same is true for most other operating systems. I guess it might 
 also be a security risk?
It's possible to modify the vtable at runtime. Just access through .__vtable in ClassInfo (I think) and treat it like a regular array. Don't know if that's what you mean. -- /Jacob Carlborg
Aug 17 2012