digitalmars.D - Static Const Object Polymorphism
- klodkevin (536/536) Mar 23 2015 Hi folks.
- Rikki Cattermole (25/25) Mar 23 2015 To be put this bluntly this is not idiomatic D and for good reason.
- Jacob Carlborg (9/10) Mar 24 2015 I didn't read the whole post but it looks like you've come up with
Hi folks. I just recently started to discovered the D language, I watched some videos about patterns, but I havent really something that comes close to the pattern I'm going to present. I hope the format of this post is correct.If so, I'm sorry to have wasted your time. The idea behind this pattern is quite simple, I call it compile time object orientation. In essence we are replicating some parts of the object oriented model, just at compile time. The result will be extremly flexible and easily configurable code, which to my knowledge can't be produced effectively otherwise, D?) The code that is used can be found here: https://github.com/klodkevin/static_polymorphism_D Let's first start with the problem we are trying to solve. We are going to build a really generic Queue with a lot of compile time options, that is: - The type of the container item. - A pop behavior that defines results for calling pop on an empty queue. (Exception, return null, etc.) - size parameter, with if set to true we get additional function returning us the size. - threaded behavior, setting this flag to true will give us thread safety for each operation - debugFlag, a flag that prints some debug info. Parts of the queue is implemented in the source in queue.d. Now the idea of this pattern is to simply create a compile time constant object called QueueOptions, that groups all these options together. So instead of instantiating the Queue with all five parameters, we simply have one container. Here is some code: enum EmptyBehavior { nullvalue, exception, ignored } // queue option definition class QueueOptions { mixin polymorphType!("type", string); mixin polymorphField!(bool, "threaded", false); mixin polymorphField!(bool, "size", true); mixin polymorphField!(EmptyBehavior, "behavior", EmptyBehavior.nullvalue); mixin polymorphField!(bool, "debugFlag", false); } // implementation for a queue. class Queue(T) { private: // poly mixin polymorphCreate!("Q", T); static if (Q.size) size_t size; // mutex for storage static if (Q.threaded) Mutex m; ... // further implementation. } // our class class MyClass { this(bool t) { value = t; } bool value; void print() { writeln("Have: " ~ std.conv.to!string(value)); } } // overrides some of our default options class Options : QueueOptions { mixin polymorphType!("type", MyClass); mixin polymorphField!(behavior, EmptyBehavior.exception); } // some test unittest { auto queue = new Queue!Options(); queue.push(new MyClass(true)); queue.push(new MyClass(false)); queue.push(new MyClass(false)); // remove and print queue.pop().print(); queue.pop().print(); queue.pop().print(); // throws exception if uncommented. //queue.pop(); } What we do here is we apply some "compiler magic" to "override" static const fields. Here the QueueOptions class defines our five fields with default values, and the Options class derives from it, overriding the EmptyBehavior and the stored class. We now can use the class pretty easily and uncommenting the last line yields to a thrown exception, as specified. To summarize this example, even if we had like 20 compile time parameters, we would still be able to get some nice syntax and a really flexible way of using our queue, depending on the need of the user. The next example covers a sort algorithm. The idea is the same as before. We make our sort algorithm really flexible, if we allow for some additional compile time object that contains the info on how to implement the sort. Here we define: enum SortFunc { normal, quick, bubble, random } class SortOptions { mixin polymorphField!(string, "pred", "a < b"); mixin polymorphField!(bool, "debugFlag", false); } class NormalSortOptions : SortOptions { mixin polymorphField!(SortFunc, "method", SortFunc.normal); mixin polymorphField!(SwapStrategy, "strategy", SwapStrategy.unstable); } class RandomSortOptions : SortOptions { mixin polymorphField!(SortFunc, "method", SortFunc.random); mixin polymorphField!(int, "rng_param", 2); } unittest { // specify our sort function. class Options2: RandomSortOptions { mixin polymorphField!(rng_param, 3); mixin polymorphField!(debugFlag, true); } void rsort(Range)(Range r) { csort!(Options2)(r); } // do some testing. int[] a = [2, 3, 4, 1]; rsort(a); assert(a == [1, 2, 3, 4]); } csort is our custom sort method. (sort.d) As we see we override the default sort method to use our random sort (ok the implementation is just a stub in sort.d for demonstration purposes). Again we can add a tremendous amount of sort options via such compile time objects with this method, making it easy for the user to choose what strategy to use and not bloating the interface of the sort method. For the next example we get really cocky. We want a sorted array, where we can insert elements of any type that we can compare and remain sorted. Of course we add some Sorted array compile time static object: class SortedArrayOptions { mixin polymorphType!("type", string); mixin polymorphField!(bool, "threaded", false); mixin polymorphField!(bool, "debugFlag", false); mixin polymorphType!("sortType", NormalSortOptions); } Now we have the field sortType, which contains our method information used for sorting. So in total, we have a lot of compile time choices to make: - All sorting options, that is 4 parameters for sortType. We also note here that if one sort option would have 8 parameters and the other just 6, we would not need to change any code here. - 3 parameters for our sorted array. So in total we end up with 7 template parameters we want to use. Our SortedArray implementation must be complicated. Wrong! class SortedArray(T) { mixin polymorphCreate!("Q", T); typeof(Q.type)[] data; this() { data = new typeof(Q.type)[0]; } void insert(typeof(Q.type)[] objs) { static if (Q.debugFlag) writeln("inserting."); data ~= objs; csort!(typeof(Q.sortType), typeof(Q.type)[])(data); } auto get() { return data; } } Although we didn't implement threaded and other methods at all, the concepts of sort is completly independent of the container options. This means in fact that we just can forward our template parameters to our sort function immediatly, without even checking them. Changing the sort method to accept more options now leads to next to no changes to our sortedArray class. Also a variable number of template arguments for the sort method leads to no changes. Here is some unittest that demonstrates the usage: unittest { // define our customized sorted array class Options : SortedArrayOptions { class CustomSort : RandomSortOptions { mixin polymorphField!(rng_param, -9); mixin polymorphField!(pred, "a > b"); mixin polymorphField!(debugFlag, true); } // sort option mixin polymorphType!("sortType", CustomSort); mixin polymorphField!(debugFlag, true); mixin polymorphType!("type", int); } alias Sarray = SortedArray!(Options); // do some testing int[] unsorted1 = [2, 3, 4, 1]; int[] unsorted2 = [-1, 0, 5, 6]; auto d = new Sarray(); d.insert(unsorted1); d.insert(unsorted2); assert(d.get() == [6, 5, 4, 3, 2, 1, 0, -1]); // Use our already defined behavior Sarray, just change the type to float class Options2 : Options { mixin polymorphType!("type", float); } alias SarrayFloat = SortedArray!(Options2); // do some more testing float[] unsorted1f = [2f, 3f, 4f, 1f]; float[] unsorted2f = [-1f, 0f, 5f, 6f]; auto df = new SarrayFloat(); df.insert(unsorted1f); df.insert(unsorted2f); assert(df.get() == [6f, 5f, 4f, 3f, 2f, 1f, 0f, -1f]); // does not compile, as we have the type float //df.insert(unsorted1); } As you can see the usage is quite simple as well flexible. Of course the syntax might be better, but im not a magician. From a more abstract view the following is going on using compile time object orientation: We want to combine normal objects using aggregation to build new objects. Unfortunately aggregate templated objects for which we can change the behavior at compile time leads to a bloat of the template signature. Think of it. Lets assume we want to create a general purpose configuration parser for JSON. This file parser has probabily some of the following methods or behavior which we want to choose at compile time: - an allocator to be specified for the data we store? - the memory model to be specified? - The file operations (buffered, unbuffered, stream options, file format, local buffer), ability to write? - Thread safety, async operations, callbacks? - Error handling (checking return null or throw exception, error handling for async operations?) - Parsing engine to be used? As you can see that would lead to huge bloat of template signature, which would essentially be unable to handle. But I can imagine that some parts of it the implementation is somehow orthogonal, that is the file operations, the memory model, the allocator, etc.. This means we would just forward our compile time static object to the allocator, the file object implementing the file operations etc. On the other hand this parser would be really flexible and by sticking with default configurations easy to use. (Maybe just override some 2 or 3 options, or just overriding memory model) If we want more specific usage we can maybe use a better fitting file implemention specific to our needs (e.g. better performance). Also we don't have to repeat the usage pattern every time, just declare one class and use one alias. So having a really general model and beeing efficient is not neccesarily an opposite if the class is well designed. For methods we can basically do the same. Lets assume we don't have compile time static objects and want to specify our really general method. Well, there are basically two options on how to handle this: - Adding an object controlling what will be executed as method parameter: This is ugly. Think of it. Really. Why adding a run time parameter to your method if you want to specify the function at compile time. Think of the implications if you want your class to be generic. Think of the performance. Add a const field to the class that gets initialized by the constructor? Adding a template parameter to your class and then instantiate an object of this class to pass it to the method? Doesn't sound clever to me. - Overloading: Make a number of x methods on e.g. sort, bubbleSort, quickSort, etc. with different parameter inputs. This works good for few sort options. We might even add templates to it to specify the overloading but then we have the signature bloat problem for more advanced options or classes that use this method. On the otherhand deciding what sort function we actually want to use based the type of the function arguments might not be what we want. Looking at a sort!(T)(args) function we make the following observation (Ob): What we want to do: sort So basically the name is acts like an interface, it tells us WHAT we want to do. Template arguments: T The template arguments specify HOW we implement the operation. Using a compile time constant object we can effectively "hide" this fact easily, even if the operation is actually really complicated and can be customized a lot. This means for example library code can just forward the user specified implementation (i.e. of the mempory model) to the underlying handler, without givin much care. (and no bloat of signature) Function arguments: args We know what we do. We know how we do it. Use the args to actually do it. To start the Discussion: I think support for a feature like this in the core language would be a great benefit. Im going to talk a bit about why. 1. Object Orientation In essence we are just replicating whats already there for the runtime. That is object orientation. Just for compile time. We can group template arguments together, we add inheritence, using virtual compile time constansts (must be set by the deriving classes) which basically gives us flexibility, ease to use and more expressive power. So by utilizing a good compile time static object model we would be able to add a common tool of solving programming problems (that is OO) to D at compile time. The exact implementation may vary, but it seems to be easier to use (at least for me), especially if your background is an OO language. 2. Performance and flexibility Since D tries to be a flexible and performant language, this tool is basically perfectly suited for it. We can write general purpose default methods and really complicated low level operations for certain tasks. Changing methods for internal computation of aggregated types would then just mean overriding a variable to specify the other method. Also adding features and new methods is easy, just specify the usage in the compile time object later, or if needed. 3. Better API design Look at std::fstream of C++. At the constructor you can set wether you want to open it as input, as output, as both, as binary or append or.... Looking at our observation (Ob) this is bad style. The actual used implementation should be specified by a compile time argument, if we don't want to change it at runtime. So in none working pseudo code: // maybe in client code or in library code as common usage pattern meta Options : FStreamOptions { mode = read, type = binary } alias Input = fstream!Options; // here Input in(file.a); in.read(); // somewhere else Input in(file.a); in.read(); // somewhere else from douchebag Input in(file.a); in.write(); Now the write shouldn't be available (I know that there is ofstream), but the same can be said about the exception flag. (Have you ever used a file in exception mode and then switched back or vice versa? Really? Why is there a failbit flag in exception mode? Do you really need that?) Of course you could add C++ templates to it, but obviously they are an abomination. (fstream with like six template parameters for general usage? Goodbye.) Another example would be Java: UnsupportedOperationException. Due to how OO is designed there, one may need to throw an unsupported operation exception during runtime due to a call to an operation that is not allowed. (If its not allowed in all cases, then why is the method even there?) Additionally for library writers, the compile time static object may contain logic to verify if the input is correct or not. Now the user just has to look at the implementation of the compile time static object to check what is allowed, instead of the usual class x(T) if(someHugeBulkOfArgumentsSinceWeHaveLike20TemplateParameters). Another example e.g. in Java is: ArrayBlockingQueue, ArrayDeque, ConcurrentLinkedQueue, DelayQueue, LinkedBlockingQueue, LinkedList, SynchronousQueue, etc. Are some queue implementations. Using some compile time static objects we could just use "Queue!Options" and in options specify our exact implementation, i.e. if the implementation is thread safe, if implemented as deque, change some default deque parameters, set someblocking paramter etc. What do you do when you want to use a SynchronousArrayQueue? Or a ConcurrentArrayQueue in Java? I hope you get the point. The Java way is to not use conditional compilation and create for each possible configuration your own class or use some fancy inheritence. It works well most of the time, but sometimes its just screwed up. Because essentially what Java does is doing Queue overload the hard way. It is not clear what implementations are possible, each is not modifiable at compile time etc. With compile time static objects the possible usage pattern would just be in the Options class, which can easily be looked at. Also we don't have an explosion of possible classes or bloat using generics. 4. Expressiveness Also the above example clearly shows the expressiveness of this method. We can group together usage patterns from code to compile time in a compile time static object (here we call the class meta). We could also add a file variable to FStreamOptions, which is then opened for reading, if the file is known at compile time. (the constructor with the file name may be removed then?) That means we remove the stuff that is known at compile time and factor it out so that we see the actual control flow of our program more clearly. Also specifying methods and objects implementations to use is just setting equality to some variables, no fancy overriding or setting of function pointers or adding runtime variables to functions to choose the actual implementation if you know in advance. 5. Easy to add Well I have no idea of the compiler, but I think that the feature shouldn't be that difficult to add, since we just have to choose the last overriding value, like choosing the last overriding function for normal classes. (thats essentially what I do in my buggy implementation) One might also think of adding multiple inheritence, since we are purely value based and hence avoid some difficulty of multiple inheritance. (well not the diamond problem.) There are probably more advantages to this, but let's talk about some disadvantages. Of course, like any language feature they can be abused. 1. Static if hell Putting to much features in one class can lead to difficult code. To much static ifs on like 20 parameters, especially if they are not orthogonal and all interact with each other are a real pain. The possible numbers of classes are 2 to the power of 20, if we just assume bool variables. If they all interact with each other, it gets really messy. So some kind of low coupling of the variables is required. If not, it gets really messy. Using compile time static objects makes it easier to do such stuff. 2. Setting variables to be compile time when they really should be run time. In one of my examples I have written rng_param, the parameter for the random sort. Now this parameter is compile time. If I needed to change that behavior at runtime, I can basically throw away all code that uses this random number generator. On the other hand one may add a flag to indicate wether the rng_param could be changeable at runtime, but to implement this behavior we have to do extra work. 3. Complexity with inheritence Using the presented pattern it is relatively easy to build large structures that are determined completly at compile time. interfaces to write some kind of polymorph code for a large codebasis or hide the implementation. Using compile time static objects we don't need that anymore, at least in some cases. If we use the strategy pattern using templates instead of inheritance, we may end up going 10 layers of static compile time inheritance down in our static compile time object to the one point where it is used. And we can modify the strategy that is used by simply overriding a value. One may start of thinking to use some kind of database for configuration at some point, but ok. Generally speaking, it can get really complex because the implementation is not really "hidden", it is there implicitly all the time. Maybe thats what we want, maybe not. Maybe it's not a problem, since we don't inherit that deep and we use this pattern mainly in library code, but still. On the other hand, there is basically no real world experience (not to my knowledge), so I can't tell. What are your thoughts on this? klodkevin
Mar 23 2015
To be put this bluntly this is not idiomatic D and for good reason. A lot of what you are trying to do, should be handled at runtime. Not at compile time. Ignoring the fact that you posted a wall of text (hence I barely got through half of it), you've come up with a very neat idea of a very configurable queue. Also remember, structs are nearly free, classes are very expensive. OOP advocates don't tell you that. Queues are one thing that can be done via structs. Also protected doesn't do what you think it does. Most of the time it is never used in D at all. I'm going to suggest reading my book[0] regarding Compile Time Function Execution. But it is more for advanced users of D. Unfortunately I didn't find a good queue to point you to. So you could learn more of how it is applied in D. Instead I'll recommend the std.container.array Array container. Ugh, looks like it needs work on the documentation front. Oh well. One thing, template bloat is a big problem. There is a good chance, if you specify something at compile time it will hang around like a barnacle per type initiation. You probably won't notice too big of a compile speed hit on this sort of code. But you are not far from it. I know most of my reply has been pretty negative. But I really do think it is neat. [0] https://leanpub.com/ctfe [1] https://github.com/D-Programming-Language/phobos/blob/master/std/container/array.d
Mar 23 2015
On 2015-03-23 20:17, klodkevin wrote:What are your thoughts on this?I didn't read the whole post but it looks like you've come up with what's called Policy-based design [1]. The only difference is that you combined several template parameters in one single parameter. I have actually started to do the same with the serialization package for Phobos I'm working on. [1] http://en.wikipedia.org/wiki/Policy-based_design -- /Jacob Carlborg
Mar 24 2015