www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - RFC: design for an open-source D discrete event simulator, D bugs

reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
Dear all,

I wish to solicit your feedback regarding the design for an
open-source D discrete event simulator, and how to work around D
implementation issues that are jeopardising that design.

Context:

The JiST (Java in Simulation Time) simulation engine [1], and the
SWANS network simulator built on top of JiST, follow an
interesting design: instead of explicitly sending messages
between simulation entities, messages are sent through the normal
method call syntax. Such special method calls gain new semantics.
Instead of being normal, synchronous method calls, they instead
schedule the reception of messages for the target entity, at the
appropriate "simulation time". For instance, consider the
following (Java) example from the JiST manual:

       import jist.runtime.JistAPI;

       class hello implements JistAPI.Entity
       {
           public static void main(String[] args)
           {
               System.out.println("simulation start");
               hello h = new hello();
               h.myEvent();
           }

           public void myEvent()
           {
               JistAPI.sleep(1); // this sleep is, misleadingly,
asynchronous
                                       // (it just advances the
simulation time -- Luís)
               myEvent();
               System.out.println("hello world, t=" +
JistAPI.getTime());
           }
       }

Executed with normal Java semantics, this example would crash
with a stack overflow. Executed through JiST, it produces the
following output:

       simulation start
       hello world, t=1
       hello world, t=2
       hello world, t=3
       ...

That's because the sending entity ('hello' is a simulation entity
because it "implements" JistAPI.Entity) does not wait for the
receiving entity to process 'myEvent'. The 'myEvent' message is
just scheduled to be received and handled once the target entity
reaches the same simulation time as the sending entity.

While this design may seem a bit surprising / concerning, the
proof is in the pudding, and its effectiveness can be seen in
SWANS, which implements a fairly complete network simulator while
still being quite clean and comprehensible. Still, there are some
problems with JiST:

      - Despite JiST claiming to use a standard language, a JiST
simulation is not written in Java, but in a slightly different
language, with different semantics. This is not explicit, because
the JiST semantics are introduced through bytecode rewriting of
the desired classes, but the technique is equivalent to having a
JiST->Java preprocessor / compiler. Since JiST is not actively
maintained, that has already produced problems with newer Java
bytecode format, which is not supported.

       - Java is not D, and the limitations of Java become quite
apparent. For instance, the time returned by JistAPI.getTime() is
of the native type 'long' (floating point becomes problematic in
simulations, as it became obvious in the Omnet++ simulator), so
the use fractional time units becomes cumbersome:

           double y = ...;
           x = (long) (JistAPI.getTime() + (y * Constants.SECOND));

        This is not fully solvable in Java, as the only solution (a
class) implies that a Time type would become a reference type
(not desirable), and greatly increase overhead (also not
desirable). That's just a trivial example (but core to the
domain), I'm sure I do not need to motivate you to the
limitations of Java, compared with D.

       - All of this stuff of bytecode rewriting eventually becomes
meaningless, because the recommended way to implement
(non-trivial) simulation entities in JiST is with interfaces,
which then requires that time semantics be obtained more
manually, through a proxy object. E.g.:

       import jist.runtime.JistAPI;

       public class proxy
       {
           public static interface myInterface extends
JistAPI.Proxiable
           {
               void myEvent();
           }

           public static class myEntity implements myInterface
           {
               private myInterface proxy =
                 (myInterface)JistAPI.proxy(this,
myInterface.class);

               public myInterface getProxy() { return proxy; }

               public void myEvent()
               {
                   JistAPI.sleep(1);
                   proxy.myEvent();
                   System.out.println("myEvent at
t="+JistAPI.getTime());
               }
           }

           public static void main(String args[])
           {
               myInterface e = (new myEntity()).getProxy();
               e.myEvent();
           }
       }

       - The JiST documentation claims that the Java type system
allows JiST to prevent direct communication between objects of
different entities (outside of simulation time), since JiST
further enforces that simulation entities cannot have public
member variables. That helps with features like automatic
parallelization of simulations. Yet, that guarantee does not
apply to the (most common) scenario with interfaces and proxies,
allowing for bugs to be introduced.

       - Etc.

Since I cannot see a problem without thinking of how I would
better solve it, I started designing a D discrete event
simulator. *Some* of the design objectives were:

       - Support a JiST-like simulation model; if possibly as a
library, and thus avoid source/bytecode rewriting.

       - Add a more advanced component composition features, as can
be found in Omnet++

       - Support kick-ass automatic parallelization, by taking
advantage of D's type system. (only immutable objects or copies
can be passed between entities; per-thread variables prevent
accidental sharing)

As I said, I tried quite a few different designs. Here are the
usage basics:

       import dsim;

       class A : Entity!A
       {
           // (public methods can accept any number and type
           // of params, and are executed with sim. time semantics)
           void test()
           {
               sleepAsync(1); // (you can also sleep synchronously)
               test();
               writeln("hello world, t=", time);
           }
       }

       void main()
       {
           A a = A(); // instead of new A();
           a.test();
           sim.endAt(5);
           sim.run();
       }

Output:

       hello world, t=1
       hello world, t=2
       hello world, t=3
       hello world, t=4
       hello world, t=5

As you can see, the model is similar to JiST. (the implementation
is ugly, but the usage interface is nice. see below)

The challenge with the design was the corner cases of the type
system, etc. After having that a rough implementation with those
basics working (with simulation time semantics), I started
looking for a design which better integrated with the type system
all the desired scenarios. I finally found a design which I was
reasonably happy with, allowing variants like:

       // example 4, see the code linked (design1)

       interface I1
       {
           void foo();
       }

       static class C : Entity!(C, I1)
       {
           void foo()
           {
               writeln("4: C foo");
           }
       }

       // can use C as a C
       C c = C();
       c.foo();

       // can use C as an Entity!C
       Entity!C ec = C();
       ec.foo();
       ec = c;
       ec.foo();

       // can use C as an untyped Entity
       Entity!Void evoid = C();
       evoid.foo();
       evoid = ec;
       evoid.missingEvent(); // ok for void entity

       // Can use C through its interface
       I1 i1 = C();
       i1.foo();

       // Can use C through an Entity interface
       Entity!I1 ei1 = C();
       ei1.foo();

Due to the experience with previous design attempts for the
simulator, I created a very thorough test suite for this last
design (testing just the usage pattern, not the simulation time
semantics), checking all the valid and invalid possible usages.
That's because after having spent quite a while implementing
other designs I would then discover some edge cases which would
tarnish those designs. But this one seemed to work well so, with
the validation done, today I was going implement the actual
simulation logic.

Unfortunately, as had happened for other design attempts, I met a
D implementation issue. Specifically, the (ugly) inheritance,
metaprogramming and reflection tricks I was using for the first
design do not work correctly for this last design.

I cannot timely summarise all the design attempts I've explored,
and all the D issues I've encountered, and how they interact with
all the features I can imagine and desire, so I'll just explain
some basics of the design, issues and bugs and leave it at that
for now:

- public methods have time semantics, public member variables are
not allowed (except if references to other entities, although
they are overridden as properties by Wrapper!T).

- Only one (user accessible) type is used for everything,
Entity!T. Other designs used multiple types (e.g., Gate!T would
be a typed reference to an entity).

- The simulation time semantics are provided to a class C by
having a base class Entity!C and a wrapper class Wrapper!C, which
inherits from C.  Entity!C mimics the interface of C, and
Wrapper!C overrides C's methods to schedule method calls, instead
of executing them synchronously. (actually I also used an
EntityHelper class due to limitations, but let's keep it simple).

- I think the inheritance of Entity!C from C is cool from a user
perspective, compared with having a mixin in C, but I suspect
many of you will disagree with it. It might complicate the
design. I'm not sure I can support all the use cases anyway
without having C derive from some helper class, and certainly it
would be unfriendly for the user to have to add both a base class
and a mixin (has anyone considered adding base classes /
interfaces through a mixin?).

- The statically typed entities (Entity!T) statically type-check
and bind all method calls / message sends. The void entity (
Entity!Void, or Entity!() ) does runtime dispatch (slower).
Messages for untyped entities which are not handled by the entity
class (C, etc.) are received by a generic handler:

           protected void receive(string name, Variant msg); //
declared in Entity!T class

       A call to untyped.foo(42, "hello") would be handled as
untyped.receive("foo", {42, "hello"}).

       I haven't quite decided if the default receive() should just
ignore the message or fail, but I'm leaning toward default ignore
(simulations are exploratory things); The actual entity class can
override receive to change the default behavior.

- I'm not sure if, in practise, the runtime composition of
entities (through textual component/scenario configuration files,
like in Omnet++) won't spoil the performance/type-checking of
non-void Entities, and make a lot of things be runtime dispatched
anyway. For instance, if once introduces channel entities between
typed entities (which would delay messages, drop them, etc.).

- A  blocking attribute woud allow method calls / message passing
to be blocking (see design0), instead of the default async.

- For now, I've named the simulator Dsim. Suggestions for more
creative names are welcome (simd is misleading).

- The most pressing / blocking issue which is preventing the
newer design from using the implementation code from the first
design is the following issue (and variants):

           template Base(T)
           {
               class Base
               {
                   // this correctly lists "foo" as a member of C
                   pragma(msg, __traits(derivedMembers, T));

                   // but this complains about C not having a "foo"
member...
                   // pragma(msg, __traits(getProtection, T.foo));

                   auto bar()
                   {
                       return T.foo; // yet, it works here!
                   }
               }
           }

           public class C : Base!C
           {
               static void foo()
               {
                   writeln("foo");
               }
           }

       That is important to allow Entity!C to respond to C's method
names, etc.

- In the original design (design0 below), I was able to use
foo(T...)(T msg) in Wrapper!T to override C's foo's, yet that was
not working today in the newer design (design1), where I had to
use the exact type of foo. I was too tired to debug that today.
I've tested that does not work in a simpler example, so I'm not
sure why it works in design0, I'll have to look into that, it's
probably be something obvious :-)

- In generic code with hierarchies like I use here, "override"
can be a pain. Especially because the behavior of D seems
inconsistent. This works without warning:

       interface I
       {
           void foo();
       }

       abstract class C : I
       {
       }

       class D : C
       {
           void foo() {}
       }

But this doesn't: (needs "override")

           interface I
           {
               void foo();
           }

           abstract class C : I
           {
               abstract void foo();
           }

           class D : C
           {
               void foo()
               {
                   writeln("foo");
               }
           }

This also complains:

           abstract class C
           {
               abstract void foo();
           }

           class D : C
           {
               void foo()
               {
                   writeln("foo");
               }
           }

       ...even though class C (and foo) are abstract, and thus 
class
D should IMO not have to use "override" (class C is effectively
an interface). Or at least it should have a behavior consistent
with the first one.

You can find other issues mentioned in the code / referencing
bugzilla Issues:

       https://github.com/luismarques/dsim

        - design1 is the design which I chose as fairly acceptable 
/
was validated, but which I discovered today would be problematic
to implement, even though I had already implemented the time
semantics in the old design. Besides the D issues, I would also
appreciate feedback on the design itself.

       - design1-hardcoded-tests is the design1 before I started
implementing the time semantics today, so it has the necessary
special-cased code so that the test suite compiles (except for
the tests that are comented-out, due to D bugs). Might be useful
to check design things which in "design1" had to be
commented-out, because they no longer compile.

        - design0 is the original design. The code is messy (was 
not
meant to be public), but it might be relevant to understand how I
was going to implement the time semantics.

        - Other designs were not included for now, because I would
have to dig them up, check if they compile, clean them, explain
them, etc. If necessary I'll add more design experiments.

Destroy! ...wait, no, gently criticise!
Dec 07 2013
next sibling parent =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
Missing reference:

[1] http://jist.ece.cornell.edu
Dec 07 2013
prev sibling parent =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Saturday, 7 December 2013 at 22:31:43 UTC, Luís Marques wrote:
 Other designs used multiple types (e.g., Gate!T would
 be a typed reference to an entity).
(not an actual reference type; it was a struct, which referenced an entity).
Dec 07 2013