digitalmars.D.learn

digitalmars.D.learn - Walnut

Dan (8/8) Dec 31 2007 Hi guys,

Alan Knowles (50/64) Jan 01 2008 As you've opened the door - please regard the below as my personal

Dan Lewis (34/90) Jan 02 2008 Of course. : )

Alan Knowles (68/134) Jan 02 2008 snip snip.. - lots of bits inline.

Dan Lewis (34/114) Jan 02 2008 I wish I could, but auto only accepts "simple" data types, and construct...

Alan Knowles (5/13) Jan 02 2008 so is the idea to run the interpreter inside the parsing engine? or are

Dan (12/27) Jan 03 2008 Would you believe me if I said combinations of both?

Alan Knowles (38/70) Jan 03 2008 Yes, sounds like a good approach.. -

Alan Knowles (15/95) Jan 03 2008 That little snippet reminded me of another trick:

Alan Knowles (21/125) Jan 03 2008 Just checking the code against this - Why are you not using Type|Token

Alan Knowles (9/9) Jan 04 2008 Not sure if it's feasible to keep a 2.0 + gdc build working - but

Dan Lewis (8/8) Jan 04 2008 Hi Alan,

Dan <murpsoft hotmail.com> writes:

Hi guys,

I've been working on the Walnut project for a while now by myself, and it's got
one major component remaining before the alpha debugging can start going down -
the parser.

http://dsource.org/projects/walnut/browser/branches/1.9/

The code seems pretty elegant so far, and it should be easy enough to read.  I
recommend starting in Value.d, but I'm sure there are lots of things I've done
wrong while programming it.  I was hoping I could convince some people to take
a look at it and point out my mistakes.

I'm going to be studying how to use Trac today, so if you'd be so kind, please
use the forums for now.

Much appreciated,
Regards,
Dan

Dec 31 2007

Alan Knowles <alan akbkhome.com> writes:

Dan wrote:
 Hi guys,
 
 I've been working on the Walnut project for a while now by myself, and it's
got one major component remaining before the alpha debugging can start going
down - the parser.
 
 http://dsource.org/projects/walnut/browser/branches/1.9/
 
 The code seems pretty elegant so far, and it should be easy enough to read.  I
recommend starting in Value.d, but I'm sure there are lots of things I've done
wrong while programming it.  I was hoping I could convince some people to take
a look at it and point out my mistakes.
 

As you've opened the door - please regard the below as my personal 
opinon, and take as such... ;)

Value.d:
You have made quite heavy use of op*** magic methods, Having done this 
before a few times, it always bites me in the ass later on.. - as you 
have to remember what magic is going to occur when you assign/create.

It may be better to switch to more obvious/classic methods, - overloaded 
constructors or an overloaded static method "construct()", / 
to[typename], index(int id) etc.
= This is going to make the future code alot easier to read, and 
understand. (along with maintain, enable others to quickly work out what 
is going on )

structure.d:
I would be tempted to create a method to generate this type of code, 
rather than trying to create a solution(the op*** stuff) for a problem 
that would not have existed if you had gone down the road of using code 
generators..
Have a look at this to get an idea..
http://www.akbkhome.com/svn/gtkDS/wrap/APILookupPhobos.txt

text.d:
some of this could be autogenerated, and enum'd (might be clearer)
eg...  TEXT.undefined

methods.d:
again, using a code generator would give you benefit's of documentation 
and using static D classes to encapsulate each Javascript class (along 
with making smaller more manageable files..)

Not sure why your standard method call is using varargs (...) - unless I 
misread the code..

------------
interpreter.d
(might be better to rename it tokenizer.d)

Looks like the next big jobs would be:
finish tokenizing (and do some test cases)
creating the OPcodes.... (this part looks painful and time-consuming, 
having seen dmdscript version. - parser / expression / statement etc.)

Scope Management?
Opcode runtime (~2800 lines of code in dmdscript)

---
Unfortunately I detest forum's - old school (or stuborn), i prefer good 
ole mailing lists (which I have for most D newsgroups, as I pull the 
nntp feed into my mailbox)

I'm keeping an eye on Walnut, but I since most of my needs for 
Javascript/DMDscript mean getting results very quickly from hacks to 
DMDscript, I cant really justify to much real help to Walnut 
unfortunately - but do keep working on it, as soon as the opcode 
runtime, parser/opcode builder are done, I'd be pretty keen to retarget 
all the binding code for DMDscript to be Walnut only.


Regards
Alan







 I'm going to be studying how to use Trac today, so if you'd be so kind, please
use the forums for now.
 
 Much appreciated,
 Regards,
 Dan

Jan 01 2008

Dan Lewis <murpsoft hotmail.com> writes:

Alan Knowles Wrote:
 As you've opened the door - please regard the below as my personal 
 opinon, and take as such... ;)

Of course.  : )

If I may, please don't take this as a rejection of your help.  I'm simply
explaining how I've come to where I am.  I'll probably implement at least some
of these right away.

 
 Value.d:
 You have made quite heavy use of op*** magic methods, Having done this 
 before a few times, it always bites me in the ass later on.. - as you 
 have to remember what magic is going to occur when you assign/create.

Yeah, the opAssign/opCall is only being used so I can go:
Value v = cast(Value) 4;

instead of:
Value v;
v.i = 4;
v.type = TYPE.NUMBER;

The only magic that ever happens there should be the automatic type property
assignment.

Then there's the opCall(Value, Value, Value[] ...) which is to call Functions,
and the opIndex, opIndexAssign, opIn_r which are to use Values as Objects.

The promotion of the Value struct to hold Function, Array and Object is a
blatant diregard of the ECMAScript spec, however, it *is* semantically
consistent, and consistent with the language itself.  It could be used to bring
a significant structural advantage as we now have a single primitive to work
with; and since the original form needed to disambiguate the type of a Value
anyways.

 
 It may be better to switch to more obvious/classic methods, - overloaded 
 constructors or an overloaded static method "construct()", / 
 to[typename], index(int id) etc.

I am actually starting to think that the Value.to[typename] format is
cumbersome, as in all honesty, I'm not sure whether the output of a
Value.toString() which is a Number object containing a value of 4 should be
"4", "[object Number]", "4.0" (it's a double), or what.  I was then wondering
how this should relate to the methods that we have; Object_prototype_toString,
RegExp_prototype_source, etc.

So, there'll be a semantic change there somewhere to disambiguate, as I'm sure
we both agree that ambiguity is bad.

 = This is going to make the future code alot easier to read, and 
 understand. (along with maintain, enable others to quickly work out what 
 is going on )
 
 structure.d:
 I would be tempted to create a method to generate this type of code, 

It is very tedious to maintain that one.  I'll probably try to do something
like that soon.

To expand, I had originally hoped to be able to use Associative Arrays, but
they apparently contain a pointer to a complex hashing structure with even more
pointers below.  I had hoped to simply sort the char[] pointing structures
based on the strings alphabetically and do a binary search; which is probably
faster for the small sets typically used for ECMAScript objects.

The structure.d file was an effort to create a static literal which wouldn't
need any memcpy or anything of the sort; it would be loaded in via DMA straight
from the file and be useable immediately.

 rather than trying to create a solution(the op*** stuff) for a problem 
 that would not have existed if you had gone down the road of using code 
 generators..
 Have a look at this to get an idea..
 http://www.akbkhome.com/svn/gtkDS/wrap/APILookupPhobos.txt
 
 text.d:
 some of this could be autogenerated, and enum'd (might be clearer)
 eg...  TEXT.undefined

Definitely like the enum notation better.  : )

 
 methods.d:
 again, using a code generator would give you benefit's of documentation 
 and using static D classes to encapsulate each Javascript class (along 
 with making smaller more manageable files..)

When I converted Walnut 1.x from DMDScript, I was mostly doing it to understand
more of what Walter had written to learn how a good implementation looks.

I noticed that there was alot of redundancy in each of the files, and that my
head was filling with all sorts of different constructs as I examined each
file.  That's why I converted it to aspect oriented.  Now the code is so
boringly simple that apart from value.d it reads like a list.

The problem with encapsulating JavaScript classes with D classes is that spec
requires you to be able to expand JavaScript objects, so you eventually have to
use an array notation inside that; as per DMDScript and Spidermonkey.  You end
up duplicating several properties inside the array notation and class notation;
and there's extensive code to look up the address of an ECMAscript property. 
This is why even DMDScript property lookup is a few times slower than Lua or Io.

 
 Not sure why your standard method call is using varargs (...) - unless I 
 misread the code..

The Value[] arguments was originally not a varargs, and you could pass it an
array of Values just fine.  My interpretation of varargs is that it converts a
set of Values to a Value[] at the caller by prepending the length?  So the
varargs would simply mean you can now call the function passing:

(self, cc, arg1, arg2, arg3), as well as:

Value[] args = { arg1, arg2, arg3 };
(self, cc, args)

and that the call would look identical.

 interpreter.d
 (might be better to rename it tokenizer.d)

I'm (now) hoping to run the parser algorithms from the same file, and making
sure it inlines the lexer.  I'm not sure if I want to generate tokens and then
interpret them, or if I can use the finite state brought about by position in
the lexer switch to somehow mean the same (preventing a double-switch).  The
problem with that is that I can't seem to think beyond one token very well -
the same one faced by the guys who invented separation of lexer, parser,
interpreter.

 
 Looks like the next big jobs would be:
 finish tokenizing (and do some test cases)
 creating the OPcodes.... (this part looks painful and time-consuming, 
 having seen dmdscript version. - parser / expression / statement etc.)

Yup.  I'm facing some analysis paralysis on this one, trying to come up with
something cool (the wheel's already been invented, so why not do it round this
time?)

 
 Scope Management?
 Opcode runtime (~2800 lines of code in dmdscript)

Yeah, I was hoping to tie scope in with something during parsing of {}.  I've
already got a Global object which is already being looked at for non-keywords
in my [rather pathetic so far] lexer.  I think what I want is a bunch of
Value's, which are of TYPE.OBJECT, or perhaps a new type just like it, which
carry variables and stuff.

If I compile all functions down to (unoptimized) native code with the same call
interface as the natives (my dream) then I could probably just use the stack to
handle scope as per the natural way instead of faking it like most interpreters.

 
 ---
 Unfortunately I detest forum's - old school (or stuborn), i prefer good 
 ole mailing lists (which I have for most D newsgroups, as I pull the 
 nntp feed into my mailbox)
 
 I'm keeping an eye on Walnut, but I since most of my needs for 
 Javascript/DMDscript mean getting results very quickly from hacks to 
 DMDscript, I cant really justify to much real help to Walnut 
 unfortunately - but do keep working on it, as soon as the opcode 
 runtime, parser/opcode builder are done, I'd be pretty keen to retarget 
 all the binding code for DMDscript to be Walnut only.

Actually, Walnut 1.0 is branched from DMDScript, but I reformatted it, cleaned
it up and the likes.  It almost has native ActiveX, moreso than JScript.  But
there are major bugs that I don't understand.  Perhaps you'd be more prone to
help there than Walnut 2.x.

Well, that was a HUGE ramble.
Regards,
Dan

Jan 02 2008

Alan Knowles <alan akbkhome.com> writes:

snip snip..  - lots of bits inline.

Actually I forgot to mention - have you seen ECMAscript 4 (the new one) 
- While aiming for that as a target may be a bit adventurous, It's 
probably worth thinking about how some of it will be implemented eventually.

 Yeah, the opAssign/opCall is only being used so I can go:
 Value v = cast(Value) 4;

 instead of:
 Value v;
 v.i = 4;
 v.type = TYPE.NUMBER;
   

yeah I was hoping for something like:
auto v = new Value(4);
Which is roughly the same length, and is a little clearer.. - probably 
worth thinking about for new code.. (If you decide to go for it, I might 
get bored one day, and help you refactor the old code ;)

 The only magic that ever happens there should be the automatic type property
assignment.

 Then there's the opCall(Value, Value, Value[] ...) which is to call Functions,
and the opIndex, opIndexAssign, opIn_r which are to use Values as Objects.

 The promotion of the Value struct to hold Function, Array and Object is a
blatant diregard of the ECMAScript spec, however, it *is* semantically
consistent, and consistent with the language itself.  It could be used to bring
a significant structural advantage as we now have a single primitive to work
with; and since the original form needed to disambiguate the type of a Value
anyways.
   

I suspect this may get into a bit of trouble when you deal with some of 
the weird and wonderfull scoping stuff with Javascript.
 From what I remember:

CallableFunction extends Object
Value can hold an object...

CallableFunction holds a reference to the FunctionDefinition (code etc.)
FunctionDefinition holds a reference to the Creation scope...


   
 It may be better to switch to more obvious/classic methods, - overloaded 
 constructors or an overloaded static method "construct()", / 
 to[typename], index(int id) etc.
     

 I am actually starting to think that the Value.to[typename] format is
cumbersome, as in all honesty, I'm not sure whether the output of a
Value.toString() which is a Number object containing a value of 4 should be
"4", "[object Number]", "4.0" (it's a double), or what.  I was then wondering
how this should relate to the methods that we have; Object_prototype_toString,
RegExp_prototype_source, etc.

 So, there'll be a semantic change there somewhere to disambiguate, as I'm sure
we both agree that ambiguity is bad.
   

as D uses the cast keyword, it's actually marginally shorter,
a = cast(String) theval
a = theval.toString();
obviously, you could use as[typename], to make it distinct...

   
 = This is going to make the future code alot easier to read, and 
 understand. (along with maintain, enable others to quickly work out what 
 is going on )

 structure.d:
 I would be tempted to create a method to generate this type of code, 
     

 It is very tedious to maintain that one.  I'll probably try to do something
like that soon.

 To expand, I had originally hoped to be able to use Associative Arrays, but
they apparently contain a pointer to a complex hashing structure with even more
pointers below.  I had hoped to simply sort the char[] pointing structures
based on the strings alphabetically and do a binary search; which is probably
faster for the small sets typically used for ECMAScript objects.

 The structure.d file was an effort to create a static literal which wouldn't
need any memcpy or anything of the sort; it would be loaded in via DMA straight
from the file and be useable immediately.
   

When you start binding something like gtk, with craploads of enum's 
expressed as object properties, the whole lookup stuff gets even more 
complex. I suspect the answer is to have a method to add/get  property 
etc. and use assoc. arrays to start with, then optimize the crap out of 
it later..
   

 When I converted Walnut 1.x from DMDScript, I was mostly doing it to
understand more of what Walter had written to learn how a good implementation
looks.

 I noticed that there was alot of redundancy in each of the files, and that my
head was filling with all sorts of different constructs as I examined each
file.  That's why I converted it to aspect oriented.  Now the code is so
boringly simple that apart from value.d it reads like a list.

 The problem with encapsulating JavaScript classes with D classes is that spec
requires you to be able to expand JavaScript objects, so you eventually have to
use an array notation inside that; as per DMDScript and Spidermonkey.  You end
up duplicating several properties inside the array notation and class notation;
and there's extensive code to look up the address of an ECMAscript property. 
This is why even DMDScript property lookup is a few times slower than Lua or Io.
   

Most of the classes are really just static classes - just used to tidy 
up the code, rather than actually doing real encapsulation.
eg.
static class Date {
      void  getHours(....) { }
      void  getTime(....) { }
}
Although I have to add prefixes in the code generator when doing 
bindings, as library writers seem to have a horible habit of using D 
keywords or common method names ;)




   
 Not sure why your standard method call is using varargs (...) - unless I 
 misread the code..
     

 The Value[] arguments was originally not a varargs, and you could pass it an
array of Values just fine.  My interpretation of varargs is that it converts a
set of Values to a Value[] at the caller by prepending the length?  So the
varargs would simply mean you can now call the function passing:

 (self, cc, arg1, arg2, arg3), as well as:

 Value[] args = { arg1, arg2, arg3 };
 (self, cc, args)

 and that the call would look identical.
   

mmh,, kind of cute ;) - That's reminds me of those cool language 
features, that confuses other people when they first see it
[string] / [number] => resulting in an array of strings  [Pike]



   
 interpreter.d
 (might be better to rename it tokenizer.d)
     

 I'm (now) hoping to run the parser algorithms from the same file, and making
sure it inlines the lexer.  I'm not sure if I want to generate tokens and then
interpret them, or if I can use the finite state brought about by position in
the lexer switch to somehow mean the same (preventing a double-switch).  The
problem with that is that I can't seem to think beyond one token very well -
the same one faced by the guys who invented separation of lexer, parser,
interpreter.
   

Actually having the tokenizer available is really usefull -
http://www.akbkhome.com/blog.php/View/156/Script_Crusher.html

I think you are being a bit hopeful on that. - I face the same problem 
with the steps, by the time I've finished the parser, my brain is usualy 
at exploding point and I give up ;)

I was wondering, although you can not copy DMDscript directly, If 
someone (eg. me) wrote a summary of the steps that where involved in the 
parser/code gen stage.. - and posibly to opcodes, then you would not be 
breaking copyright??? based on code documentation, rather than actual 
code????
- It would save you a considerable amount of pain.....


   
 Scope Management?
 Opcode runtime (~2800 lines of code in dmdscript)
     

 Yeah, I was hoping to tie scope in with something during parsing of {}.  I've
already got a Global object which is already being looked at for non-keywords
in my [rather pathetic so far] lexer.  I think what I want is a bunch of
Value's, which are of TYPE.OBJECT, or perhaps a new type just like it, which
carry variables and stuff.
   

I need to understand how Walter solved the closures bug in the last 
release - I copied the code into my repo, but did not have time to 
understand it.
The problems you get with Javascript, is that the scope is not only from 
Global, but also creation scope (which may not be a compile time).. and 
outer layers (eg. functions within functions etc.)
 If I compile all functions down to (unoptimized) native code with the same
call interface as the natives (my dream) then I could probably just use the
stack to handle scope as per the natural way instead of faking it like most
interpreters.
   

I've looked at this a few times, I dont think you will ever get native 
code out of a scripted language very well. (let alone understanding 
gcc's internals to make it happen ;) - One thing to think about is how 
it may be possible to write your opcode arrays to memory, and how to 
duplicate your stack (so that  you r interpreter can eventually handle  
multi-threaded applications), key to this is making the Value object 
serializable/unserializable..
 Actually, Walnut 1.0 is branched from DMDScript, but I reformatted it, cleaned
it up and the likes.  It almost has native ActiveX, moreso than JScript.  But
there are major bugs that I don't understand.  Perhaps you'd be more prone to
help there than Walnut 2.x.
   

Have you updated Walnut 1.0 with Walter's last change? - the closure fix?
Yes, There are alot of other stuff I've added to DMDscript that could do 
with a better home ;)


Regards
Alan

 Well, that was a HUGE ramble.
 Regards,
 Dan

Jan 02 2008

Dan Lewis <murpsoft hotmail.com> writes:

Alan Knowles Wrote:

 snip snip..  - lots of bits inline.
 
 Actually I forgot to mention - have you seen ECMAscript 4 (the new one) 

Yeah, so far I've been targetting ECMAScript 3, but I've seen 4 and have it in
mind.  I figured I'd worry about the difference after it was running js files.

 Yeah, the opAssign/opCall is only being used so I can go:
 Value v = cast(Value) 4;

 instead of:
 Value v;
 v.i = 4;
 v.type = TYPE.NUMBER;
   

 yeah I was hoping for something like:
 auto v = new Value(4);

I wish I could, but auto only accepts "simple" data types, and constructors
can't even be faked in structs.  One would need to store it in a class, which
involves keeping unweildy, opague vtbls, forces us to use the heap and pass by
reference (structs can go either way)

What we can do now is go:

Value v = 4;

and have it correctly use opAssign and opCall.  What it fails to do is handle
things like:

Value myFunc() {
  return 4;
}

(If you decide to go for it, I might 
 get bored one day, and help you refactor the old code ;)

You're always welcome to try refactoring it any which way.  If the resultant
program is more elegant, it goes in my source.

 I suspect this may get into a bit of trouble when you deal with some of 
 the weird and wonderfull scoping stuff with Javascript.
  From what I remember:
 
 CallableFunction extends Object
 Value can hold an object...
 
 CallableFunction holds a reference to the FunctionDefinition (code etc.)
 FunctionDefinition holds a reference to the Creation scope...

Actually, that's pretty easy.  Value stores both callable js functions and js
objects, and the js functions hold pointers to native functions.  

Getting more challenging, the scope for the function takes two aspects, first
we have a bunch of identifiers that we need to know, and second, we need those
to be local to the function.  My plan was to essentially push Values onto the
local part of the call stack (below EBP)

 It may be better to switch to more obvious/classic methods, - overloaded 
 constructors or an overloaded static method "construct()", / 
 to[typename], index(int id) etc.


 as D uses the cast keyword, it's actually marginally shorter,
 a = cast(String) theval

The problem is that cast(string) X can only be defined on either the string
struct (which is native to D), or for *one* type via the opCast method.  The
reason for only being able to do it for one type, is because the function
signatures in D need to have the same return type (I dunno, ask Walter?)

 a = theval.toString();
 obviously, you could use as[typename], to make it distinct...

Yeah, that could work.

 I would be tempted to create a method to generate this type of code, 

 It is very tedious to maintain that one.  I'll probably try to do something
like that soon.

 When you start binding something like gtk, with craploads of enum's 
 expressed as object properties, the whole lookup stuff gets even more 
 complex. I suspect the answer is to have a method to add/get  property 
 etc. and use assoc. arrays to start with, then optimize the crap out of 
 it later..

Yeah, I would, except I wouldn't be able to optimize out AA's.  They're part of
D, not part of my code.  What I believe I could do is to use AA's, and later
use opIndex, opAssign, opIn_r; but essentially this doesn't provide much
benefit over what I've got now - the produced functionality is same; and I
*can* declare a static literal.

 Most of the classes are really just static classes - just used to tidy 
 up the code, rather than actually doing real encapsulation.
 eg.
 static class Date {
       void  getHours(....) { }
       void  getTime(....) { }
 }

Oh, I didn't know that got optimized out.

 Although I have to add prefixes in the code generator when doing 
 bindings, as library writers seem to have a horible habit of using D 
 keywords or common method names ;)

Yeah, I don't mind folks using common names, as long as they tidy up their
namespaces.  Walnut at this point is not tidy or threadsafe.  My program is
using static global variables for now, and none of the modules identify
themselves as walnut.module.

 interpreter.d
 (might be better to rename it tokenizer.d)


 Actually having the tokenizer available is really usefull -
 http://www.akbkhome.com/blog.php/View/156/Script_Crusher.html


To follow up, today I created the functions interpret() and tokenize().  The
tokenizer is return Value's, which can now also be non-morphemic tokens (like
'=' and 'a bunch of stuff in parens').  The tokenizer puts data into the Value
for morphemic tokens (like numbers, strings, etc)

This is actually a ways better than DMD because it only needs to be read once.

 I think you are being a bit hopeful on that. - I face the same problem 
 with the steps, by the time I've finished the parser, my brain is usualy 
 at exploding point and I give up ;)

Yeah, I realized that you can only efficiently have a single instruction
address; which is why I couldn't move beyond the current token.  Once could
theoretically write a predictive parser, but those are evil.

 
 I was wondering, although you can not copy DMDscript directly, If 
 someone (eg. me) wrote a summary of the steps that where involved in the 
 parser/code gen stage.. - and posibly to opcodes, then you would not be 
 breaking copyright??? based on code documentation, rather than actual 
 code????
 - It would save you a considerable amount of pain.....

Yes, except the object isn't to copy DMDScript without the license, the
objective is to create an engine that's significantly better.  At the moment, I
would say roughly half the code is written and I'm using 108KB vs DMDScript's
513KB.  The parser is the only remaining component before it can (incorrectly)
run javascript files.  The rest is debugging.

 I need to understand how Walter solved the closures bug in the last 
 release - I copied the code into my repo, but did not have time to 
 understand it.
 The problems you get with Javascript, is that the scope is not only from 
 Global, but also creation scope (which may not be a compile time).. and 
 outer layers (eg. functions within functions etc.)

You have a scope chain, essentially, every function containing this one
(including global) is checked in order from this context up to global.  This
can be done by examining the stack during runtime; which should only be storing
Value structs, so it should be pretty readily understood.

 If I compile all functions down to (unoptimized) native code with the same
call interface as the natives (my dream) then I could probably just use the
stack to handle scope as per the natural way instead of faking it like most
interpreters.
   

 I've looked at this a few times, I dont think you will ever get native 
 code out of a scripted language very well. (let alone understanding 
 gcc's internals to make it happen ;) - One thing to think about is how 
 it may be possible to write your opcode arrays to memory, and how to 

The plan is to write opcodes to memory strictly operating on whatever's in my
Value structs.  There isn't any ambiguity between types or sizes, and I can
probably take advantage of D's inline asm statements to ease things a bit.

 duplicate your stack (so that  you r interpreter can eventually handle  
 multi-threaded applications), key to this is making the Value object 
 serializable/unserializable..

You mean like fork().  I'm thinking like; being able to run multiple instances
of ecmascript by calling:

global = Global_init();
interpret(source,global,args); 

as many times as I like within the same program from different threads (that
someone else can go ahead and figure out how to make)

Value is already serializable.  It's a struct.  One of the reasons I hate
classes is because they're opague, and thus very hard to serialize.

 Have you updated Walnut 1.0 with Walter's last change? - the closure fix?

Nope.  I should though, and I should make 1.x run on D 2.x

 Yes, There are alot of other stuff I've added to DMDscript that could do 
 with a better home ;)

Highly interested.  Also, 1.x almost has native ActiveXObject.  It needs a few
bugs worked out, and I haven't had the brainpower to face it again for a while.

At the moment, fromVariant isn't recognizing whatever type is being passed for
numbers (as seen by running test\activex.nut)  It's recognizing functions and I
think even letting you call them.  It's enumerating the properties perfectly. 
I had to comment out the Put method, but I'd like to refactor set and setByRef
into that.  That would make ActiveXObject more native to Walnut 1.x than
JScript.

: p

Regards,
Dan

Jan 02 2008

Alan Knowles <alan akbkhome.com> writes:

  .
 
 Yes, except the object isn't to copy DMDScript without the license,
 the objective is to create an engine that's significantly better.  At
 the moment, I would say roughly half the code is written and I'm
 using 108KB vs DMDScript's 513KB.  The parser is the only remaining
 component before it can (incorrectly) run javascript files.  The rest
 is debugging.
 

so is the idea to run the interpreter inside the parsing engine? or are 
you going to generate opcodes? - It wasn't quite clear?

Regards
Alan

Jan 02 2008

Dan <murpsoft hotmail.com> writes:

Alan Knowles Wrote:

   .
 
 Yes, except the object isn't to copy DMDScript without the license,
 the objective is to create an engine that's significantly better.  At
 the moment, I would say roughly half the code is written and I'm
 using 108KB vs DMDScript's 513KB.  The parser is the only remaining
 component before it can (incorrectly) run javascript files.  The rest
 is debugging.
 

 so is the idea to run the interpreter inside the parsing engine? or are 
 you going to generate opcodes? - It wasn't quite clear?
 
 Regards
 Alan
 

Would you believe me if I said combinations of both?

For now I want to do this:
0) interpret top-level, and compile functions and loops to unoptimized native
for execution (probably default behavior)

Later, I'd like it to be able to:
1) tokenize everything and serialize the output.
2) interpret everything on-the-fly, using bytecode for loops, functions
3) compile the whole program to unoptimized native and serialize the output.
4) run serialized token streams, and serialized compiled scripts.

I'm aware that's a tall order.  That's why I'm not scheduling all those for
Walnut 2.0.  They'll come with following minor versions.

Regards,
Dan

Jan 03 2008

Alan Knowles <alan akbkhome.com> writes:

Yes, sounds like a good approach.. -

Might be worth playing with naming the Parse methods around the Grammer 
as documented in the ECMAScript spec.

eg. something like:

// methods? returns 1 if statement found?? or should it just throw an 
exception???
bool Statement(bool execute=true)
{
	switch(tok) {
		case '{': // Block:
			while(Statement());
			if (tok != '}') throw Error....
		case TEXT.var: //   VariableStatement:
			while(VariableStatement());
			if (tok != ';') throw Error....			
		case ';': // EmtpyStatement:
			return 1;
		
ExpressionStatement:  -- this may be tricky.. (it uses lookahead?)
		case TEXT.if:
			if (tok != '(') throw Error....
			bool doif = Expression(); // return true|false?
			if (!Statement(doif)) throw Error...
			if tok == TEXT.else
				if (!Statement(!doif)) throw Error...	
			
  IterationStatement:
  ContinueStatement:
  BreakStatement:
  ReturnStatment:
  WithStatement:
  LabelledStatement:
  SwitchStatement:
  ThrowStatement:
  TryStatement:


Regards
Alan


Dan wrote:
 Alan Knowles Wrote:
 
   .
 Yes, except the object isn't to copy DMDScript without the license,
 the objective is to create an engine that's significantly better.  At
 the moment, I would say roughly half the code is written and I'm
 using 108KB vs DMDScript's 513KB.  The parser is the only remaining
 component before it can (incorrectly) run javascript files.  The rest
 is debugging.

 so is the idea to run the interpreter inside the parsing engine? or are 
 you going to generate opcodes? - It wasn't quite clear?

 Regards
 Alan

 
 Would you believe me if I said combinations of both?
 
 For now I want to do this:
 0) interpret top-level, and compile functions and loops to unoptimized native
for execution (probably default behavior)
 
 Later, I'd like it to be able to:
 1) tokenize everything and serialize the output.
 2) interpret everything on-the-fly, using bytecode for loops, functions
 3) compile the whole program to unoptimized native and serialize the output.
 4) run serialized token streams, and serialized compiled scripts.
 
 I'm aware that's a tall order.  That's why I'm not scheduling all those for
Walnut 2.0.  They'll come with following minor versions.
 
 Regards,
 Dan

Jan 03 2008

Alan Knowles <alan akbkhome.com> writes:

That little snippet reminded me of another trick:

Dont create Tokens for single character Tokens - eg. -,=,",',..... etc.
Start the Token Enum from 127.

In the case of your Value Type Enum, this should be ok. (may take a bit 
of fudging with the Typedefs on Value.type / Enum creation.)

This enables you to do stuff like the example below:

switch(Value.type) {
     case ':':
     case Token.IF:
     case '=':

Which makes the code considerably more readable. (no remembering what 
Token.LT Token.GT where supposed to be...

Regards
Alan




Alan Knowles wrote:
 Yes, sounds like a good approach.. -
 
 Might be worth playing with naming the Parse methods around the Grammer 
 as documented in the ECMAScript spec.
 
 eg. something like:
 
 // methods? returns 1 if statement found?? or should it just throw an 
 exception???
 bool Statement(bool execute=true)
 {
     switch(tok) {
         case '{': // Block:
             while(Statement());
             if (tok != '}') throw Error....
         case TEXT.var: //   VariableStatement:
             while(VariableStatement());
             if (tok != ';') throw Error....           
         case ';': // EmtpyStatement:
             return 1;
        
 ExpressionStatement:  -- this may be tricky.. (it uses lookahead?)
         case TEXT.if:
             if (tok != '(') throw Error....
             bool doif = Expression(); // return true|false?
             if (!Statement(doif)) throw Error...
             if tok == TEXT.else
                 if (!Statement(!doif)) throw Error...   
            
  IterationStatement:
  ContinueStatement:
  BreakStatement:
  ReturnStatment:
  WithStatement:
  LabelledStatement:
  SwitchStatement:
  ThrowStatement:
  TryStatement:
 
 
 Regards
 Alan
 
 
 Dan wrote:
 Alan Knowles Wrote:

   .
 Yes, except the object isn't to copy DMDScript without the license,
 the objective is to create an engine that's significantly better.  At
 the moment, I would say roughly half the code is written and I'm
 using 108KB vs DMDScript's 513KB.  The parser is the only remaining
 component before it can (incorrectly) run javascript files.  The rest
 is debugging.

 so is the idea to run the interpreter inside the parsing engine? or 
 are you going to generate opcodes? - It wasn't quite clear?

 Regards
 Alan

 Would you believe me if I said combinations of both?

 For now I want to do this:
 0) interpret top-level, and compile functions and loops to unoptimized 
 native for execution (probably default behavior)

 Later, I'd like it to be able to:
 1) tokenize everything and serialize the output.
 2) interpret everything on-the-fly, using bytecode for loops, functions
 3) compile the whole program to unoptimized native and serialize the 
 output.
 4) run serialized token streams, and serialized compiled scripts.

 I'm aware that's a tall order.  That's why I'm not scheduling all 
 those for Walnut 2.0.  They'll come with following minor versions.

 Regards,
 Dan

Jan 03 2008

Alan Knowles <alan akbkhome.com> writes:

Just checking the code against this - Why are you not using Type|Token 
numbers for the Keywords
...
	case TEXT_case:
                   v.s = TEXT_case;
                   v.type = TYPE.KEYWORD;
                   return v;
...

would work alot better as:
	
	switch(word) {
		case "case" : v.type   = Token.CASE;  return v;
		case "if" :   v.type   = Token.IF;    return v;
		case "else" : v.type   = Token.ELSE;  return v;		
		....
Thinking about this - it may be a good idea to
alias TYPE Token;

It will give you quite a good readability gain.

Regards
Alan







Alan Knowles wrote:
 
 That little snippet reminded me of another trick:
 
 Dont create Tokens for single character Tokens - eg. -,=,",',..... etc.
 Start the Token Enum from 127.
 
 In the case of your Value Type Enum, this should be ok. (may take a bit 
 of fudging with the Typedefs on Value.type / Enum creation.)
 
 This enables you to do stuff like the example below:
 
 switch(Value.type) {
     case ':':
     case Token.IF:
     case '=':
 
 Which makes the code considerably more readable. (no remembering what 
 Token.LT Token.GT where supposed to be...
 
 Regards
 Alan
 
 
 
 
 Alan Knowles wrote:
 Yes, sounds like a good approach.. -

 Might be worth playing with naming the Parse methods around the 
 Grammer as documented in the ECMAScript spec.

 eg. something like:

 // methods? returns 1 if statement found?? or should it just throw an 
 exception???
 bool Statement(bool execute=true)
 {
     switch(tok) {
         case '{': // Block:
             while(Statement());
             if (tok != '}') throw Error....
         case TEXT.var: //   VariableStatement:
             while(VariableStatement());
             if (tok != ';') throw Error....                   case 
 ';': // EmtpyStatement:
             return 1;
        ExpressionStatement:  -- this may be tricky.. (it uses lookahead?)
         case TEXT.if:
             if (tok != '(') throw Error....
             bool doif = Expression(); // return true|false?
             if (!Statement(doif)) throw Error...
             if tok == TEXT.else
                 if (!Statement(!doif)) throw Error...              
  IterationStatement:
  ContinueStatement:
  BreakStatement:
  ReturnStatment:
  WithStatement:
  LabelledStatement:
  SwitchStatement:
  ThrowStatement:
  TryStatement:


 Regards
 Alan


 Dan wrote:
 Alan Knowles Wrote:

   .
 Yes, except the object isn't to copy DMDScript without the license,
 the objective is to create an engine that's significantly better.  At
 the moment, I would say roughly half the code is written and I'm
 using 108KB vs DMDScript's 513KB.  The parser is the only remaining
 component before it can (incorrectly) run javascript files.  The rest
 is debugging.

 so is the idea to run the interpreter inside the parsing engine? or 
 are you going to generate opcodes? - It wasn't quite clear?

 Regards
 Alan

 Would you believe me if I said combinations of both?

 For now I want to do this:
 0) interpret top-level, and compile functions and loops to 
 unoptimized native for execution (probably default behavior)

 Later, I'd like it to be able to:
 1) tokenize everything and serialize the output.
 2) interpret everything on-the-fly, using bytecode for loops, functions
 3) compile the whole program to unoptimized native and serialize the 
 output.
 4) run serialized token streams, and serialized compiled scripts.

 I'm aware that's a tall order.  That's why I'm not scheduling all 
 those for Walnut 2.0.  They'll come with following minor versions.

 Regards,
 Dan

Jan 03 2008

Alan Knowles <alan akbkhome.com> writes:

  Not sure if it's feasible to keep a 2.0 + gdc build working - but 
these changes (Attached) work for gdc
- looks like const(Value)  / const(char)  syntax completly throws gdc, 
even if it's inside a version(D_Version2)

Do you have a non-hotmail address - hotmail is notorious for just 
trashing emails without notice... - hope this get's through..

?? take discussion onto DMDScript newsgroup?


Regards
Alan

Jan 04 2008

Dan Lewis <murpsoft hotmail.com> writes:

Hi Alan,

Sorry it took so long to reply.  I must have left just as you were getting
started.  I definitely like the idea of using charcodes as token types for some
tokens.

Yesterday I tried declaring a enum TEXT : static const(char)[] and couldn't get
it working even down to char[].  : p

Perchance you know how?

I'll examine the diff and make changes as best I can.

We ought to move this to either the dsource.org/ walnut forums, skype(
murposaurus ) msn, hotmail or whatnot.  It's moved away from the subject matter
of digitalmars.D.learn.

Regards,
Dan

Jan 04 2008

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Walnut