digitalmars.D - Javascript bytecode
- Walter Bright (12/12) Dec 18 2012 An interesting datapoint in regards to bytecode is Javascript. Note that...
- Adam D. Ruppe (3/4) Dec 18 2012 Including (a buggy, incomplete subset of) D!
- Max Samukha (2/17) Dec 18 2012 Actually, they call JavaScript an IL for the next ten years.
- Max Samukha (2/3) Dec 18 2012 s/an/the
- Peter Alexander (10/18) Dec 18 2012 Yes, bytecode isn't strictly required, but it's certainly
- Walter Bright (4/15) Dec 18 2012 Bytecode would have added nothing to js but complexity.
- Peter Alexander (14/37) Dec 18 2012 When I say "easier", I'm talking about implementation cost.
- Walter Bright (7/38) Dec 18 2012 D is open source. There is little implementation cost to doing a compile...
- deadalnix (33/40) Dec 18 2012 Let me emit some doubt about that.
- Walter Bright (6/10) Dec 18 2012 I believe you're conflating releasing a "whole project" with what I'm ta...
- H. S. Teoh (16/29) Dec 18 2012 I never liked that motto of Java's. It's one of those things that are
- DypthroposTheImposter (3/3) Dec 18 2012 There is Emscripten which compiles LLVM to javascript, so you
- Walter Bright (5/16) Dec 18 2012 Well, I was thinking of the language, not the runtime library, which is ...
- Andrei Alexandrescu (3/5) Dec 18 2012 The SafeD subset takes care of that.
- H. S. Teoh (12/17) Dec 18 2012 [...]
- Andrei Alexandrescu (3/17) Dec 18 2012 Yes, there are several bugs related to SafeD.
- Brad Roberts (3/16) Dec 18 2012 Are the remaining issues at the compiler, runtime, or phobos levels (or
- Jonathan M Davis (8/25) Dec 18 2012 Quite a few are, but it wouldn't surprise me at all if there are quite a...
- deadalnix (6/19) Dec 18 2012 This is chicken and egg issue. Due to limitations, enforcing
- Rob T (5/10) Dec 18 2012 Unfortunately fixing these will break existing code, or can the
- David Nadlinger (4/14) Dec 19 2012 We *must* take the liberty to fix them; if SafeD is not sound,
- Andrei Alexandrescu (3/16) Dec 19 2012 Yes.
- Rob T (8/23) Dec 19 2012 Don't get me wrong, I agree that broken behavior must always be
- deadalnix (3/13) Dec 19 2012 The code is already broken. The compiler detecting more faulty
- Brad Roberts (2/32) Dec 18 2012 The part I'm particularly interested in is the compiler layer.
- Jacob Carlborg (5/7) Dec 18 2012 CoffeeScript and Dart to mention two other languages that compile to
- deadalnix (8/23) Dec 18 2012 Well, my experience is more like write once, debug everywhere.
- F i L (18/18) Dec 18 2012 Without bytecode, the entire compiler becomes a dependency of a
- Rob T (32/51) Dec 18 2012 I'm not claiming to be an expert in this area either, however it
- Walter Bright (2/5) Dec 18 2012 Evidently you've dismissed all of my posts in this thread on that topic ...
- Max Samukha (8/16) Dec 19 2012 As you dismissed all points in favor of bytecode. Such as it
- Walter Bright (19/27) Dec 19 2012 My arguments were all based on the idea of distributing "compiled" sourc...
- Max Samukha (26/61) Dec 19 2012 I understand that but can not fully agree. The problem is the
- foobar (21/56) Dec 19 2012 There other part of an intermediate representation which you
- Rob T (16/35) Dec 19 2012 Imagine if the compiler were built in a user extensible way, such
- Rob T (38/46) Dec 19 2012 I really am trying to understand your POV, but I'm having a
- Walter Bright (19/42) Dec 19 2012 Mostly. If you use bytecode, you have Yet Another Spec that has to be de...
- eles (6/11) Dec 19 2012 Hey, this is really OT, but I'm interested in. Why do you
- Walter Bright (26/34) Dec 19 2012 It boils down to the overriding expense in spaceflight is weight. There'...
- ixid (9/54) Dec 19 2012 The shuttle was originally intended to be a lot smaller and sit
- David Gileadi (8/17) Dec 19 2012 I had the same question, and Google found me a 2003 article
- Rob T (17/25) Dec 19 2012 As always the answer is never as simple as it seems (just as it
- Walter Bright (13/22) Dec 19 2012 I find it hard to believe that the Canadarm cost more than wings, landin...
- eles (4/7) Dec 20 2012 Thank you to all of you that expressed viewpoints on this issue.
- Andrei Alexandrescu (13/35) Dec 19 2012 I thought the claim was about ASTs vs. bytecode, which slowly segued
- Walter Bright (10/22) Dec 19 2012 Originally, the claim was how modules should be imported in some binary ...
- Rob T (25/27) Dec 19 2012 But those are mostly one-time costs, and for software that has to
- Peter Sommerfeld (8/13) Dec 19 2012 Because you need a D-Programmer to program in D. ;)
- Walter Bright (5/13) Dec 19 2012 I know of zero claims that making a bytecode standard for javascript wil...
- Joakim (8/16) Dec 20 2012 One data point along this line, the most popular javascript
- deadalnix (8/10) Dec 20 2012 Note that in the first place, bytecode discussion has started
- Walter Bright (2/7) Dec 20 2012 No, it doesn't solve that problem at all. I explained why repeatedly.
- deadalnix (3/15) Dec 20 2012 No you explained that java's bytecode doesn't solve that problem.
- Walter Bright (2/14) Dec 20 2012 Please reread all of my messages in the thread. I addressed this.
- Walter Bright (4/6) Dec 20 2012 I did, but obviously you did not find that satisfactory. Let me put it t...
- Rainer Schuetze (8/16) Dec 21 2012 Sorry, can't resist: How about feeding the x86 machine byte code
- Timon Gehr (4/23) Dec 21 2012 Direct hardware support is not achievable because CTFE needs to be pure
- Rainer Schuetze (5/33) Dec 21 2012 True, you would have to trust the library code not to do unpure/unsafe
- Walter Bright (9/24) Dec 21 2012 Not going to work, as CTFE needs type information. CTFE needs to interac...
- Rainer Schuetze (11/41) Dec 21 2012 I think you don't need to care. The CPU can execute it as well without
- Walter Bright (10/15) Dec 21 2012 CPU instructions are as unportable as you can get. All type information ...
- deadalnix (4/12) Dec 21 2012 Optimized LLVM bytecode look like a good candidate for the job.
- jerro (4/7) Dec 21 2012 It's true that it couldn't be automatically decompiled to
- Walter Bright (3/8) Dec 21 2012 I haven't looked at the format, but if it's got type information, that g...
- deadalnix (3/10) Dec 21 2012 Once the optimizer is passed, a lot of it is lost. It is easier
- Max Samukha (8/18) Dec 21 2012 Walter is right that bytecode doesn't solve that problem at all.
- Walter Bright (2/3) Dec 21 2012 I'll bite. What is its advantage over source code?
- Araq (5/9) Dec 21 2012 Interpreting the AST directly: Requires recursion.
- Walter Bright (5/14) Dec 21 2012 Sorry, I don't get this at all. Every bytecode scheme I've seen had a st...
- Mafi (5/17) Dec 21 2012 It don't think that this is such a big deal. Either way you need
- Max Samukha (5/9) Dec 21 2012 It is not about bytecode vs source code. It is about a common
- Max Samukha (4/16) Dec 21 2012 Another example. Many of us here are talking in an intermediate
- Simen Kjaeraas (13/22) Dec 21 2012 But Walter has said that for exactly this purpose, bytecode is useful.
- Max Samukha (4/34) Dec 21 2012 Really? He sounded like the whole world should repent for using
- Andrei Alexandrescu (5/9) Dec 19 2012 I think the important claim here is that an AST and a bytecode have the
- Paulo Pinto (7/22) Dec 18 2012 True, however JavaScript's case is similar to C.
An interesting datapoint in regards to bytecode is Javascript. Note that Javascript is not distributed in bytecode form. There is no Javascript VM. It is distributed as source code. Sometimes, that source code is compressed and obfuscated, nevertheless it is still source code. How the end system chooses to execute the js is up to that end system, and indeed there are a great variety of methods in use. Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior. Note also that Typescript compiles to Javascript. I suspect there are other languages that do so, too.
Dec 18 2012
On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:I suspect there are other languages that do so, too.Including (a buggy, incomplete subset of) D! https://github.com/adamdruppe/dmd/tree/dtojs
Dec 18 2012
On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:An interesting datapoint in regards to bytecode is Javascript. Note that Javascript is not distributed in bytecode form. There is no Javascript VM. It is distributed as source code. Sometimes, that source code is compressed and obfuscated, nevertheless it is still source code. How the end system chooses to execute the js is up to that end system, and indeed there are a great variety of methods in use. Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior. Note also that Typescript compiles to Javascript. I suspect there are other languages that do so, too.Actually, they call JavaScript an IL for the next ten years.
Dec 18 2012
On Tuesday, 18 December 2012 at 18:22:40 UTC, Max Samukha wrote:Actually, they call JavaScript an IL for the next ten years.s/an/the
Dec 18 2012
On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.Yes, bytecode isn't strictly required, but it's certainly desirable. Bytecode is much easier to interpret, much easier to compile to, and more compact. The downside of bytecode is loss of high-level meaning... but that depends on the bytecode. There's nothing stopping the bytecode from being a serialised AST (actually, that would be ideal).Note also that Typescript compiles to Javascript. I suspect there are other languages that do so, too.There are lots. It's probably the most compiled-to high level language language out there (including C).
Dec 18 2012
On 12/18/2012 10:29 AM, Peter Alexander wrote:On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:Bytecode would have added nothing to js but complexity. I think you're seriously overestimating the cost of compilation.Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.Yes, bytecode isn't strictly required, but it's certainly desirable. Bytecode is much easier to interpret, much easier to compile to, and more compact.The downside of bytecode is loss of high-level meaning... but that depends on the bytecode. There's nothing stopping the bytecode from being a serialised AST (actually, that would be ideal).As I pointed out to Andrei, Java bytecode *is* a serialized AST.
Dec 18 2012
On Tuesday, 18 December 2012 at 19:25:01 UTC, Walter Bright wrote:On 12/18/2012 10:29 AM, Peter Alexander wrote:When I say "easier", I'm talking about implementation cost. Consider how easy it is to write a conforming Java byte code interpreter compared to a conforming Java interpreter/compiler. Parsing and semantic analysis are much easier to get wrong than a byte code spec. At the bytecode level, you don't need to worry about function overloading, symbol tables, variable scoping, type inference, forward references etc. etc. All those things are intentional complexities meant to make life easier for the programmer, not the computer. A bytecode doesn't need them.On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:Bytecode would have added nothing to js but complexity. I think you're seriously overestimating the cost of compilation.Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.Yes, bytecode isn't strictly required, but it's certainly desirable. Bytecode is much easier to interpret, much easier to compile to, and more compact.It's not a lossless serialisation -- especially not after optimisation. For example, it's non-trivial to reconstruct the AST of a for loop from optimised bytecode (or even regular bytecode).The downside of bytecode is loss of high-level meaning... but that depends on the bytecode. There's nothing stopping the bytecode from being a serialised AST (actually, that would be ideal).As I pointed out to Andrei, Java bytecode *is* a serialized AST.
Dec 18 2012
On 12/18/2012 12:38 PM, Peter Alexander wrote:On Tuesday, 18 December 2012 at 19:25:01 UTC, Walter Bright wrote:D is open source. There is little implementation cost to doing a compiler for it. It's a solved problem. A bytecode requires another spec to be written, and if you think it's easy to make a conformant Java VM bytecode interpreter, think again :-)On 12/18/2012 10:29 AM, Peter Alexander wrote:When I say "easier", I'm talking about implementation cost. Consider how easy it is to write a conforming Java byte code interpreter compared to a conforming Java interpreter/compiler. Parsing and semantic analysis are much easier to get wrong than a byte code spec. At the bytecode level, you don't need to worry about function overloading, symbol tables, variable scoping, type inference, forward references etc. etc. All those things are intentional complexities meant to make life easier for the programmer, not the computer. A bytecode doesn't need them.On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:Bytecode would have added nothing to js but complexity. I think you're seriously overestimating the cost of compilation.Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.Yes, bytecode isn't strictly required, but it's certainly desirable. Bytecode is much easier to interpret, much easier to compile to, and more compact.Yes, it is trivial. The only thing that is lost are local variable names and comments.It's not a lossless serialisation -- especially not after optimisation. For example, it's non-trivial to reconstruct the AST of a for loop from optimised bytecode (or even regular bytecode).The downside of bytecode is loss of high-level meaning... but that depends on the bytecode. There's nothing stopping the bytecode from being a serialised AST (actually, that would be ideal).As I pointed out to Andrei, Java bytecode *is* a serialized AST.
Dec 18 2012
On Tuesday, 18 December 2012 at 21:30:04 UTC, Walter Bright wrote:D is open source. There is little implementation cost to doing a compiler for it. It's a solved problem.Let me emit some doubt about that. First, D is difficult to compile because of the compile time features. DMD frontend is not the best piece of software I've seen on an extensibility point of view. Plus, if D is open source in its license, it isn't in its way of doing things. When you drop functionality into master for reason the community isn't even aware of, you don't act like in an open source project. You'll find a huge gap between adopting a license and adopt the cultural switch that is required to benefit from open source. Right now, it is painfully hard to implement a D compiler, for various reasons : - No language spec exists (and dmd, dlang.org and TDPL often contradict each others). - The language spec change constantly. - Sometime by surprise ! - Many behavior now considered as standard are historical dmd quirks, that are hard to reproduce in any implementation not based on dmd. - Nothing can be anticipated because goals are not publics. - Sometime you'll find 2 specs (-property) for the same thing. - Many things are deprecated but it is pretty hard to know which one. - It in unknown how to resolve paradoxes created by compile time features. - I can go on and on. Right now only dmd based front end are production quality, and almost no tooling exists around the language. You'll find very good reasons for that in the points listed above.A bytecode requires another spec to be written, and if you think it's easy to make a conformant Java VM bytecode interpreter, think again :-)Nobody ever said that.Yes, it is trivial. The only thing that is lost are local variable names and comments.You'll find tools that compact your whole project, loosing in the process all names.
Dec 18 2012
On 12/18/2012 1:57 PM, deadalnix wrote:I believe you're conflating releasing a "whole project" with what I'm talking about, which is releasing modules meant to be incorporated into a user project, which won't work if the names are changed. And besides, changing the names doesn't change the fact that Java .class files include 100% of the type information.Yes, it is trivial. The only thing that is lost are local variable names and comments.You'll find tools that compact your whole project, loosing in the process all names.
Dec 18 2012
On Tue, Dec 18, 2012 at 10:11:37AM -0800, Walter Bright wrote:An interesting datapoint in regards to bytecode is Javascript. Note that Javascript is not distributed in bytecode form. There is no Javascript VM. It is distributed as source code. Sometimes, that source code is compressed and obfuscated, nevertheless it is still source code. How the end system chooses to execute the js is up to that end system, and indeed there are a great variety of methods in use. Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode.I never liked that motto of Java's. It's one of those things that are too good to be true, and papers over very real, complex cross-platform compatibility issues. I prefer "write once, debug everywhere". :-PWhat is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.[...] What would you do with system-specific things like filesystem manipulation, though? That has to be implementation-defined by definition. And IME, any abstraction that's both (1) completely-defined without any implementation differences and (2) covers every possible platform that ever existed and will exist, is either totally useless from being over-complex and over-engineered, or completely fails to capture the complexity of real-world systems and the details required to work with them efficiently. T -- WINDOWS = Will Install Needless Data On Whole System -- CompuMan
Dec 18 2012
There is Emscripten which compiles LLVM to javascript, so you could probably get D into JS like that also https://github.com/kripken/emscripten
Dec 18 2012
On 12/18/2012 11:41 AM, H. S. Teoh wrote:Well, I was thinking of the language, not the runtime library, which is a separate issue. And no, I don't think D can be a systems language *and* eliminate all undefined and implementation defined behavior.What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.[...] What would you do with system-specific things like filesystem manipulation, though? That has to be implementation-defined by definition. And IME, any abstraction that's both (1) completely-defined without any implementation differences and (2) covers every possible platform that ever existed and will exist, is either totally useless from being over-complex and over-engineered, or completely fails to capture the complexity of real-world systems and the details required to work with them efficiently.
Dec 18 2012
On 12/18/12 3:35 PM, Walter Bright wrote:And no, I don't think D can be a systems language *and* eliminate all undefined and implementation defined behavior.The SafeD subset takes care of that. Andrei
Dec 18 2012
On Tue, Dec 18, 2012 at 07:08:04PM -0500, Andrei Alexandrescu wrote:On 12/18/12 3:35 PM, Walter Bright wrote:[...] Which right now suffers from some silly things like writefln not being able to be made safe, just because some obscure formatting parameter is un safe. Which is exactly how safe was designed, of course. Except that it makes SafeD ... a bit of a letdown, shall we say? - when it comes to practical real-world applications. (And just to be clear, I'm all for SafeD, but it does still have a ways to go.) T -- Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De VitisAnd no, I don't think D can be a systems language *and* eliminate all undefined and implementation defined behavior.The SafeD subset takes care of that.
Dec 18 2012
On 12/18/12 7:29 PM, H. S. Teoh wrote:On Tue, Dec 18, 2012 at 07:08:04PM -0500, Andrei Alexandrescu wrote:Yes, there are several bugs related to SafeD. AndreiOn 12/18/12 3:35 PM, Walter Bright wrote:[...] Which right now suffers from some silly things like writefln not being able to be made safe, just because some obscure formatting parameter is un safe. Which is exactly how safe was designed, of course. Except that it makes SafeD ... a bit of a letdown, shall we say? - when it comes to practical real-world applications. (And just to be clear, I'm all for SafeD, but it does still have a ways to go.)And no, I don't think D can be a systems language *and* eliminate all undefined and implementation defined behavior.The SafeD subset takes care of that.
Dec 18 2012
On Tue, 18 Dec 2012, Andrei Alexandrescu wrote:On 12/18/12 7:29 PM, H. S. Teoh wrote:Are the remaining issues at the compiler, runtime, or phobos levels (or what combination of the three)? Are the bugs filed?Which right now suffers from some silly things like writefln not being able to be made safe, just because some obscure formatting parameter is un safe. Which is exactly how safe was designed, of course. Except that it makes SafeD ... a bit of a letdown, shall we say? - when it comes to practical real-world applications. (And just to be clear, I'm all for SafeD, but it does still have a ways to go.)Yes, there are several bugs related to SafeD. Andrei
Dec 18 2012
On Tuesday, December 18, 2012 17:57:50 Brad Roberts wrote:On Tue, 18 Dec 2012, Andrei Alexandrescu wrote:Quite a few are, but it wouldn't surprise me at all if there are quite a few which aren't. For instance, AFAIK, no one ever brought up the issue of slicing static arrays being unsafe until just a couple of months ago: http://d.puremagic.com/issues/show_bug.cgi?id=8838 Such operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisOn 12/18/12 7:29 PM, H. S. Teoh wrote:Are the remaining issues at the compiler, runtime, or phobos levels (or what combination of the three)? Are the bugs filed?Which right now suffers from some silly things like writefln not being able to be made safe, just because some obscure formatting parameter is un safe. Which is exactly how safe was designed, of course. Except that it makes SafeD ... a bit of a letdown, shall we say? - when it comes to practical real-world applications. (And just to be clear, I'm all for SafeD, but it does still have a ways to go.)Yes, there are several bugs related to SafeD. Andrei
Dec 18 2012
On Wednesday, 19 December 2012 at 01:58:54 UTC, Jonathan M Davis wrote:This is chicken and egg issue. Due to limitations, enforcing safe is hard to do in many code that is safe. So you actually don't notice that some stuff are considered/not considered safe/system when they should.Are the remaining issues at the compiler, runtime, or phobos levels (or what combination of the three)? Are the bugs filed?Quite a few are, but it wouldn't surprise me at all if there are quite a few which aren't. For instance, AFAIK, no one ever brought up the issue of slicing static arrays being unsafe until just a couple of months ago: http://d.puremagic.com/issues/show_bug.cgi?id=8838 Such operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla.
Dec 18 2012
On Wednesday, 19 December 2012 at 01:58:54 UTC, Jonathan M Davis wrote:Such operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisUnfortunately fixing these will break existing code, or can the behavior be depreciated? --rt
Dec 18 2012
On Wednesday, 19 December 2012 at 07:14:30 UTC, Rob T wrote:On Wednesday, 19 December 2012 at 01:58:54 UTC, Jonathan M Davis wrote:We *must* take the liberty to fix them; if SafeD is not sound, it's hardly worth its salt. DavidSuch operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisUnfortunately fixing these will break existing code, or can the behavior be depreciated?
Dec 19 2012
On 12/19/12 8:13 AM, David Nadlinger wrote:On Wednesday, 19 December 2012 at 07:14:30 UTC, Rob T wrote:Yes. AndreiOn Wednesday, 19 December 2012 at 01:58:54 UTC, Jonathan M Davis wrote:We *must* take the liberty to fix them; if SafeD is not sound, it's hardly worth its salt. DavidSuch operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisUnfortunately fixing these will break existing code, or can the behavior be depreciated?
Dec 19 2012
On Wednesday, 19 December 2012 at 13:13:32 UTC, David Nadlinger wrote:On Wednesday, 19 December 2012 at 07:14:30 UTC, Rob T wrote:Don't get me wrong, I agree that broken behavior must always be fixed and never left in as a "feature". Probably the priority bugs should be the ones where the fix ends up breaking existing code. The sooner these are gotten rid of the better. --rtOn Wednesday, 19 December 2012 at 01:58:54 UTC, Jonathan M Davis wrote:We *must* take the liberty to fix them; if SafeD is not sound, it's hardly worth its salt. DavidSuch operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisUnfortunately fixing these will break existing code, or can the behavior be depreciated?
Dec 19 2012
On Wednesday, 19 December 2012 at 07:14:30 UTC, Rob T wrote:On Wednesday, 19 December 2012 at 01:58:54 UTC, Jonathan M Davis wrote:The code is already broken. The compiler detecting more faulty code is a plus.Such operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisUnfortunately fixing these will break existing code, or can the behavior be depreciated?
Dec 19 2012
On 12/18/2012 5:58 PM, Jonathan M Davis wrote:On Tuesday, December 18, 2012 17:57:50 Brad Roberts wrote:The part I'm particularly interested in is the compiler layer.On Tue, 18 Dec 2012, Andrei Alexandrescu wrote:Quite a few are, but it wouldn't surprise me at all if there are quite a few which aren't. For instance, AFAIK, no one ever brought up the issue of slicing static arrays being unsafe until just a couple of months ago: http://d.puremagic.com/issues/show_bug.cgi?id=8838 Such operations should be system but are currently considered safe. Who knows how many others we've missed beyond what's currently in bugzilla. - Jonathan M DavisOn 12/18/12 7:29 PM, H. S. Teoh wrote:Are the remaining issues at the compiler, runtime, or phobos levels (or what combination of the three)? Are the bugs filed?Which right now suffers from some silly things like writefln not being able to be made safe, just because some obscure formatting parameter is un safe. Which is exactly how safe was designed, of course. Except that it makes SafeD ... a bit of a letdown, shall we say? - when it comes to practical real-world applications. (And just to be clear, I'm all for SafeD, but it does still have a ways to go.)Yes, there are several bugs related to SafeD. Andrei
Dec 18 2012
On 2012-12-18 19:11, Walter Bright wrote:Note also that Typescript compiles to Javascript. I suspect there are other languages that do so, too.CoffeeScript and Dart to mention two other languages that compile to JavaScript. -- /Jacob Carlborg
Dec 18 2012
On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:An interesting datapoint in regards to bytecode is Javascript. Note that Javascript is not distributed in bytecode form. There is no Javascript VM. It is distributed as source code. Sometimes, that source code is compressed and obfuscated, nevertheless it is still source code. How the end system chooses to execute the js is up to that end system, and indeed there are a great variety of methods in use. Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode.Well, my experience is more like write once, debug everywhere. For both java AND javascript.What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior.Isn't safeD supposed to provide that (as long as you never go throw trusted code) ?Note also that Typescript compiles to Javascript. I suspect there are other languages that do so, too.Most thing can compile to javascript. The MOTO right now seems to be that if you can do it in javascript someone will. It don't mean someone should.
Dec 18 2012
Without bytecode, the entire compiler becomes a dependency of a AOT/JIT compiled program.. not only does bytecode allow for faster on-site compilations, it also means half the compiler can be stripped away (so i'm told, i'm not claiming to be an expert here). I'm actually kinda surprised there hasn't been more of a AOT/JIT compiling push within the D community.. D's the best there is at code specialization, but half of that battle seems to be hardware specifics only really known on-site... like SIMD for example. I've been told many game companies compile against SIMD 3.1 because that's the base-line x64 instruction set. If you could query the hardware post-distribution (vs pre-distribution) without any performance loss or code complication (to the developer), that would be incredibly idea. (ps. I acknowledge that this would probably _require_ the full compiler, so there's probably not be to much value in a D-bytecode). The D compiler is small enough for distribution I think (only ~10mb compressed?), but the back-end license restricts it right?
Dec 18 2012
On Wednesday, 19 December 2012 at 01:09:14 UTC, F i L wrote:Without bytecode, the entire compiler becomes a dependency of a AOT/JIT compiled program.. not only does bytecode allow for faster on-site compilations, it also means half the compiler can be stripped away (so i'm told, i'm not claiming to be an expert here). I'm actually kinda surprised there hasn't been more of a AOT/JIT compiling push within the D community.. D's the best there is at code specialization, but half of that battle seems to be hardware specifics only really known on-site... like SIMD for example. I've been told many game companies compile against SIMD 3.1 because that's the base-line x64 instruction set. If you could query the hardware post-distribution (vs pre-distribution) without any performance loss or code complication (to the developer), that would be incredibly idea. (ps. I acknowledge that this would probably _require_ the full compiler, so there's probably not be to much value in a D-bytecode). The D compiler is small enough for distribution I think (only ~10mb compressed?), but the back-end license restricts it right?I'm not claiming to be an expert in this area either, however it seems obvious that there are significant theoretical and practical advantages with using the bytecode concept. My understanding is that with byte code and a suitable VM to process it, one can abstract away the underlying high-level language that was used to produce the byte code, therefore it is possible to use alternate high level languages with front-ends that compile to the same common bytecode instruction set. This is exactly the same as what is being done with the D front end, and other front ends for the GCC, except for the difference that the machine code produced needs a physical cpu to process it, and there is no machine code instruction set that is common across all architectures. Effectively, the bytecode serves as the common native machine code for a standardized virtualized cpu (the VM) and the VM can sit on top of any given architecture (more or less). Of course there are significant execution inefficiencies with this method, however bytecode can be compiled into native code - keeping in mind that you did not have to transport whatever the high level language was that was compiled into the byte code for this to be possible. So in summary, the primary purpose of byte code is to serve as an intermediate common language that can be run directly on a VM, or compiled directly into native machine code. There's no need to transport or even know what language was used for producing the byte code. As a reminder, this is what "my understanding is", which may be incorrect in one or more areas, so if I'm wrong, I'd like to be corrected. Thanks --rt
Dec 18 2012
On 12/18/2012 11:04 PM, Rob T wrote:I'm not claiming to be an expert in this area either, however it seems obvious that there are significant theoretical and practical advantages with using the bytecode concept.Evidently you've dismissed all of my posts in this thread on that topic :-)
Dec 18 2012
On Wednesday, 19 December 2012 at 07:22:45 UTC, Walter Bright wrote:On 12/18/2012 11:04 PM, Rob T wrote:As you dismissed all points in favor of bytecode. Such as it being a standardized AST representation for multiple languages. CLI is all about that, which is reflected in its name. LLVM is used almost exclusively for that purpose (clang is great). Not advocating bytecode here but you claiming it is completely useless is so D-ish :).I'm not claiming to be an expert in this area either, however it seems obvious that there are significant theoretical and practical advantages with using the bytecode concept.Evidently you've dismissed all of my posts in this thread on that topic :-)
Dec 19 2012
On 12/19/2012 12:19 AM, Max Samukha wrote:And I gave detailed reasons why.Evidently you've dismissed all of my posts in this thread on that topic :-)As you dismissed all points in favor of bytecode.Such as it being a standardized AST representation for multiple languages. CLI is all about that, which is reflected in its name. LLVM is used almost exclusively for that purpose (clang is great).My arguments were all based on the idea of distributing "compiled" source code in bytecode format. The idea of using some common intermediate format to tie together multiple front ends and multiple back ends is something completely different. And, surprise (!), I've done that, too. The original C compiler I wrote for many years was a multipass affair, that communicated the data from one pass to the next via an intermediate file. I was forced into such a system because DOS just didn't have enough memory to combine the passes. I dumped it when more memory became available, as it was the source of major slowdowns in the compilation process. Note that such a system need not be *bytecode* at all, it can just hand the data structure off from one pass to the next. In fact, an actual bytecode requires a serialization of the data structures and then a reconstruction of them - rather pointless.Not advocating bytecode here but you claiming it is completely useless is so D-ish :).I'm not without experience doing everything bytecode is allegedly good at. There turned out to be no way to efficiently represent D slices in it, for example.
Dec 19 2012
On Wednesday, 19 December 2012 at 08:45:20 UTC, Walter Bright wrote:On 12/19/2012 12:19 AM, Max Samukha wrote:I understand that but can not fully agree. The problem is the components of such a system are distributed and not binary-compatible. The data structures are intended to be transferred over a stream and you *have* to serialize at one end and deserialize at the the other. For example, we serialize a D host app and a C library into portable pnacl bitcode and transfer it to Chrome for compilation and execution. There is no point in having C, D (or whatever other languages people are going to invent) front-ends on the receiving side. The same applies to JS - people "serialize" ASTs generated from, say, a CoffeeScript source into JS, transfer that to the browser, which "deserializes" JS into an internal AST representation. Note that I am not arguing that bytecode is the best kind of standard AST representation. I am arguing that there *is* a point in such serialized representation. Hence, your claim that ILs are *completely* useless is not quite convincing. When we have a single God language (I wouldn't object if it were D but it is not yet ;)), then there would be no need in complications like ILs.And I gave detailed reasons why.Evidently you've dismissed all of my posts in this thread on that topic :-)As you dismissed all points in favor of bytecode.Such as it being a standardized AST representation for multiple languages. CLI is all about that, which is reflected in its name. LLVM is used almost exclusively for that purpose (clang is great).My arguments were all based on the idea of distributing "compiled" source code in bytecode format. The idea of using some common intermediate format to tie together multiple front ends and multiple back ends is something completely different. And, surprise (!), I've done that, too. The original C compiler I wrote for many years was a multipass affair, that communicated the data from one pass to the next via an intermediate file. I was forced into such a system because DOS just didn't have enough memory to combine the passes. I dumped it when more memory became available, as it was the source of major slowdowns in the compilation process. Note that such a system need not be *bytecode* at all, it can just hand the data structure off from one pass to the next. In fact, an actual bytecode requires a serialization of the data structures and then a reconstruction of them - rather pointless.I am not doubting your experience but that might be an authoritative argument.Not advocating bytecode here but you claiming it is completely useless is so D-ish :).I'm not without experience doing everything bytecode is allegedly good at.languages, not so much. There turned out to be no way to efficiently represent D slices in it, for example.That is the limitation of CLI, not the concept. LLVM does not have that problem.
Dec 19 2012
On Wednesday, 19 December 2012 at 08:45:20 UTC, Walter Bright wrote:On 12/19/2012 12:19 AM, Max Samukha wrote:There other part of an intermediate representation which you ignored is attaching *multiple backends* which is important for portability and the web. Applications could be written in safeD (a subset that is supposed to have no implementation defined or undefined behaviors) and compiled to such an intermediate representation (let's call it common-IR since you don't like "bytecode"). Now, each client platform has its own backend for our common-IR. We can have install-time compilation like in .NET or JIT as in Java or both, or maybe some other such method. Having such format allows to add distribution to the system. The serialization and de-serialization is only pointless when done on the same machine. Another usecase could be a compilation server - we can put only the front-end on client machines and do the optimizations and native code generation on the server. This can be used for example in a browser to allow D scripting. Think for instant about smart-phone browsers. the dreaded "bytecode" helps to solve all those use cases.And I gave detailed reasons why.Evidently you've dismissed all of my posts in this thread on that topic :-)As you dismissed all points in favor of bytecode.Such as it being a standardized AST representation for multiple languages. CLI is all about that, which is reflected in its name. LLVM is used almost exclusively for that purpose (clang is great).My arguments were all based on the idea of distributing "compiled" source code in bytecode format. The idea of using some common intermediate format to tie together multiple front ends and multiple back ends is something completely different. And, surprise (!), I've done that, too. The original C compiler I wrote for many years was a multipass affair, that communicated the data from one pass to the next via an intermediate file. I was forced into such a system because DOS just didn't have enough memory to combine the passes. I dumped it when more memory became available, as it was the source of major slowdowns in the compilation process. Note that such a system need not be *bytecode* at all, it can just hand the data structure off from one pass to the next. In fact, an actual bytecode requires a serialization of the data structures and then a reconstruction of them - rather pointless.Not advocating bytecode here but you claiming it is completely useless is so D-ish :).I'm not without experience doing everything bytecode is allegedly good at. languages, not so much. There turned out to be no way to efficiently represent D slices in it, for example.
Dec 19 2012
On Wednesday, 19 December 2012 at 10:55:43 UTC, foobar wrote:There other part of an intermediate representation which you ignored is attaching *multiple backends* which is important for portability and the web. Applications could be written in safeD (a subset that is supposed to have no implementation defined or undefined behaviors) and compiled to such an intermediate representation (let's call it common-IR since you don't like "bytecode"). Now, each client platform has its own backend for our common-IR. We can have install-time compilation like in .NET or JIT as in Java or both, or maybe some other such method. Having such format allows to add distribution to the system. The serialization and de-serialization is only pointless when done on the same machine. Another usecase could be a compilation server - we can put only the front-end on client machines and do the optimizations and native code generation on the server. This can be used for example in a browser to allow D scripting. Think for instant about smart-phone browsers. the dreaded "bytecode" helps to solve all those use cases.Imagine if the compiler were built in a user extensible way, such that one could write a plugin module that outputs the compiled code in the form of JVM bytecode which could run directly on the JVM. It doesn't really matter if Walter is correct or not concerning his views, what matters is that people want X, Y and Z, and no matter how silly it may seem to some people, the silliness is never going to change. Why fight it, why not do ourselves a favor and embrace it. We don't have to do anything silly, just provide the means to allow people to do what they want to do. Most of it will be bad, which is the case right now and always will be, but every so often someone will do something brilliant. Why not make D the platform of choice, meaning you really do have a choice? --rt
Dec 19 2012
On Wednesday, 19 December 2012 at 07:22:45 UTC, Walter Bright wrote:On 12/18/2012 11:04 PM, Rob T wrote:I really am trying to understand your POV, but I'm having a difficult time with the point concerning performance. Using the JS code as an example, you are stating that the JS source code itself could just as well be viewed as the "bytecode", and therefore given what I previously wrote concerning the "advantages", I could replace "bytecode" with "JS source code" and achieve the exact same result. Am I Correct? I will agree that the bytecode could be encoded as JS (or as another language) and used as a common base for interpretation or compilation to machine code. I can also agree that other languages can be "compiled" into the common "bytecode" language provided that it is versatile enough, so from that POV I will agree that you are correct. I thought that transforming source code into bytecode was an optimization technique intended to improve interpretation performance while preserving portability across architectures, i.e., the bytecode language was designed specifically to improve interpretation performance - but you say that the costs of performing the transformations from a high-level language into the optimized bytecode language far outweigh the advantages of leaving it as-is, i.e., whatever performance gains you get through the transformation is not significant enough to justify the costs of performing the transformation. Is my understanding of your POV correct? What I'm having trouble understanding is this: If the intention of something like the Java VM was to create a portable virtualized machine that could be used to execute any language, then would it not make sense to select a common bytecode language that was optimized for execution performance, rather than using another common language that was not specifically designed for that purpose? Do you have a theory or insight that can explain why a situation like the Java bytecode VM came to be and why it persists despite your suggestion that it is not required or of enough advantage to justify using it (may as well use Java source directly)? --rtI'm not claiming to be an expert in this area either, however it seems obvious that there are significant theoretical and practical advantages with using the bytecode concept.Evidently you've dismissed all of my posts in this thread on that topic :-)
Dec 19 2012
On 12/19/2012 1:10 AM, Rob T wrote:Using the JS code as an example, you are stating that the JS source code itself could just as well be viewed as the "bytecode", and therefore given what I previously wrote concerning the "advantages", I could replace "bytecode" with "JS source code" and achieve the exact same result. Am I Correct?Yes.I thought that transforming source code into bytecode was an optimization technique intended to improve interpretation performance while preserving portability across architectures, i.e., the bytecode language was designed specifically to improve interpretation performance - but you say that the costs of performing the transformations from a high-level language into the optimized bytecode language far outweigh the advantages of leaving it as-is, i.e., whatever performance gains you get through the transformation is not significant enough to justify the costs of performing the transformation. Is my understanding of your POV correct?Mostly. If you use bytecode, you have Yet Another Spec that has to be defined and conformed to. This has a lot of costs.What I'm having trouble understanding is this: If the intention of something like the Java VM was to create a portable virtualized machine that could be used to execute any language, then would it not make sense to select a common bytecode language that was optimized for execution performance, rather than using another common language that was not specifically designed for that purpose?Java as we know it evolved from a language that (as I recall) used bytecode to run on embedded systems of very low power. This use failed, and Java was re-imagined to be a network language that transmitted the bytecode over the internet. The rest was attempts to justify the investment in bytecode, or perhaps it simply never occurred to the Java engineers that the bytecode was redundant. (Bytecode can make sense on 8 bit machines where the target machine simply has no capacity to run even a simple compiler. Such underpowered machines were obsolete by the time Java was developed, but the old ideas died hard.)Do you have a theory or insight that can explain why a situation like the Java bytecode VM came to be and why it persists despite your suggestion that it is not required or of enough advantage to justify using it (may as well use Java source directly)?Consider the US space shuttle design. It's probably the most wrong-headed engineering design ever, and it persisted because too many billions of dollars and careers were invested into it. Nobody could admit that it was an extremely inefficient and rather crazy design. A couple NASA engineers have admitted to me privately that they knew this, but to keep their careers they kept their mouths shut.
Dec 19 2012
Consider the US space shuttle design. It's probably the most wrong-headed engineering design ever, and it persisted because too many billions of dollars and careers were invested into it. Nobody could admit that it was an extremely inefficient and rather crazy design.Hey, this is really OT, but I'm interested in. Why do you consider it is such a bad design? Because the shuttle is intended to be reentrant and this is costly? Some other issue? Is about the design or about the entire idea? Thanks for answering and I promise to not further hijack this thread.
Dec 19 2012
On 12/19/2012 2:05 AM, eles wrote:It boils down to the overriding expense in spaceflight is weight. There's the notion of "payload", which is the weight of whatever does something useful in space - the whole point of the mission. Every bit of weight adds a great deal of more weight in terms of cost to push it all into orbit. To make the shuttle return and land, you've got wings, rudder, landing gear, flight control system, basically a huge amount of weight devoted to that. That weight subtracts from what you can push up as payload. All of the lifting capability for that also must be insanely reliable. (And never mind needing things like a custom 747 to transport it around because it's too big to go on the roads, all that money spent trying to make a reusable heat shield, etc.) Now consider the only thing that actually has to return are the astronauts. And all they actually need to return is a heatshield and a parachute - i.e. an Apollo capsule. Thinking about it from basic principles, you need: 1. astronauts 2. payload 3. a way to get the astronauts back So the idea then is to have two launches. 1. an insanely reliable rocket to push the astronauts up, and nothing else 2. a less reliable (and hence cheap) heavy lift rocket to push the payload up The two launches dock in space, astronauts do their job, astronauts return via their Apollo-style capsule. Mission accomplished at far, far less cost.Consider the US space shuttle design. It's probably the most wrong-headed engineering design ever, and it persisted because too many billions of dollars and careers were invested into it. Nobody could admit that it was an extremely inefficient and rather crazy design.Hey, this is really OT, but I'm interested in. Why do you consider it is such a bad design? Because the shuttle is intended to be reentrant and this is costly? Some other issue? Is about the design or about the entire idea?
Dec 19 2012
On Wednesday, 19 December 2012 at 21:00:20 UTC, Walter Bright wrote:On 12/19/2012 2:05 AM, eles wrote:The shuttle was originally intended to be a lot smaller and sit atop the central booster, avoiding the issues that caused the Columbia disaster. I believe that design may have been intended to operate in the manner you suggest, however the CIA demanded that the shuttle be made much larger to accommodate large military satellites, distorting the design and making it a lot less efficient.It boils down to the overriding expense in spaceflight is weight. There's the notion of "payload", which is the weight of whatever does something useful in space - the whole point of the mission. Every bit of weight adds a great deal of more weight in terms of cost to push it all into orbit. To make the shuttle return and land, you've got wings, rudder, landing gear, flight control system, basically a huge amount of weight devoted to that. That weight subtracts from what you can push up as payload. All of the lifting capability for that also must be insanely reliable. (And never mind needing things like a custom 747 to transport it around because it's too big to go on the roads, all that money spent trying to make a reusable heat shield, etc.) Now consider the only thing that actually has to return are the astronauts. And all they actually need to return is a heatshield and a parachute - i.e. an Apollo capsule. Thinking about it from basic principles, you need: 1. astronauts 2. payload 3. a way to get the astronauts back So the idea then is to have two launches. 1. an insanely reliable rocket to push the astronauts up, and nothing else 2. a less reliable (and hence cheap) heavy lift rocket to push the payload up The two launches dock in space, astronauts do their job, astronauts return via their Apollo-style capsule. Mission accomplished at far, far less cost.Consider the US space shuttle design. It's probably the most wrong-headed engineering design ever, and it persisted because too many billions of dollars and careers were invested into it. Nobody could admit that it was an extremely inefficient and rather crazy design.Hey, this is really OT, but I'm interested in. Why do you consider it is such a bad design? Because the shuttle is intended to be reentrant and this is costly? Some other issue? Is about the design or about the entire idea?
Dec 19 2012
On 12/19/12 3:05 AM, eles wrote:I had the same question, and Google found me a 2003 article http://www.spacedaily.com/news/oped-03l.html which in the wake of Columbia is largely about safety but also about efficiency. Interestingly the article claims that the shuttle flaws were largely the result of a) the desire to carry large payloads along with astronauts (as Walter mentions) and b) the choice of fuel, which led to several other expensive and dangerous design choices.Consider the US space shuttle design. It's probably the most wrong-headed engineering design ever, and it persisted because too many billions of dollars and careers were invested into it. Nobody could admit that it was an extremely inefficient and rather crazy design.Hey, this is really OT, but I'm interested in. Why do you consider it is such a bad design? Because the shuttle is intended to be reentrant and this is costly? Some other issue? Is about the design or about the entire idea? Thanks for answering and I promise to not further hijack this thread.
Dec 19 2012
On Wednesday, 19 December 2012 at 21:24:46 UTC, David Gileadi wrote:I had the same question, and Google found me a 2003 article http://www.spacedaily.com/news/oped-03l.html which in the wake of Columbia is largely about safety but also about efficiency. Interestingly the article claims that the shuttle flaws were largely the result of a) the desire to carry large payloads along with astronauts (as Walter mentions) and b) the choice of fuel, which led to several other expensive and dangerous design choices.As always the answer is never as simple as it seems (just as it is with bytecode if I'm to attempt to stay on topic). One of subgoals of the space shuttle was for it to be able to return not just people back, but also to capture and return back to earth an orbiting payload. It also carnied along instrumentation such as the Canadarm, a very expensive device that you normally would not want to throw away. The arm was used for deploying the payload and also for performing repair work. It is hard to imagine a throw away rocket booster approach meeting all of these design goals, and I'm leaving out other abilities you cannot get from a simple return capsule approach. A mistake would be to use the shuttle for purposes that it was not suitable for, such as situations that did not need its unique abilities and could be done more cheaply. --rt
Dec 19 2012
On 12/19/2012 4:09 PM, Rob T wrote:As always the answer is never as simple as it seems (just as it is with bytecode if I'm to attempt to stay on topic). One of subgoals of the space shuttle was for it to be able to return not just people back, but also to capture and return back to earth an orbiting payload. It also carnied along instrumentation such as the Canadarm, a very expensive device that you normally would not want to throw away. The arm was used for deploying the payload and also for performing repair work. It is hard to imagine a throw away rocket booster approach meeting all of these design goals, and I'm leaving out other abilities you cannot get from a simple return capsule approach.I find it hard to believe that the Canadarm cost more than wings, landing gear, a custom 747, etc. (That custom 747 probably cost a cool billion all by itself.) Secondly, the cost of the Canadarm consists of two parts: engineering design, and construction. Once the design is done, the incremental construction cost of making multiple ones is way, way, way cheaper. As for returning an orbiting payload, has that ever happened? And still, one could launch a shell with a heatshield and parachute on it, put that payload into the shell, and drop it into the atmosphere. I can see needing to return spy satellites with their film canisters, but film is hopelessly obsolete now, and I can't see any such satellites these days. The shuttle concept was so expensive that it severely stunted what we could do in space, and finally sank the whole manned space program.
Dec 19 2012
The shuttle concept was so expensive that it severely stunted what we could do in space, and finally sank the whole manned space program.Thank you to all of you that expressed viewpoints on this issue. I found the discussion valuable and reasonable arguments were made (both sides). Anyway, it is too off-topic, so I consider this hijack ended.
Dec 20 2012
On 12/19/12 4:25 AM, Walter Bright wrote:On 12/19/2012 1:10 AM, Rob T wrote:I thought the claim was about ASTs vs. bytecode, which slowly segued into source code vs. byte code. Are e in agreement there is a cost of translating JS source code to AST format? (The cost may be negligible to some applications but it's there.) There's also the serialization aspect. Serializing and deserializing an AST takes extra effort because pointers must be fixed. Bytecode can be designed to avoid most of that cost. On these two accounts alone, one may as well choose bytecode if it ever needs to be read and written. Defining a strategy for pointer serialization is comparable work.Using the JS code as an example, you are stating that the JS source code itself could just as well be viewed as the "bytecode", and therefore given what I previously wrote concerning the "advantages", I could replace "bytecode" with "JS source code" and achieve the exact same result. Am I Correct?Yes.That's not answering the question. AndreiDo you have a theory or insight that can explain why a situation like the Java bytecode VM came to be and why it persists despite your suggestion that it is not required or of enough advantage to justify using it (may as well use Java source directly)?Consider the US space shuttle design. It's probably the most wrong-headed engineering design ever, and it persisted because too many billions of dollars and careers were invested into it. Nobody could admit that it was an extremely inefficient and rather crazy design. A couple NASA engineers have admitted to me privately that they knew this, but to keep their careers they kept their mouths shut.
Dec 19 2012
On 12/19/2012 7:44 AM, Andrei Alexandrescu wrote:I thought the claim was about ASTs vs. bytecode, which slowly segued into source code vs. byte code.Originally, the claim was how modules should be imported in some binary format rather than as source code.Are e in agreement there is a cost of translating JS source code to AST format? (The cost may be negligible to some applications but it's there.)There is a cost, and it is a killer if you've got an 8 bit CPU with a 2K EPROM as a target. This is no longer relevant.There's also the serialization aspect. Serializing and deserializing an AST takes extra effort because pointers must be fixed. Bytecode can be designed to avoid most of that cost.Bytecode does not avoid that cost, in fact, bytecode *is* a serialized AST format. (And, btw, the first thing a JIT compiler does is convert bytecode back into an AST.)On these two accounts alone, one may as well choose bytecode if it ever needs to be read and written. Defining a strategy for pointer serialization is comparable work.You're not saving anything.That's not answering the question.Analogies are legitimate answers to questions about motivation.
Dec 19 2012
On Wednesday, 19 December 2012 at 09:25:54 UTC, Walter Bright wrote:Mostly. If you use bytecode, you have Yet Another Spec that has to be defined and conformed to. This has a lot of costs.But those are mostly one-time costs, and for software that has to run millions of times over, if there are enough performance gains by first compiling to bytecode, it could be worth the costs over the long term. If there may be better methods of producing he same or better results that are not strictly bytecode, then that's another story, however one goal is to have a common language that amalgamates everything together under a common roof. One question I have for you, is what percentage performance gain can you expect to get by using a well chosen bytecode-like language verses interpreting directly from source code? The other question, is are there better alternative techniques? For example, compiling regular source directly to native using a JIT approach. In many ways, this seems like the very best approach, which I suppose is precisely what you've been arguing about all this time. So perhaps I've managed to convince myself that you are indeed correct. I'll take that stance and see if it sticks. BTW I'm not a fan of interpreted languages, except for situations where you want to transport code in the form of data, or be able to store it for later portable execution. LUA embedded into a game engine is an good use case example (although why not D!). --rt
Dec 19 2012
Am 20.12.2012, 01:54 Uhr, schrieb Rob T <rob ucora.com>:I'm not a fan of interpreted languages, except for situationswhere you want to transport code in the form of data, or beable to store it for later portable execution. LUA embeddedinto a game engine is an good use case example (although why not D!).Because you need a D-Programmer to program in D. ;) Scripting languages like Lua reduce the complexity of programming to fit the needs of its users, which are often often not programmer. There is a lot more needed to programm in D then in Lua. BTW: LuaJIT uses the source code, not Luas byte code. Peter
Dec 19 2012
On 12/19/2012 4:54 PM, Rob T wrote:One question I have for you, is what percentage performance gain can you expect to get by using a well chosen bytecode-like language verses interpreting directly from source code?I know of zero claims that making a bytecode standard for javascript will improve performance.The other question, is are there better alternative techniques? For example, compiling regular source directly to native using a JIT approach. In many ways, this seems like the very best approach, which I suppose is precisely what you've been arguing about all this time. So perhaps I've managed to convince myself that you are indeed correct. I'll take that stance and see if it sticks.Not exactly, I argue that having a bytecode standard is useless. How a compiler works internally is fairly irrelevant.
Dec 19 2012
On Thursday, 20 December 2012 at 01:41:38 UTC, Walter Bright wrote:On 12/19/2012 4:54 PM, Rob T wrote:One data point along this line, the most popular javascript implementation these days is v8, which is implemented as a javascript compiler, not with bytecode: http://wingolog.org/archives/2011/08/02/a-closer-look-at-crankshaft-v8s-optimizing-compiler More interesting posts about v8 on that blog: http://wingolog.org/tags/v8One question I have for you, is what percentage performance gain can you expect to get by using a well chosen bytecode-like language verses interpreting directly from source code?I know of zero claims that making a bytecode standard for javascript will improve performance.
Dec 20 2012
On Thursday, 20 December 2012 at 01:41:38 UTC, Walter Bright wrote:Not exactly, I argue that having a bytecode standard is useless. How a compiler works internally is fairly irrelevant.Note that in the first place, bytecode discussion has started with the need of provide a CTFEable module that do not contains more information that what is in a DI file, as it is a concern for some companies. Bytecode can solve that problem nicely IMO. You mentioned that DI is superior here, but I don't really understand how.
Dec 20 2012
On 12/20/2012 1:30 PM, deadalnix wrote:Note that in the first place, bytecode discussion has started with the need of provide a CTFEable module that do not contains more information that what is in a DI file, as it is a concern for some companies. Bytecode can solve that problem nicely IMO. You mentioned that DI is superior here, but I don't really understand how.No, it doesn't solve that problem at all. I explained why repeatedly.
Dec 20 2012
On Friday, 21 December 2012 at 05:43:18 UTC, Walter Bright wrote:On 12/20/2012 1:30 PM, deadalnix wrote:No you explained that java's bytecode doesn't solve that problem. Which is quite different.Note that in the first place, bytecode discussion has started with the need of provide a CTFEable module that do not contains more information that what is in a DI file, as it is a concern for some companies. Bytecode can solve that problem nicely IMO. You mentioned that DI is superior here, but I don't really understand how.No, it doesn't solve that problem at all. I explained why repeatedly.
Dec 20 2012
On 12/20/2012 10:05 PM, deadalnix wrote:On Friday, 21 December 2012 at 05:43:18 UTC, Walter Bright wrote:Please reread all of my messages in the thread. I addressed this.On 12/20/2012 1:30 PM, deadalnix wrote:No you explained that java's bytecode doesn't solve that problem. Which is quite different.Note that in the first place, bytecode discussion has started with the need of provide a CTFEable module that do not contains more information that what is in a DI file, as it is a concern for some companies. Bytecode can solve that problem nicely IMO. You mentioned that DI is superior here, but I don't really understand how.No, it doesn't solve that problem at all. I explained why repeatedly.
Dec 20 2012
On 12/20/2012 10:05 PM, deadalnix wrote:No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.
Dec 20 2012
On 21.12.2012 08:02, Walter Bright wrote:On 12/20/2012 10:05 PM, deadalnix wrote:Sorry, can't resist: How about feeding the x86 machine byte code (including some fixup information) into an interpreter? Maybe not realistic, but a data point in the field of possible "byte codes". The interpreter might even enjoy hardware support ;-) That might not cover all possible architectures, but if the distributed library is compiled for one platform only, CTFEing against another won't make much sense anyway.No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.
Dec 21 2012
On 12/21/2012 09:37 AM, Rainer Schuetze wrote:On 21.12.2012 08:02, Walter Bright wrote:http://bellard.org/jslinux/On 12/20/2012 10:05 PM, deadalnix wrote:Sorry, can't resist: How about feeding the x86 machine byte code (including some fixup information) into an interpreter? Maybe not realistic,No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.but a data point in the field of possible "byte codes". The interpreter might even enjoy hardware support ;-)Direct hardware support is not achievable because CTFE needs to be pure and safe.That might not cover all possible architectures, but if the distributed library is compiled for one platform only, CTFEing against another won't make much sense anyway.
Dec 21 2012
On 21.12.2012 10:20, Timon Gehr wrote:On 12/21/2012 09:37 AM, Rainer Schuetze wrote:Incredible ;-)On 21.12.2012 08:02, Walter Bright wrote:http://bellard.org/jslinux/On 12/20/2012 10:05 PM, deadalnix wrote:Sorry, can't resist: How about feeding the x86 machine byte code (including some fixup information) into an interpreter? Maybe not realistic,No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.True, you would have to trust the library code not to do unpure/unsafe operations. Some of this might be verifiable, e.g. not allowing fixups into mutable global memory.but a data point in the field of possible "byte codes". The interpreter might even enjoy hardware support ;-)Direct hardware support is not achievable because CTFE needs to be pure and safe.That might not cover all possible architectures, but if the distributed library is compiled for one platform only, CTFEing against another won't make much sense anyway.
Dec 21 2012
On 12/21/2012 12:37 AM, Rainer Schuetze wrote:On 21.12.2012 08:02, Walter Bright wrote:Not going to work, as CTFE needs type information. CTFE needs to interact with the current symbols and types in the compilation unit. Just think about what you'd need to do to get CTFE to read the object file for a module and try to execute the code in it, feeding it data and types and symbols from the current compilation unit? Consider: add EAX,37 mov [EAX],EBX what the heck is EAX pointing at?On 12/20/2012 10:05 PM, deadalnix wrote:Sorry, can't resist: How about feeding the x86 machine byte code (including some fixup information) into an interpreter? Maybe not realistic, but a data point in the field of possible "byte codes". The interpreter might even enjoy hardware support ;-)No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.
Dec 21 2012
On 21.12.2012 11:28, Walter Bright wrote:On 12/21/2012 12:37 AM, Rainer Schuetze wrote:I think you don't need to care. The CPU can execute it as well without type information. If the data layout of the interpreter values is the same as for the interpreted architecture, all you need to know is the calling convention and the types of the arguments to the function to be executed and the return type. I'd intercept calls to other functions because the interpreter might want to replace them with non-native versions (e.g. new or functions where the source code exists). The types of the data passed when executing these calls are known as well.On 21.12.2012 08:02, Walter Bright wrote:Not going to work, as CTFE needs type information. CTFE needs to interact with the current symbols and types in the compilation unit. Just think about what you'd need to do to get CTFE to read the object file for a module and try to execute the code in it, feeding it data and types and symbols from the current compilation unit? Consider: add EAX,37 mov [EAX],EBX what the heck is EAX pointing at?On 12/20/2012 10:05 PM, deadalnix wrote:Sorry, can't resist: How about feeding the x86 machine byte code (including some fixup information) into an interpreter? Maybe not realistic, but a data point in the field of possible "byte codes". The interpreter might even enjoy hardware support ;-)No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.
Dec 21 2012
On 12/21/2012 3:52 AM, Rainer Schuetze wrote:I think you don't need to care. The CPU can execute it as well without type information. If the data layout of the interpreter values is the same as for the interpreted architecture, all you need to know is the calling convention and the types of the arguments to the function to be executed and the return type.CPU instructions are as unportable as you can get. All type information is lost, as well as all information as to where the values for things come from. Hence, such a format has dependencies on every module and switch it was compiled with, dependencies that cannot be accounted for if they change. It cannot be inlined, no inferences can be made as to purity, and it's hard to see how CTFE could determine if a particular path through that code is supported or not by the CTFE. In fact, it is useless as a means of importing module information - you might as well just link to that object code at link time.
Dec 21 2012
On Friday, 21 December 2012 at 07:03:13 UTC, Walter Bright wrote:On 12/20/2012 10:05 PM, deadalnix wrote:Optimized LLVM bytecode look like a good candidate for the job. Note that I'm not suggesting this as a spec, but as an example of possible solution.No you explained that java's bytecode doesn't solve that problem. Which is quite different.I did, but obviously you did not find that satisfactory. Let me put it this way: Design a bytecode format, and present it here, that is CTFEable and is not able to be automatically decompiled.
Dec 21 2012
Optimized LLVM bytecode look like a good candidate for the job. Note that I'm not suggesting this as a spec, but as an example of possible solution.It's true that it couldn't be automatically decompiled to something equivalent to the original D source, but it does contain type information. Its human readable form (llvm assembly language) is easier to understand than assembly.
Dec 21 2012
On 12/21/2012 12:07 PM, jerro wrote:I haven't looked at the format, but if it's got type information, that goes quite a long way towards supporting automatic decompilation.Optimized LLVM bytecode look like a good candidate for the job. Note that I'm not suggesting this as a spec, but as an example of possible solution.It's true that it couldn't be automatically decompiled to something equivalent to the original D source, but it does contain type information. Its human readable form (llvm assembly language) is easier to understand than assembly.
Dec 21 2012
On Friday, 21 December 2012 at 20:08:00 UTC, jerro wrote:Once the optimizer is passed, a lot of it is lost. It is easier to understand than pure x86 assembly, but it is clearly opaque.Optimized LLVM bytecode look like a good candidate for the job. Note that I'm not suggesting this as a spec, but as an example of possible solution.It's true that it couldn't be automatically decompiled to something equivalent to the original D source, but it does contain type information. Its human readable form (llvm assembly language) is easier to understand than assembly.
Dec 21 2012
On Thursday, 20 December 2012 at 21:30:44 UTC, deadalnix wrote:On Thursday, 20 December 2012 at 01:41:38 UTC, Walter Bright wrote:Walter is right that bytecode doesn't solve that problem at all. High level bytecodes like Microsoft IL are trivially decompiled into very readable source code. I did that frequently at one of my jobs when I needed to debug third-party .NET libraries that we didn't have source code for. The advantage of bytecode is not in obfuscation. What Walter is wrong about is that bytecode is entirely pointless.Not exactly, I argue that having a bytecode standard is useless. How a compiler works internally is fairly irrelevant.Note that in the first place, bytecode discussion has started with the need of provide a CTFEable module that do not contains more information that what is in a DI file, as it is a concern for some companies. Bytecode can solve that problem nicely IMO. You mentioned that DI is superior here, but I don't really understand how.
Dec 21 2012
On 12/21/2012 2:13 AM, Max Samukha wrote:What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:On 12/21/2012 2:13 AM, Max Samukha wrote:Interpreting the AST directly: Requires recursion. Interpreting a (stack based) bytecode: Does not require recursion. That's what an AST to bytecode tranformation does; it eliminates the recursion. And that is far from being useless.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On 12/21/2012 2:37 AM, Araq wrote:On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:Sorry, I don't get this at all. Every bytecode scheme I've seen had a stack and recursion. Furthermore, that's not an argument that transmission of code (and importation of modules) is better done as bytecode than source code.On 12/21/2012 2:13 AM, Max Samukha wrote:Interpreting the AST directly: Requires recursion. Interpreting a (stack based) bytecode: Does not require recursion. That's what an AST to bytecode tranformation does; it eliminates the recursion. And that is far from being useless.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On Friday, 21 December 2012 at 10:37:05 UTC, Araq wrote:On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:It don't think that this is such a big deal. Either way you need one stack: either the call stack or the stack machine's stack. It doesn't seem to make a big difference. Am I wrong?On 12/21/2012 2:13 AM, Max Samukha wrote:Interpreting the AST directly: Requires recursion. Interpreting a (stack based) bytecode: Does not require recursion. That's what an AST to bytecode tranformation does; it eliminates the recursion. And that is far from being useless.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:On 12/21/2012 2:13 AM, Max Samukha wrote:It is not about bytecode vs source code. It is about a common platform-independent intermediate representation for multiple languages. JS is such a representation in the browsers and it is widely used. It it entirely pointless? I am not convinced it is.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On Friday, 21 December 2012 at 11:00:01 UTC, Max Samukha wrote:On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:Another example. Many of us here are talking in an intermediate language, which not English :) The concept of a common representation works pretty well here.On 12/21/2012 2:13 AM, Max Samukha wrote:It is not about bytecode vs source code. It is about a common platform-independent intermediate representation for multiple languages. JS is such a representation in the browsers and it is widely used. It it entirely pointless? I am not convinced it is.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On 2012-00-21 12:12, Max Samukha <maxsamukha gmail.com> wrote:On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:But Walter has said that for exactly this purpose, bytecode is useful. What he's said is that in the case proposed (using bytecode instead of source code for CTFE), bytecode offers absolutely no advantage over source. Now can we move on? It's been said so many times now, and we all know Walter is not a pushover. If nobody can present irrefutable, solid, peer-reviewed, and definite proof that bytecode has significant advantages over source code for the purpose of CTFE, such an implementation will never be done, and the world will be better off for it. -- SimenOn 12/21/2012 2:13 AM, Max Samukha wrote:It is not about bytecode vs source code. It is about a common platform-independent intermediate representation for multiple languages. JS is such a representation in the browsers and it is widely used. It it entirely pointless? I am not convinced it is.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?
Dec 21 2012
On Friday, 21 December 2012 at 17:08:28 UTC, Simen Kjaeraas wrote:On 2012-00-21 12:12, Max Samukha <maxsamukha gmail.com> wrote:Really? He sounded like the whole world should repent for using IRs. Maybe I misunderstood.On Friday, 21 December 2012 at 10:30:21 UTC, Walter Bright wrote:But Walter has said that for exactly this purpose, bytecode is useful.On 12/21/2012 2:13 AM, Max Samukha wrote:It is not about bytecode vs source code. It is about a common platform-independent intermediate representation for multiple languages. JS is such a representation in the browsers and it is widely used. It it entirely pointless? I am not convinced it is.What Walter is wrong about is that bytecode is entirely pointless.I'll bite. What is its advantage over source code?What he's said is that in the case proposed (using bytecode instead of source code for CTFE), bytecode offers absolutely no advantage over source. Now can we move on? It's been said so many times now, and we all know Walter is not a pushover. If nobody can present irrefutable, solid, peer-reviewed, and definite proof that bytecode has significant advantages over source code for the purpose of CTFE, such an implementation will never be done, and the world will be better off for it.I am not arguing that.
Dec 21 2012
On 12/19/12 4:10 AM, Rob T wrote:Do you have a theory or insight that can explain why a situation like the Java bytecode VM came to be and why it persists despite your suggestion that it is not required or of enough advantage to justify using it (may as well use Java source directly)?I think the important claim here is that an AST and a bytecode have the same "power". Clearly to parse source code into AST form has a cost, which is clearly understood by everyone in this discussion. Andrei
Dec 19 2012
On Tuesday, 18 December 2012 at 18:11:37 UTC, Walter Bright wrote:An interesting datapoint in regards to bytecode is Javascript. Note that Javascript is not distributed in bytecode form. There is no Javascript VM. It is distributed as source code. Sometimes, that source code is compressed and obfuscated, nevertheless it is still source code. How the end system chooses to execute the js is up to that end system, and indeed there are a great variety of methods in use. Javascript proves that bytecode is not required for "write once, run everywhere", which was one of the pitches for bytecode. What is required for w.o.r.e. is a specification for the source code that precludes undefined and implementation defined behavior. Note also that Typescript compiles to Javascript. I suspect there are other languages that do so, too.True, however JavaScript's case is similar to C. Many compilers make use of C as an high level assembler and JavaScript, like it or not, is the C of Internet. -- Paulo
Dec 18 2012