digitalmars.D - Labels as values and threaded-code interpretation
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (99/99) May 31 2013 Hi,
- Manu (11/110) May 31 2013 GCC has a non-standard extension to do this, I've used it to get great
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/116) Jun 01 2013 Yes, it's basically essential for high-perf interpreters. It's a feature...
- bearophile (33/41) Jun 01 2013 This was discussed more than once, and I think it's a valuable
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (9/48) Jun 01 2013 I don't think there's any question as to the usefulness (and
- Walter Bright (5/13) Jun 01 2013 To be pedantic, C and C++ don't have that feature. Some compilers add it...
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (8/24) Jun 01 2013 I don't particularly care about its safety (its insanely unsafe). It's
- Paulo Pinto (6/20) Jun 01 2013 I always have fun in public forums making people aware that what they
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (6/27) Jun 02 2013 I am perfectly aware that it's a GNU C extension.
- Brad Roberts (3/16) Jun 02 2013 While you're technically correct, every major compiler in the unix world...
- Paulo Pinto (4/27) Jun 02 2013 If your world is only UNIX then yeah, there are still lots of non UNIX
- Brad Roberts (5/33) Jun 02 2013 I agree. I said unix primarily to mean most except for msvc.
- Walter Bright (2/6) Jun 02 2013 The curious question is why it never gets into the newer C and C++ Stand...
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (8/17) Jun 02 2013 That one's easy: It makes 'too many' assumptions about the capabilities
- Brad Roberts (7/13) Jun 02 2013 I can only surmise since I've never been a part of those processes, but:
- Jacob Carlborg (6/8) Jun 03 2013 It's like inline assembly. I think basically every compiler supports it
- Manu (4/31) Jun 02 2013 us.
- bearophile (8/15) Jun 01 2013 Regarding the syntax, why do you use "&&"? Isn't a single "&"
- Timon Gehr (8/22) Jun 01 2013 D (like C) uses a different namespace for labels and symbols that are
- bearophile (5/12) Jun 01 2013 On a related topic I wrote this:
- Andrej Mitrovic (4/10) Jun 01 2013 Perhaps for experimenting purposes (before resorting to language
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (6/16) Jun 01 2013 You need a way to jump to an arbitrary address too.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (8/22) Jun 01 2013 I just used the GNU C syntax because I was familiar with it. I don't
- bearophile (6/6) Jun 01 2013 As example the very small but fast virtual machine written in GNU
- Diggory (8/15) Jun 01 2013 Would be cool if there was a platform independent way in D to
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/22) Jun 01 2013 Probably a bit too domain-specific (not to say it isn't useful).
- Timon Gehr (3/59) Jun 01 2013 (This must be supported at compile time!)
- Adam D. Ruppe (4/5) Jun 01 2013 Me too. Last time this came up I said no since there's some
- Rob T (8/13) Jun 01 2013 I wonder if this sort of thing would help solve another
- Carl Sturtivant (3/7) Jun 02 2013 Here's some evidence that the answer to your question is "yes".
- Dmitry Olshansky (10/23) Jun 02 2013 Same here.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (6/29) Jun 02 2013 I'm not sure that can support threaded-code interpretation.
- Dmitry Olshansky (27/60) Jun 02 2013 Why not:
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (8/70) Jun 02 2013 The problem here is assuming the interpreter state can be global. Once
- Dmitry Olshansky (12/87) Jun 02 2013 One pointer to a struct MyInterpreterState. Think of it as 'this'
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (14/102) Jun 02 2013 That's one load and store for every single step in the interpreter.
- Dmitry Olshansky (8/28) Jun 02 2013 See my reply to Walter. Again I personally have not done any
- Dmitry Olshansky (12/21) Jun 02 2013 If the instruction set is RISC-y then you are in trouble anyway (writing...
- Walter Bright (4/6) Jun 02 2013 This should not be the case with register allocation using live ranges.
- Dmitry Olshansky (11/18) Jun 02 2013 The problem is that if every goto *get_me_new_opcode compile may have no...
- Walter Bright (2/9) Jun 02 2013 Thanks for the interesting article.
- nazriel (11/24) Jun 02 2013 It would also solve the problem with IASM and being unable to
Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions The idea is to be able to write code like this: ---- import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x => !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = &&handle_imm; break; case add: lbl = &&handle_add; break; case sub: lbl = &&handle_sub; break; // ... case ret: lbl = &&handle_ret; break; } } insn.lbl = lbl; } // Start interpreting the entry instruction. auto state = new State; state.pc = code[0]; goto *state.pc.lbl; // Interpreter logic follows... // The whole point is to avoid unnecessary function // calls, so we use a mixin template for this stuff. enum advance = "state.pc = state.pc.next;" ~ "goto *state.pc.lbl;"; handle_imm: { state.regs[state.pc.args[0]] = state.pc.args[1]; mixin(advance); } handle_add: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]]; mixin(advance); } handle_sub: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]]; mixin(advance); } // ... handle_ret: { return state.regs[state.pc.args[0]]; } assert(false); } ---- Notice that there are no function calls going on. Just plain jumps all the way through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
May 31 2013
GCC has a non-standard extension to do this, I've used it to get great performance wins writing emulators. I love this feature, love to see it in D! On 1 Jun 2013 15:30, "Alex R=C3=B8nne Petersen" <alex lycus.org> wrote:Hi, I'm sure this has been brought up before, but I feel I need to bring it u=pagain (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/**gcc/Labels-as-Values.html<http://gcc.gnu.=org/onlinedocs/gcc/Labels-as-Values.html>This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a labe=laddress directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th =Qat: http://www.erlang.org/doc/**installation_guide/INSTALL-** WIN32.html#Frequently-Asked-**Questions<http://www.erlang.org/doc/install=ation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions>The idea is to be able to write code like this: ---- import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x =3D> !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl =3D &&handle_imm; break; case add: lbl =3D &&handle_add; break; case sub: lbl =3D &&handle_sub; break; // ... case ret: lbl =3D &&handle_ret; break; } } insn.lbl =3D lbl; } // Start interpreting the entry instruction. auto state =3D new State; state.pc =3D code[0]; goto *state.pc.lbl; // Interpreter logic follows... // The whole point is to avoid unnecessary function // calls, so we use a mixin template for this stuff. enum advance =3D "state.pc =3D state.pc.next;" ~ "goto *state.pc.lbl;"; handle_imm: { state.regs[state.pc.args[0]] =3D state.pc.args[1]; mixin(advance); } handle_add: { state.regs[state.pc.args[0]] =3D state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]]; mixin(advance); } handle_sub: { state.regs[state.pc.args[0]] =3D state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]]; mixin(advance); } // ... handle_ret: { return state.regs[state.pc.args[0]]; } assert(false); } ---- Notice that there are no function calls going on. Just plain jumps all th=eway through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to suppor=tit too. -- Alex R=C3=B8nne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
May 31 2013
On 01-06-2013 07:41, Manu wrote:GCC has a non-standard extension to do this, I've used it to get great performance wins writing emulators. I love this feature, love to see it in D!Yes, it's basically essential for high-perf interpreters. It's a feature that a systems language like D must have.On 1 Jun 2013 15:30, "Alex Rønne Petersen" <alex lycus.org <mailto:alex lycus.org>> wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/__gcc/Labels-as-Values.html <http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html> This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/__installation_guide/INSTALL-__WIN32.html#Frequently-Asked-__Questions <http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions> The idea is to be able to write code like this: ---- import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x => !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = &&handle_imm; break; case add: lbl = &&handle_add; break; case sub: lbl = &&handle_sub; break; // ... case ret: lbl = &&handle_ret; break; } } insn.lbl = lbl; } // Start interpreting the entry instruction. auto state = new State; state.pc = code[0]; goto *state.pc.lbl; // Interpreter logic follows... // The whole point is to avoid unnecessary function // calls, so we use a mixin template for this stuff. enum advance = "state.pc = state.pc.next;" ~ "goto *state.pc.lbl;"; handle_imm: { state.regs[state.pc.args[0]] = state.pc.args[1]; mixin(advance); } handle_add: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]]; mixin(advance); } handle_sub: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]]; mixin(advance); } // ... handle_ret: { return state.regs[state.pc.args[0]]; } assert(false); } ---- Notice that there are no function calls going on. Just plain jumps all the way through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. -- Alex Rønne Petersen alex alexrp.com <mailto:alex alexrp.com> / alex lycus.org <mailto:alex lycus.org> http://alexrp.com / http://lycus.org-- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
Alex Rønne Petersen:I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension.This was discussed more than once, and I think it's a valuable improvement for certain kinds of low level D programming. Walter says that he has not added this feature to D because it's useful only to write interpreters and the like. I have found it useful in GCC to also write very fast finite state machines to analyse genomic strings. But even if Walter is right, writing interpreters (and emulators) is an important purpose for a system language as D. You can't write them efficient in scripting Languages like C/D/C++ (and few others, as later probably Rust) are the only few natural languages to write them. "Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too.Even if DMD back-end will never implement it, I think it's important to define it formally and officially in the D language and add the relative syntax to the front-end (plus a standard version identifier that allows to write programs that have both a routine that uses computed gotos and one that doesn't use them). This has the big advantage that all future compilers that will want to implement it will use the same nice standard syntax. If D doesn't define this syntax, then maybe GDC will add it, and maybe later LDC will add it, and later maybe other future compilers will add it, and we can't be sure they will share the same syntax. Another advantage of having this syntax in D, is that even if a compiler back-end doesn't support computed gotos, the program compiles without syntax errors when you use conditional compilation to disable the piece of code that uses computed gotos. Bye, bearophile
Jun 01 2013
On 01-06-2013 09:59, bearophile wrote:Alex Rønne Petersen:I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension.This was discussed more than once, and I think it's a valuable improvement for certain kinds of low level D programming. Walter says that he has not added this feature to D because it's useful only to write interpreters and the like. I have found it useful in GCC to also write very fast finite state machines to analyse genomic strings. But even if Walter is right, writing interpreters (and emulators) is an important purpose for a system language as D. You can't good at all to write them. Languages like C/D/C++ (and few others, as later probably Rust) are the only few natural languages to write them. "Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.Yes, good points.Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too.Even if DMD back-end will never implement it, I think it's important to define it formally and officially in the D language and add the relative syntax to the front-end (plus a standard version identifier that allows to write programs that have both a routine that uses computed gotos and one that doesn't use them). This has the big advantage that all future compilers that will want to implement it will use the same nice standard syntax. If D doesn't define this syntax, then maybe GDC will add it, and maybe later LDC will add it, and later maybe other future compilers will add it, and we can't be sure they will share the same syntax. Another advantage of having this syntax in D, is that even if a compiler back-end doesn't support computed gotos, the program compiles without syntax errors when you use conditional compilation to disable the piece of code that uses computed gotos.Bye, bearophile-- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:On 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
Jun 01 2013
On 02-06-2013 06:49, Walter Bright wrote:On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:I know, I just don't always remember to type out "GNU C" instead of "C".On 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.I don't particularly care about its safety (its insanely unsafe). It's all about performance with a feature like this. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
Am 02.06.2013 06:49, schrieb Walter Bright:On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour. Not much people seem to care to read language standards. :) -- PauloOn 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
Jun 01 2013
On 02-06-2013 08:44, Paulo Pinto wrote:Am 02.06.2013 06:49, schrieb Walter Bright:I am perfectly aware that it's a GNU C extension. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.orgOn 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour. Not much people seem to care to read language standards. :) -- PauloOn 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
Jun 02 2013
On 6/1/13 9:49 PM, Walter Bright wrote:On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.On 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
Jun 02 2013
Am 02.06.2013 09:25, schrieb Brad Roberts:On 6/1/13 9:49 PM, Walter Bright wrote:If your world is only UNIX then yeah, there are still lots of non UNIX os out there if you look outside the desktop computers. Of course, one could always question if they matter at all.On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.On 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
Jun 02 2013
On 6/2/13 12:33 AM, Paulo Pinto wrote:Am 02.06.2013 09:25, schrieb Brad Roberts:I agree. I said unix primarily to mean most except for msvc. Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs.On 6/1/13 9:49 PM, Walter Bright wrote:If your world is only UNIX then yeah, there are still lots of non UNIX os out there if you look outside the desktop computers. Of course, one could always question if they matter at all.On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.On 01-06-2013 09:59, bearophile wrote:To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere."Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
Jun 02 2013
On 6/2/2013 12:49 AM, Brad Roberts wrote:Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs.The curious question is why it never gets into the newer C and C++ Standards.
Jun 02 2013
On 02-06-2013 19:44, Walter Bright wrote:On 6/2/2013 12:49 AM, Brad Roberts wrote:That one's easy: It makes 'too many' assumptions about the capabilities of the target machine. Sort of like assuming signed integers overflow. In practice, though, any relevant target can support computed gotos. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.orgMany if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs.The curious question is why it never gets into the newer C and C++ Standards.
Jun 02 2013
On 6/2/13 10:44 AM, Walter Bright wrote:On 6/2/2013 12:49 AM, Brad Roberts wrote:I can only surmise since I've never been a part of those processes, but: 1) since the support is already there for the people that want it, why go through the hassle? 2) going through the hassle just to get it more broadly supported is a lot of hassle. 3) people are generally fundamentally allergic to hassle. Which also lines up with why people are asking for it to be added to d, because it's not already available, so it's worth the hassle.Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs.The curious question is why it never gets into the newer C and C++ Standards.
Jun 02 2013
On 2013-06-02 19:44, Walter Bright wrote:The curious question is why it never gets into the newer C and C++ Standards.It's like inline assembly. I think basically every compiler supports it but it's not in the standard. At least there's a compiler for each platform supporting it. -- /Jacob Carlborg
Jun 03 2013
On 2 June 2013 17:25, Brad Roberts <braddr puremagic.com> wrote:On 6/1/13 9:49 PM, Walter Bright wrote:oOn 6/1/2013 7:35 PM, Alex R=C3=B8nne Petersen wrote:On 01-06-2013 09:59, bearophile wrote:"Recently" the Python C interpreter was modified and speed up thanks t=us.While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard stat=To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.MSVC doesn't support it, which is really annoying.
Jun 02 2013
Alex Rønne Petersen:final switch (insn.op) { case imm: lbl = &&handle_imm; break; case add: lbl = &&handle_add; break; case sub: lbl = &&handle_sub; break; // ... case ret: lbl = &&handle_ret; break;Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? case imm: lbl = &handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile
Jun 01 2013
On 06/01/2013 11:43 AM, bearophile wrote:Alex Rønne Petersen:D (like C) uses a different namespace for labels and symbols that are not labels. This compiles today: void main(){ int foo; foo: auto b = &foo; }final switch (insn.op) { case imm: lbl = &&handle_imm; break; case add: lbl = &&handle_add; break; case sub: lbl = &&handle_sub; break; // ... case ret: lbl = &&handle_ret; break;Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? case imm: lbl = &handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile
Jun 01 2013
Timon Gehr:D (like C) uses a different namespace for labels and symbols that are not labels. This compiles today: void main(){ int foo; foo: auto b = &foo; }On a related topic I wrote this: http://d.puremagic.com/issues/show_bug.cgi?id=4902 Bye, bearophile
Jun 01 2013
On 6/1/13, Timon Gehr <timon.gehr gmx.ch> wrote:D (like C) uses a different namespace for labels and symbols that are not labels.Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.:void main(){ int foo; foo: auto b = __traits(gotoAddr, foo); }And if it's successful, we add language support.
Jun 01 2013
On 02-06-2013 05:18, Andrej Mitrovic wrote:On 6/1/13, Timon Gehr <timon.gehr gmx.ch> wrote:You need a way to jump to an arbitrary address too. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.orgD (like C) uses a different namespace for labels and symbols that are not labels.Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.:void main(){ int foo; foo: auto b = __traits(gotoAddr, foo); }And if it's successful, we add language support.
Jun 01 2013
On 01-06-2013 11:43, bearophile wrote:Alex Rønne Petersen:I just used the GNU C syntax because I was familiar with it. I don't particularly care how it ends up looking in D. But Timon makes a good point about namespaces. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.orgfinal switch (insn.op) { case imm: lbl = &&handle_imm; break; case add: lbl = &&handle_add; break; case sub: lbl = &&handle_sub; break; // ... case ret: lbl = &&handle_ret; break;Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? case imm: lbl = &handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile
Jun 01 2013
As example the very small but fast virtual machine written in GNU C++ from the International Contest on Functional Programming 2006: http://codepad.org/iibBeWKw It's faster than the same code without computed gotos. Bye, bearophile
Jun 01 2013
On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote:As example the very small but fast virtual machine written in GNU C++ from the International Contest on Functional Programming 2006: http://codepad.org/iibBeWKw It's faster than the same code without computed gotos. Bye, bearophileWould be cool if there was a platform independent way in D to construct a sequence of "call" instructions in memory. Then it would be possible to write a JIT compiler in the exact same style as that example. It would be relatively easy to add to phobos although you'd have to be careful about DEP. Hmm, perhaps I will try writing something like this...
Jun 01 2013
On 01-06-2013 15:17, Diggory wrote:On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote:At that point, you just want a simple in-memory assembler. Think asmjit.As example the very small but fast virtual machine written in GNU C++ from the International Contest on Functional Programming 2006: http://codepad.org/iibBeWKw It's faster than the same code without computed gotos. Bye, bearophileWould be cool if there was a platform independent way in D to construct a sequence of "call" instructions in memory. Then it would be possible to write a JIT compiler in the exact same style as that example.It would be relatively easy to add to phobos although you'd have to be careful about DEP. Hmm, perhaps I will try writing something like this...Probably a bit too domain-specific (not to say it isn't useful). -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions The idea is to be able to write code like this: ---- import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x => !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = &&handle_imm; break; case add: lbl = &&handle_add; break; case sub: lbl = &&handle_sub; break; // ... case ret: lbl = &&handle_ret; break; } } insn.lbl = lbl; } ...(This must be supported at compile time!)
Jun 01 2013
On Saturday, 1 June 2013 at 16:13:18 UTC, Timon Gehr wrote:I'd also like to see this.Me too. Last time this came up I said no since there's some inline asm hacks you can do but turns out those hacks suck in capability and appearance compared to just having this feature.
Jun 01 2013
On Saturday, 1 June 2013 at 16:23:11 UTC, Adam D. Ruppe wrote:On Saturday, 1 June 2013 at 16:13:18 UTC, Timon Gehr wrote:I wonder if this sort of thing would help solve another implementation problem, which is implementing very fast coroutines without the overhead seen with similar features such as fibers. Some discussion here: http://forum.dlang.org/thread/pcexsqbwefjueyylyyoz forum.dlang.org --rtI'd also like to see this.Me too. Last time this came up I said no since there's some inline asm hacks you can do but turns out those hacks suck in capability and appearance compared to just having this feature.
Jun 01 2013
On Saturday, 1 June 2013 at 17:44:02 UTC, Rob T wrote:I wonder if this sort of thing would help solve another implementation problem, which is implementing very fast coroutines without the overhead seen with similar features such as fibers.Here's some evidence that the answer to your question is "yes". http://www.cs.arizona.edu/icon/jcon/gde97.pdf‎
Jun 02 2013
01-Jun-2013 20:13, Timon Gehr пишет:On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; -- Dmitry OlshanskyHi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.
Jun 02 2013
On 02-06-2013 10:52, Dmitry Olshansky wrote:01-Jun-2013 20:13, Timon Gehr пишет:I'm not sure that can support threaded-code interpretation. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.orgOn 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.
Jun 02 2013
02-Jun-2013 20:48, Alex Rønne Petersen пишет:On 02-06-2013 10:52, Dmitry Olshansky wrote:Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer. -- Dmitry Olshansky01-Jun-2013 20:13, Timon Gehr пишет:I'm not sure that can support threaded-code interpretation.On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.
Jun 02 2013
On 02-06-2013 19:43, Dmitry Olshansky wrote:02-Jun-2013 20:48, Alex Rønne Petersen пишет:The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.orgOn 02-06-2013 10:52, Dmitry Olshansky wrote:Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer.01-Jun-2013 20:13, Timon Gehr пишет:I'm not sure that can support threaded-code interpretation.On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.
Jun 02 2013
02-Jun-2013 21:49, Alex Rønne Petersen пишет:On 02-06-2013 19:43, Dmitry Olshansky wrote:One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :) Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). And you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another. BTW Globals are actually quite expansive to access anyway, this was a mock example. -- Dmitry Olshansky02-Jun-2013 20:48, Alex Rønne Petersen пишет:The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call.On 02-06-2013 10:52, Dmitry Olshansky wrote:Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer.01-Jun-2013 20:13, Timon Gehr пишет:I'm not sure that can support threaded-code interpretation.On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.
Jun 02 2013
On 02-06-2013 19:58, Dmitry Olshansky wrote:02-Jun-2013 21:49, Alex Rønne Petersen пишет:That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running.On 02-06-2013 19:43, Dmitry Olshansky wrote:One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :)02-Jun-2013 20:48, Alex Rønne Petersen пишет:The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call.On 02-06-2013 10:52, Dmitry Olshansky wrote:Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer.01-Jun-2013 20:13, Timon Gehr пишет:I'm not sure that can support threaded-code interpretation.On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.I'd also like to see this.Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one).I'd frankly be surprised if that was the case. But without any investigation, it's just word against word.And you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another.It depends on what you mean by context. Can you elaborate?BTW Globals are actually quite expansive to access anyway, this was a mock example.Sure. Either way, evidence suggests that this style of interpreter has performance advantages in practice. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 02 2013
02-Jun-2013 22:25, Alex Rønne Petersen пишет:On 02-06-2013 19:58, Dmitry Olshansky wrote:See my reply to Walter. Again I personally have not done any measurements but what that post by LuaJIt author makes a lot of sense to me.That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running.The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call.One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :)Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one).I'd frankly be surprised if that was the case. But without any investigation, it's just word against word.Context is what a virtual thread represents in the global world. Any outside hooks into it like argv in C's main are there, function tables (=extensions) and whatnot. -- Dmitry OlshanskyAnd you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another.It depends on what you mean by context. Can you elaborate?
Jun 02 2013
02-Jun-2013 22:25, Alex Rønne Petersen пишет:On 02-06-2013 19:58, Dmitry Olshansky wrote:If the instruction set is RISC-y then you are in trouble anyway (writing interpreter not in ASM). One way out is fusing multiple simple instructions into macro-instructions and interpreting those.One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :)That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running.Either way, evidence suggests that this style of interpreter has performance advantages in practice.Agreed. Though this day I'd go for what I call "bastardized JIT". In essense just produce a chunk of "call XYZ" instructions, comnditionals would have to have a test + jmp. And then there is 'ret' at the end of this chunk ~4 instructions to have to deal with. What you get is branch prediction done on the hardware level and no instruction counter to worry about. -- Dmitry Olshansky
Jun 02 2013
On 6/2/2013 10:58 AM, Dmitry Olshansky wrote:Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one).This should not be the case with register allocation using live ranges. You can also help with this by organizing declarations so the variables have the shortest scope.
Jun 02 2013
02-Jun-2013 22:54, Walter Bright пишет:On 6/2/2013 10:58 AM, Dmitry Olshansky wrote:The problem is that if every goto *get_me_new_opcode compile may have no idea of where it will through it with computed labels. Hence some conservativeness of what registers contain and reloads and in effect all labels end up as isolated "functions". Plus a big stack frame is liability. And then there is a fair share of obstacles in making the kind of threaded interpreter code optimized properly, this post I find interesting: http://article.gmane.org/gmane.comp.lang.lua.general/75426Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one).This should not be the case with register allocation using live ranges.You can also help with this by organizing declarations so the variables have the shortest scope.Then splitting opcodes as functions separates their timespan even more. -- Dmitry Olshansky
Jun 02 2013
On 6/2/2013 1:10 PM, Dmitry Olshansky wrote:The problem is that if every goto *get_me_new_opcode compile may have no idea of where it will through it with computed labels. Hence some conservativeness of what registers contain and reloads and in effect all labels end up as isolated "functions". Plus a big stack frame is liability. And then there is a fair share of obstacles in making the kind of threaded interpreter code optimized properly, this post I find interesting: http://article.gmane.org/gmane.comp.lang.lua.general/75426Thanks for the interesting article.
Jun 02 2013
On Saturday, 1 June 2013 at 05:29:28 UTC, Alex Rønne Petersen wrote:Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-QuestionsIt would also solve the problem with IASM and being unable to address the labels in iasm I noticed the problem when I tried to address the db'ed bytes. http://dpaste.dzfl.pl/36bbd7d3 Commented out //mov dword ptr RAX, meh; ??? illustrates the problem. Jumping to labels in IASM blocks seems to work ( http://dpaste.dzfl.pl/d9e47f70 ). I guess we could have two chickens backed on the same fire ;)
Jun 02 2013