www.digitalmars.com         C & C++   DMDScript  

D.gnu - DMD Intermediate Representation

reply resistor AT mac DOT com <resistor_member pathlink.com> writes:
I've been looking at trying to hook the DMD frontend up to LLVM (www.llvm.org),
but I've been having 
some trouble.  The LLVM IR (Intermediate Representation) is very well
documented, but I'm having a 
rough time figuring out how DMD holds its IR.  Since at least three people
(David, Ben, and Walter) seem 
to have understand, I thought I'd ask for guidance.

What's the best way to traverse the DMD IR once I've run the three semantic
phases?  As far as I can tell 
it's all held in the SymbolTable as a bunch of Symbols.  Is there a good way to
traverse that and 
reconstruct it into another IR?

-Owen
Aug 22 2004
next sibling parent Andy Friesen <andy ikagames.com> writes:
resistor AT mac DOT com wrote:
 I've been looking at trying to hook the DMD frontend up to LLVM (www.llvm.org),
 but I've been having 
 some trouble.  The LLVM IR (Intermediate Representation) is very well
 documented, but I'm having a 
 rough time figuring out how DMD holds its IR.  Since at least three people
 (David, Ben, and Walter) seem 
 to have understand, I thought I'd ask for guidance.
 
 What's the best way to traverse the DMD IR once I've run the three semantic
 phases?  As far as I can tell 
 it's all held in the SymbolTable as a bunch of Symbols.  Is there a good way to
 traverse that and 
 reconstruct it into another IR?
Have you checked out DLI? It's very old and badly out of sync with the latest DMD frontend, but it does sport a working x86 backend. You can still grab the last version from <http://opend.org> -- andy
Aug 23 2004
prev sibling parent reply David Friedman <d3rdclsmail earthlink.net> writes:
There isn't a generic visitor interface.  Instead, there are several 
methods with are responsible for emiting code/data and then calling that 
method for child objects.  Start by implementing Module::genobjfile and 
loop over the 'members' array, calling each Dsymbol object's toObjFile 
method.  From there, you will need to implement these methods:

Dsymbol (and descendents) ::toObjFile -- Emits code and data for objects 
that have generally have a symbol name and storage in memory. 
Containers like ClassDeclaration also have a 'members' array with child 
Dsymbols.  Most of these are descendents of the Declaration class.

Statement (and descendents) ::toIR -- Emits instructions.  Usually, you 
just call toObjFile, toIR, toElem, etc. on the statement's fields and 
string  the results together in the IR.

Expression (and descendents) ::toElem -- Returns a back end 
representation of numeric constants, variable references, and operations 
that expression trees are composed of.  This was very simple for GCC 
because the back end already had the code to convert expression trees to 
ordered instructions.  If LLVM doesn't do this, I think you could 
generate the instructions here since LLVM has SSA.

Type (and descendents) ::toCtype -- Returns the back end representation 
of the type.  Note that a lot of classes don't override this -- you just 
need to do a switch on the 'ty' field in Type::toCtype.

Dsymbol (and descendents) ::toSymbol -- returns the back end reference 
to the object.  For example, FuncDeclaration::toSymbol could return a 
llvm::Function. These are already implemented in tocsym.c, but you will 
probably rewrite them to create LLVM objects.

David


resistor AT mac DOT com wrote:
 I've been looking at trying to hook the DMD frontend up to LLVM (www.llvm.org),
 but I've been having 
 some trouble.  The LLVM IR (Intermediate Representation) is very well
 documented, but I'm having a 
 rough time figuring out how DMD holds its IR.  Since at least three people
 (David, Ben, and Walter) seem 
 to have understand, I thought I'd ask for guidance.
 
 What's the best way to traverse the DMD IR once I've run the three semantic
 phases?  As far as I can tell 
 it's all held in the SymbolTable as a bunch of Symbols.  Is there a good way to
 traverse that and 
 reconstruct it into another IR?
 
 -Owen
 
 
Aug 23 2004
parent reply Owen Anderson <resistor mac.com> writes:
Awesome.  Thanks for all the help.  I'm starting out by removing all 
backend calls and replacing them with debug printf's so I can keep track 
of what needs replacing.

Question:  How does your GDC code still have these includes -

#include	"cc.h"
#include	"el.h"
#include	"oper.h"
#include	"global.h"
#include	"code.h"
#include	"type.h"
#include	"dt.h"

Mine complains about them not existing, so I assumed they were backend 
headers.

-Owen

David Friedman wrote:
 There isn't a generic visitor interface.  Instead, there are several 
 methods with are responsible for emiting code/data and then calling that 
 method for child objects.  Start by implementing Module::genobjfile and 
 loop over the 'members' array, calling each Dsymbol object's toObjFile 
 method.  From there, you will need to implement these methods:
 
 Dsymbol (and descendents) ::toObjFile -- Emits code and data for objects 
 that have generally have a symbol name and storage in memory. Containers 
 like ClassDeclaration also have a 'members' array with child Dsymbols.  
 Most of these are descendents of the Declaration class.
 
 Statement (and descendents) ::toIR -- Emits instructions.  Usually, you 
 just call toObjFile, toIR, toElem, etc. on the statement's fields and 
 string  the results together in the IR.
 
 Expression (and descendents) ::toElem -- Returns a back end 
 representation of numeric constants, variable references, and operations 
 that expression trees are composed of.  This was very simple for GCC 
 because the back end already had the code to convert expression trees to 
 ordered instructions.  If LLVM doesn't do this, I think you could 
 generate the instructions here since LLVM has SSA.
 
 Type (and descendents) ::toCtype -- Returns the back end representation 
 of the type.  Note that a lot of classes don't override this -- you just 
 need to do a switch on the 'ty' field in Type::toCtype.
 
 Dsymbol (and descendents) ::toSymbol -- returns the back end reference 
 to the object.  For example, FuncDeclaration::toSymbol could return a 
 llvm::Function. These are already implemented in tocsym.c, but you will 
 probably rewrite them to create LLVM objects.
 
 David
 
 
 resistor AT mac DOT com wrote:
 
 I've been looking at trying to hook the DMD frontend up to LLVM 
 (www.llvm.org),
 but I've been having some trouble.  The LLVM IR (Intermediate 
 Representation) is very well
 documented, but I'm having a rough time figuring out how DMD holds its 
 IR.  Since at least three people
 (David, Ben, and Walter) seem to have understand, I thought I'd ask 
 for guidance.

 What's the best way to traverse the DMD IR once I've run the three 
 semantic
 phases?  As far as I can tell it's all held in the SymbolTable as a 
 bunch of Symbols.  Is there a good way to
 traverse that and reconstruct it into another IR?

 -Owen
Aug 24 2004
parent David Friedman <d3rdclsmail earthlink.net> writes:
GDC doesn't use the original todt.c and tocsym.c, so it doesn't need 
those headers.  Instead of recreating the DMD back end types (elem, 
dt_t, etc.), I just typedef'd them to be GCC nodes (except for the 
Symbol struct.)

David

Owen Anderson wrote:
 Awesome.  Thanks for all the help.  I'm starting out by removing all 
 backend calls and replacing them with debug printf's so I can keep track 
 of what needs replacing.
 
 Question:  How does your GDC code still have these includes -
 
 #include    "cc.h"
 #include    "el.h"
 #include    "oper.h"
 #include    "global.h"
 #include    "code.h"
 #include    "type.h"
 #include    "dt.h"
 
 Mine complains about them not existing, so I assumed they were backend 
 headers.
 
 -Owen
 
 David Friedman wrote:
 
 There isn't a generic visitor interface.  Instead, there are several 
 methods with are responsible for emiting code/data and then calling 
 that method for child objects.  Start by implementing 
 Module::genobjfile and loop over the 'members' array, calling each 
 Dsymbol object's toObjFile method.  From there, you will need to 
 implement these methods:

 Dsymbol (and descendents) ::toObjFile -- Emits code and data for 
 objects that have generally have a symbol name and storage in memory. 
 Containers like ClassDeclaration also have a 'members' array with 
 child Dsymbols.  Most of these are descendents of the Declaration class.

 Statement (and descendents) ::toIR -- Emits instructions.  Usually, 
 you just call toObjFile, toIR, toElem, etc. on the statement's fields 
 and string  the results together in the IR.

 Expression (and descendents) ::toElem -- Returns a back end 
 representation of numeric constants, variable references, and 
 operations that expression trees are composed of.  This was very 
 simple for GCC because the back end already had the code to convert 
 expression trees to ordered instructions.  If LLVM doesn't do this, I 
 think you could generate the instructions here since LLVM has SSA.

 Type (and descendents) ::toCtype -- Returns the back end 
 representation of the type.  Note that a lot of classes don't override 
 this -- you just need to do a switch on the 'ty' field in Type::toCtype.

 Dsymbol (and descendents) ::toSymbol -- returns the back end reference 
 to the object.  For example, FuncDeclaration::toSymbol could return a 
 llvm::Function. These are already implemented in tocsym.c, but you 
 will probably rewrite them to create LLVM objects.

 David


 resistor AT mac DOT com wrote:

 I've been looking at trying to hook the DMD frontend up to LLVM 
 (www.llvm.org),
 but I've been having some trouble.  The LLVM IR (Intermediate 
 Representation) is very well
 documented, but I'm having a rough time figuring out how DMD holds 
 its IR.  Since at least three people
 (David, Ben, and Walter) seem to have understand, I thought I'd ask 
 for guidance.

 What's the best way to traverse the DMD IR once I've run the three 
 semantic
 phases?  As far as I can tell it's all held in the SymbolTable as a 
 bunch of Symbols.  Is there a good way to
 traverse that and reconstruct it into another IR?

 -Owen
Aug 25 2004