digitalmars.D - Identifier-name compression.
- Stefan Koch (9/9) May 21 2016 Hi,
- Stefan Koch (8/18) May 21 2016 I though about this a bit more and I am more and more convinced
- Walter Bright (3/4) May 21 2016 It won't be reproducible from run to run, and worse, if you use separate...
- Stefan Koch (6/10) May 21 2016 please elaborate why wouldn't it be reproduceble from run to run ?
- Walter Bright (4/12) May 21 2016 Because it is the address of the symbol, and modern operating systems ra...
- Stefan Koch (3/7) May 21 2016 Of course the table would have to build by the compiler and
- Walter Bright (2/4) May 21 2016 You'd have to build your own linker, too.
- Stefan Koch (2/7) May 21 2016 Not if dmd is used to build the executable.
- Walter Bright (3/10) May 21 2016 Since such a dmd would have to be able to read .o files created by C/C++...
- Stefan Koch (7/19) May 21 2016 If an extern(C) or extern(c++) is used we can't do our mangling
- Stefan Koch (8/12) May 21 2016 There will not be duplicates since you would not compile the same
- Walter Bright (2/4) May 21 2016 I've used such for temporaries, but they caused problems and people comp...
- Stefan Koch (5/11) May 21 2016 I see.
Hi, I just had a nice idea. However due to my lack of obj-file-format knowlege I don't know how feasible it is. As far as I can see Identifiers are already in a hashed format while inside the symbol-table of the compiler. The Idea would be to safe a hash-table from id to clear-text-name or compressed-clear-text-name inside the object And simply mangle the id of the identifier rather then the identifier itself.
May 21 2016
On Saturday, 21 May 2016 at 22:40:44 UTC, Stefan Koch wrote:Hi, I just had a nice idea. However due to my lack of obj-file-format knowlege I don't know how feasible it is. As far as I can see Identifiers are already in a hashed format while inside the symbol-table of the compiler. The Idea would be to safe a hash-table from id to clear-text-name or compressed-clear-text-name inside the object And simply mangle the id of the identifier rather then the identifier itself.I though about this a bit more and I am more and more convinced that it can actually work. Since the symbol id per module will be unique. So basically it would go like _modulename_length%modulename_SymbolID This way processing time will not be touched and in the best case it will even be reduced.
May 21 2016
On 5/21/2016 3:50 PM, Stefan Koch wrote:[...]It won't be reproducible from run to run, and worse, if you use separate compilation, duplicates are inevitable.
May 21 2016
On Saturday, 21 May 2016 at 22:59:48 UTC, Walter Bright wrote:On 5/21/2016 3:50 PM, Stefan Koch wrote:please elaborate why wouldn't it be reproduceble from run to run ? aren't symbols always inserted in the same order. So the same sourceFile will always produce the same mangling ? and at link time the id-to-identifier translation-table would be consulted ?[...]It won't be reproducible from run to run, and worse, if you use separate compilation, duplicates are inevitable.
May 21 2016
On 5/21/2016 4:02 PM, Stefan Koch wrote:On Saturday, 21 May 2016 at 22:59:48 UTC, Walter Bright wrote:Because it is the address of the symbol, and modern operating systems randomize the addresses of a loaded program from run to run.On 5/21/2016 3:50 PM, Stefan Koch wrote:please elaborate why wouldn't it be reproduceble from run to run ?[...]It won't be reproducible from run to run, and worse, if you use separate compilation, duplicates are inevitable.and at link time the id-to-identifier translation-table would be consulted ?There's no such table.
May 21 2016
On Saturday, 21 May 2016 at 23:20:53 UTC, Walter Bright wrote:On 5/21/2016 4:02 PM, Stefan Koch wrote:Of course the table would have to build by the compiler and inserted as data into the object-file.and at link time the id-to-identifier translation-table would be consulted ?There's no such table.
May 21 2016
On 5/21/2016 4:30 PM, Stefan Koch wrote:Of course the table would have to build by the compiler and inserted as data into the object-file.You'd have to build your own linker, too.
May 21 2016
On Saturday, 21 May 2016 at 23:43:48 UTC, Walter Bright wrote:On 5/21/2016 4:30 PM, Stefan Koch wrote:Not if dmd is used to build the executable.Of course the table would have to build by the compiler and inserted as data into the object-file.You'd have to build your own linker, too.
May 21 2016
On 5/21/2016 4:45 PM, Stefan Koch wrote:On Saturday, 21 May 2016 at 23:43:48 UTC, Walter Bright wrote:Since such a dmd would have to be able to read .o files created by C/C++, it would be the same thing as building our own linker.On 5/21/2016 4:30 PM, Stefan Koch wrote:Not if dmd is used to build the executable.Of course the table would have to build by the compiler and inserted as data into the object-file.You'd have to build your own linker, too.
May 21 2016
On Saturday, 21 May 2016 at 23:52:59 UTC, Walter Bright wrote:On 5/21/2016 4:45 PM, Stefan Koch wrote:If an extern(C) or extern(c++) is used we can't do our mangling scheme anyway. So any function that is supposed to be called by C or C++ will still be mangled a compatible way. That way we can get away with using dmd as a pre-linker and doing the rest of the job with the system linker.On Saturday, 21 May 2016 at 23:43:48 UTC, Walter Bright wrote:Since such a dmd would have to be able to read .o files created by C/C++, it would be the same thing as building our own linker.On 5/21/2016 4:30 PM, Stefan Koch wrote:Not if dmd is used to build the executable.Of course the table would have to build by the compiler and inserted as data into the object-file.You'd have to build your own linker, too.
May 21 2016
On Saturday, 21 May 2016 at 22:59:48 UTC, Walter Bright wrote:On 5/21/2016 3:50 PM, Stefan Koch wrote:There will not be duplicates since you would not compile the same module twice and If you do, It is trivial to remove them. In fact you would have the same doublicates with every mangling scheme. A symbol can be uniquely identified with the module it is defined in and a numerical id. If your module names clash you cannot compile anyway... At least I hope so.[...]It won't be reproducible from run to run, and worse, if you use separate compilation, duplicates are inevitable.
May 21 2016
On 5/21/2016 4:08 PM, Stefan Koch wrote:A symbol can be uniquely identified with the module it is defined in and a numerical id.I've used such for temporaries, but they caused problems and people complained.
May 21 2016
On Saturday, 21 May 2016 at 23:22:22 UTC, Walter Bright wrote:On 5/21/2016 4:08 PM, Stefan Koch wrote:I see. But realistically compression of the symbolName is not diffrent. If fact the hypothetical id-to-name table would just be the external dictionary of a compressor.A symbol can be uniquely identified with the module it is defined in and a numerical id.I've used such for temporaries, but they caused problems and people complained.
May 21 2016