digitalmars.D - AddressSanitizer and the GC
- Johan Engelen (41/41) Jul 21 2017 Hi all,
- Walter Bright (15/15) Jul 21 2017 On 7/21/2017 12:06 PM, Johan Engelen wrote:
- Johan Engelen (19/38) Jul 21 2017 ARM's toolchain has something that does this:
- Johan Engelen (4/19) Jul 21 2017 Quick extra note: the function pointer solution is not friendly
- Walter Bright (8/11) Jul 21 2017 This code should be version'd with:
- Walter Bright (6/7) Jul 21 2017 Offhand, I don't know. It's been maybe 15 years since I worked on it, an...
- Walter Bright (3/5) Jul 21 2017 If anyone is game, it would be cool to do an LLVM compile of DMD so
- Johan Engelen (5/10) Jul 21 2017 Yep, I've got fuzzing to work too, so... expect nice blogs
- Walter Bright (2/10) Jul 21 2017 Pretty dazz! Please continue!
Hi all, I've added AddressSanitizer [1] support to LDC, and it is able to nicely catch bugs like this one: ``` void foo(int* arr) { arr[10] = 1; } void main() { int[10] a; foo(&a[0]); } ``` It works for malloc'ed memory, but not yet for GC'd memory. Our GC pool memory is malloc'ed, so it is "safe" to access from ASan's current point of view. To make it work with our GC, we have to mark all GC pool memory (malloc'ed) as poisoned, such that access to that memory will trigger an ASan error. Then when a user asks memory from the pool (e.g. `new int`), we should allocate a little more GC'd memory than what the user requests (red zone), and then mark only the user size as unpoisoned. By allocating a little more, we make sure that user allocated memory blocks are separated by at least an amount of red zone that helps detect read/write memory overflows. Pieces of the solution: a. Call __asan_poison_memory_region on all memory internally allocated by the GC b. ptr = GC.alloc(user_request_size+redzone) c. Call __asan_unpoison_memory_region on ptr[0..user_request_size] I've already been toying with several options but would like your opinion: what is the best way for me to add this to the GC? 1. What are the best places to modify the conservative GC with the above pieces? 2. Should I create a new GC type "asan"? We now have "conservative" (default) and "manual". Adding an option to the default GC is made hard because of linking problems (need to link with asan library, but I can define weak function stubs), but I am worried about slow down in non-asan mode (which is the 99.9999999% use case). Thanks, Johan [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
Jul 21 2017
On 7/21/2017 12:06 PM, Johan Engelen wrote: < [...] Thanks for your work on this. It is important to support such valuable tools. One way making such things pretty much cost-free in the olden days was to trick the linker into writing NOPs over the function calls by using specially crafted fixup records. This worked with dumb linkers, but today's linkers do much more semantic inference into what's going on, I'm afraid this may be too brittle. Unless there is a linker feature to support it? Are some linkers able to NOP out a function call if it sees that the function body consists solely of a RET? Another option is to use a pointer to the asan functions, like: if (asan_fp) (*asan_fp)(args...) The asan_fp can then be set to NULL and there's only the overhead of a compare/branch rather than a function call. Well, also the load of a global variable.
Jul 21 2017
On Friday, 21 July 2017 at 20:53:18 UTC, Walter Bright wrote:On 7/21/2017 12:06 PM, Johan Engelen wrote: < [...] Thanks for your work on this. It is important to support such valuable tools. One way making such things pretty much cost-free in the olden days was to trick the linker into writing NOPs over the function calls by using specially crafted fixup records. This worked with dumb linkers, but today's linkers do much more semantic inference into what's going on, I'm afraid this may be too brittle. Unless there is a linker feature to support it? Are some linkers able to NOP out a function call if it sees that the function body consists solely of a RET?ARM's toolchain has something that does this: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14171.html Can't find something like it for other toolchains.Another option is to use a pointer to the asan functions, like: if (asan_fp) (*asan_fp)(args...) The asan_fp can then be set to NULL and there's only the overhead of a compare/branch rather than a function call. Well, also the load of a global variable.I can mark the if-clause as unlikely, so the performance difference would be small (allocation is slow already, of course). Let me try that for now. We can decide later, let's first find an implementation that works and test that the whole idea is valid and I didn't miss anything important. Can you advise me on where best to add the asan calls? Thanks, Johan PS. The interface is easy, I don't need much info at the call sites: // Poisons memory region [addr, addr+size) void __asan_poison_memory_region(const(void*) addr, size_t size); // Unpoisons memory region [addr, addr+size) void __asan_unpoison_memory_region(const(void*) addr, size_t size);
Jul 21 2017
On Friday, 21 July 2017 at 21:37:39 UTC, Johan Engelen wrote:On Friday, 21 July 2017 at 20:53:18 UTC, Walter Bright wrote:Quick extra note: the function pointer solution is not friendly to LTO, whereas a weak-linking solution would easily be optimized-out fully with LTO. So that's something to consider too.On 7/21/2017 12:06 PM, Johan Engelen wrote: < [...] Another option is to use a pointer to the asan functions, like: if (asan_fp) (*asan_fp)(args...) The asan_fp can then be set to NULL and there's only the overhead of a compare/branch rather than a function call. Well, also the load of a global variable.I can mark the if-clause as unlikely, so the performance difference would be small (allocation is slow already, of course).
Jul 21 2017
On 7/21/2017 2:39 PM, Johan Engelen wrote:Quick extra note: the function pointer solution is not friendly to LTO, whereas a weak-linking solution would easily be optimized-out fully with LTO. So that's something to consider too.This code should be version'd with: version (AddressSanitizer) so that: 1. it will be easy to try different methods 2. all the different places it is used will be easy to find 3. if it does make things too slow, it can be easily 'removed' at the option of whoever built druntime
Jul 21 2017
On 7/21/2017 2:37 PM, Johan Engelen wrote:Can you advise me on where best to add the asan calls?Offhand, I don't know. It's been maybe 15 years since I worked on it, and a lot of people have moved things about since then. Dmitry Olshansky has worked on it very recently, in fact he is working on it at the moment. He might be a good person to ask and can likely give you a better answer than I can.
Jul 21 2017
On 7/21/2017 12:06 PM, Johan Engelen wrote:I've added AddressSanitizer [1] support to LDC, and it is able to nicely catch bugs like this one:If anyone is game, it would be cool to do an LLVM compile of DMD so AddressSanitizer can look for bugs in DMD itself.
Jul 21 2017
On Friday, 21 July 2017 at 20:56:02 UTC, Walter Bright wrote:On 7/21/2017 12:06 PM, Johan Engelen wrote:Yep, I've got fuzzing to work too, so... expect nice blogs "soon". ;-) (btw, the work is because of wanting fuzz testing at Weka, so that's my stress test for this work)I've added AddressSanitizer [1] support to LDC, and it is able to nicely catch bugs like this one:If anyone is game, it would be cool to do an LLVM compile of DMD so AddressSanitizer can look for bugs in DMD itself.
Jul 21 2017
On 7/21/2017 2:14 PM, Johan Engelen wrote:On Friday, 21 July 2017 at 20:56:02 UTC, Walter Bright wrote:Pretty dazz! Please continue!If anyone is game, it would be cool to do an LLVM compile of DMD so AddressSanitizer can look for bugs in DMD itself.Yep, I've got fuzzing to work too, so... expect nice blogs "soon". ;-) (btw, the work is because of wanting fuzz testing at Weka, so that's my stress test for this work)
Jul 21 2017