www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Executable memory

reply "Alan" <alanpotteiger gmail.com> writes:
Hello! Sorry if I appear to be posting a lot of questions (if you 
saw my LLVM one, thanks again for the help) I'm trying to throw 
some things together and learn a lot.

So I've been researching compilers and virtual machines recently, 
I've managed to implement some fairly good front ends and the 
past few weeks I've been looking into various options for back 
ends. Something that really interests me in virtual machine type 
systems.

I looked into things like the JVM and Google's V8 JavaScript 
engine and read that they directly execute native code--reason 
for their speed (at least V8).
So I did some research and found tons of code snippets and stuff 
for C/C++ (of course) but from that I managed to write this in D:
import std.stdio, core.memory;

void main()
{
	uint* opcodes = cast(uint*)GC.malloc(2);
	opcodes[0] = 0xCC;
	/* System call of some sort */
         /* Function pointer */
	void* delegate() func;
	func.ptr = cast(void*)opcodes;
         /* Call function */
	func();
	GC.free(opcodes);
}

This reserves some memory and places some CPU instructions, makes 
a function pointer to that block of memory, calls it, and frees 
the memory. The only problem is, is that most processors and OS's 
block direct memory execution like this, to my understanding 
there is some way to mark this block of memory executable. (Cause 
without doing so there is a segmentation fault) I've seen ways to 
do this in Linux and Windows in C/C++ but I have no clue where to 
start with this in D. If anyone has any ideas for at least Linux 
that would be great, thanks a lot everyone!
Oct 04 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 4 October 2013 at 19:58:54 UTC, Alan wrote:
 fault) I've seen ways to do this in Linux and Windows in C/C++ 
 but I have no clue where to start with this in D.
You can often almost copy+paste code from C into D and get it working. My guess is the C examples use mmap on Linux and VirtualAlloc to do it on Windows. You can call those same functions from D as well. Most you might have to do is copy some declarations from C, but these are in druntime so you'll be ok. For example, a working Linux program would be like this: import core.sys.posix.sys.mman; void main() { // the operating system mmap call lets us set the executable flag uint* opcodes = cast(uint*) mmap(null, 4096, PROT_EXEC | PROT_WRITE, MAP_PRIVATE | MAP_ANON, 0, 0); scope(exit) munmap(opcodes, 4096); // free when we're done - scope(exit) rox opcodes[0] = 0xCC; // you did a delegate before, but you really should use a function // because delegates expect more state that won' really be here void* function() func = cast(void* function()) opcodes; func(); } and you get the breakpoint trap when you run it, which is what opcode 0xcc is supposed to do. And a working Windows program would look like this: import core.sys.windows.windows; void main() { // VirtualAlloc and mmap are very similar functions... uint* opcodes = cast(uint*) VirtualAlloc(null, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE); scope(exit) VirtualFree(opcodes, 4096, MEM_RELEASE); // and the rest is the same opcodes[0] = 0xCC; void* function() func = cast(void* function()) opcodes; func(); }
Oct 04 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
BTW I'm not sure if GC.malloc supports the executable flag or 
not. A quick search at druntime's source doesn't turn up 
anything, but maybe I missed it.

Regardless though, the operating system functions definitely work 
and knowing about them are useful anyway since it makes using C 
examples easier too.
Oct 04 2013
next sibling parent reply "Alan" <alanpotteiger gmail.com> writes:
Interesting... I was not aware of those functions in the D 
runtime, thanks for the help! Just some simple conditional 
compile statements will probably do the job!
Oct 04 2013
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 4 October 2013 at 20:26:35 UTC, Alan wrote:
 Interesting... I was not aware of those functions in the D 
 runtime
Technically, they're part of the operating system. If druntime didn't provide them, you could also just add // copy pasted from msdn extern(Windows) LPVOID VirtualAlloc( LPVOID lpAddress, SIZE_T dwSize, DWORD flAllocationType, DWORD flProtect ); or // copy pasted from the man page extern(C) void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); to your files (importing the necessary header so the types are defined, or defining them yourself too, might take some digging through the C header files though) and go ahead and call them that way.
 Just some simple conditional compile statements will probably 
 do the job!
yup. Also don't forget error checking, you should make sure the returned pointers aren't null (since they are C functions, they won't be throwing exceptions!)
Oct 04 2013
prev sibling parent reply "Alan" <alanpotteiger gmail.com> writes:
Great! So that's exactly what happens! Does anyone have an 
example of how to maybe print a character to the string with a 
system call? I'm on a 64bit intel pentium and running ubuntu 
linux. Any help is appreciated, thanks again Adam.
Oct 04 2013
next sibling parent "Alan" <alanpotteiger gmail.com> writes:
On Friday, 4 October 2013 at 20:50:09 UTC, Alan wrote:
 Great! So that's exactly what happens! Does anyone have an 
 example of how to maybe print a character to the string with a 
 system call? I'm on a 64bit intel pentium and running ubuntu 
 linux. Any help is appreciated, thanks again Adam.
Sorry, "how to maybe print a character to the string" makes no sense, I don't know why I said that. I meant "how to print a character to stdout" or something similar.
Oct 04 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 4 October 2013 at 20:50:09 UTC, Alan wrote:
 Does anyone have an example of how to maybe print a character 
 to the string with a  system call?
yeah on Linux the assembly is: string a = "hello!"; auto sptr = a.ptr; auto slen = a.length; version(D_InlineAsm_X86) asm { // 32 bit mov ECX, sptr; mov EDX, slen; mov EBX, fd; mov EAX, 4; // sys_write int 0x80; } else version(D_InlineAsm_X86_64) asm { // 64 bit mov RSI, sptr; mov RDX, slen; mov RDI, fd; mov RAX, 1; // sys_write syscall; } I gotta run, so I'll leave translating that into machine code an exercise for the reader (you could compile it in D then objdump it), at least until I get back to the computer :)
Oct 04 2013
next sibling parent "Alan" <alanpotteiger gmail.com> writes:
On Friday, 4 October 2013 at 21:10:30 UTC, Adam D. Ruppe wrote:
 On Friday, 4 October 2013 at 20:50:09 UTC, Alan wrote:
 Does anyone have an example of how to maybe print a character 
 to the string with a  system call?
yeah on Linux the assembly is: string a = "hello!"; auto sptr = a.ptr; auto slen = a.length; version(D_InlineAsm_X86) asm { // 32 bit mov ECX, sptr; mov EDX, slen; mov EBX, fd; mov EAX, 4; // sys_write int 0x80; } else version(D_InlineAsm_X86_64) asm { // 64 bit mov RSI, sptr; mov RDX, slen; mov RDI, fd; mov RAX, 1; // sys_write syscall; } I gotta run, so I'll leave translating that into machine code an exercise for the reader (you could compile it in D then objdump it), at least until I get back to the computer :)
Alright, thanks for the help Adam!
Oct 04 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
I'm just playing at this point and I'm pretty sure these hacks 
won't quite work or might even be kinda useless... but one way to 
avoid the hassle of making the machine code yourself is to get 
the compiler to do it.

So we'll write our function (just 32 bit here, the 64 bit didn't 
work and I'm not sure why, could be because of this 
http://stackoverflow.com/a/6313264/1457000 ) in D then copy it 
into the magic memory:

void sayHello() {
	asm {
		naked;
	}
	static immutable hello = "hello!\n";
	auto sptr = hello.ptr;
	auto slen = hello.length;
	version(D_InlineAsm_X86)
	asm { // 32 bit
		mov ECX, sptr;
		mov EDX, slen;
		mov EBX, 1; // stdout
		mov EAX, 4; // sys_write
		int 0x80;
	}

	asm {
		ret;
	}
}
void sayHelloEnd(){} // I'm using this with the assumption that 
the two functions will be right next to each other in memory, 
thus sayHello's code goes from &sayHello .. &sayHelloEnd. idk if 
that is really true but it seems to work for me

import core.sys.posix.sys.mman;

void main()
{
        // NOTE: you did uint* before, now i'm doing ubyte* since 
we're really working in bytes, not ints
         ubyte[] opcodes = (cast(ubyte*) mmap(null, 4096, 
PROT_EXEC | PROT_WRITE, MAP_PRIVATE | MAP_ANON, 0, 0))[0 .. 4096];
	assert(opcodes !is null);
	scope(exit)
		munmap(opcodes.ptr, 4096);

        // copy the code from our sayHello function that dmd 
compiled for us into the executable memory area
	auto f = cast(ubyte*) &sayHello;
	auto fe = cast(ubyte*) &sayHelloEnd;
	auto len = fe - f;
	import std.stdio;
	writeln("length: ", len); // shouldn't be very large
         opcodes[0 .. len] = f[0 .. len]; // copy it over
         void* function() func = cast(void* function()) opcodes;
         func(); // and run it

	writeln("ending function normally!");

       // then write over it to prove we still can...
	opcodes[0] = 0xcc;
	func(); // this should trap
}




Perhaps you could string together a bunch of little functions 
written in D and inline asm to build your executable code most 
easily using tricks like this. Assuming it continues to work in 
more complex situations!
Oct 04 2013
next sibling parent "Alan" <alanpotteiger gmail.com> writes:
On Friday, 4 October 2013 at 22:00:36 UTC, Adam D. Ruppe wrote:
 I'm just playing at this point and I'm pretty sure these hacks 
 won't quite work or might even be kinda useless... but one way 
 to avoid the hassle of making the machine code yourself is to 
 get the compiler to do it.

 So we'll write our function (just 32 bit here, the 64 bit 
 didn't work and I'm not sure why, could be because of this 
 http://stackoverflow.com/a/6313264/1457000 ) in D then copy it 
 into the magic memory:

 void sayHello() {
 	asm {
 		naked;
 	}
 	static immutable hello = "hello!\n";
 	auto sptr = hello.ptr;
 	auto slen = hello.length;
 	version(D_InlineAsm_X86)
 	asm { // 32 bit
 		mov ECX, sptr;
 		mov EDX, slen;
 		mov EBX, 1; // stdout
 		mov EAX, 4; // sys_write
 		int 0x80;
 	}

 	asm {
 		ret;
 	}
 }
 void sayHelloEnd(){} // I'm using this with the assumption that 
 the two functions will be right next to each other in memory, 
 thus sayHello's code goes from &sayHello .. &sayHelloEnd. idk 
 if that is really true but it seems to work for me

 import core.sys.posix.sys.mman;

 void main()
 {
        // NOTE: you did uint* before, now i'm doing ubyte* 
 since we're really working in bytes, not ints
         ubyte[] opcodes = (cast(ubyte*) mmap(null, 4096, 
 PROT_EXEC | PROT_WRITE, MAP_PRIVATE | MAP_ANON, 0, 0))[0 .. 
 4096];
 	assert(opcodes !is null);
 	scope(exit)
 		munmap(opcodes.ptr, 4096);

        // copy the code from our sayHello function that dmd 
 compiled for us into the executable memory area
 	auto f = cast(ubyte*) &sayHello;
 	auto fe = cast(ubyte*) &sayHelloEnd;
 	auto len = fe - f;
 	import std.stdio;
 	writeln("length: ", len); // shouldn't be very large
         opcodes[0 .. len] = f[0 .. len]; // copy it over
         void* function() func = cast(void* function()) opcodes;
         func(); // and run it

 	writeln("ending function normally!");

       // then write over it to prove we still can...
 	opcodes[0] = 0xcc;
 	func(); // this should trap
 }




 Perhaps you could string together a bunch of little functions 
 written in D and inline asm to build your executable code most 
 easily using tricks like this. Assuming it continues to work in 
 more complex situations!
Hmm interesting... I'll have to work with it some! I get a segmentation fault with this code though :| It outputs: length: 24 Segmentation fault (core dumped) Thanks for the help though!
Oct 04 2013
prev sibling parent "Alan" <alanpotteiger gmail.com> writes:
On Friday, 4 October 2013 at 22:00:36 UTC, Adam D. Ruppe wrote:
 I'm just playing at this point and I'm pretty sure these hacks 
 won't quite work or might even be kinda useless... but one way 
 to avoid the hassle of making the machine code yourself is to 
 get the compiler to do it.

 So we'll write our function (just 32 bit here, the 64 bit 
 didn't work and I'm not sure why, could be because of this 
 http://stackoverflow.com/a/6313264/1457000 ) in D then copy it 
 into the magic memory:

 void sayHello() {
 	asm {
 		naked;
 	}
 	static immutable hello = "hello!\n";
 	auto sptr = hello.ptr;
 	auto slen = hello.length;
 	version(D_InlineAsm_X86)
 	asm { // 32 bit
 		mov ECX, sptr;
 		mov EDX, slen;
 		mov EBX, 1; // stdout
 		mov EAX, 4; // sys_write
 		int 0x80;
 	}

 	asm {
 		ret;
 	}
 }
 void sayHelloEnd(){} // I'm using this with the assumption that 
 the two functions will be right next to each other in memory, 
 thus sayHello's code goes from &sayHello .. &sayHelloEnd. idk 
 if that is really true but it seems to work for me

 import core.sys.posix.sys.mman;

 void main()
 {
        // NOTE: you did uint* before, now i'm doing ubyte* 
 since we're really working in bytes, not ints
         ubyte[] opcodes = (cast(ubyte*) mmap(null, 4096, 
 PROT_EXEC | PROT_WRITE, MAP_PRIVATE | MAP_ANON, 0, 0))[0 .. 
 4096];
 	assert(opcodes !is null);
 	scope(exit)
 		munmap(opcodes.ptr, 4096);

        // copy the code from our sayHello function that dmd 
 compiled for us into the executable memory area
 	auto f = cast(ubyte*) &sayHello;
 	auto fe = cast(ubyte*) &sayHelloEnd;
 	auto len = fe - f;
 	import std.stdio;
 	writeln("length: ", len); // shouldn't be very large
         opcodes[0 .. len] = f[0 .. len]; // copy it over
         void* function() func = cast(void* function()) opcodes;
         func(); // and run it

 	writeln("ending function normally!");

       // then write over it to prove we still can...
 	opcodes[0] = 0xcc;
 	func(); // this should trap
 }




 Perhaps you could string together a bunch of little functions 
 written in D and inline asm to build your executable code most 
 easily using tricks like this. Assuming it continues to work in 
 more complex situations!
Hmm interesting... I'll have to work with it some! I get a segmentation fault with this code though :| It outputs: length: 24 Segmentation fault (core dumped) Thanks for the help though!
Oct 04 2013