www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Distributed Memory implementation

reply tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:
I, due to a need, will start implementation of distributed memory 
system.

Idea is that:

Let's say you have allocated 1 GiB space in memory. This memory 
is blocked into 4 KiB.

After some reservation, and free operations, now only the blocks 
0, 12, and 13 are free to be allocated.

Problem is that those memory blocks are not consecutive.

With the implementation of distributed memory, those blocks will 
be appended into a struct or object (however it is implemented), 
and that struct or object will allow access to that 12 KiB of 
space (3 x 4 KiB blocks) like it is allocated in memory 
consecutively.

Is there anything like this in Phobos, or shall I start my own 
implementation?
Jan 17 2016
next sibling parent reply Nemanja Boric <4burgos gmail.com> writes:
On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 I, due to a need, will start implementation of distributed 
 memory system.

 Idea is that:

 Let's say you have allocated 1 GiB space in memory. This memory 
 is blocked into 4 KiB.

 After some reservation, and free operations, now only the 
 blocks 0, 12, and 13 are free to be allocated.

 Problem is that those memory blocks are not consecutive.

 With the implementation of distributed memory, those blocks 
 will be appended into a struct or object (however it is 
 implemented), and that struct or object will allow access to 
 that 12 KiB of space (3 x 4 KiB blocks) like it is allocated in 
 memory consecutively.

 Is there anything like this in Phobos, or shall I start my own 
 implementation?
Check https://dlang.org/phobos/std_experimental_allocator.html
Jan 18 2016
parent reply tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:
On Monday, 18 January 2016 at 08:12:03 UTC, Nemanja Boric wrote:
 On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 [...]
Check https://dlang.org/phobos/std_experimental_allocator.html
Which part of this module provide the functionality of using non-consecutive memory(distributed) blocks like they are consecutive?
Jan 18 2016
parent reply Nemanja Boric <4burgos gmail.com> writes:
On Monday, 18 January 2016 at 09:28:59 UTC, tcak wrote:
 On Monday, 18 January 2016 at 08:12:03 UTC, Nemanja Boric wrote:
 On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 [...]
Check https://dlang.org/phobos/std_experimental_allocator.html
Which part of this module provide the functionality of using non-consecutive memory(distributed) blocks like they are consecutive?
IIRC, none of them, sorry, but if you're going to implement it, my guess it that it should be compatible with `std.experimental.allocator`.
Jan 18 2016
parent Xinok <xinok live.com> writes:
On Monday, 18 January 2016 at 11:46:36 UTC, Nemanja Boric wrote:
 On Monday, 18 January 2016 at 09:28:59 UTC, tcak wrote:
 On Monday, 18 January 2016 at 08:12:03 UTC, Nemanja Boric 
 wrote:
 Check https://dlang.org/phobos/std_experimental_allocator.html
Which part of this module provide the functionality of using non-consecutive memory(distributed) blocks like they are consecutive?
IIRC, none of them, sorry, but if you're going to implement it, my guess it that it should be compatible with `std.experimental.allocator`.
Allocators work with void[] arrays which require memory to be contiguous. This idea of fragmented memory simply isn't compatible with allocators. Such a library could provide a range interface but it would have to be wrapped in a container of some sort. It could potentially be a candidate for std.container though.
Jan 18 2016
prev sibling next sibling parent reply Adrian Matoga <epi atari8.info> writes:
On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 I, due to a need, will start implementation of distributed 
 memory system.

 Idea is that:

 Let's say you have allocated 1 GiB space in memory. This memory 
 is blocked into 4 KiB.

 After some reservation, and free operations, now only the 
 blocks 0, 12, and 13 are free to be allocated.

 Problem is that those memory blocks are not consecutive.

 With the implementation of distributed memory, those blocks 
 will be appended into a struct or object (however it is 
 implemented), and that struct or object will allow access to 
 that 12 KiB of space (3 x 4 KiB blocks) like it is allocated in 
 memory consecutively.

 Is there anything like this in Phobos, or shall I start my own 
 implementation?
Just a note about terminology: I'd rather call it "fragmented memory" or something alike. The term "distributed memory" usually refers to something really different [1]. Your idea seems interesting, but IMHO a compacting GC should be the preferred solution for heap fragmentation. [1] https://en.wikipedia.org/wiki/Distributed_memory
Jan 18 2016
next sibling parent tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:
On Monday, 18 January 2016 at 09:56:17 UTC, Adrian Matoga wrote:
 On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 I, due to a need, will start implementation of distributed 
 memory system.

 Idea is that:

 Let's say you have allocated 1 GiB space in memory. This 
 memory is blocked into 4 KiB.

 After some reservation, and free operations, now only the 
 blocks 0, 12, and 13 are free to be allocated.

 Problem is that those memory blocks are not consecutive.

 With the implementation of distributed memory, those blocks 
 will be appended into a struct or object (however it is 
 implemented), and that struct or object will allow access to 
 that 12 KiB of space (3 x 4 KiB blocks) like it is allocated 
 in memory consecutively.

 Is there anything like this in Phobos, or shall I start my own 
 implementation?
Just a note about terminology: I'd rather call it "fragmented memory" or something alike. The term "distributed memory" usually refers to something really different [1]. Your idea seems interesting, but IMHO a compacting GC should be the preferred solution for heap fragmentation. [1] https://en.wikipedia.org/wiki/Distributed_memory
I was trying to remember the Windows program that is about disk clusters "De...". Couldn't remember. It was Defragmentation. Yes, Fragmented Memory is much better. It is definitely not that distributed memory thing. In my use case, a server program reserves a part of shared memory to client application, then when the client program is done with that memory, it releases them. So, making it dependent on GC might not be good. But I am sure, many use cases can be found that fits to GC.
Jan 18 2016
prev sibling parent Xinok <xinok live.com> writes:
On Monday, 18 January 2016 at 09:56:17 UTC, Adrian Matoga wrote:
 ...
 Your idea seems interesting, but IMHO a compacting GC should be 
 the preferred solution for heap fragmentation.
Implementing a compacting GC in D would be exceedingly difficult, if not impossible, because of raw pointers, unions, non-GC memory, linked C/C++ code, and possibly other reasons. Other languages, such as Java, have a much simpler model that can be dealt with. There's simply too many caveats in D and any such benefits would be minimal at best.
Jan 18 2016
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 Is there anything like this in Phobos, or shall I start my own 
 implementation?
It isn't really clear to me what you are trying to do. IIRC the C++ deque is usually implemented as an array of pointers to fixed sized memory blocks. Is that what you are trying to achieve?
Jan 19 2016
next sibling parent reply tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:
On Tuesday, 19 January 2016 at 10:09:01 UTC, Ola Fosheim Grøstad 
wrote:
 On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 Is there anything like this in Phobos, or shall I start my own 
 implementation?
It isn't really clear to me what you are trying to do. IIRC the C++ deque is usually implemented as an array of pointers to fixed sized memory blocks. Is that what you are trying to achieve?
First of all, I have started implementation, and a part of implementation is completed. For your question, I can explain it as follows: Example I have 3 memory blocks. Memory block 1: Ptr = 0x1000, Len = 100 Memory block 2: Ptr = 0x2000, Len = 5 Memory block 3: Ptr = 0x3000, Len = 150 Into the class FragmentedMemory, you append those three memory blocks. Then it provides you an interface like those three memory blocks are consecutive. Let's say: fragmem[ 125 ] = 5; Because the index 125 is in memory block 3, the value is written to memoryBlock3[ 20 ]; (125 = 100 + 5 + 20 ) Reading is in the same way. I think you can think about use cases on your side. Currently, set, get, append operations are completed. I will implement bulk memory copy and assign operations as well.
Jan 19 2016
parent Dragos Carp <dragoscarp gmail.com> writes:
On Tuesday, 19 January 2016 at 11:56:38 UTC, tcak wrote:
 On Tuesday, 19 January 2016 at 10:09:01 UTC, Ola Fosheim 
 Grøstad wrote:
 On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 Is there anything like this in Phobos, or shall I start my 
 own implementation?
It isn't really clear to me what you are trying to do. IIRC the C++ deque is usually implemented as an array of pointers to fixed sized memory blocks. Is that what you are trying to achieve?
First of all, I have started implementation, and a part of implementation is completed. For your question, I can explain it as follows: Example I have 3 memory blocks. Memory block 1: Ptr = 0x1000, Len = 100 Memory block 2: Ptr = 0x2000, Len = 5 Memory block 3: Ptr = 0x3000, Len = 150 Into the class FragmentedMemory, you append those three memory blocks. Then it provides you an interface like those three memory blocks are consecutive. Let's say: fragmem[ 125 ] = 5; Because the index 125 is in memory block 3, the value is written to memoryBlock3[ 20 ]; (125 = 100 + 5 + 20 ) Reading is in the same way. I think you can think about use cases on your side. Currently, set, get, append operations are completed. I will implement bulk memory copy and assign operations as well.
If the number of blocks is known at compile time use chain [1], otherwise use joiner [2]. At the moment joiner does not provide opIndex, opSlice, length and such, but this can be fixed. [2] http://dlang.org/phobos/std_algorithm_iteration.html#joiner
Jan 19 2016
prev sibling parent Chris Wright <dhasenan gmail.com> writes:
On Tue, 19 Jan 2016 10:09:01 +0000, Ola Fosheim Grøstad wrote:

 On Monday, 18 January 2016 at 05:59:15 UTC, tcak wrote:
 Is there anything like this in Phobos, or shall I start my own
 implementation?
It isn't really clear to me what you are trying to do. IIRC the C++ deque is usually implemented as an array of pointers to fixed sized memory blocks. Is that what you are trying to achieve?
It's a rope-style data structure, but for void[] rather than strings. https://en.wikipedia.org/wiki/Rope_(data_structure) An interesting property of this is that, while you can issue pointers to memory within this rope, it takes caution to ensure that a pointer remains within the rope. Instead of just incrementing the pointer, you would need to call a function on the rope to ensure that it promotes the pointer to the next memory segment as appropriate. This also means you can't store a multibyte object in the rope and use it in place. (Consider: the first block is one byte, the second is seven bytes; you store a ulong in the rope.) You must always copy it out, use it, and optionally store it back. A friendly API would have convenience methods for mutating structured data within the rope. This is why just using joiner isn't that friendly, even if joiner did have opSlice, opIndex, etc.
Jan 19 2016
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Mon, 18 Jan 2016 05:59:15 +0000
schrieb tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com>:

 I, due to a need, will start implementation of distributed memory 
 system.
 
 Idea is that:
 
 Let's say you have allocated 1 GiB space in memory. This memory 
 is blocked into 4 KiB.
 
 After some reservation, and free operations, now only the blocks 
 0, 12, and 13 are free to be allocated.
 
 Problem is that those memory blocks are not consecutive.
Well ... there is a way that is a bit arcane. If you don't need a lot of such remappings and you use page sized blocks anyways, you could remap the scattered virtual memory to a new consecutive range and be done. Basically you reserve a new uncommitted (e.g. not backed by physical memory) virtual memory range, remap your scattered blocks into that range in the desired order and then release the original blocks. I used this technique in a circular buffer implementation, mirroring the start of the buffer to the end, thus avoiding the wrapping problems that normally occur when accessing data close to the end in a traditional circular buffer. Caveats: You'd need to allocate/free virtual memory directly in multiples of the allocation granularity (which is the same as the page size on POSIX) and you put some stress on the virtual memory system. A process may have at most 2^16 memory mappings at a time on my Linux system. -- Marco
Jan 19 2016
parent tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:
On Wednesday, 20 January 2016 at 07:58:20 UTC, Marco Leise wrote:
 Am Mon, 18 Jan 2016 05:59:15 +0000
 schrieb tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com>:

 I, due to a need, will start implementation of distributed 
 memory system.
 
 Idea is that:
 
 Let's say you have allocated 1 GiB space in memory. This 
 memory is blocked into 4 KiB.
 
 After some reservation, and free operations, now only the 
 blocks 0, 12, and 13 are free to be allocated.
 
 Problem is that those memory blocks are not consecutive.
Well ... there is a way that is a bit arcane. If you don't need a lot of such remappings and you use page sized blocks anyways, you could remap the scattered virtual memory to a new consecutive range and be done. Basically you reserve a new uncommitted (e.g. not backed by physical memory) virtual memory range, remap your scattered blocks into that range in the desired order and then release the original blocks. I used this technique in a circular buffer implementation, mirroring the start of the buffer to the end, thus avoiding the wrapping problems that normally occur when accessing data close to the end in a traditional circular buffer. Caveats: You'd need to allocate/free virtual memory directly in multiples of the allocation granularity (which is the same as the page size on POSIX) and you put some stress on the virtual memory system. A process may have at most 2^16 memory mappings at a time on my Linux system.
What you are saying is true, and before starting the implementation, it was in my mind, but it doesn't fit to shared memory structure. If very big part of memory is already used, there won't be any more space to allocate a consecutive memory to bring the content of non-consecutive memory blocks. Anyway, I am done with the implementation. At least implemented as "shared" class. I am aware that, because of calculations, it won't be as fast as linear memory access, but solve the problem at least. I am writing unittest right now.
Jan 20 2016