digitalmars.D - Small Buffer Optimization for string and friends
- Andrei Alexandrescu (18/18) Apr 07 2012 Walter and I discussed today about using the small string optimization
- Daniel Murphy (9/21) Apr 07 2012 This sounds like it would be a great addition to phobos.
- Andrei Alexandrescu (9/14) Apr 08 2012 Probably not.
- Walter Bright (2/5) Apr 18 2018 I don't know of a good generic way to do precise collection with unions.
- Andrej Mitrovic (5/6) Apr 08 2012 I just hope that doesn't cause:
- Andrei Alexandrescu (9/15) Apr 08 2012 Walter and I agree that relying on sheer D for instead of
- Vladimir Panteleev (6/12) Apr 08 2012 Don't use the first byte. Use the last byte.
- Manu (2/14) Apr 08 2012 What is the plan for 32bit?
- Andrei Alexandrescu (4/17) Apr 08 2012 We can experiment with making strings shorter than 8 chars in-situ. The
- Manu (10/36) Apr 08 2012 29 bits? ...not 31?
- Andrei Alexandrescu (9/18) Apr 08 2012 Essentially it will use either the first or the last bit of the
- deadalnix (2/27) Apr 10 2012 As it is a flag, why not limit the string size to 2GB instead of 512MB ?
- Vladimir Panteleev (10/24) Apr 08 2012 Erm... never mind, I thought the pointer was the first field.
- Andrei Alexandrescu (5/28) Apr 08 2012 It's amazing how fast shift works :o).
- Walter Bright (4/5) Apr 08 2012 That could get expensive. You cannot just point into the small string pa...
- Andrei Alexandrescu (5/10) Apr 08 2012 As I mentioned, the first call to .ptr changes representation, thus
- Michel Fortin (12/23) Apr 08 2012 This only works if the code has write access to that variable.
- Andrei Alexandrescu (5/16) Apr 08 2012 I think there's a bit of a confusion. Could you please clarify with an
- Andrei Alexandrescu (11/27) Apr 08 2012 I think I understand:
- Michel Fortin (20/42) Apr 08 2012 Exactly.
- Manu (15/33) Apr 08 2012 The only way I can see this working is if the marker happens to be the t...
- Andrei Alexandrescu (7/14) Apr 08 2012 That does happen, but much more rarely than one might think.
- Walter Bright (3/9) Apr 19 2018 Warp (a fast C preprocessor)
- Manu (3/13) Apr 19 2018 Cool story... 6 years later! :P
- Andrei Alexandrescu (3/14) Apr 08 2012 Hehe. Good idea. On big endian machines the last byte is the ticket!
- Andrei Alexandrescu (2/17) Apr 08 2012 s/big/little/
- Patrick Schluter (6/20) Apr 15 2018 If the length has multi purpose it would be even better to
- Jacob Carlborg (4/7) Apr 08 2012 Just don't make the same mistake as with AA.
- Andrei Alexandrescu (4/9) Apr 08 2012 The mistake with AAs was done long ago, but it was forced as AAs
- Jacob Carlborg (5/16) Apr 08 2012 I'm referring to the new template implementation of AAs that got
- H. S. Teoh (9/21) Apr 08 2012 [...]
- H. S. Teoh (49/51) Apr 08 2012 [...]
- Tove (3/74) Apr 08 2012 doesn't this work?
- H. S. Teoh (21/50) Apr 08 2012 [...]
- Andrei Alexandrescu (5/8) Apr 08 2012 Thanks for the update! Let me reiterate this is important work for many
- H. S. Teoh (10/20) Apr 08 2012 [...]
- Andrei Alexandrescu (3/11) Apr 08 2012 I wonder how frequently such code is used.
- Michel Fortin (41/52) Apr 08 2012 Small buffer optimization is a very good thing to have indeed. But… ho...
- Andrei Alexandrescu (4/9) Apr 08 2012 Taking .ptr will engender a copy. A small regression will be that
- Michel Fortin (20/30) Apr 08 2012 You know, many people have been wary of hidden memory allocations in
- Andrei Alexandrescu (9/36) Apr 08 2012 Well, the optimization makes for fewer allocations total. In fact .ptr
- Somedude (11/40) Apr 08 2012 You've raised some very good points. I also think this won't do.
- Don (3/6) Apr 09 2012 vote -= real.infinity.
- Timon Gehr (7/13) Apr 09 2012 Why does this even compile?
- Manu (13/20) Apr 09 2012 How do you figure?
- Andrei Alexandrescu (9/20) Apr 09 2012 I agree. So we have the counterarguments:
- Jakob Ovrum (13/17) Apr 09 2012 It's important to note that this pattern is probably most common
- Andrej Mitrovic (10/13) Apr 09 2012 Yup. E.g. WinAPI text drawing functions take a wchar* and a length. I
- Jakob Ovrum (16/25) Apr 09 2012 You're right, I just confirmed the optimization is still in place
- Andrej Mitrovic (4/8) Apr 09 2012 Or add a compile-time argument:
- Manu (15/35) Apr 09 2012 Indeed. I don't think the small array optimisation would benefit us in a...
- Nick Sabalausky (9/16) Apr 09 2012 So in other words, we need the @forceinline that some people have strong...
- Artur Skawina (13/36) Apr 10 2012 Obviously, yes, but should wait until enough attribute support is in pla...
- Marco Leise (5/7) Apr 10 2012 If you refer to the proposed user attributes, they wont change the opera...
- Artur Skawina (7/14) Apr 10 2012 I'm saying that introducing new function attributes like @inline to the ...
- Marco Leise (5/12) Apr 10 2012 I had to read up on your older posts again. So you are not expecting com...
- Andrei Alexandrescu (3/9) Apr 09 2012 Why?
- Steven Schveighoffer (9/25) Apr 09 2012 No, this would suck.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/7) Apr 15 2018 Have anybody put together code that implements this idea in a
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (181/184) Apr 17 2018 I put together SSOString at
Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. It turns out statistically a lot of strings are small. According to a variety of systems we use at Facebook, the small buffer optimization is king - it just works great in all cases. In D that means better speed, better locality, and less garbage. For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions: 1. What happened to the new hash project? We need to take that to completion. 2. Is anyone willing to start the effort of migrating built-in slices into templates? Thanks, Andrei
Apr 07 2012
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:jlr9ak$28bv$1 digitalmars.com...Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. It turns out statistically a lot of strings are small. According to a variety of systems we use at Facebook, the small buffer optimization is king - it just works great in all cases. In D that means better speed, better locality, and less garbage.This sounds like it would be a great addition to phobos.For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to.- This has been a disaster for AAs - Is it worth doing for 32 bit? - Would generate false pointers - Run-time check on every array access? - Why should this be in the language/compiler instead of phobos? April 1st was last week!!
Apr 07 2012
On 4/8/12 1:33 AM, Daniel Murphy wrote:- This has been a disaster for AAsMaking them "magic" has been the problem. We must revert that.- Is it worth doing for 32 bit?Probably not.- Would generate false pointersFair point but we're also moving to precise collection :o).- Run-time check on every array access?Cost is negligible in most cases according to the extensive measurements we've done.- Why should this be in the language/compiler instead of phobos?People use built-in immutable(char)[] as strings. We want to benefit existing uses. Andrei
Apr 08 2012
On 4/8/2012 7:29 AM, Andrei Alexandrescu wrote:On 4/8/12 1:33 AM, Daniel Murphy wrote:I don't know of a good generic way to do precise collection with unions.- Would generate false pointersFair point but we're also moving to precise collection :o).
Apr 18 2018
On Wed., 18 Apr. 2018, 1:00 pm Walter Bright via Digitalmars-d, < digitalmars-d puremagic.com> wrote:On 4/8/2012 7:29 AM, Andrei Alexandrescu wrote:I wonder if precise collectors could leverage runtime support for ambiguous cases? opPreciseCollect() which might return an array of pointers contained in T, which would allow runtime logic to determine how the union should be interpreted... Or maybe the function should receive a delegate which the function should call on each embedded pointer. I'm sure some standardised runtime support function can help out in these cases...On 4/8/12 1:33 AM, Daniel Murphy wrote:I don't know of a good generic way to do precise collection with unions.- Would generate false pointersFair point but we're also moving to precise collection :o).
Apr 18 2018
On Wed., 18 Apr. 2018, 8:36 pm Manu, <turkeyman gmail.com> wrote:On Wed., 18 Apr. 2018, 1:00 pm Walter Bright via Digitalmars-d, < digitalmars-d puremagic.com> wrote:This would be useful too for applications that use bit-packed or encoded/implied pointers...On 4/8/2012 7:29 AM, Andrei Alexandrescu wrote:I wonder if precise collectors could leverage runtime support for ambiguous cases? opPreciseCollect() which might return an array of pointers contained in T, which would allow runtime logic to determine how the union should be interpreted... Or maybe the function should receive a delegate which the function should call on each embedded pointer. I'm sure some standardised runtime support function can help out in these cases...On 4/8/12 1:33 AM, Daniel Murphy wrote:I don't know of a good generic way to do precise collection with unions.- Would generate false pointersFair point but we're also moving to precise collection :o).
Apr 18 2018
On 4/8/12, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:essentially making them templatesI just hope that doesn't cause: 1) Awful template errors 2) Slower build times 3) More ICEs
Apr 08 2012
On 4/8/12 4:30 AM, Andrej Mitrovic wrote:On 4/8/12, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:Walter and I agree that relying on sheer D for instead of compiler-generated magic (in this case, bitmaps etc) is the better solution. Implementing a generic marker as a template using introspection is very simple - I predict a few dozen lines. Once that is finished there will obviously be no template errors. I don't know whether build times would be impacted. There will be fewer ICEs because there will be less reliance on the compiler. Andreiessentially making them templatesI just hope that doesn't cause: 1) Awful template errors 2) Slower build times 3) More ICEs
Apr 08 2012
On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 08 2012
On 8 April 2012 12:46, Vladimir Panteleev <vladimir thecybershadow.net>wrote:On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:What is the plan for 32bit?Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 08 2012
On 4/8/12 4:54 AM, Manu wrote:On 8 April 2012 12:46, Vladimir Panteleev <vladimir thecybershadow.net <mailto:vladimir thecybershadow.net>> wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit?We can experiment with making strings shorter than 8 chars in-situ. The drawback will be that length will be limited to 29 bits, i.e. 512MB. Andrei
Apr 08 2012
On 8 April 2012 17:52, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>wrote:On 4/8/12 4:54 AM, Manu wrote:29 bits? ...not 31? How does this implementation actually work? On 32/64 bits, and little/big endian? I can only imagine it working with a carefully placed 1 bit. bit-0 of the size on little endian, and bit-31 of the size on big endian. That should only halve the address range (leaving 31 bits)... where did the other 2 bits go? I also hope this only affects slices of chars? It will ignore this behaviour for anything other than char arrays right?On 8 April 2012 12:46, Vladimir Panteleev <vladimir thecybershadow.net <mailto:vladimir **thecybershadow.net <vladimir thecybershadow.net>>> wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit?We can experiment with making strings shorter than 8 chars in-situ. The drawback will be that length will be limited to 29 bits, i.e. 512MB.
Apr 08 2012
On 4/8/12 10:03 AM, Manu wrote:29 bits? ...not 31?Sorry, 31 indeed.How does this implementation actually work? On 32/64 bits, and little/big endian? I can only imagine it working with a carefully placed 1 bit. bit-0 of the size on little endian, and bit-31 of the size on big endian. That should only halve the address range (leaving 31 bits)... where did the other 2 bits go?Essentially it will use either the first or the last bit of the representation as discriminator. That bit is most likely "taken" from the length representation. Shifting and masking can easily account for it when computing length of large strings.I also hope this only affects slices of chars? It will ignore this behaviour for anything other than char arrays right?It works for any arrays of sufficiently small immutable data type (e.g. immutable(byte)[]), but the most advantage is reaped for string. Andrei
Apr 08 2012
Le 08/04/2012 16:52, Andrei Alexandrescu a écrit :On 4/8/12 4:54 AM, Manu wrote:As it is a flag, why not limit the string size to 2GB instead of 512MB ?On 8 April 2012 12:46, Vladimir Panteleev <vladimir thecybershadow.net <mailto:vladimir thecybershadow.net>> wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit?We can experiment with making strings shorter than 8 chars in-situ. The drawback will be that length will be limited to 29 bits, i.e. 512MB. Andrei
Apr 10 2012
On Sunday, 8 April 2012 at 09:46:28 UTC, Vladimir Panteleev wrote:On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:Erm... never mind, I thought the pointer was the first field. Even so, how would you use the lowest-order byte of the length as a discriminator? Unless you allocated bytes 2-8 for the length (which would be an unaligned read, or a shift every time...) "making assumptions about the memory layout" now seems like the only solution. Also, what will be .ptr for such arrays? It can't point inside the string, because the type is immutable and the array is on the stack. Duplication can be expensive as well.Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 08 2012
On 4/8/12 4:55 AM, Vladimir Panteleev wrote:On Sunday, 8 April 2012 at 09:46:28 UTC, Vladimir Panteleev wrote:Me too, actually.On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:Erm... never mind, I thought the pointer was the first field.Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.Even so, how would you use the lowest-order byte of the length as a discriminator? Unless you allocated bytes 2-8 for the length (which would be an unaligned read, or a shift every time...)It's amazing how fast shift works :o)."making assumptions about the memory layout" now seems like the only solution. Also, what will be .ptr for such arrays? It can't point inside the string, because the type is immutable and the array is on the stack. Duplication can be expensive as well.Once anyone asks for .ptr a conservative copy will be made. Andrei
Apr 08 2012
On 4/8/2012 7:53 AM, Andrei Alexandrescu wrote:Once anyone asks for .ptr a conservative copy will be made.That could get expensive. You cannot just point into the small string part, because that may only exist temporarily on the stack. There are some pathological cases for this.
Apr 08 2012
On 4/8/12 12:05 PM, Walter Bright wrote:On 4/8/2012 7:53 AM, Andrei Alexandrescu wrote:As I mentioned, the first call to .ptr changes representation, thus making the allocation that the optimization had saved. Things are not worse off than before. AndreiOnce anyone asks for .ptr a conservative copy will be made.That could get expensive. You cannot just point into the small string part, because that may only exist temporarily on the stack. There are some pathological cases for this.
Apr 08 2012
On 2012-04-08 17:14:37 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 4/8/12 12:05 PM, Walter Bright wrote:This only works if the code has write access to that variable. Also, if the variable is a temporary copy such as a by-value function parameter, only the representation of that temporary copy will be affected. Every time you call a function which calls .ptr a new copy will be allocated on the heap. If such a function call is part of a loop, it'll degrade performance noticeably. -- Michel Fortin michel.fortin michelf.com http://michelf.com/On 4/8/2012 7:53 AM, Andrei Alexandrescu wrote:As I mentioned, the first call to .ptr changes representation, thus making the allocation that the optimization had saved. Things are not worse off than before.Once anyone asks for .ptr a conservative copy will be made.That could get expensive. You cannot just point into the small string part, because that may only exist temporarily on the stack. There are some pathological cases for this.
Apr 08 2012
On 4/8/12 1:16 PM, Michel Fortin wrote:On 2012-04-08 17:14:37 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I think there's a bit of a confusion. Could you please clarify with an example? Thanks, AndreiAs I mentioned, the first call to .ptr changes representation, thus making the allocation that the optimization had saved. Things are not worse off than before.This only works if the code has write access to that variable. Also, if the variable is a temporary copy such as a by-value function parameter, only the representation of that temporary copy will be affected. Every time you call a function which calls .ptr a new copy will be allocated on the heap. If such a function call is part of a loop, it'll degrade performance noticeably.
Apr 08 2012
On 4/8/12 1:26 PM, Andrei Alexandrescu wrote:On 4/8/12 1:16 PM, Michel Fortin wrote:I think I understand: string s; foreach (lots) fun(s); void fun(string s) { ... s.ptr ... } I agree in this case there would be one allocation per call. AndreiOn 2012-04-08 17:14:37 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I think there's a bit of a confusion. Could you please clarify with an example?As I mentioned, the first call to .ptr changes representation, thus making the allocation that the optimization had saved. Things are not worse off than before.This only works if the code has write access to that variable. Also, if the variable is a temporary copy such as a by-value function parameter, only the representation of that temporary copy will be affected. Every time you call a function which calls .ptr a new copy will be allocated on the heap. If such a function call is part of a loop, it'll degrade performance noticeably.
Apr 08 2012
On 2012-04-08 18:30:49 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 4/8/12 1:26 PM, Andrei Alexandrescu wrote:Exactly.On 4/8/12 1:16 PM, Michel Fortin wrote:I think I understand: string s; foreach (lots) fun(s); void fun(string s) { ... s.ptr ... }Also, if the variable is a temporary copy such as a by-value function parameter, only the representation of that temporary copy will be affected. Every time you call a function which calls .ptr a new copy will be allocated on the heap. If such a function call is part of a loop, it'll degrade performance noticeably.I think there's a bit of a confusion. Could you please clarify with an example?I agree in this case there would be one allocation per call.Basically, it's the same problem as receiving a const(char)[] as a function parameter and .idup-ing it in the function. The recommendation is for such functions to accept only immutable(char)[] and let the caller find a way to send an immutable string. That way, if the caller is repeatedly using the same string, it can reuse the efficient representation. The same principles apply here: if your function needs to use .ptr, it should request a type on which you can call .ptr efficiently and let the caller find a way to pass think kind of string. To me, your small buffer optimization proposal looks like an argument in favor of creating a separate string type free from the constrains of offering at all time bare access to the bytes. If only that type could be the default... -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 08 2012
On 8 April 2012 12:54, Manu <turkeyman gmail.com> wrote:On 8 April 2012 12:46, Vladimir Panteleev <vladimir thecybershadow.net>wrote:The only way I can see this working is if the marker happens to be the top bit of the size (limiting arrays to 2gb on 32bit systems, which is probably fine), and if set, the next 7 bits are the size, leaving 7 bytes for a string... 7 bytes is pushing it, 15 bytes is very useful, 7 is borderline... That all said, I'll ultimately end out with my own string type anyway which is multiples of, and aligned to 16 bytes, which will support SSE string opcodes. I wonder if these considerations can be factored into the built-in string? Is it realistic that anyone can actually use raw d-string's in an app that performs a lot of string manipulation? I bet most people end up with a custom string class anyway... Who's written a string-heavy app without their own string helper class? I ended up with a string class within about half an hour of trying to work with D strings (initially just to support stack strings, then it grew).On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:What is the plan for 32bit?Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 08 2012
On 4/8/12 9:26 AM, Manu wrote:Is it realistic that anyone can actually use raw d-string's in an app that performs a lot of string manipulation?Yes.I bet most people end up with a custom string class anyway...That does happen, but much more rarely than one might think.Who's written a string-heavy app without their own string helper class? I ended up with a string class within about half an hour of trying to work with D strings (initially just to support stack strings, then it grew).A lot of people write string-heavy apps with the built-in strings. "Heavy" actually describes a continuum. No matter how you put it, improving the performance of built-in strings is beneficial. Andrei
Apr 08 2012
On 4/8/2012 7:26 AM, Manu wrote:Is it realistic that anyone can actually use raw d-string's in an app that performs a lot of string manipulation? I bet most people end up with a custom string class anyway... Who's written a string-heavy app without their own string helper class? I ended up with a string class within about half an hour of trying to work with D strings (initially just to support stack strings, then it grew).Warp (a fast C preprocessor) https://github.com/facebookarchive/warp
Apr 19 2018
On 19 April 2018 at 02:22, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 4/8/2012 7:26 AM, Manu wrote:Cool story... 6 years later! :PIs it realistic that anyone can actually use raw d-string's in an app that performs a lot of string manipulation? I bet most people end up with a custom string class anyway... Who's written a string-heavy app without their own string helper class? I ended up with a string class within about half an hour of trying to work with D strings (initially just to support stack strings, then it grew).Warp (a fast C preprocessor) https://github.com/facebookarchive/warp
Apr 19 2018
On 4/8/12 4:46 AM, Vladimir Panteleev wrote:On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:Hehe. Good idea. On big endian machines the last byte is the ticket! AndreiWalter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 08 2012
On 4/8/12 9:49 AM, Andrei Alexandrescu wrote:On 4/8/12 4:46 AM, Vladimir Panteleev wrote:s/big/little/On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:Hehe. Good idea. On big endian machines the last byte is the ticket!Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 08 2012
On Sunday, 8 April 2012 at 09:46:28 UTC, Vladimir Panteleev wrote:On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:If the length has multi purpose it would be even better to reserve more than just one bit. For all practical purpose 48 bits or 56 bits are more than enough to handle all possible lengths. This would liberate 8 or even 16 bits that can be used for other purposes.Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all.Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Apr 15 2018
On 2012-04-08 07:56, Andrei Alexandrescu wrote:For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:Just don't make the same mistake as with AA. -- /Jacob Carlborg
Apr 08 2012
On 4/8/12 5:45 AM, Jacob Carlborg wrote:On 2012-04-08 07:56, Andrei Alexandrescu wrote:The mistake with AAs was done long ago, but it was forced as AAs predated templates. AndreiFor this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:Just don't make the same mistake as with AA.
Apr 08 2012
On 2012-04-08 16:54, Andrei Alexandrescu wrote:On 4/8/12 5:45 AM, Jacob Carlborg wrote:I'm referring to the new template implementation of AAs that got reverted due everything breaking, if I recall correctly. -- /Jacob CarlborgOn 2012-04-08 07:56, Andrei Alexandrescu wrote:The mistake with AAs was done long ago, but it was forced as AAs predated templates. AndreiFor this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:Just don't make the same mistake as with AA.
Apr 08 2012
On Sun, Apr 08, 2012 at 05:35:50PM +0200, Jacob Carlborg wrote:On 2012-04-08 16:54, Andrei Alexandrescu wrote:[...]On 4/8/12 5:45 AM, Jacob Carlborg wrote:[...] Huh? When was this? T -- "I suspect the best way to deal with procrastination is to put off the procrastination itself until later. I've been meaning to try this, but haven't gotten around to it yet. " -- swrI'm referring to the new template implementation of AAs that got reverted due everything breaking, if I recall correctly.Just don't make the same mistake as with AA.The mistake with AAs was done long ago, but it was forced as AAs predated templates. Andrei
Apr 08 2012
On Sun, Apr 08, 2012 at 12:56:38AM -0500, Andrei Alexandrescu wrote: [...]1. What happened to the new hash project? We need to take that to completion.[...] Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementation The major outstanding issues are: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } The problem is that inout gets carried over into the AA template, which breaks because it instantiates into something that has: struct Slot { hash_t hash; inout(T) key; // <-- this causes a compile error Value value; } Ideally, AA keys should all be stored as immutable inside the AA, and automatically converted to/from the qualified type the user specified. - Template bloat: the current code uses template member functions, and will instantiate a new function for every implicit conversion of input key types. This also depends on IFTI, which has some quirks (compiler bugs) that make the code ugly (e.g., strings and arrays not treated equally by the compiler, requiring hacks to make implicit conversion work). Timon has suggested an alternative way of handling implicit conversions, which I think is better, but I need to take some time to actually implement it. - Static initialization of AA's (AA literals that compile directly into object code). This should be possible in principle, but I've run into what may be a CTFE bug that prevents it from working. - A not-so-major issue is to finish the toHash() implementations for all native types (currently it works for some common key types, but coverage is still incomplete). Once this is done, we can finally get rid of getHash from TypeInfo; UFCS will let us simply write x.toHash() for pretty much any type x. Once these issues are resolved, there remains the major task of actually integrating this code with druntime/dmd. A lot of work is expected on the dmd end, because of the current amount of hacks in dmd to make AA's work. T -- Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten.
Apr 08 2012
On Sunday, 8 April 2012 at 13:53:07 UTC, H. S. Teoh wrote:On Sun, Apr 08, 2012 at 12:56:38AM -0500, Andrei Alexandrescu wrote: [...]doesn't this work? immutable std.traits.Unqual!(inout(T)) key;1. What happened to the new hash project? We need to take that to completion.[...] Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementation The major outstanding issues are: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } The problem is that inout gets carried over into the AA template, which breaks because it instantiates into something that has: struct Slot { hash_t hash; inout(T) key; // <-- this causes a compile error Value value; } Ideally, AA keys should all be stored as immutable inside the AA, and automatically converted to/from the qualified type the user specified. - Template bloat: the current code uses template member functions, and will instantiate a new function for every implicit conversion of input key types. This also depends on IFTI, which has some quirks (compiler bugs) that make the code ugly (e.g., strings and arrays not treated equally by the compiler, requiring hacks to make implicit conversion work). Timon has suggested an alternative way of handling implicit conversions, which I think is better, but I need to take some time to actually implement it. - Static initialization of AA's (AA literals that compile directly into object code). This should be possible in principle, but I've run into what may be a CTFE bug that prevents it from working. - A not-so-major issue is to finish the toHash() implementations for all native types (currently it works for some common key types, but coverage is still incomplete). Once this is done, we can finally get rid of getHash from TypeInfo; UFCS will let us simply write x.toHash() for pretty much any type x. Once these issues are resolved, there remains the major task of actually integrating this code with druntime/dmd. A lot of work is expected on the dmd end, because of the current amount of hacks in dmd to make AA's work. T
Apr 08 2012
On Sun, Apr 08, 2012 at 04:11:10PM +0200, Tove wrote:On Sunday, 8 April 2012 at 13:53:07 UTC, H. S. Teoh wrote:[...]On Sun, Apr 08, 2012 at 12:56:38AM -0500, Andrei Alexandrescu wrote: [...]1. What happened to the new hash project? We need to take that to completion.[...][...]The major outstanding issues are: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } The problem is that inout gets carried over into the AA template, which breaks because it instantiates into something that has: struct Slot { hash_t hash; inout(T) key; // <-- this causes a compile error Value value; }doesn't this work? immutable std.traits.Unqual!(inout(T)) key;I suppose so, but the problem is more with the return type of various AA functions. If the user writes int[inout(T)] aa, then he would expect that aa.keys should return inout(T)[]. But the problem here is that this is impossible to express in the current system, because inout has a different meaning when you write it in the AA method, than its meaning in the context of the calling function. For example, this doesn't quite work: struct AA { ... inout(Key) keys() { ... // Problem: inout here doesn't mean what it // means in the context of the caller } } T -- Tech-savvy: euphemism for nerdy.
Apr 08 2012
On 4/8/12 8:54 AM, H. S. Teoh wrote:Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementationThanks for the update! Let me reiterate this is important work for many reasons, so it would be great if you got around to pushing it through. Do you think someone else in the community could help you? Andrei
Apr 08 2012
On Sun, Apr 08, 2012 at 10:00:37AM -0500, Andrei Alexandrescu wrote:On 4/8/12 8:54 AM, H. S. Teoh wrote:I'll try my best to work on it when I have free time.Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementationThanks for the update! Let me reiterate this is important work for many reasons, so it would be great if you got around to pushing it through.Do you think someone else in the community could help you?[...] Well, I put the code up on github for a reason. :-) I'll only be too glad to have someone else contribute. The code itself isn't all that complicated; it's basically just a facsimile of aaA.d with many bugs fixed due to the fact that we now have direct access to key/value types. T -- "Life is all a great joke, but only the brave ever get the point." -- Kenneth Rexroth
Apr 08 2012
On 4/8/12 8:54 AM, H. S. Teoh wrote:- Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... }I wonder how frequently such code is used. Andrei
Apr 08 2012
On 2012-04-08 05:56:38 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. It turns out statistically a lot of strings are small. According to a variety of systems we use at Facebook, the small buffer optimization is king - it just works great in all cases. In D that means better speed, better locality, and less garbage.Small buffer optimization is a very good thing to have indeed. But… how can you preserve existing semantics? For instance, let's say you have this: string s = "abcd"; which is easily representable as a small string. Do you use the small buffer optimization in the assignment? That seems like a definitive yes. But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = "abcd"; immutable(char)* p = s.ptr; s = "defg"; // assigns to where? There's also the issue of this code being legal currently: immutable(char)* getPtr(string s) { return s.ptr; } If you pass a small string to getPtr, it'll be copied to the local stack frame and you'll be returning a pointer to that local copy. You could mitigate this by throwing an error when trying to get the pointer to a small string, but then you have to disallow taking the pointer of a const(char)[] pointing to it: const(char)* getPtr2(const(char)[] s) { return s.ptr; } const(char)* getAbcdPtr() { string s = "abcd"; // s implicitly converted to regular const(char)[] pointing to local stack frame const(char)* c = getPtr2(s); // c points to the storage of s, which is the local stack frame return c; } So it's sad, but I am of the opinion that the only way to implement small buffer optimization is to have a higher-level abstraction, a distinct type for such small strings. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 08 2012
On 4/8/12 9:59 AM, Michel Fortin wrote:But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = "abcd"; immutable(char)* p = s.ptr; s = "defg"; // assigns to where?Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken. Andrei
Apr 08 2012
On 2012-04-08 15:06:13 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 4/8/12 9:59 AM, Michel Fortin wrote:You know, many people have been wary of hidden memory allocations in the past. That's not going to make them happy. I'm not complaining, but I think .ptr should return null in those cases. Let people use toStringz when they need a C string, and let people deal with the ugly details themselves if they're using .ptr to bypass array bound checking. Because if someone used .ptr somewhere to bypass bound checking and instead he gets a memory allocation at each loop iteration… it won't be pretty. And what about implicit conversions to const(char)[]? That too will require a copy, because otherwise it could point to the local stack frame where your immutable(char)[] resides. That said, maybe copies of small-string optimized immutable(char)[] could be small-string optimized const(char)[]. That'd not have any side effect since no one can have a mutable pointer/slice to the const copy anyway. -- Michel Fortin michel.fortin michelf.com http://michelf.com/But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = "abcd"; immutable(char)* p = s.ptr; s = "defg"; // assigns to where?Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken.
Apr 08 2012
On 4/8/12 10:48 AM, Michel Fortin wrote:On 2012-04-08 15:06:13 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Well, the optimization makes for fewer allocations total. In fact .ptr does the allocation that was formerly mandatory.On 4/8/12 9:59 AM, Michel Fortin wrote:You know, many people have been wary of hidden memory allocations in the past.But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = "abcd"; immutable(char)* p = s.ptr; s = "defg"; // assigns to where?Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken.That's not going to make them happy. I'm not complaining, but I think .ptr should return null in those cases.That would be too large a regression I think. And it's not detectable during compilation.Let people use toStringz when they need a C string, and let people deal with the ugly details themselves if they're using .ptr to bypass array bound checking. Because if someone used .ptr somewhere to bypass bound checking and instead he gets a memory allocation at each loop iteration… it won't be pretty.Only one allocation. First invocation of .ptr effectively changes representation.And what about implicit conversions to const(char)[]? That too will require a copy, because otherwise it could point to the local stack frame where your immutable(char)[] resides. That said, maybe copies of small-string optimized immutable(char)[] could be small-string optimized const(char)[]. That'd not have any side effect since no one can have a mutable pointer/slice to the const copy anyway.I think casting to const(char)[] should work without allocation. Andrei
Apr 08 2012
Le 08/04/2012 17:48, Michel Fortin a écrit :On 2012-04-08 15:06:13 +0000, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:You've raised some very good points. I also think this won't do. It is quite clear with your examples (especially the temporary copy as a by value function parameter) that the small strings have some very specific semantics that are completely different from normal string semantics. Therefore I also have the feeling that they should be another type, sstring or something. sstrings could be promoted to full string when necessary, but the whole thing would probably look very very ugly pretty soon. This to me smells like the idea of small string optimization should be put to rest for a while, requiring a little more thought.On 4/8/12 9:59 AM, Michel Fortin wrote:You know, many people have been wary of hidden memory allocations in the past. That's not going to make them happy. I'm not complaining, but I think .ptr should return null in those cases. Let people use toStringz when they need a C string, and let people deal with the ugly details themselves if they're using .ptr to bypass array bound checking. Because if someone used .ptr somewhere to bypass bound checking and instead he gets a memory allocation at each loop iteration… it won't be pretty. And what about implicit conversions to const(char)[]? That too will require a copy, because otherwise it could point to the local stack frame where your immutable(char)[] resides. That said, maybe copies of small-string optimized immutable(char)[] could be small-string optimized const(char)[]. That'd not have any side effect since no one can have a mutable pointer/slice to the const copy anyway.But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = "abcd"; immutable(char)* p = s.ptr; s = "defg"; // assigns to where?Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken.
Apr 08 2012
On 08.04.2012 07:56, Andrei Alexandrescu wrote:For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:vote -= real.infinity. That would kill D.
Apr 09 2012
On 04/09/2012 10:24 AM, Don wrote:On 08.04.2012 07:56, Andrei Alexandrescu wrote:Why does this even compile? void main(){ long vote; vote -= real.infinity; assert(vote == long.min); }For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:vote -= real.infinity. That would kill D.
Apr 09 2012
On 9 April 2012 11:24, Don <nospam nospam.com> wrote:On 08.04.2012 07:56, Andrei Alexandrescu wrote: For this to happen, we need to start an effort of migrating built-inHow do you figure? After thinking on it a bit, I'm becoming a little worried about this move for 2 rarely considered reasons: Using lowering to a template, debug(/unoptimised) performance will probably get a lot slower, which is really annoying. And debugging/stepping might become considerably more annoying too, if every time I press F11 (step in) over a function call that happens to receive an arg from an array, the debugger then steps into the array templates index operator... We'd be no better off than with STL, unless the language has clever ways of hiding this magic from the debugger too, and optimising/inlining the index even in debug builds...? But this is the built-in array, and not a library we can optionally not use.arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:vote -= real.infinity. That would kill D.
Apr 09 2012
On 4/9/12 4:21 AM, Manu wrote:After thinking on it a bit, I'm becoming a little worried about this move for 2 rarely considered reasons: Using lowering to a template, debug(/unoptimised) performance will probably get a lot slower, which is really annoying. And debugging/stepping might become considerably more annoying too, if every time I press F11 (step in) over a function call that happens to receive an arg from an array, the debugger then steps into the array templates index operator... We'd be no better off than with STL, unless the language has clever ways of hiding this magic from the debugger too, and optimising/inlining the index even in debug builds...? But this is the built-in array, and not a library we can optionally not use.I agree. So we have the counterarguments: 1. Lowering would treat array primitives as sheer D code, subject to refusal of inlining. That means worse performance. 2. Unless the compiler takes special measures, source-level debuggers will trace through core, uninteresting code for array operations. 3. There are patterns that attempt to optimize by e.g. using .ptr, but end up pessimizing code because they trigger multiple memory allocations. Andrei
Apr 09 2012
On Monday, 9 April 2012 at 14:55:16 UTC, Andrei Alexandrescu wrote:3. There are patterns that attempt to optimize by e.g. using .ptr, but end up pessimizing code because they trigger multiple memory allocations. AndreiIt's important to note that this pattern is probably most common in glue code to C libraries, not bounds-checking related optimizations. There are countless of C library functions which receive the equivalent of an array by taking a pointer and a length, and implicit allocation on `foo.ptr` is completely unacceptable in these cases. It's also common to avoid the `toStringz` function for strings you know are zero-terminated, using `.ptr` directly instead, as the toStringz function unconditionally appends a zero these days (and for good reasons, its previous optimization was extremely optimistic about its input).
Apr 09 2012
On 4/9/12, Jakob Ovrum <jakobovrum gmail.com> wrote:It's also common to avoid the `toStringz` function for strings you know are zero-terminated, using `.ptr` directly instead.Yup. E.g. WinAPI text drawing functions take a wchar* and a length. I don't have to call toUTF16z but just pass a pointer, or even a pointer to a specific element via &arr[index] (after calling std.utf.stride, of course).the toStringz function unconditionally appends a zero these daysThe one taking (const(char)[] s) does this, but not the other overload taking (string s). Whether or not that's safe I don't really know. I've had an argument over this on github, but I don't know if it was about toStringz or maybe toUTF16z. I haven't got the link to the discussion.
Apr 09 2012
On Monday, 9 April 2012 at 15:37:37 UTC, Andrej Mitrovic wrote:On 4/9/12, Jakob Ovrum <jakobovrum gmail.com> wrote: The one taking (const(char)[] s) does this, but not the other overload taking (string s). Whether or not that's safe I don't really know. I've had an argument over this on github, but I don't know if it was about toStringz or maybe toUTF16z. I haven't got the link to the discussion.You're right, I just confirmed the optimization is still in place for the `string` version. The documentation is identical for both functions. I think this is a mistake. It assumes that the string is either a compiler-generated literal or a GC allocated string, while the documentation does not mention such assumptions. With all the focus on manual memory management and pluggable allocators going on, I think the optimization must be removed or the documentation for the `string` overload changed. This optimization can always be put back in without narrowing the scope of the `string` overload once the above conditions can be reliably checked. Another option is to add a known-bug section to the `string` overload informing users that the function may fail on custom-allocated strings.
Apr 09 2012
On 4/9/12, Jakob Ovrum <jakobovrum gmail.com> wrote:With all the focus on manual memory management and pluggable allocators going on, I think the optimization must be removed or the documentation for the `string` overload changed.Or add a compile-time argument: toStringz(bool ForceAllocate = true)(string s) Or split the unsafe version into another function.
Apr 09 2012
On 9 April 2012 17:55, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>wrote:On 4/9/12 4:21 AM, Manu wrote:Indeed. I don't think the small array optimisation would benefit us in any way that could even come close to balancing the loss. I don't think it would benefit us much at all regardless, since we've already proven the need for a string class anyway, and we can make such small-string optimisations already. I am very wary of removing a fundamental language primitive, that is, the basic ability to express an array with absolutely no frills. This idea only really benefits utf-8 strings, so why not just make a real d-string class in phobos and be done with it? People complain about UTF issues with raw strings, and this is what you intend to do anyway to implement the optimisation. d-string could perhaps be made to appear as a first-class language feature via the 'string' keyword, and lower to the library in the way you describe. Just don't blanket this across all arrays in general.After thinking on it a bit, I'm becoming a little worried about this move for 2 rarely considered reasons: Using lowering to a template, debug(/unoptimised) performance will probably get a lot slower, which is really annoying. And debugging/stepping might become considerably more annoying too, if every time I press F11 (step in) over a function call that happens to receive an arg from an array, the debugger then steps into the array templates index operator... We'd be no better off than with STL, unless the language has clever ways of hiding this magic from the debugger too, and optimising/inlining the index even in debug builds...? But this is the built-in array, and not a library we can optionally not use.I agree. So we have the counterarguments: 1. Lowering would treat array primitives as sheer D code, subject to refusal of inlining. That means worse performance. 2. Unless the compiler takes special measures, source-level debuggers will trace through core, uninteresting code for array operations. 3. There are patterns that attempt to optimize by e.g. using .ptr, but end up pessimizing code because they trigger multiple memory allocations.
Apr 09 2012
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:jlut8k$j4o$1 digitalmars.com...I agree. So we have the counterarguments: 1. Lowering would treat array primitives as sheer D code, subject to refusal of inlining. That means worse performance.So in other words, we need the forceinline that some people have strongly requested? Make it work even without -inline, and then add -noinline for any rare cases where someone might need to forcefully disable forceinline. Shouldn't that take care of it?2. Unless the compiler takes special measures, source-level debuggers will trace through core, uninteresting code for array operations.Would forceinline fix this?3. There are patterns that attempt to optimize by e.g. using .ptr, but end up pessimizing code because they trigger multiple memory allocations.Someone's suggestion of just making .ptr null instead of doing implicit allocations was an interesting idea.
Apr 09 2012
On 04/10/12 07:01, Nick Sabalausky wrote:"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:jlut8k$j4o$1 digitalmars.com...Obviously, yes, but should wait until enough attribute support is in place and not be just a inline hack (no point in naming it forceinline - there's no other kind of inline).I agree. So we have the counterarguments: 1. Lowering would treat array primitives as sheer D code, subject to refusal of inlining. That means worse performance.So in other words, we need the forceinline that some people have strongly requested? Make it work even without -inline, and then add -noinline for any rare cases where someone might need to forcefully disable forceinline. Shouldn't that take care of it?No, but the compiler could just omit the debuginfo for the lowerings, unless requested with a flag.2. Unless the compiler takes special measures, source-level debuggers will trace through core, uninteresting code for array operations.Would forceinline fix this?I hope that wasn't a serious suggestion. What might be possible is passing strings by reference and atomically changing the representation when (if) a pointer is needed. That would have problems too (ABI change, can't have RO strings, can't keep the small strings in registers etc). Doing the small buffer optimization for a new type may be possible, but trying to add it to the current char[] arrays is probably not a good idea. artur3. There are patterns that attempt to optimize by e.g. using .ptr, but end up pessimizing code because they trigger multiple memory allocations.Someone's suggestion of just making .ptr null instead of doing implicit allocations was an interesting idea.
Apr 10 2012
Am Tue, 10 Apr 2012 10:50:24 +0200 schrieb Artur Skawina <art.08.09 gmail.com>:Obviously, yes, but should wait until enough attribute support is in place and not be just a inline hack.If you refer to the proposed user attributes, they wont change the operation of the compiler. Only your own program code will know how to use them. inline, safe, property, final, nothrow, ... on the other hand are keywords that directly map to flags and hard wired logic in the compiler. Correct me if I'm wrong. -- Marco
Apr 10 2012
On 04/10/12 19:25, Marco Leise wrote:Am Tue, 10 Apr 2012 10:50:24 +0200 schrieb Artur Skawina <art.08.09 gmail.com>:I'm saying that introducing new function attributes like inline to the language, when there's a real possibility of "generic" attributes being invented in the near future, may not be a good idea. Any generic scheme should also work for inline and the many other attrs that i've mentioned before - there's no reason to artificially limit the support to *just* user attributes. arturObviously, yes, but should wait until enough attribute support is in place and not be just a inline hack.If you refer to the proposed user attributes, they wont change the operation of the compiler. Only your own program code will know how to use them. inline, safe, property, final, nothrow, ... on the other hand are keywords that directly map to flags and hard wired logic in the compiler. Correct me if I'm wrong.
Apr 10 2012
Am Tue, 10 Apr 2012 20:52:56 +0200 schrieb Artur Skawina <art.08.09 gmail.com>:I'm saying that introducing new function attributes like inline to the language, when there's a real possibility of "generic" attributes being invented in the near future, may not be a good idea. Any generic scheme should also work for inline and the many other attrs that i've mentioned before - there's no reason to artificially limit the support to *just* user attributes. arturI had to read up on your older posts again. So you are not expecting compiler hooks that allow to change the language semantics and code gen through user attributes, but a common syntax especially for bundling multiple compiler/user attributes like " attr(safe, nothrow, userattr(abc), inline, ...) my_attr_alias" in the event that there will be a lot of platform specific and other pragmas/attributes/keywords like in GCC in the future? Then I tend to agree. -- Marco
Apr 10 2012
On 4/9/12 3:24 AM, Don wrote:On 08.04.2012 07:56, Andrei Alexandrescu wrote:Why? AndreiFor this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions:vote -= real.infinity. That would kill D.
Apr 09 2012
On Sun, 08 Apr 2012 01:56:38 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. It turns out statistically a lot of strings are small. According to a variety of systems we use at Facebook, the small buffer optimization is king - it just works great in all cases. In D that means better speed, better locality, and less garbage. For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions: 1. What happened to the new hash project? We need to take that to completion. 2. Is anyone willing to start the effort of migrating built-in slices into templates?No, this would suck. A better solution - make an *actual* string type that does this, and fixes all the shitty problems that we have from shoehorning arrays into UTF strings. Then alias that type to string. I'm so sick of phobos trying to pretend char[] is not an array, and this would just be another mark against D. -Steve
Apr 09 2012
On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:AndreiHave anybody put together code that implements this idea in a library? That is, a small strings up to length 15 bytes unioned with a `string`.
Apr 15 2018
On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote:Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects.I put together SSOString at https://github.com/nordlow/phobos-next/blob/967eb1088fbfab8be5ccd811b66e7b5171b46acf/src/sso_string.d that uses small-string-optimization on top of a normal D string (slice). I'm satisfied with everything excepts that -dip1000 doesn't vorbids `f` from compiling. I also don't understand why `x[0]` cannot be returned by ref in the function `g`. Comments are welcome. Contents of sso_string.d follows: module sso_string; /** Small-size-optimized string. * * Store on the stack if constructed with <= `smallCapacity` number of * characters, otherwise on the GC heap. */ struct SSOString { private alias E = immutable(char); // immutable element type private alias ME = char; // mutable element type pure nothrow: /** Construct from `elements`, with potential GC-allocation (iff * `elements.length > smallCapacity`). */ this()(scope ME[] elements) trusted // template-lazy { if (elements.length <= smallCapacity) { small.data[0 .. elements.length] = elements; small.length = cast(typeof(small.length))(2*elements.length); } else { large = elements.idup; // GC-allocate raw.length *= 2; // shift up raw.length |= 1; // tag as large } } nogc: // TODO add nogc overload to construct from mutable static array <= smallCapacity /** Construct from `elements` without any kind of heap allocation. */ this()(immutable(E)[] elements) trusted // template-lazy { if (elements.length <= smallCapacity) { small.data[0 .. elements.length] = elements; small.length = cast(typeof(small.length))(2*elements.length); } else { large = elements; // nogc raw.length *= 2; // shift up raw.length |= 1; // tag as large } } property size_t length() const trusted { if (isLarge) { return large.length/2; // skip first bit } else { return small.length/2; // skip fist bit } } scope ref inout(E) opIndex(size_t index) inout return trusted { return opSlice()[index]; // automatic range checking } scope inout(E)[] opSlice() inout return trusted { if (isLarge) { union RawLarge { Raw raw; Large large; } RawLarge copy = void; copy.large = cast(Large)large; copy.raw.length /= 2; // adjust length return copy.large; } else { return small.data[0 .. small.length/2]; // scoped } } private property bool isLarge() const trusted { return large.length & 1; // first bit discriminates small from large } private: struct Raw // same memory layout as `E[]` { size_t length; // can be bit-fiddled without GC allocation E* ptr; } alias Large = E[]; enum smallCapacity = Large.sizeof - Small.length.sizeof; static assert(smallCapacity > 0, "No room for small elements for E being " ~ E.stringof); version(LittleEndian) // see: http://forum.dlang.org/posting/zifyahfohbwavwkwbgmw { struct Small { ubyte length; E[smallCapacity] data; } } else { static assert(0, "BigEndian support and test"); } union { Raw raw; Large large; Small small; } } /// safe pure nothrow nogc unittest { import container_traits : mustAddGCRange; alias S = SSOString; static assert(S.sizeof == 2*size_t.sizeof); // two words static assert(S.smallCapacity == 15); static assert(mustAddGCRange!S); // `Large large.ptr` must be scanned auto s0 = S.init; assert(s0.length == 0); assert(!s0.isLarge); assert(s0[] == []); const s7 = S("0123456"); static assert(is(typeof(s7[]) == string)); assert(!s7.isLarge); assert(s7.length == 7); assert(s7[] == "0123456"); // TODO assert(s7[0 .. 4] == "0123"); const s15 = S("012345678901234"); static assert(is(typeof(s15[]) == string)); assert(!s15.isLarge); assert(s15.length == 15); assert(s15[] == "012345678901234"); const s16 = S("0123456789abcdef"); static assert(is(typeof(s16[]) == string)); assert(s16.isLarge); assert(s16.length == 16); assert(s16[] == "0123456789abcdef"); assert(s16[0] == '0'); assert(s16[10] == 'a'); assert(s16[15] == 'f'); // TODO static assert(!__traits(compiles, { auto _ = S((char[]).init); })); string f() safe pure nothrow nogc { S x; return x[]; // TODO should fail with -dip1000 } // TODO activate // ref char g() safe pure nothrow nogc // { // S x; // return x[0]; // TODO should fail with -dip1000 // } }
Apr 17 2018