digitalmars.D - Thread Attributes
- Jonathan Marler (94/94) Jul 10 2014 I had an idea this morning and wanted to post it to see what
- Jacob Carlborg (6/29) Jul 10 2014 I'm not sure I understand but if the variable can only be accessed from
- Jonathan Marler (28/28) Jul 11 2014 I'm not sure how AST macros would assist in thread safety the way
- Jacob Carlborg (38/56) Jul 13 2014 Looking at the first example:
- Jonathan Marler (69/91) Jul 14 2014 Ah I see now. It looks like AST macros are going to open up alot
- Jacob Carlborg (21/54) Jul 15 2014 Yeah, probably. I think the problem is that threads are not so tightly
- Timon Gehr (2/3) Jul 11 2014 How do you make sure there is at most one thread of each kind?
- Jonathan Marler (26/29) Jul 12 2014 Good question. First, since the language doesn't support starting
I had an idea this morning and wanted to post it to see what people think. I know we have alot of attributes already but I'm wondering if people think adding a thread attribute could be useful. Something that says a variable or function or class/struct can only be accessed by code that has been tagged with the same thread name. Something like this. // This variable is allocated as a true shared global // with a fixed location in memory since it can only // be accessed by one thread. thread:main int mainThreadGlobal; thread:main int main(string[] args) { // Start the worker thread at some point } thread:worker void workerLoop() { // do some work , cannot access mainThreadGlobal } With this information the compiler could help verify thread safety at compile time. This idea is far from refined as I just thought of it this morning, but I had a couple thoughts. One problem I foresaw was how to handle passing callback functions into other libraries. If the callback function is tagged with a thread name, then maybe you have to call that function on the same thread that the callback is tagged with? In example: FindLibrary: // No thread attribute because it is a library function void find(string haystack, string needle, void function(int offset)) { // logic... } MyProgram: thread:worker uint[] offsets; thread:worker void foundOffset(int offset) { offsets ~= offset; } thread:worker void callFind(string data) { find(data, "importantstring", &foundOffset); } void main(string[] args) { // start thread for callFind } Now let's say that we didn't tag callFind with the thread:worker attribute. The compiler would need to know the source code of the find function but could throw an error when it sees that it calls the callback function passed into it tagged with a specific thread. Or if you didn't know the source code of the find function, you could assume that it calls the callback function on it's own thread and just throw an error whenever you pass a callback function into another function that isn't tagged with the same thread you are currently executing on. When something is not tagged with a thread (which would likely include any kind of library/api function, or any code in a single threaded application), then no checking is done. But the thread attribute would be guarantee that any function tagged with that attribute can only be called by a function tagged with the same attribute. The other thought I had was handling synchronization. Let's say you have a function that you don't mind being called by other threads but you still want it to be synchronized. You could add a synchronized attribute that takes an object: object mySyncObject; sync(mySyncObject) void doSomething() { // I don't need to synchronize on the mySyncObject here because the compiler will verify that // anyone who calls this function will have already synchronized on it. // Therefore I can assume I am already synchronized on the mySyncObject without actually doing it...yay! :) } So what do people think? Like I said I just thought of this and haven't had time think about more corner cases so feel free to nit pick:)
Jul 10 2014
On 10/07/14 20:12, Jonathan Marler wrote:I had an idea this morning and wanted to post it to see what people think. I know we have alot of attributes already but I'm wondering if people think adding a thread attribute could be useful. Something that says a variable or function or class/struct can only be accessed by code that has been tagged with the same thread name. Something like this. // This variable is allocated as a true shared global // with a fixed location in memory since it can only // be accessed by one thread. thread:main int mainThreadGlobal; thread:main int main(string[] args) { // Start the worker thread at some point } thread:worker void workerLoop() { // do some work , cannot access mainThreadGlobal }I'm not sure I understand but if the variable can only be accessed from a single thread, why not make it thread local?[SNIP] So what do people think? Like I said I just thought of this and haven't had time think about more corner cases so feel free to nit pick:)BTW, both of these features sounds like a job for AST macros. -- /Jacob Carlborg
Jul 10 2014
I'm not sure how AST macros would assist in thread safety the way that this feature would. Maybe you could elaborate? To explain a little more, when you put a thread:name or sync(object) attribute on something, the compiler will guarantee that no safe D code will ever use that code or data unless it is either on the given thread or can guarantee at compile time that it has synchronized on the given object. You mentioned making the variable thread local. So if I'm understanding, you're saying just make it a regular global variable. However, the point is that if you tell the compiler that it can only be accessed by a single thread then it doesn't need to be thread local. Real global variables are preferred over thread local for performance/memory reasons. Their address is known at compile time and you don't need to allocate a new instance for every thread. The only reason for thread local variables is to alleviate problems with multithreaded applications, but using an attribute like this would allow someone to have the benefit of a real global variable without exposing it to other threads fixing the synchronization issue. D has its own way of handling multithreaded applications but I still have applications that use the old idioms to get lightning performance and minimize memory usage. A feature like this could solve alot of problems the old idioms use. There are many times that I write a function and I have to make a mental note (or a comment) that this function should only ever be called by a certain thread. Or that this function should only be called by code that has locked on a certain object. It would be wonderful if the compiler could guarantee that for me.
Jul 11 2014
On 2014-07-11 19:07, Jonathan Marler wrote:I'm not sure how AST macros would assist in thread safety the way that this feature would. Maybe you could elaborate?Looking at the first example: thread:main int mainThreadGlobal; thread:main int main(string[] args) { // Start the worker thread at some point } thread:worker void workerLoop() { // do some work , cannot access mainThreadGlobal } This would be implemented as a declaration macro [1], something like this: macro thread (Context, Ast!(string) name, Declaration decl) { if (decl.isVariable) decl.attributes ~= Thread(name); else if (decl.isCallable) { foreach (var ; decl.accessedVariables) { if (auto attr = var.getAttribute!(Thread)) if (attr.name != name) context.compiler.error("Cannot access variable with thread name " ~ attr.name ~ " from callable with thread name " ~ name); } } return decl; } Usage: thread("main") int mainThreadGlobal; thread("worker") void workerLoop ();To explain a little more, when you put a thread:name or sync(object) attribute on something, the compiler will guarantee that no safe D code will ever use that code or data unless it is either on the given thread or can guarantee at compile time that it has synchronized on the given object. You mentioned making the variable thread local. So if I'm understanding, you're saying just make it a regular global variable. However, the point is that if you tell the compiler that it can only be accessed by a single thread then it doesn't need to be thread local. Real global variables are preferred over thread local for performance/memory reasons. Their address is known at compile time and you don't need to allocate a new instance for every thread. The only reason for thread local variables is to alleviate problems with multithreaded applications, but using an attribute like this would allow someone to have the benefit of a real global variable without exposing it to other threads fixing the synchronization issue.Makes sense. [1] http://wiki.dlang.org/DIP50#Declaration_macros -- /Jacob Carlborg
Jul 13 2014
On Sunday, 13 July 2014 at 10:45:29 UTC, Jacob Carlborg wrote:This would be implemented as a declaration macro [1], something like this: macro thread (Context, Ast!(string) name, Declaration decl) { if (decl.isVariable) decl.attributes ~= Thread(name); else if (decl.isCallable) { foreach (var ; decl.accessedVariables) { if (auto attr = var.getAttribute!(Thread)) if (attr.name != name) context.compiler.error("Cannot access variable with thread name " ~ attr.name ~ " from callable with thread name " ~ name); } } return decl; } Usage: thread("main") int mainThreadGlobal; thread("worker") void workerLoop ();Ah I see now. It looks like AST macros are going to open up alot of new paradigms, I'm excited to see how they progress and what they can do. It doesn't get us all the way there in this example but its a very good alternative without having to add anything to the compiler. Your macro would help the developer ensure that particular variables and functions are only touched by the appropriate code. If the feature only did this I would use a different name such as "restrict" or something. However, in order for the compiler to perform thread-optimizations based on this information it would have to be apart of the language. It also doesn't handle the synchronized case. For that you would need a knowledge of the execution paths of your functions to determine what parts are locked and what parts are not (Unless of course AST macros could support that, I would be very pleasently surprised if they did). This AST macro is very intriguing though. Its like you're writing code that can analyze your code as you develop it. This is a really cool idea. As I've had more time to think on this here's one of the potential consequence I've thought of. *********** Deprecating usage of the __gshared hack *********** Here's some cases that a developer may think __gshared usage is justified. 1. In a single threaded application 2. When the developer knows that its only possible for one thread to access that global at a time or, 3. When the developer is using some type of locking scheme to access the globals is the correct design. In a single threaded application then you could add the thread attribute with the same name to every single function and variable, but this would be very unnecessary. Instead the developer could use a pragme to tell the compiler to make sure it is single threaded, but since adding this feature would require the compiler to know when threads are started anyway, the compiler could determine that an application was single threaded on its own. The pragma would just be a sanity check so that the developer would be notified when their code has changed to break the initial design. In the second case, if you know that only one thread will ever access the global variable(s) then you may be willing to take the risk of making the variable(s) __gshared and just remember to make sure you don't break your own rule by using it on another thread. This feature would allow the compiler to verify this at compile time taking pressure off the developer. In the third case, the developer has designed the code to use the global(s) so long as they lock on the appropriate object first. This is a huge risk because the safety of the code is up to the developer remembering to check that every access of the global(s) has locked on the appropriate object. Adding the sync(object) attribute would allow the compiler to verify this at compile time for the developer, and the compiler would again have no need to make the globals thread local. There's one more odd corner case I thought of. Suppose you know that only one thread will access the global data but this thread could change over the course of the program's life (impossible to know at compile time which thread it is). In this case you would use the sync(object) design pattern and just have the appropriate thread lock on the object whenever it is started. Instead of locking before every access you would just lock the object as soon as the thread is started and unlock it when the thread dies. It would also be beneficial if the thread could throw an exception if the object is already locked on. This is a different kind of way to used locked objects, instead of using them to synchronize small accesses to shared data, it is used to create a "slot" so that only one thread can be in the slot at a time. ***********************************************************************************
Jul 14 2014
On 14/07/14 23:26, Jonathan Marler wrote:Ah I see now. It looks like AST macros are going to open up alot of new paradigms, I'm excited to see how they progress and what they can do. It doesn't get us all the way there in this example but its a very good alternative without having to add anything to the compiler. Your macro would help the developer ensure that particular variables and functions are only touched by the appropriate code. If the feature only did this I would use a different name such as "restrict" or something. However, in order for the compiler to perform thread-optimizations based on this information it would have to be apart of the language.Yeah, probably. I think the problem is that threads are not so tightly couple with the language. They're mostly implemented in the runtime. Perhaps it's possible to if you could attach a thread name, globally to indicate the current thread. When the thread macro is used on a function it would check if a new thread is created. If it is, it would attach a name to the current thread variable, somewhere in the AST. When the thread macro is used on a variable, it would check context and get all callers (if possible). It would get the thread name of the caller and see if it matches the current global thread name. Otherwise issue an error. I have no idea if this is possible, if it is it sounds quite complicated. It would be easy to verify at runtime at least.It also doesn't handle the synchronized case. For that you would need a knowledge of the execution paths of your functions to determine what parts are locked and what parts are not (Unless of course AST macros could support that, I would be very pleasently surprised if they did).If you could do something similar as above and get the caller of a function with the sync macro/attribute. The AST of a synched object (mySyncObject) would have some way to indicate if it's currently synced or not.This AST macro is very intriguing though. Its like you're writing code that can analyze your code as you develop it. This is a really cool idea. As I've had more time to think on this here's one of the potential consequence I've thought of. *********** Deprecating usage of the __gshared hack *********** Here's some cases that a developer may think __gshared usage is justified. 1. In a single threaded application 2. When the developer knows that its only possible for one thread to access that global at a time or, 3. When the developer is using some type of locking scheme to access the globals is the correct design. In a single threaded application then you could add the thread attribute with the same name to every single function and variable, but this would be very unnecessary. Instead the developer could use a pragme to tell the compiler to make sure it is single threaded, but since adding this feature would require the compiler to know when threads are started anyway, the compiler could determine that an application was single threaded on its own. The pragma would just be a sanity check so that the developer would be notified when their code has changed to break the initial design.That would be a lot easier to do in the compiler. Or just remove it from the runtime. -- /Jacob Carlborg
Jul 15 2014
On 07/10/2014 08:12 PM, Jonathan Marler wrote:So what do people think?How do you make sure there is at most one thread of each kind?
Jul 11 2014
On Friday, 11 July 2014 at 18:56:07 UTC, Timon Gehr wrote:On 07/10/2014 08:12 PM, Jonathan Marler wrote:Good question. First, since the language doesn't support starting threads itself (like Go) but instead uses a library, the compiler would likely need to be modified to semantically understand whenever a line of code is starting a thread (I'm assuming it doesn't already). If this feature were interesting enough I'm sure Walter would have an opinion on the right way to accomplish this. Then how do you make sure that every named thread is only started once? The ideal situation would be to verify this at compile time. This is possible in some situations. If it is not possible to verify this at compile time then the compiler could generate a synchronized global pointer to every named thread to prevent each one from getting started more than once. However, one thought that comes to mind is if the developer cannot change the code to be able to verify that the thread is only started once at compile-time then maybe their code is poorly designed or they are using this feature incorrectly. This is just a random thought I had after writing this but maybe if you could somehow tell the compiler that a section of code will only ever be executed once it would help in this analysis. executeonce. The main function would obviously only be executed once, so any function that executes once would need to be called at most once and you can directly trace where it is called from the main thread. However I'm not sure how useful this feature would be in the general case...I would have to think on it more.So what do people think?How do you make sure there is at most one thread of each kind?
Jul 12 2014