digitalmars.D - Common Issue in Shared Code
- Andrew Wiley (84/84) Nov 20 2011 About a month or so ago, I started trying to convert a codebase I've bee...
About a month or so ago, I started trying to convert a codebase I've been working on into a multithreaded system, and I've been hitting this sort of thing over and over: -------- // used as a field and as a local variable all over the codebase struct Data { int a,b,c; int total() { return a + b + c; } } // has a Data as one of its members but never escapes a pointer to it class Bob { private: Data _dat; public: int currentTotal() { return _dat.total(); } } -------- Now, as part of my multithreaded refactor, I need to make Bob synchronized, but that means the Data field inside it is shared, which means I can no longer call the total() method in currentTotal(). To fix this, I could make Data synchronized as well, but Data is used all over the codebase, most of the time as a local variable inside a function. In my particular case, I see this a lot with a struct that represents a location, which is just 2 bytes in my codebase, so adding a monitor would more than double the size, and the locking overhead would be completely unnecessary. If I don't want to make it synchronized, I could just cast away shared everywhere I use it as a field, which looks ugly and is confusing when I look at the codebase. If I don't want to cast away shared, I could just make Data shared and assume that the owner will make sure it's not shared improperly, but at this point I've disabled all help the type system could provide me. Firstly, according to TDPL: -------- For synchronized methods: "Maybe not very intuitively, the temporary nature of synchronized entails the rule that no address of a field can escape a synchronized address. If that happened, some other portion of the code could access some data beyond the temporary protection conferred by method-level synchronization." For synchronized classes: =95 All numeric types are not shared (they have no tail) so they can be manipulated normally. =95 Array fields declared with type T [ ] receive type shared(T) [ ] ; that is, the head (the slice limits) is not shared and the tail (the contents of the array) remains shared. =95 Pointer fields declared with type T* receive type shared(T)*; that is, the head (the pointer itself) is not shared and the tail (the pointed-to data) remains shared. =95 Class fields declared with type T receive type shared(T). Classes are automatically by-reference, so they're "all tail." These rules apply on top of the no-escape rule described in the previous section. One direct consequence is that operations affecting direct fields of the object can be freely reordered and optimized inside the method, as if sharing has been temporarily suspended for them=97which is exactly what synchronized does. -------- At a first glance, it seems like the first rule should apply for structs (which would mean it should address "value types"), but it can't because a struct could contain a reference to another object, and that reference should be transitively shared. Typing a struct as shared if it contains a reference and unshared otherwise would just be confusing, but this use case is one that the language does not currently address in a satisfying way. When I flag a type as shared, all instances of it are forced to become shared, but the compiler assumes that the programmer has properly synchronized things such that sharing instances of the type is safe. Why, then, can I not force the compiler to assume I've properly synchronized things for a field of a class? In this case, the effect would be the opposite - the field wouldn't be flagged as shared, but supposing we had such a keyword, it would act as a much more limited version of the "shared" keyword because I'm only forcing the compiler to assume I've done things properly within the context of a class. The keyword would have to be restricted such that it could only be applied to private fields, and the compiler would continue to enforce (as much as is reasonable) that the address of the field does not escape. I believe that this case of data sharing will appear and frustrate programmers in almost any multithreaded program, and that finding a satisfying solution to allow the language to provide as many guarantees as possible is worthwhile. Any thoughts?
Nov 20 2011