digitalmars.D - Dependent typing and Refinement Typing for D
- sighoya (25/36) Dec 20 2019 I find it that easier if we do it likewise D does it at compile
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/31) Dec 20 2019 I understand what you mean, but I don't really think you mean
- sighoya (32/48) Dec 21 2019 I mentioned exactly gradual typing and D has already dynamic
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/41) Dec 21 2019 It will be very limited considering you can call functions in
- sighoya (9/13) Dec 21 2019 If you suggest to take values from the function body into the
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/16) Dec 21 2019 I think it would be easier to follow what the scope of your idea
- sighoya (65/68) Dec 21 2019 Good, point:
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (19/36) Dec 21 2019 Ok, I believe I understand, but how is that dynamic check
- sighoya (27/45) Dec 22 2019 Additional to contract overloading, we provide overloading over
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/24) Dec 22 2019 Hm, C++ has only added deduction guides, nothing to do with
- sighoya (30/31) Dec 21 2019 cache results from previous compiles or make it optional (i.e.
- sighoya (5/5) Dec 21 2019 Generally speaking, for enterprises, gradual dependent typing is
- bpr (11/11) Dec 21 2019 On Friday, 20 December 2019 at 22:06:03 UTC, sighoya wrote:
- sighoya (19/31) Dec 22 2019 The point is that D already offers a part of a kind of dependent
Reviving bearophile's thread https://forum.dlang.org/thread/gjkzgykpgwqryfxonnsy forum.dlang.orghttps://github.com/tomprimozic/type-systems/tree/master/refined_typesI find it that easier if we do it likewise D does it at compile time. With gradual typing, the checks are done at runtime and you didn't need to up propagate the runtime restrictions in the if clause like non gradual dependent typing does it with compile time checking. The idea is that you can apply any List with any Integer to "get(List list,int index) if index < list.length", the compiler inserts a runtime clause for asserting the condition in the if clause and subsequently throw an IndexOutOfBoundException in the false case. If the compiler is clever enough, it can remove some runtime assertions if the enclosing function gives already guarantee for it, but that together with overload resolution might be a bit problematic and requires some sort of SAT solving, or we decide to do it only partial and give errors for negative cases instead. The point is, even in non gradual dependent typing, you have to make some runtime assertion at some point, you never get runtime dependent typing for zero cost. Further, gradual dependent typing can be easily fit to Dlang's existing ecosystem. I would use it purely for argument constraints, not for result constraints as these incur unnecessary runtime checking except in unit tests where it makes sense.The implementation of a refined type-checker is actually very straightforward and turned out to be much simpler than I expected. Essentially, program expressions and contracts on function parameters and return types are converted into a series of mathematical formulas and logical statements, the validity of which is then assessed using an external automated theorem prover Z3.<It seems possible to integrate Z3 with D contract programming. Bye, bearophile
Dec 20 2019
On Friday, 20 December 2019 at 22:06:03 UTC, sighoya wrote:I find it that easier if we do it likewise D does it at compile time. With gradual typing, the checks are done at runtime and you didn't need to up propagate the runtime restrictions in the if clause like non gradual dependent typing does it with compile time checking.I understand what you mean, but I don't really think you mean gradual typing. That would be more common in dynamic languages where you may choose to gradually apply more and more constraints. So you can start with a runtime checked prototype and end up with a statically checked final program. Which is kinda cool, but not possible for D.If the compiler is clever enough, it can remove some runtime assertions if the enclosing function gives already guarantee for it, but that together with overload resolution might be a bit problematic and requires some sort of SAT solving, or we decide to do it only partial and give errors for negative cases instead.You need more machinery and annotations than just a SAT solver. I think it is better for D to instead explore what it already does, that is to do more thorough data flow analysis or abstract interpretation. Essentially add flow-typing and refinement typing (of some sort).I would use it purely for argument constraints, not for result constraints as these incur unnecessary runtime checking except in unit tests where it makes sense.Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint. Generally speaking, you need heavy inter-procedural analysis?
Dec 20 2019
I understand what you mean, but I don't really think you mean gradual typing. That would be more common in dynamic languages where you may choose to gradually apply more and more constraints.I mentioned exactly gradual typing and D has already dynamic typing regarding generic constraints, though D's duck typing is at compile time, so gradual typing at runtime for runtime values would be the dual to compile duck typing for compile time values at compile time.So you can start with a runtime checked prototype and end up with a statically checked final program.No, I would only use runtime checking. Even general dependent typing requires at some point runtime assertions aka Path Dependent typing ala Scala.Which is kinda cool, but not possible for D.Why?You need more machinery and annotations than just a SAT solver.Why? You only need SAT solving in cases where you want to up propagate runtime assertion inserted by the compiler or for overload resolution. It don't have to be perfect though and we may can help with some kind of annotation, i.e. manually specifying the overload order.I think it is better for D to instead explore what it already does, that is to do more thorough data flow analysis or abstract interpretation. Essentially add flow-typing and refinement typing (of some sort).This is non gradual dependent typing (general/classic dependent typing). At first, it is very compile time consuming because of tracking, e.g. proofing a list is extended with m additional members requires to track the count of push functions called in scope. At second, it is generally undecidable you need to insert manually proofs for it which you don't need for gradual dependent typing as all runtime variables become constants at runtime.Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.No, return values depending on input values would be evaluated at runtime before the function is called. The point is the following, it seems to add additional runtime checking, but you do it anyway in Lists and Vectors to throw IndexOutOfBoundException or NullPointerException, instead of doing this manually, the compiler can automate these exception handling by inserting runtime asserts at compile time before the function is called. Instead of an NPE or IndexOutOfBound you will get a ConstraintViolationException.
Dec 21 2019
On Saturday, 21 December 2019 at 12:52:47 UTC, sighoya wrote:I don't know Scala, how does that work?So you can start with a runtime checked prototype and end up with a statically checked final program.No, I would only use runtime checking. Even general dependent typing requires at some point runtime assertions aka Path Dependent typing ala Scala.It will be very limited considering you can call functions in assertions. So you need to emit verification conditions to get anywhere, and that takes quite a bit of machinery.You need more machinery and annotations than just a SAT solver.Why? You only need SAT solving in cases where you want to up propagate runtime assertion inserted by the compiler or for overload resolution. It don't have to be perfect though and we may can help with some kind of annotation, i.e. manually specifying the overload order.At first, it is very compile time consuming because of tracking, e.g. proofing a list is extended with m additional members requires to track the count of push functions called in scope.Verification is time consuming, dataflow/abstract interpretion is more reasonable (but less powerful).At second, it is generally undecidable you need to insert manually proofs for it which you don't need for gradual dependent typing as all runtime variables become constants at runtime.Yes, it will not always succeed. However, you might be able to cache results from previous compiles or make it optional (i.e. just emit dynamic checks where applicable).No, return values depending on input values would be evaluated at runtime before the function is called.After?The point is the following, it seems to add additional runtime checking, but you do it anyway in Lists and Vectors to throw IndexOutOfBoundException or NullPointerException, instead of doing this manually, the compiler can automate these exception handling by inserting runtime asserts at compile time before the function is called.Contracts are supposed to have their asserts evaluated at the calling point, just not implemented in D IIRC.
Dec 21 2019
Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.If you suggest to take values from the function body into the return value expression, then no, we shouldn't allow this. What happens if we have a constant value expression in the function signature return value, then we need to check indeed all possible paths leading to this value, therefore I would only use it for unit tests as unit tests could be executed at compile time then (because of constants inputs). Another option would be to add a runtime assertion before returning a value inside the function block.
Dec 21 2019
On Saturday, 21 December 2019 at 13:13:26 UTC, sighoya wrote:I think it would be easier to follow what the scope of your idea is if we looked at some pseduo code examples?Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.If you suggest to take values from the function body into the return value expression, then no, we shouldn't allow this.What happens if we have a constant value expression in the function signature return value, then we need to check indeed all possible paths leading to this value, therefore I would only use it for unit tests as unit tests could be executed at compile time then (because of constants inputs).Yes, that is a good point actually. Verification is kinda like unit-tests at compile time. :-)
Dec 21 2019
I think it would be easier to follow what the scope of your idea is if we looked at some pseduo code examples?Good, point: ElemType method(Array array) { uint index=readUInt(...); return array(index); } with ElemType array(uint index) if index<array.length becomes: ElemType method(Array array) { uint index=readUInt(...); if(index<array.length) { return array(index); } else { throw new ContradictionError("Violation of Constraint index<array.length where index=$index and array.length=$array.length"); } If we have ElemType method(Array array, uint index) if index<array.length return array(index); } instead, then the compiler could check if above constraint implies the constraint for array(index), but that may be complicated, maybe we add some hint to the compiler that is is trivial to check it: constraintsforMethod ElemType method(Array array, uint index) if index<array.length { return array(index) impliedBy( constraintsForMethod); } such that no ContradictionException can be thrown in method but before each call of "method" at the call site except the method call is implied by the constraint of the enclosing function, then the runtime assertion further up propagates. Overload resolution: method1(uint32 i, uint32 j, uint32 k) if i^2 < j^2 < k^2 method2(int64 i, int64 j, int64 k) if i^2 < j^2 < k^2 May be hard to grok for the compiler and likewise for Z3, a better approach will be to reject it by the compiler and require the user to give an annotation: MoreSpecificThan(method2) method(uint32 i, uint32 j, uint32 k) if i^2 < j^2 < k^2 method2 method(int64 i, int64 j, int64 k) if i^2 < j^2 < k^2 Dispatching of method "method" is done at runtime like follows: If i,j,k is applicable to only one of the methods, then this method is choosen, If i,j,k is to non of these applicable, the throw an ContradictionException If i,j,k is to more than one applicable, choose the most specific by follow the annotations given to them. Unit Test: void method():a if 2<a<10 { testObject=mock(...); return functionToTest(testObject) //will be run at compile time in order to check signature of method }Verification is time consuming, dataflow/abstract interpretion ismore reasonable (but less powerful). What's the difference? Can you give an example?
Dec 21 2019
On Saturday, 21 December 2019 at 14:51:39 UTC, sighoya wrote:instead, then the compiler could check if above constraint implies the constraint for array(index), but that may be complicated, maybe we add some hint to the compiler that is is trivial to check it:Ok, I believe I understand, but how is that dynamic check different from the in/out contracts that D has? Well, except that in-contracts should be inlined before the function is called.such that no ContradictionException can be thrown in method but before each call of "method" at the call site except the method call is implied by the constraint of the enclosing function, then the runtime assertion further up propagates.Understood.If i,j,k is applicable to only one of the methods, then this method is choosen, If i,j,k is to non of these applicable, the throw an ContradictionException If i,j,k is to more than one applicable, choose the most specific by follow the annotations given to them.Do you want to have overloading on dependent/refined types? That would be interesting, but then it has to be resolved at compile time...Verification tries to prove assertions by transforming the code into something that can be handled by a SMT/SAT solver. It is often done by overapproximation (gross oversimplificaton: putting a line between two points instead of calculating the exact position of points) Flow analysis/abstract interpretation tries to collect information over the various paths in the code, usually using very specific measures that are dealt with in an approximate manner. Like keeping track of variables that are negative/positive/unknown or value ranges. Can move both backwards and forwards.Verification is time consuming, dataflow/abstract interpretion is more reasonable (but less powerful).What's the difference? Can you give an example?
Dec 21 2019
Ok, I believe I understand, but how is that dynamic check different from the in/out contracts that D has? Well, except that in-contracts should be inlined before the function is called.Additional to contract overloading, we provide overloading over these conditions and optional up propagation which both require either a SAT/SMT/Flow Typing or manual annotations.Do you want to have overloading on dependent/refined types?Absolutely.That would be interesting, but then it has to be resolved at compile time...Yes, as I said together with constraint up propagation it requires SAT/SMT/Flow Typing or manual annotations, I prefer the latter for simplicity, it doesn't have to be perfect, even the upcoming overload resolution of C++ 20 isn't perfect.Verification tries to prove assertions by transforming the code into something that can be handled by a SMT/SAT solver. It is often done by overapproximation (gross oversimplificaton: putting a line between two points instead of calculating the exact position of points)Hmm..., I have to inform myself again, I thought it is equivalent to solving a simplex algorithm (in linear case) but it doesn't seem to have an objective.Flow analysis/abstract interpretation tries to collect information over the various paths in the code, usually using very specific measures that are dealt with in an approximate manner. Like keeping track of variables that are negative/positive/unknown or value ranges. Can move both backwards and forwards.Understand, but what is for more general cases like if we know x+y<=z for the if branch in function f and we call a function g in this branch requiring x^2+y^2=z^2, I think in this case we require an SAT/SMT. However in simple cases, we can just fallback to simple static refinement types like: enum {1,2,3} x=... enum {"hello","world","!"} x=... enum(n,m) {"repetition{$n,$m}"} x=... enum Range(n,k,m) {n..k..m} x=... It would be nice if D would support these kind of things. As long as we constraint one variable, solving compatibility becomes (mostly) easy. I wasn't aware of dlang contracts, so we may should extend their semantic for overload resolution and constraint up propagation in a backward compatible manner, I think this is possible.
Dec 22 2019
On Sunday, 22 December 2019 at 13:25:31 UTC, sighoya wrote:Yes, as I said together with constraint up propagation it requires SAT/SMT/Flow Typing or manual annotations, I prefer the latter for simplicity, it doesn't have to be perfect, even the upcoming overload resolution of C++ 20 isn't perfect.Hm, C++ has only added deduction guides, nothing to do with dependent types? Isn't that mostly a convenience feature?Hmm..., I have to inform myself again, I thought it is equivalent to solving a simplex algorithm (in linear case) but it doesn't seem to have an objective.In order to speed up SMT solvers use over-approximation, so that they don't have "trace the full behaviour of a function", but can rather deduce that a function is always below a line (or something else). It is an optimization. So they model types/sorts like integers. So how you formulate the problem... can effect performance and indeed if the problem is solved at all. That makes automation a bit tricky.Understand, but what is for more general cases like if we know x+y<=z for the if branch in function f and we call a function g in this branch requiring x^2+y^2=z^2, I think in this case we require an SAT/SMT.Dataflow is usually not precise, so if you want to precise results to overload on then you need SAT/SMT.
Dec 22 2019
Yes, it will not always succeed. However, you might be able tocache results from previous compiles or make it optional (i.e. just emit dynamic checks where applicable). Is an option, but release builds will be slow, runtime checking isn't good in this case, as you would proof argument to return value relationships each time a function is called. I know that general dependent typing is from an idiomatic view point right, but it has the following drawbacks: 1.) Function signatures become much more complicated with respect to gradual dependent typed function signatures in that all return values are related to their input arguments 2.) You can't characterize all properties of a function/type in the signature, you don't know them immediately and the compiler can't infer them for you, adding some properties later might break your code. 3.) Signature become more sensible to changes (unstable), changing them causes a cascade of errors following the stack upwards. 4.) It doesn't fit to D, as D is already lazy (and unsafe) with compile time duck typing, e.g. you can create generic entities which can never be valid for any type parameter. Further type errors are thrown at specialization time deep in the call stack (unice), doing it differently for runtime values is an idomatic break to how it is handled by D. Theoretically, general dependent typing could replace all kinds of unit testing, but to do so, you need to write the whole function block as the return value in the function signature which is unreadable and undecidable for the compiler to infer all kinds of properties. It would further break the idiom of encapsulation which is important for any kind of commercialism. So practically, you still need to unit test.
Dec 21 2019
Generally speaking, for enterprises, gradual dependent typing is fine, because if they don't want to get accustomed to it, they can simply ignore to up propagate additional constraints upwards to the stack whereas for general dependent typing, they are forced to do so, giving them a second Rust.
Dec 21 2019
On Friday, 20 December 2019 at 22:06:03 UTC, sighoya wrote: . . . I realize this is a D forum, but it seems highly unlikely to me that you'll be able to get anywhere trying to add this stuff to D. If you don't know, there is a research language called F* or Fstar that integrates dependent types, refinement types, the Z3 prover, and more. It has a low level subset called Low* and a backend which can emit verified C, and has been used to implement verified cryptographic libraries.
Dec 21 2019
On Sunday, 22 December 2019 at 00:52:56 UTC, bpr wrote:On Friday, 20 December 2019 at 22:06:03 UTC, sighoya wrote: . . . I realize this is a D forum, but it seems highly unlikely to me that you'll be able to get anywhere trying to add this stuff to D.The point is that D already offers a part of a kind of dependent typing I'm searching for, we just need to extend it in some ways to make it more usable. Further some kinds of refinement types like: enum Range(n,k,m) {n..k..m} would be nice. D already offers some limited kind of theses things like static arrays which can be seen as a static refinement of dynamic arrays. If you don't know, there is a research language called F* orFstar that integrates dependent types, refinement types, the Z3 prover, and more.Yes, it has general (non gradual) dependent typing which is I think too much for a pragmatic language like D. And really, who knows F* it will stay as a research language only. Idris and Agda are candidates as well but didn't support refinement types directly, instead they have to define them over some weird Successor machinery. I think F*, Idris and Agda are too alienated because of their ML like syntax and their dogmatism about functional programming.It has a low level subset called Low* and a backend which can emit verified C, and has been used to implement verified cryptographic libraries.Interesting, I will may look onto it.
Dec 22 2019