digitalmars.D - [GSoC Proposal] Statically Checked Measurement Units
- Cristi Cobzarenco (128/128) Mar 28 2011 First, let me apologize for this very late entry, it's the end of univer...
- David Nadlinger (37/40) Mar 28 2011 This is by no means a late proposal – the application period has not
- Cristi Cobzarenco (88/128) Mar 28 2011 - I too was playing around with a units project before GSoC, that is why...
- David Nadlinger (46/107) Mar 29 2011 I am in a slight dilemma, because although I would love to share my work...
- Cristi Cobzarenco (29/146) Mar 29 2011 Surely, .mangleof returns unique strings? Thanks for your offer, but in ...
- David Nadlinger (62/67) Mar 29 2011 Yes, .mangleof returns unique strings for types. The stringof property
- Jens Mueller (4/10) Mar 28 2011 T.stringof where T is some type gives you the name of the type at
- Andrei Alexandrescu (29/36) Mar 28 2011 [snip]
- Cristi Cobzarenco (33/72) Mar 28 2011 Thanks for your answer!
- Jonathan M Davis (9/12) Mar 28 2011 Well, I can't say what's possible before we actually have a proposed uni...
- spir (8/13) Mar 28 2011 I would implement something like Categorical in a language that has no e...
- Andrei Alexandrescu (4/16) Mar 28 2011 A categorical type may not have a name for each value (userid, cityid,
- Don (6/33) Mar 29 2011 This is one of those features that gets proposed frequently in multiple
- Cristi Cobzarenco (66/99) Mar 29 2011 To Don:
- Don (23/57) Mar 29 2011 I'm a physicist and most of my programming involves quantities which
- Cristi Cobzarenco (63/119) Mar 29 2011 To David:
- David Nadlinger (59/84) Mar 29 2011 To be honest, I still don't see how you are able to get away without
- Cristi Cobzarenco (43/134) Mar 29 2011 Well they don't _have_ to be the same type as long they're convertible t...
- David Nadlinger (3/6) Mar 29 2011 But how would you make them _implicitly_ convertible then?
- Cristi Cobzarenco (13/20) Mar 30 2011 By making the operators on quantity templates:
- David Nadlinger (23/24) Mar 30 2011 opAssign isn't taken into consideration when initializing variables or
- Andrej Mitrovic (27/27) Mar 30 2011 Maybe OT, but here's some hackish wizardry you can do with classes:
- Cristi Cobzarenco (8/33) Mar 30 2011 Yeah, you're right (case (1) also works with a template ctor as well - i...
- Cristi Cobzarenco (6/33) Mar 30 2011 Hmmm, the only problem with this is that we would have to require the
- Cristi Cobzarenco (5/45) Apr 01 2011 Ok, my proposal is up, I'm looking forward to feedback.
- Andrei Alexandrescu (4/13) Mar 29 2011 Many of my bugs involving numeric code is that I mix scalars with units,...
- spir (49/90) Mar 29 2011 Have you considered
- Andrei Alexandrescu (18/51) Mar 29 2011 At work we use C++ enums for categorical types to great effect. The way
- spir (8/57) Mar 29 2011 Waow, this is a great explanation of expected benefits of units, I guess...
- Andrei Alexandrescu (5/8) Mar 29 2011 Typedefs would not allow defining categorical types (e.g. no
First, let me apologize for this very late entry, it's the end of universit= y and it's been a very busy period, I hope you will still consider it. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hope you'll forgive me if that's the case. =3D=3D=3D=3D=3D=3D=3D Google Summer of Code Proposal: Statically Checked Un= its =3D=3D=3D=3D=3D=3D=3D Abstract ------------- Measurement units allow to statically check the correctness of assignments and expressions at virtually no performance cost and very little extra effort. When it comes to physics the advantages are obvious =96 if you try = to assign a force a variable measuring distance, you've most certainly got a formula wrong somewhere along the way. Also, showing a sensor measurement i= n gallons on a litre display that keeps track of the remaining fuel of a plan= e (a big no-no) is easily avoidable with this technique. What this translates is that one more of the many hidden assumptions in source code is made visible: units naturally complement other contract checking techniques, lik= e assertions, invariants and the like. After all the unit that a value is measured in is part of the contract. The scope of measurement units is not limited to physics calculations however and if the feature is properly implemented and is very easy to use, creating a very domain-specific units helps a great deal with checking correct code at compile time. Static typing doesn't cut it sometimes: imagine two variables counting different things =96 Gadgets and Widgets. While both values should be ints, one of them should probably not be assignable to the other. Or imagine a website calculating the number of downloads per second, but uses a timer that counts milliseconds. When one thinks about it this way, there are a great many cases where units can help prevent common errors. Given D's focus on contract based design and language features supporting it, I think statically checked measurement units fit very naturally into th= e standard library, and the language's metaprogramming features would make it very clean to implement (as opposed to a similar effort in C++). I think a provided by Boost.Units: 1. Defining unit systems, like Boost.Units requires is extra effort, so units for counting Widgets or Gadgets would be awkward to use and we would lose the safety checks there. them. 3. The sort of silent conversion that Boost.Units performs is undesirable i= n many cases since it is a recipe for precision disasters, imagine sometimes accidentally assigning a variable measured in billions of years to a one measured in picoseconds. Boost.Units would silently convert one to another, since they measure the same dimension. This probably results into the value being set to +INF and even if it doesn't, very rarely one actually intends to perform this conversion. 4. Setting numerical ids to units and dimensions is cumbersome. S Thus, the requirements for the unit system would be: 1. One line definition of new units. 2. Simple, yet safe and explicit conversion between units. 3. Zero runtime overhead. 4. Minimal extra coding effort to use units. Interface Overview --------------------------- A Boost type approach to the library interface would be: struct Metre : SomeUnitBaseType!(...) {} struct Second : SomeUnitBaseType!(...) {} typedef DerivedUnit!(MetreType,1,Second,-1) MetresPerSecond; typedef DerivedUnit!(MetreType,2) MetersSquared; Meter metre, metres; Second second, seconds; MetersPerSecond metersPerSecond; MetersSquared meterSquared, metersSquared; void f() { Quantity!(metre) dist1 =3D 3.0 * metres; Quantity!(meterSquared) area =3D dist1 * dist1; Quantity!(metresPerSecond) speed =3D distance / (2.0*seconds); } This is very cumbersome and fails on the one line requirement. I propose using types for base units and mixins to define derived units. One can use the typenames of the units in arithmetic operations this way: struct metre {} struct second {} void f() { Quantity!("metre") dist1 =3D quantity!(3.0, "metre"); Quantity!("metre^2") area =3D dist1 * dist1; Quantity!("metre/second") speed =3D dist1 / quantity!(2.0, "second"); } Conversion between units can be done specifying a single factor with a proper unit: template conversion( alias unit : "kilometer/meter" ) { immutable Quantity!(unit) conversion =3D quantity!(123.0,unit); } void f() { Quantity!("metre") d1 =3D quantity!(123.0,"metre"); // convert calls conversion! with the right argument Quantity!("kilometre") d2 =3D convert!(d1,"kilometre"); } Also, notice this approach imposes no restriction to the types that define units, therefore our Widget/Gadget counters could be defined without any extra work: class Widget { /* complicated class definition */ } class Gadget { /* complicated class definition */ } Quantity!("Widget",int) nWidgets; Quantity!("Gadget",int) nGadgets; About Me ------------ I am an undergraduate student at the University of Edinburgh in Scotland doing a degree in Computer Science and Artificial Intelligence, originally from Romania where I finished a specialised Computer Science high school (Colegiul National de Informatica "Tudor Vianu"). My first language is C++ which I started learning when I was 9 and as a result I have a very good understanding of template metaprogramming. I also know Haskell and Python well which helps me draw from multiple paradigms when designing as system. I started learning D about a year ago and I instantly fell in love with it. The fact that it does away with all the annoying C backwards compatibility, improves on the features that make C++ unique (the template system, performance, low level memory access etc.) and adds modern language features (garbage collection, lambda functions etc.) makes me very optimistic about the project. In terms of working experience, other than a myriad of personal projects, I did work for my former high school for a summer, implementing an automated testing system, and last year I was lead a team that won a software development competition organised by our computing society. This year I too= k part in the Scottish Game Jam where my team ended up in 8th place. --=20 (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco
Mar 28 2011
On 3/28/11 5:43 PM, Cristi Cobzarenco wrote:First, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it.This is by no means a late proposal – the application period has not even formally opened yet. I was somewhat surprised to see your post, because I had been playing around with a dimensional algebra/unit system implementation for a while before the whole GSoC thing started. I even have an unfinished project proposal for it lying around, but as I thought that I was rather alone with my fascination for unit systems, I decided to finish the Thrift one first. Well, seems like I was wrong… :) A few things that came to my mind while reading your proposal: - The need for numerical ids is not inherent to a dimension-aware model like Boost.Units – it's just needed because there is no way to get a strict total order of types in C++. You will need to find a solution for this in D as well, because you want to make sure that »1.0 * second * metre« is of the same type as »1.0 * metre * second«. - I think that whether disallowing implicit conversion of e.g. millimeters to meters is a good idea has to be decided once the rest of the API has been implemented and one can actually test how the API »feels« – although I'd probably disallow them in my design by default as well, I'm not too sure if this actually works out in practice, or if it just cumbersome to use. Also, I'd like to note that whether to allow this kind of implicit conversions doesn't necessarily depend on whether a system has the notion of dimensions. - Not that I would be too fond of the Boost.Units design, but »convenience« functions for constructing units from strings like in your second example could be implemented for it just as well. - You have probably already thought of this and omitted it for the sake of brevity, but I don't quite see how your current, entirely string-based proposal would work when units and/or conversion factors are defined in a module different from the one the implementations of Quantity/convert/… are in. Contrary to C++, D doesn't have ADL, so I am not sure how a custom base unit would be in scope for Quantity to find, or how a custom conversion template could be found from convert(). Anyway, I am not sure whether I should submit my own units proposal as well, but if you should end up working on this project, I'd be happy to discuss any design or implementation issue you'd like to. David
Mar 28 2011
- I too was playing around with a units project before GSoC, that is why I thought doing this project was a good idea. The way I was doing it without numerical IDs was simply by having more complicated algorithms for equality= , multiplications etc. For example, equality would be implemented as: template UnitTuple(P...) { alias P Units; } template contains( Unit, UT ) { /* do a linear search for Unit in UT.Units (since UT is a UnitTuple) - O(n)*/ } template includes( UT1, UT2 ) { /* check for each Unit in UT1 that it is also in UT2 (using contains) - O(n^2) */ } template equals( UT1, UT2 ) { immutable bool equals =3D includes!(UT1,UT2) && includes!(UT2, UT1); } Granted this means that each check takes O(n^2) where n is the number of different units, but it might be worth it - or not. On the small tests I've done it didn't seem to increase compile time significantly, but more research needs to be done. I think that as long as there aren't values with _a lot_ of units (like ten), the extra compile time shouldn't be noticeable= . The biggest problem I have with adding IDs is that one will have to manage systems afterwards or have to deal with collisions. Neither one is very nice. - You're right, you don't need dimensions for implicit conversions, of course. And you're also right about possibly making the decision later abou= t has explicit conversions, and I was trying to steer more towards that model= . - I seem not to have been to clear about the way I would like to use strings. The names of the units in the strings have to be the type names that determine the units. Then one needs a function that would convert a string like "Meter/Second" to Division!(Meter, Second), I'm not sure how yo= u would do that in C++. Maybe I'm wrong, but I can't see it. - I hope it is by now clear that my proposal is not, in fact, string based at all. The strings are just there to be able to write derived units in infix notation, something boost solves by using dummy objects with overloaded operators. The lack of ADL is a problem which I completely missed; I have immersed myself in C++ completely lately and I've gotten use= d to specializing templates in different scopes. These are the solutions I ca= n come up with, but I will have to think some more: 1. There is an intrusive way of solving this, by making the conversion factors static members of the unit types, but this would not allow, for example, having a Widget/Gadget counter the way I intended. that one manually uses. That actually is not bad at all. The only problem was that I was hoping that conversion between derived units could automatically be done using the conversion factors of the fundamental units= : (meter/second) -> (kilometer/hour) knowing meter->kilometer and second->hour. Again I will have to think some more about the latter point. And I'll do some more tests on the performance of doing linear searches. Is there way t= o get the name of a type (as a string) at compile time (not the mangled name you get at runtime)? I wasn't able to find any way to do this. My original idea was actually to use the fully qualified typenames to create the ordering. Thanks a lot for your feedback, it's been very helpful, especially in pointing out the lack of ADL. Hope to hear from you again. On 28 March 2011 20:57, David Nadlinger <see klickverbot.at> wrote:On 3/28/11 5:43 PM, Cristi Cobzarenco wrote:enFirst, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it.This is by no means a late proposal =96 the application period has not ev=formally opened yet. I was somewhat surprised to see your post, because I had been playing around with a dimensional algebra/unit system implementation for a while before the whole GSoC thing started. I even have an unfinished project proposal for it lying around, but as I thought that I was rather alone wi=thmy fascination for unit systems, I decided to finish the Thrift one first=.Well, seems like I was wrong=85 :) A few things that came to my mind while reading your proposal: - The need for numerical ids is not inherent to a dimension-aware model like Boost.Units =96 it's just needed because there is no way to get a st=ricttotal order of types in C++. You will need to find a solution for this in=Das well, because you want to make sure that =BB1.0 * second * metre=AB is=of thesame type as =BB1.0 * metre * second=AB. - I think that whether disallowing implicit conversion of e.g. millimete=rsto meters is a good idea has to be decided once the rest of the API has b=eenimplemented and one can actually test how the API =BBfeels=AB =96 althoug=h I'dprobably disallow them in my design by default as well, I'm not too sure =ifthis actually works out in practice, or if it just cumbersome to use. Als=o,I'd like to note that whether to allow this kind of implicit conversions doesn't necessarily depend on whether a system has the notion of dimensio=ns.- Not that I would be too fond of the Boost.Units design, but =BBconvenience=AB functions for constructing units from strings like in y=oursecond example could be implemented for it just as well. - You have probably already thought of this and omitted it for the sake =ofbrevity, but I don't quite see how your current, entirely string-based proposal would work when units and/or conversion factors are defined in a module different from the one the implementations of Quantity/convert/=85=arein. Contrary to C++, D doesn't have ADL, so I am not sure how a custom ba=seunit would be in scope for Quantity to find, or how a custom conversion template could be found from convert(). Anyway, I am not sure whether I should submit my own units proposal as well, but if you should end up working on this project, I'd be happy to discuss any design or implementation issue you'd like to. David--=20 (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco
Mar 28 2011
I am in a slight dilemma, because although I would love to share my work and ideas with you, right now this would automatically weaken my own units proposal in comparison to yours. However, as this would be grossly against the open source spirit, and the point of GSoC certainly can't be to encourage that, I'll just do it anyway. Regarding IDs: As I wrote in my previous post, the only point of the unit IDs in Boost.Units is to provide a strict total order over the set of units. If you can achieve it without that (see below), you won't need any artificial numbers which you have to manage. But why would you need to be able to sort the base units in the first place? The answer is simple: To define a single type representation for each possible unit, i.e. to implement type canonicalization. To illustrate this point, consider the following (pseudocode) example: auto force = 5.0 * newton; auto distance = 3.0 * meter; Quantity!(Newton, Meter) torque = force * distance; torque = distance * force; Both of the assignments to »torque« should obviously work, because the types of »force * distance« and »distance * force« are semantically the same. In a naïve implementation, however, the actual types would be different because the pairs of base units and exponents would be arranged in a different order, so at least one of the assignments would lead to type mismatch – because a tuple of units is, well, a tuple and not an (unordered) set. And this is exactly where the strictly ordered IDs enter the scheme. By using them to sort the base unit/exponent pairs, you can guarantee that quantities semantically equivalent always end up with the same »physical« type. Luckily, there is no need to require the user to manually assign sortable, unique IDs to each base type because we can access the mangled names of types at compile time, which fulfill these requirements. There are probably other feasible approaches as well, but using them worked out well for me (you can't rely on .stringof to give unique strings). When implementing the type sorting code, you might probably run into some difficulties and/or CTFE bugs, feel free to contact me for related questions (as I have already wasted enough time on this to get a working solution…^^). Regarding strings: I might not have expressed my doubts clearly, but I didn't assume that your proposed system would use strings as internal representation at all. What I meant is that I don't see a way how, given »Quantity!("Widgets/Gadgets")«, to get the Widget and Gadget types in scope inside Quantity. Incidentally, this is exactly the reason for which you can't use arbitrary functions/types in the »string lambdas« from std.algorithm. David On 3/28/11 9:43 PM, Cristi Cobzarenco wrote:- I too was playing around with a units project before GSoC, that is why I thought doing this project was a good idea. The way I was doing it without numerical IDs was simply by having more complicated algorithms for equality, multiplications etc. For example, equality would be implemented as: template UnitTuple(P...) { alias P Units; } template contains( Unit, UT ) { /* do a linear search for Unit in UT.Units (since UT is a UnitTuple) - O(n)*/ } template includes( UT1, UT2 ) { /* check for each Unit in UT1 that it is also in UT2 (using contains) - O(n^2) */ } template equals( UT1, UT2 ) { immutable bool equals = includes!(UT1,UT2) && includes!(UT2, UT1); } Granted this means that each check takes O(n^2) where n is the number of different units, but it might be worth it - or not. On the small tests I've done it didn't seem to increase compile time significantly, but more research needs to be done. I think that as long as there aren't values with _a lot_ of units (like ten), the extra compile time shouldn't be noticeable. The biggest problem I have with adding IDs is that one will have to manage systems afterwards or have to deal with collisions. Neither one is very nice. - You're right, you don't need dimensions for implicit conversions, of course. And you're also right about possibly making the decision later popular, only has explicit conversions, and I was trying to steer more towards that model. - I seem not to have been to clear about the way I would like to use strings. The names of the units in the strings have to be the type names that determine the units. Then one needs a function that would convert a string like "Meter/Second" to Division!(Meter, Second), I'm not sure how you would do that in C++. Maybe I'm wrong, but I can't see it. - I hope it is by now clear that my proposal is not, in fact, string based at all. The strings are just there to be able to write derived units in infix notation, something boost solves by using dummy objects with overloaded operators. The lack of ADL is a problem which I completely missed; I have immersed myself in C++ completely lately and I've gotten used to specializing templates in different scopes. These are the solutions I can come up with, but I will have to think some more: 1. There is an intrusive way of solving this, by making the conversion factors static members of the unit types, but this would not allow, for example, having a Widget/Gadget counter the way I intended. factors, that one manually uses. That actually is not bad at all. The only problem was that I was hoping that conversion between derived units could automatically be done using the conversion factors of the fundamental units: (meter/second) -> (kilometer/hour) knowing meter->kilometer and second->hour. Again I will have to think some more about the latter point. And I'll do some more tests on the performance of doing linear searches. Is there way to get the name of a type (as a string) at compile time (not the mangled name you get at runtime)? I wasn't able to find any way to do this. My original idea was actually to use the fully qualified typenames to create the ordering. Thanks a lot for your feedback, it's been very helpful, especially in pointing out the lack of ADL. Hope to hear from you again.
Mar 29 2011
Surely, .mangleof returns unique strings? Thanks for your offer, but in my prototype I already have sorting and operators working. You're right, again= , about the scope of the types, I have a few ideas on how to work around that= , but I don't like any of them too much, I'll play around with them and tell you more. Thanks a lot for your feedback, I feel this collaboration will help D in the end, no matter whose proposal gets accepted (if any). I am a bit confused regarding your GSoC proposal, aren't you a mentor? On 29 March 2011 13:51, David Nadlinger <see klickverbot.at> wrote:I am in a slight dilemma, because although I would love to share my work and ideas with you, right now this would automatically weaken my own unit=sproposal in comparison to yours. However, as this would be grossly agains=tthe open source spirit, and the point of GSoC certainly can't be to encourage that, I'll just do it anyway. Regarding IDs: As I wrote in my previous post, the only point of the unit IDs in Boost.Units is to provide a strict total order over the set of uni=ts.If you can achieve it without that (see below), you won't need any artificial numbers which you have to manage. But why would you need to be able to sort the base units in the first place? The answer is simple: To define a single type representation for e=achpossible unit, i.e. to implement type canonicalization. To illustrate thi=spoint, consider the following (pseudocode) example: auto force =3D 5.0 * newton; auto distance =3D 3.0 * meter; Quantity!(Newton, Meter) torque =3D force * distance; torque =3D distance * force; Both of the assignments to =BBtorque=AB should obviously work, because th=etypes of =BBforce * distance=AB and =BBdistance * force=AB are semantical=ly thesame. In a na=EFve implementation, however, the actual types would be different because the pairs of base units and exponents would be arranged=ina different order, so at least one of the assignments would lead to type mismatch =96 because a tuple of units is, well, a tuple and not an (unord=ered)set. And this is exactly where the strictly ordered IDs enter the scheme. By using them to sort the base unit/exponent pairs, you can guarantee that quantities semantically equivalent always end up with the same =BBphysica=l=ABtype. Luckily, there is no need to require the user to manually assign sortable=,unique IDs to each base type because we can access the mangled names of types at compile time, which fulfill these requirements. There are probab=lyother feasible approaches as well, but using them worked out well for me (you can't rely on .stringof to give unique strings). When implementing t=hetype sorting code, you might probably run into some difficulties and/or C=TFEbugs, feel free to contact me for related questions (as I have already wasted enough time on this to get a working solution=85^^). Regarding strings: I might not have expressed my doubts clearly, but I didn't assume that your proposed system would use strings as internal representation at all. What I meant is that I don't see a way how, given =BBQuantity!("Widgets/Gadgets")=AB, to get the Widget and Gadget types in=scopeinside Quantity. Incidentally, this is exactly the reason for which you can't use arbitrary functions/types in the =BBstring lambdas=AB from std.algorithm. David On 3/28/11 9:43 PM, Cristi Cobzarenco wrote::- I too was playing around with a units project before GSoC, that is why I thought doing this project was a good idea. The way I was doing it without numerical IDs was simply by having more complicated algorithms for equality, multiplications etc. For example, equality would be implemented as: template UnitTuple(P...) { alias P Units; } template contains( Unit, UT ) { /* do a linear search for Unit in UT.Units (since UT is a UnitTuple) - O(n)*/ } template includes( UT1, UT2 ) { /* check for each Unit in UT1 that it is also in UT2 (using contains) - O(n^2) */ } template equals( UT1, UT2 ) { immutable bool equals =3D includes!(UT1,UT2) && includes!(UT2, UT1); } Granted this means that each check takes O(n^2) where n is the number of different units, but it might be worth it - or not. On the small tests I've done it didn't seem to increase compile time significantly, but more research needs to be done. I think that as long as there aren't values with _a lot_ of units (like ten), the extra compile time shouldn't be noticeable. The biggest problem I have with adding IDs is that one will have to manage systems afterwards or have to deal with collisions. Neither one is very nice. - You're right, you don't need dimensions for implicit conversions, of course. And you're also right about possibly making the decision later popular, only has explicit conversions, and I was trying to steer more towards that model. - I seem not to have been to clear about the way I would like to use strings. The names of the units in the strings have to be the type names that determine the units. Then one needs a function that would convert a string like "Meter/Second" to Division!(Meter, Second), I'm not sure how you would do that in C++. Maybe I'm wrong, but I can't see it. - I hope it is by now clear that my proposal is not, in fact, string based at all. The strings are just there to be able to write derived units in infix notation, something boost solves by using dummy objects with overloaded operators. The lack of ADL is a problem which I completely missed; I have immersed myself in C++ completely lately and I've gotten used to specializing templates in different scopes. These are the solutions I can come up with, but I will have to think some more=--=20 (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco1. There is an intrusive way of solving this, by making the conversion factors static members of the unit types, but this would not allow, for example, having a Widget/Gadget counter the way I intended. factors, that one manually uses. That actually is not bad at all. The only problem was that I was hoping that conversion between derived units could automatically be done using the conversion factors of the fundamental units: (meter/second) -> (kilometer/hour) knowing meter->kilometer and second->hour. Again I will have to think some more about the latter point. And I'll do some more tests on the performance of doing linear searches. Is there way to get the name of a type (as a string) at compile time (not the mangled name you get at runtime)? I wasn't able to find any way to do this. My original idea was actually to use the fully qualified typenames to create the ordering. Thanks a lot for your feedback, it's been very helpful, especially in pointing out the lack of ADL. Hope to hear from you again.
Mar 29 2011
On 3/29/11 2:33 PM, Cristi Cobzarenco wrote:Surely, .mangleof returns unique strings?Yes, .mangleof returns unique strings for types. The stringof property which was suggested by other people here on the NG, however, is not unique.[…] Thanks a lot for your feedback, I feel this collaboration will help D in the end, no matter whose proposal gets accepted (if any). I am a bit confused regarding your GSoC proposal, aren't you a mentor?No, I'm just hoping to participate in this GSoC as a student as well. To clarify the situation: Having experienced how incredibly useful dimensional analysis is in many areas of science, I have long been interested in possible ways of using unit systems in programming to gain additional type safety. Earlier this year, before a possible application to GSoC was even brought up in the D community, I started to work on a D implementation of an unit system. I finished a working prototype, but didn't have the time yet to implement a flexible unit conversion scheme and, more importantly, extend the documentation and examples so I could put it up for discussion at the D NG. Then, it was announced that Digital Mars would participate in this Google Summer of Code, and surprisingly it didn't take long until someone added an unit system to the ideas page. As I was considering to apply to GSoC anyway, this seemed like a natural fit. However, Andrei also put up the idea of a D implementation of Apache Thrift, which caught my attention as I have been waiting for the opportunity to have an in-depth look on it for quite some time now. As I am equally interested in both topics, and students are allowed to submit a large number of proposals (20?), I decided to just write project proposals for both of them and let Walter/Andrei/… choose which one they like better, if any. I decided to start with the Thrift one, and planned to submit my units proposal later in the application period. After publishing my first draft here at the NG, I also contacted Andrei for his opinion on whether it would make sense to submit a second proposal, given that he seemed quite interested in the Thrift idea. Now, back to topic: I am absolutely sure that collaborating on this project will lead to better results (I mean, that's how open source software works after all), but there is a problem: By the GSoC rules, it's not possible for students to work in teams on a single project. The dilemma I hinted at is that if we start working together right now, we'll probably end up with two almost identical proposals/applications for the same project, which doesn't really seem desirable. Also, I'm increasingly doubtful that an units library would be a good fit for a Summer of Code project in the first place, which is also why I finished my other proposal first: As Don said, I think that while it certainly is a nice demonstration of the metaprogramming capabilities/type system expressiveness of a language, it might not be too useful for the »general public«, compared to other features. Don't get me wrong here, I'm personally very enthusiastic about the idea, and I can imagine many possible ways in which a flexible unit system could be used to avoid bugs or to clarify interfaces. But: The concept isn't new at all – for example, during my research I stumbled over papers dedicated to units in programming languages dating back to 1985 –, but I have yet to see units actually being used in production code. My second concern is the extent of the project: After spending two weekends on it, I have a working prototype of a units library, and, if I understood you correctly, you have one as well. They surely both lack some features and a lot of polish and documentation, but I think it would probably take neither of us three full months of work to get them into a state suitable for inclusion in the Phobos review queue. For these reasons, I really started to wonder if it wouldn't be the better idea to just merge our projects and work on getting the result into shape independent of GSoC when I saw your proposal – even more so since our design/implementation ideas have shown to be quite similar. I don't want to discourage you from applying at all, and I will probably still submit a proposal for it nevertheless, but I think this should be discussed. David
Mar 29 2011
Cristi Cobzarenco wrote:Again I will have to think some more about the latter point. And I'll do some more tests on the performance of doing linear searches. Is there way to get the name of a type (as a string) at compile time (not the mangled name you get at runtime)? I wasn't able to find any way to do this. My original idea was actually to use the fully qualified typenames to create the ordering.T.stringof where T is some type gives you the name of the type at compile time. Jens
Mar 28 2011
On 3/28/11 10:43 AM, Cristi Cobzarenco wrote:First, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hope you'll forgive me if that's the case. ======= Google Summer of Code Proposal: Statically Checked Units =======[snip] This is a good place to discuss pre-submission proposals. To submit, go to http://d-programming-language.org/gsoc2011.html later today. This is a strong draft proposal that I am likely to back up when complete. A few notes: * There is a good overview of existing work, which puts the proponent in the right position to make the best choices for an implementation in D. * Human-readable strings as means to generate types is a fertile direction. One issue is canonicalization, e.g. "meters^2" would be a different type from "meters ^ 2" (and btw that should mimic D's operators, so it should use "^^"), and both are different from the semantically equivalent "meters*meters". I think this is avoidable by a function that brings all strings to a canonical form. This needs to be discussed in the proposal. * The approach to quantities of discrete objects (widgets, gadgets and I hope to see examples with degrees, radians etc.) is very promising. I'm also looking forward to a "Categorical" type - an integer-based quantity that describes a bounded enumeration of objects, for example "CityID". Categorical measures are not supposed to support arithmetic; they simply identify distinct objects in an unrelated space. * In the final proposal the scope of the library should be clarified (e.g. what kinds of units and related idioms will be supported, and what kinds you chose not to support and why). * At best the proposal could define and project a relationship with std.datetime, which defines a few units itself. Wonder whether it's possible to simplify std.datetime by using the future units library. Thanks for your interest, and good luck! Andrei
Mar 28 2011
Thanks for your answer! - I agree that using strings to represent units is not a particularly good idea. Since many people have noted related things, I seem not to have been particularly clear about the way I intend to use strings. Let me try to explain it in detail: There is a type that determines the unit: struct Meter {} Then every quantity is parametrised with two aliases: Quantity!(UnitList, ValueType) UnitList represents a list of pairs (UnitType,Exponent), where UnitType is a typename (like Meter) and Exponent is a static rational type. Therefore, the following would be a valid quantity type: Quantity!( UnitList!( UnitPair!(Meter,1) ), double ) The strings are parsed at compile time and converted (using mixins) into the UnitList. For example: ParseUnitString!("meters/second") -> UnitListDivison!( UnitList!( UnitPair!(Meter,1) ), UnitList!( UnitPair!(Second,1) ) ) -> UnitList( UnitPair!(Meter,1), UnitPair!(Second,-1) ). Therefore there is no need to convert all strings to a cannonical form, they are all converted to an alias tuple (UnitList). To check whether two UnitList's are the same, one can check double-inclusion. What do you think, does this make sense. - The Categorical type sounds like a great idea. I think they could be passed on as a ValueType to a quantity: typedef Quantity!(City, BoundedInt!(0,100)) CityID; And BoundedInt is just a type implicitly-convertible to and from int, that supports assignment and equality and throws on an out-of-bounds assignment. What do you think? On 28 March 2011 21:53, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>wrote:On 3/28/11 10:43 AM, Cristi Cobzarenco wrote:-- (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarencoFirst, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hope you'll forgive me if that's the case. ======= Google Summer of Code Proposal: Statically Checked Units =======[snip] This is a good place to discuss pre-submission proposals. To submit, go to http://d-programming-language.org/gsoc2011.html later today. This is a strong draft proposal that I am likely to back up when complete. A few notes: * There is a good overview of existing work, which puts the proponent in the right position to make the best choices for an implementation in D. * Human-readable strings as means to generate types is a fertile direction. One issue is canonicalization, e.g. "meters^2" would be a different type from "meters ^ 2" (and btw that should mimic D's operators, so it should use "^^"), and both are different from the semantically equivalent "meters*meters". I think this is avoidable by a function that brings all strings to a canonical form. This needs to be discussed in the proposal. * The approach to quantities of discrete objects (widgets, gadgets and I hope to see examples with degrees, radians etc.) is very promising. I'm also looking forward to a "Categorical" type - an integer-based quantity that describes a bounded enumeration of objects, for example "CityID". Categorical measures are not supposed to support arithmetic; they simply identify distinct objects in an unrelated space. * In the final proposal the scope of the library should be clarified (e.g. what kinds of units and related idioms will be supported, and what kinds you chose not to support and why). * At best the proposal could define and project a relationship with std.datetime, which defines a few units itself. Wonder whether it's possible to simplify std.datetime by using the future units library. Thanks for your interest, and good luck! Andrei
Mar 28 2011
On 2011-03-28 12:53, Andrei Alexandrescu wrote:* At best the proposal could define and project a relationship with std.datetime, which defines a few units itself. Wonder whether it's possible to simplify std.datetime by using the future units library.Well, I can't say what's possible before we actually have a proposed units module, but I doubt that there's much in std.datetime which could be simplified by having a units library. The units portion of it is a fairly small piece. There are functions templatized on time units, but that's all so generic that it's not exactly much code. So, it'll be interesting to see how a units module might relate to it, but I question that it would really do much to simplify it. - Jonathan M Davis
Mar 28 2011
On 03/28/2011 10:13 PM, Cristi Cobzarenco wrote:- The Categorical type sounds like a great idea. I think they could be passed on as a ValueType to a quantity: typedef Quantity!(City, BoundedInt!(0,100)) CityID; And BoundedInt is just a type implicitly-convertible to and from int, that supports assignment and equality and throws on an out-of-bounds assignment.I would implement something like Categorical in a language that has no enum. But I do not see the point in D. What advantage does it bring? Denis -- _________________ vita es estrany spir.wikidot.com
Mar 28 2011
On 3/28/11 6:09 PM, spir wrote:On 03/28/2011 10:13 PM, Cristi Cobzarenco wrote:A categorical type may not have a name for each value (userid, cityid, countryid...) Andrei- The Categorical type sounds like a great idea. I think they could be passed on as a ValueType to a quantity: typedef Quantity!(City, BoundedInt!(0,100)) CityID; And BoundedInt is just a type implicitly-convertible to and from int, that supports assignment and equality and throws on an out-of-bounds assignment.I would implement something like Categorical in a language that has no enum. But I do not see the point in D. What advantage does it bring? Denis
Mar 28 2011
Cristi Cobzarenco wrote:First, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hope you'll forgive me if that's the case. ======= Google Summer of Code Proposal: Statically Checked Units ======= Abstract ------------- Measurement units allow to statically check the correctness of assignments and expressions at virtually no performance cost and very little extra effort. When it comes to physics the advantages are obvious – if you try to assign a force a variable measuring distance, you've most certainly got a formula wrong somewhere along the way. Also, showing a sensor measurement in gallons on a litre display that keeps track of the remaining fuel of a plane (a big no-no) is easily avoidable with this technique. What this translates is that one more of the many hidden assumptions in source code is made visible: units naturally complement other contract checking techniques, like assertions, invariants and the like. After all the unit that a value is measured in is part of the contract.This is one of those features that gets proposed frequently in multiple languages. It's a great example for metaprogramming. But, are there examples of this idea being seriously *used* in production code in ANY language? (For example, does anybody actually use Boost.Unit?)
Mar 29 2011
To Don: That is a very good point and I agree that one shouldn't implement features just because they're popular. There don't seem to be many (if any projects) But, I think the reason Boost.Units isn't use hasn't got much to do with th= e idea as much as it does with the implementation. Using units in Boost is very cumbersome. Adding new units (measuring different dimensions) on-the-fly is virtually impossible. I think that that Boost.Units misses th= e point of units. They should be a natural extension of the type system of a language, not something so limited to the area of natural sciences. D is a new language and we should be pushing the envelope, just because the Boost failed (if it did, it may very well kick-off later) doesn't mean we shouldn't do it. Since it is such a new feature, I think we should talk about its potential rather than its acceptance. d people are still trying to figure out exactly how to use it. I feel that in some conventions and good practices. As I said in the abstract, I think the feature fits snugly with other mechanisms in D and seems to be a natural part of a contract-based design, so D programmers should have a predisposition (that C++ programmers might not have) of adopting such a feature. I really hope this doesn't come off as rude; as I said, you make a very goo= d point, one that needs answering. I guess what I'm saying can be summed up as: it is a new feature; there have been mistakes; it has a lot of potentia= l and we can make it better. I'd be curious to hear what you think. To spir: Calling the string representation a small domain specific language is perfect. It is just that, a way of writing arithmetic expressions between types - something we couldn't do inside D grammar. It's much like the lambd= a definitions in functional. I too am queasy about using strings to represent code, but I think that small DSLs that save effort and improve readability is one place where it's OK. Parsing the expressions at compile time will be fun, thankfully one only needs a stack to do that (Djikstra's shunting yard algorithm) which very is to implement in the functional-style metaprogramming land. To David: Using T.stringof, we can define a total order on types, based on their typenames. I'm still thinking about conversion. To Andrei and Jens: std.datetime won't be simplified _that_ much, but it will probably require some work so that it uses the same unit system as the future units library. I would, of course, take care of this as well. On 29 March 2011 09:06, Don <nospam nospam.com> wrote:Cristi Cobzarenco wrote:derFirst, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consi=eit. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hop=Units =3D=3D=3D=3D=3D=3D=3Dyou'll forgive me if that's the case. =3D=3D=3D=3D=3D=3D=3D Google Summer of Code Proposal: Statically Checked=tsAbstract ------------- Measurement units allow to statically check the correctness of assignmen=ry toand expressions at virtually no performance cost and very little extra effort. When it comes to physics the advantages are obvious =96 if you t=aassign a force a variable measuring distance, you've most certainly got =t informula wrong somewhere along the way. Also, showing a sensor measuremen=lanegallons on a litre display that keeps track of the remaining fuel of a p=tes(a big no-no) is easily avoidable with this technique. What this transla=likeis that one more of the many hidden assumptions in source code is made visible: units naturally complement other contract checking techniques, =lesassertions, invariants and the like. After all the unit that a value is measured in is part of the contract.This is one of those features that gets proposed frequently in multiple languages. It's a great example for metaprogramming. But, are there examp=of this idea being seriously *used* in production code in ANY language? (For example, does anybody actually use Boost.Unit?)--=20 (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco
Mar 29 2011
Cristi Cobzarenco wrote:To Don: That is a very good point and I agree that one shouldn't implement features just because they're popular. There don't seem to be many (if But, I think the reason Boost.Units isn't use hasn't got much to do with the idea as much as it does with the implementation. Using units in Boost is very cumbersome. Adding new units (measuring different dimensions) on-the-fly is virtually impossible. I think that that Boost.Units misses the point of units. They should be a natural extension of the type system of a language, not something so limited to the area of natural sciences. D is a new language and we should be pushing the envelope, just because the Boost failed (if it did, it may very well kick-off later) doesn't mean we shouldn't do it. Since it is such a new feature, I think we should talk about its potential rather than its acceptance. and people are still trying to figure out exactly how to use it. I feel agree on some conventions and good practices. As I said in the abstract, I think the feature fits snugly with other mechanisms in D and seems to be a natural part of a contract-based design, so D programmers should have a predisposition (that C++ programmers might not have) of adopting such a feature. I really hope this doesn't come off as rude; as I said, you make a very good point, one that needs answering. I guess what I'm saying can be summed up as: it is a new feature; there have been mistakes; it has a lot of potential and we can make it better. I'd be curious to hear what you think.I'm a physicist and most of my programming involves quantities which have units. Yet, I can't really imagine myself using a units library. A few observations from my own code: * For each dimension, choose a unit, and use it throughout the code. For example, my code always uses mm because it's a natural size for the work I do. Mixing (say) cm and m is always a design mistake. Scaling should happen only at input and output, not in internal calculations. (So my feeling is, that the value of a units library would come from keeping track of dimension rather than scale). * Most errors involving units can, in my experience, easily be flushed out with a couple of unit tests. This is particularly true of scale errors. The important use cases would be situations where that isn't true. * Arrays are very important. Although an example may have force = mass * accelaration, in real code mass won't be a double, it'll be an array of doubles.Since it is such a new feature, I think we should talk about its potential rather than its acceptance.I'm really glad you've said that. It's important to be clear that doing a perfect job on this project does not necessarily mean that we end up with a widely used library. You might be right that the implementations have held back widespread use -- I just see a significant risk that we end up with an elegant, well written library that never gets used. If the author is aware of that risk, it's OK. If not, it would be a very depressing thing to discover after the project was completed.
Mar 29 2011
To David: Ok, right now, I got two working versions, one sorting by .mangleof and one performing a double-inclusion test on the tuples. Both work, I can't see any performance increase in the .mangleof one, but if .mangleof returns unique string, I say we use it this way. Regarding my string little DSL. I have 3 solutions right now: 1. Drop the DSL altogether, right now my system would work perfectly fine with boost-like tuples (a list of units alternating with exponents): Quantity!(Metre,1,Second,-1) speed = distance/time; While less readable, this doesn't have the disadvantages of the following 2. 2. Use a mixin template to declare the expression parser in the current scope: mixin DeclareExprQuantity!(); struct Metre {} struct Second {} struct Kg {} void f() { ExprQuantity!("Metre/Second * Kg^-1") q = speed / mass; } This works, is readable, but it uses C-preprocessor like behaviour (read: black vodoo) - a library declaring something in your scope isn't very nice. 3. Abandon using types as units and just use strings all the way. This doesn't guarantee unit name uniqueness and a misspelled unit name is a new unit. One could use an algorithm to convert all strings to a cannonical form (like Andrei suggested) and then use string equality for unit equality. What do you think, I'm personally quite divided: 1. I like that this is simple and it works. It make writing derived units unnatural though. 2. I actually like this one, despite the obvious ugliness. It's just one extra line at the beginning of your code and you can the use arithmetic operations and use type-uniqueness to guarantee unit-uniqueness. 3. This is a bit dangerous. It works very well as long as there isn't more than one system of units. I still like it a bit. The only completely clean alternative would be the abominable: Quantity!( mixin(Expr!("Metre/Second")) ) q; To Don: * Choosing one unit and using it is still a very good idea. As I said there are to be no implicit conversions, so this system would ensure you don't, by mistake, adhere to this convention. Also, if somebody else uses your library maybe they assume everything is in meters when in fact you use milimeters. Sure they should check the documentation, but it's better if they get a nice error message "Inferred unit Meter doesn't match expected Milimeter", or something like that. * True, scale errors can be figured out easily, multiplying something with an acceleration instead of velocity, or forgetting to multiply acceleration by a timestep isn't as easily checked. Multiplying instead of dividing in a formula, or forgetting to divide by a normalisation constant are other things you may forget, and are caught instantly by unit checking. * Arrays & vectors are very important, I agree. The Quantity! type is parametrised both by a unit and a value type, therefore, if one wants a vector whose components are of the same unit, using "Quantity!(Metre, Vector!(double))" would work. Vector!(Quantity!(Metre,double)) would also work. As long as the value type has arithmetic operations defined everything works out. Same goes for arrays. There is a risk that it never gets used, sure. But I think that units will become commonplace, some time in the future, so while it won't get wide acceptance very soon, at some point people will be looking on Wikipedia for "Languages supporting measurement units" and it will be good for D to show up there. On 29 March 2011 14:36, Don <nospam nospam.com> wrote:Cristi Cobzarenco wrote:-- (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarencoTo Don: That is a very good point and I agree that one shouldn't implement features just because they're popular. There don't seem to be many (if any But, I think the reason Boost.Units isn't use hasn't got much to do with the idea as much as it does with the implementation. Using units in Boost is very cumbersome. Adding new units (measuring different dimensions) on-the-fly is virtually impossible. I think that that Boost.Units misses the point of units. They should be a natural extension of the type system of a language, not something so limited to the area of natural sciences. D is a new language and we should be pushing the envelope, just because the Boost failed (if it did, it may very well kick-off later) doesn't mean we shouldn't do it. Since it is such a new feature, I think we should talk about its potential rather than its acceptance. people are still trying to figure out exactly how to use it. I feel that in some conventions and good practices. As I said in the abstract, I think the feature fits snugly with other mechanisms in D and seems to be a natural part of a contract-based design, so D programmers should have a predisposition (that C++ programmers might not have) of adopting such a feature. I really hope this doesn't come off as rude; as I said, you make a very good point, one that needs answering. I guess what I'm saying can be summed up as: it is a new feature; there have been mistakes; it has a lot of potential and we can make it better. I'd be curious to hear what you think.I'm a physicist and most of my programming involves quantities which have units. Yet, I can't really imagine myself using a units library. A few observations from my own code: * For each dimension, choose a unit, and use it throughout the code. For example, my code always uses mm because it's a natural size for the work I do. Mixing (say) cm and m is always a design mistake. Scaling should happen only at input and output, not in internal calculations. (So my feeling is, that the value of a units library would come from keeping track of dimension rather than scale). * Most errors involving units can, in my experience, easily be flushed out with a couple of unit tests. This is particularly true of scale errors. The important use cases would be situations where that isn't true. * Arrays are very important. Although an example may have force = mass * accelaration, in real code mass won't be a double, it'll be an array of doubles.Since it is such a new feature, I think we should talk about its potential rather than its acceptance.I'm really glad you've said that. It's important to be clear that doing a perfect job on this project does not necessarily mean that we end up with a widely used library. You might be right that the implementations have held back widespread use -- I just see a significant risk that we end up with an elegant, well written library that never gets used. If the author is aware of that risk, it's OK. If not, it would be a very depressing thing to discover after the project was completed.
Mar 29 2011
On 3/29/11 3:49 PM, Cristi Cobzarenco wrote:To David: Ok, right now, I got two working versions, one sorting by .mangleof and one performing a double-inclusion test on the tuples. Both work, I can't see any performance increase in the .mangleof one, but if .mangleof returns unique string, I say we use it this way.To be honest, I still don't see how you are able to get away without canonicalization in the first place; would you mind to elaborate on how you solve the issue of different ordering of expression yielding types? This is not about the algorithm to determine whether two types are semantically equivalent, where your algorithm would work fine as well, but about the actual D types. If you don't sort them, Quantity!(BaseUnitExp!(Meter, 1), BaseUnitExp!(Second, -2)) and Quantity!(BaseUnitExp!(Second, -2), BaseUnitExp!(Meter, 1)) would be different types, which is not desirable for obvious reasons.Regarding my string little DSL. I have 3 solutions right now: 1. Drop the DSL altogether, right now my system would work perfectly fine with boost-like tuples (a list of units alternating with exponents): Quantity!(Metre,1,Second,-1) speed = distance/time; While less readable, this doesn't have the disadvantages of the following 2. 2. Use a mixin template to declare the expression parser in the current scope: mixin DeclareExprQuantity!(); struct Metre {} struct Second {} struct Kg {} void f() { ExprQuantity!("Metre/Second * Kg^-1") q = speed / mass; } This works, is readable, but it uses C-preprocessor like behaviour (read: black vodoo) - a library declaring something in your scope isn't very nice. […] The only completely clean alternative would be the abominable: Quantity!( mixin(Expr!("Metre/Second")) ) q;Get out of my head! Immediately! ;) Just kidding – incidentally I considered exactly the same options when designing my current prototype. My current approach would be a mix between 1 and 2: I don't think the Boost approach of using »dummy« instances of units is any less readable than your proposed one when you don't deal with a lot of units. For example, consider enum widgetCount = quantity!("Widget")(2); vs. enum widgetCount = 2 * widgets; This could also be extended to type definitions to avoid having to manually write the template instantiation: Quantity!("meter / second", float) speed; vs. typeof(1.f * meter / second) speed; There are situations, though, where using unit strings could considerable improve readability, namely when using lots of units with exponents. In these cases, a mixin could be used to bring all the types in scope for the »parsing template«, similar to the one you suggested. If a user of the library things could use an additional mixin identifier to clarify the code, e.g. »mixin UnitStringParser U; […] U.unit!"m/s"«). But a more attractive solution would exploit the fact that you would mostly use units with a lot of exponents when working with a »closed« unit system without the need for ad-hoc extensions, like the SI system, which would allow you to use unit symbols instead of the full name, which wouldn't need to be globally unique and wouldn't pollute the namespace (directly defining a type »m« to express meters would probably render the module unusable without static imports). It would essentially work by instantiating a parser template with all the named units. Thus, the parser would know all the types and could query them for additional properties like short names/symbols, etc. In code: --- module units.si; […] alias UnitSystem!(Meter, Second, …) Si; --- module client; import units.si; auto inductance = 5.0 * Si.u!"m^2 kg/(s^2 A^2)"; --- This could also be combined with the mixin parser approach like this: --- import units.si; mixin UnitStringParser!(Si) U; --- But to reiterate my point, I don't think a way to parse unit strings is terribly important, at least not if it isn't coupled with other things like the ability to add shorthand symbols. David
Mar 29 2011
Well they don't _have_ to be the same type as long they're convertible to one another, and one can make sure they're convertible based on the result of the double-inclusion. It does make more sense for them to be the same type, I agree, therefore I'll be sticking to the .mangleof version. Dummy objects are fine, the only problem is the fact that one has to define the extra objects (and, when one wants to count objects, you'll need to define = a different type). I was also considering shorthand symbols, since it seems like a natural addition, I'll have think a bit more on how exactly to do that, to avoid collisions. Regarding if this is appropriate for GSoC. It doesn't take me 12 weeks to write a prototype, sure. But to have a library fit to be part of the standard library of language takes a lot of laborious testing, going throug= h use-cases and making sure it is highly usable. Also, there should be an effort to make sure other libraries make use of units when appropriate (lik= e Andrei suggested std.datetime) etc. As we all know it takes 90% of the time to code 10% of the code. I think all of this extra polish, reliability and usability is very important and takes the extra 11 weeks. It's not the most glorified kind of work, but I really think it's worth it. On 29 March 2011 23:40, David Nadlinger <see klickverbot.at> wrote:On 3/29/11 3:49 PM, Cristi Cobzarenco wrote:ouTo David: Ok, right now, I got two working versions, one sorting by .mangleof and one performing a double-inclusion test on the tuples. Both work, I can't see any performance increase in the .mangleof one, but if .mangleof returns unique string, I say we use it this way.To be honest, I still don't see how you are able to get away without canonicalization in the first place; would you mind to elaborate on how y=solve the issue of different ordering of expression yielding types? This =isnot about the algorithm to determine whether two types are semantically equivalent, where your algorithm would work fine as well, but about the actual D types. If you don't sort them, Quantity!(BaseUnitExp!(Meter, 1), BaseUnitExp!(Second, -2)) and Quantity!(BaseUnitExp!(Second, -2), BaseUnitExp!(Meter, 1)) would be different types, which is not desirable =forobvious reasons. Regarding my string little DSL. I have 3 solutions right now::1. Drop the DSL altogether, right now my system would work perfectly fine with boost-like tuples (a list of units alternating with exponents)=gQuantity!(Metre,1,Second,-1) speed =3D distance/time; While less readable, this doesn't have the disadvantages of the followin=My2. 2. Use a mixin template to declare the expression parser in the current scope: mixin DeclareExprQuantity!(); struct Metre {} struct Second {} struct Kg {} void f() { ExprQuantity!("Metre/Second * Kg^-1") q =3D speed / mass; } This works, is readable, but it uses C-preprocessor like behaviour (read: black vodoo) - a library declaring something in your scope isn't very nice. [=85] The only completely clean alternative would be the abominable: Quantity!( mixin(Expr!("Metre/Second")) ) q;Get out of my head! Immediately! ;) Just kidding =96 incidentally I considered exactly the same options when designing my current prototype. =current approach would be a mix between 1 and 2: I don't think the Boost approach of using =BBdummy=AB instances of units is any less readable tha=n yourproposed one when you don't deal with a lot of units. For example, consid=erenum widgetCount =3D quantity!("Widget")(2); vs. enum widgetCount =3D 2 * widgets; This could also be extended to type definitions to avoid having to manual=lywrite the template instantiation: Quantity!("meter / second", float) speed; vs. typeof(1.f * meter / second) speed; There are situations, though, where using unit strings could considerable improve readability, namely when using lots of units with exponents. In these cases, a mixin could be used to bring all the types in scope for th=e=BBparsing template=AB, similar to the one you suggested. If a user of th=elibrary things could use an additional mixin identifier to clarify the code, e.g. =BBmixin UnitStringParser U; [=85] U.unit!"m/s"=AB). But a more attractive solution would exploit the fact that you would most=lyuse units with a lot of exponents when working with a =BBclosed=AB unit s=ystemwithout the need for ad-hoc extensions, like the SI system, which would allow you to use unit symbols instead of the full name, which wouldn't ne=edto be globally unique and wouldn't pollute the namespace (directly defini=nga type =BBm=AB to express meters would probably render the module unusabl=ewithout static imports). It would essentially work by instantiating a parser template with all the named units. Thus, the parser would know all the types and could query th=emfor additional properties like short names/symbols, etc. In code: --- module units.si; [=85] alias UnitSystem!(Meter, Second, =85) Si; --- module client; import units.si; auto inductance =3D 5.0 * Si.u!"m^2 kg/(s^2 A^2)"; --- This could also be combined with the mixin parser approach like this: --- import units.si; mixin UnitStringParser!(Si) U; --- But to reiterate my point, I don't think a way to parse unit strings is terribly important, at least not if it isn't coupled with other things li=kethe ability to add shorthand symbols. David--=20 (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco
Mar 29 2011
On 3/30/11 12:20 AM, Cristi Cobzarenco wrote:Well they don't _have_ to be the same type as long they're convertible to one another, and one can make sure they're convertible based on the result of the double-inclusion.But how would you make them _implicitly_ convertible then? David
Mar 29 2011
By making the operators on quantity templates: ref Quantity opAssign(U)( Quantity!(U) u2 ) { static assert( SameUnit!(U,Unit) ); this.value = u2.value; return this; } Same for addition, subtraction and equality. Multiplication and division will have to have a different return type. Seems right to me, am I missing something? On 30 March 2011 00:39, David Nadlinger <see klickverbot.at> wrote:On 3/30/11 12:20 AM, Cristi Cobzarenco wrote:-- (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarencoWell they don't _have_ to be the same type as long they're convertible to one another, and one can make sure they're convertible based on the result of the double-inclusion.But how would you make them _implicitly_ convertible then? David
Mar 30 2011
On 3/30/11 11:21 AM, Cristi Cobzarenco wrote:Seems right to me, am I missing something?opAssign isn't taken into consideration when initializing variables or passing values to functions. An example probably says more than thousand words: --- struct Test { ref Test opAssign(int i) { value = i; return this; } int value; } void foo(Test t) {} void main() { // Neither of the following two lines compiles, IIRC: Test t = 4; // (1) foo(4); // (2) } --- You can make case (1) work by defining a static opCall taking an int, which will be called due to property syntax, but I can't think of any solution for (2). David
Mar 30 2011
Maybe OT, but here's some hackish wizardry you can do with classes: class Test { int value; this(int x) { value = x; } } ref Test foo(Test t ...) { return tuple(t)[0]; } void main() { auto result = foo(4); assert(result.value == 4); } The Tuple is used to trick DMD into escaping the local `t` reference. This won't work with structs. And `t` should be constructed on the stack, but it seems the destructor gets called only after the exit from main. The docs do say that construction of classes in variadic arguments depend on the implementation. I asked on the newsgroups whether Typesafe Variadic Functions automatically calling a constructor was a good thing at all. Read about this feature here under "Typesafe Variadic Functions": http://www.digitalmars.com/d/2.0/function.html
Mar 30 2011
Yeah, you're right (case (1) also works with a template ctor as well - in C++ this would allow for implicit conversions as well, that's why I thought about using it this way). As I said, I had already abandoned this approach and decided on using .mangleof sorting anyway for elegance. I think my proposal write-up is almost ready, will submit it today or tomorrow. (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco On 30 March 2011 15:26, David Nadlinger <see klickverbot.at> wrote:On 3/30/11 11:21 AM, Cristi Cobzarenco wrote:Seems right to me, am I missing something?opAssign isn't taken into consideration when initializing variables or passing values to functions. An example probably says more than thousand words: --- struct Test { ref Test opAssign(int i) { value = i; return this; } int value; } void foo(Test t) {} void main() { // Neither of the following two lines compiles, IIRC: Test t = 4; // (1) foo(4); // (2) } --- You can make case (1) work by defining a static opCall taking an int, which will be called due to property syntax, but I can't think of any solution for (2). David
Mar 30 2011
Hmmm, the only problem with this is that we would have to require the library users to do this to their functions. Thanks for the suggestion but I'll stick with .mangleof sorting. (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco On 30 March 2011 16:49, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:Maybe OT, but here's some hackish wizardry you can do with classes: class Test { int value; this(int x) { value = x; } } ref Test foo(Test t ...) { return tuple(t)[0]; } void main() { auto result = foo(4); assert(result.value == 4); } The Tuple is used to trick DMD into escaping the local `t` reference. This won't work with structs. And `t` should be constructed on the stack, but it seems the destructor gets called only after the exit from main. The docs do say that construction of classes in variadic arguments depend on the implementation. I asked on the newsgroups whether Typesafe Variadic Functions automatically calling a constructor was a good thing at all. Read about this feature here under "Typesafe Variadic Functions": http://www.digitalmars.com/d/2.0/function.html
Mar 30 2011
Ok, my proposal is up, I'm looking forward to feedback. *fingers crossed* (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco On 30 March 2011 17:03, Cristi Cobzarenco <cristi.cobzarenco gmail.com>wrote:Hmmm, the only problem with this is that we would have to require the library users to do this to their functions. Thanks for the suggestion but I'll stick with .mangleof sorting. (Cristi Cobzarenco) Pofile: http://www.google.com/profiles/cristi.cobzarenco On 30 March 2011 16:49, Andrej Mitrovic <andrej.mitrovich gmail.com>wrote:Maybe OT, but here's some hackish wizardry you can do with classes: class Test { int value; this(int x) { value = x; } } ref Test foo(Test t ...) { return tuple(t)[0]; } void main() { auto result = foo(4); assert(result.value == 4); } The Tuple is used to trick DMD into escaping the local `t` reference. This won't work with structs. And `t` should be constructed on the stack, but it seems the destructor gets called only after the exit from main. The docs do say that construction of classes in variadic arguments depend on the implementation. I asked on the newsgroups whether Typesafe Variadic Functions automatically calling a constructor was a good thing at all. Read about this feature here under "Typesafe Variadic Functions": http://www.digitalmars.com/d/2.0/function.html
Apr 01 2011
On 03/29/2011 07:36 AM, Don wrote:I'm a physicist and most of my programming involves quantities which have units. Yet, I can't really imagine myself using a units library. A few observations from my own code: * For each dimension, choose a unit, and use it throughout the code. For example, my code always uses mm because it's a natural size for the work I do. Mixing (say) cm and m is always a design mistake. Scaling should happen only at input and output, not in internal calculations. (So my feeling is, that the value of a units library would come from keeping track of dimension rather than scale).Many of my bugs involving numeric code is that I mix scalars with units, not units of different scale. Andrei
Mar 29 2011
On 03/29/2011 03:49 PM, Cristi Cobzarenco wrote:To David: Ok, right now, I got two working versions, one sorting by .mangleof and one performing a double-inclusion test on the tuples. Both work, I can't see any performance increase in the .mangleof one, but if .mangleof returns unique string, I say we use it this way. Regarding my string little DSL. I have 3 solutions right now: 1. Drop the DSL altogether, right now my system would work perfectly fine with boost-like tuples (a list of units alternating with exponents): Quantity!(Metre,1,Second,-1) speed = distance/time; While less readable, this doesn't have the disadvantages of the following 2. 2. Use a mixin template to declare the expression parser in the current scope: mixin DeclareExprQuantity!(); struct Metre {} struct Second {} struct Kg {} void f() { ExprQuantity!("Metre/Second * Kg^-1") q = speed / mass; } This works, is readable, but it uses C-preprocessor like behaviour (read: black vodoo) - a library declaring something in your scope isn't very nice. 3. Abandon using types as units and just use strings all the way. This doesn't guarantee unit name uniqueness and a misspelled unit name is a new unit. One could use an algorithm to convert all strings to a cannonical form (like Andrei suggested) and then use string equality for unit equality. What do you think, I'm personally quite divided: 1. I like that this is simple and it works. It make writing derived units unnatural though. 2. I actually like this one, despite the obvious ugliness. It's just one extra line at the beginning of your code and you can the use arithmetic operations and use type-uniqueness to guarantee unit-uniqueness. 3. This is a bit dangerous. It works very well as long as there isn't more than one system of units. I still like it a bit.Have you considered 0. Derived units are declared? After all, relative to the size of an app, and the amount of work it represents, declaring actually used derived units is very a small burden. This means instead of: struct meter {} struct second {} auto dist = Quantity!"meter"(3.0); auto time = Quantity!"second"(2.0); auto speed = Quantity!"meter/second"(dist/time); auto surface = Quantity!"meter2"(dist*dist); one would write: struct meter {} struct second {} alias FractionUnit!(meter,second) meterPerSecond; alias PowerUnit!(meter,2) squareMeter; auto dist = Quantity!meter(3.0); auto time = Quantity!second(2.0); auto speed = Quantity!meterPerSecond(dist/time); auto surface = Quantity!squareMeter(dist*dist); This means you use struct templates as unit-id factories, for user's convenience. The constructor would then generate the metadata needed for unit-type checking, strored on the struct itself (this is far more easily using such struct templates than by parsing a string). In addition to the 2 struct templates above, there should be struct ProductUnit(Units...) {...} (accepting n base units); and I guess that's all, isn't it? The only drawback is that very complicated derived units need be constructed step by step. But this can also be seen as an advantage. An alternative may be to have a single, but more sophisticated and more difficult to use, struct template. I find several advantages to this approach: * Simplicity (also of implementation, I guess). * Unit identifiers are structs all along (both in code and in semantics). * No string mixin black voodoo. I guess even if this is not ideal, you could start with something similar, because it looks easier and cleaner (to me). A similar system may be used for units of diff scales in the same dimension: alias ScaleUnit!(mm,1_000_000) km; By the way, have you considered unit-less (pseudo-)magnitudes (I mean ratios, including %). I would have one declared and exported as constant. then, alias ScaleUnit!(voidUnit,0.001) perthousand;To Don: * Choosing one unit and using it is still a very good idea. As I said there are to be no implicit conversions, so this system would ensure you don't, by mistake, adhere to this convention. Also, if somebody else uses your library maybe they assume everything is in meters when in fact you use milimeters. Sure they should check the documentation, but it's better if they get a nice error message "Inferred unit Meter doesn't match expected Milimeter", or something like that.I agree with this. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 29 2011
On 03/29/2011 02:06 AM, Don wrote:Cristi Cobzarenco wrote:At work we use C++ enums for categorical types to great effect. The way it works is: enum UserId { min = 0, max = 1 << 31 }; enum AppId { min = 0, max = 1 << 31 }; then we express data in terms of UserID, AppId instead of an integral type, and we cast to it when we read it off the wire or the database. The beauty of it is that you can never pass by mistake an AppId instead of a UserId of vice versa, or even a raw int as one without explicitly stating intent. It's saved us a lot of bugs (I know because I found some when converting raw ints to enums) and presumably potential bugs. If we used quantities probably a similar benefit would emerge from using dimensional analysis. I know that in my machine learning code it's very difficult to spot bugs because "it's all numbers". If I used a sort of a double "enum" that could only be a probability, I'm sure I'd save myself a ton of bugs. AndreiFirst, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hope you'll forgive me if that's the case. ======= Google Summer of Code Proposal: Statically Checked Units ======= Abstract ------------- Measurement units allow to statically check the correctness of assignments and expressions at virtually no performance cost and very little extra effort. When it comes to physics the advantages are obvious – if you try to assign a force a variable measuring distance, you've most certainly got a formula wrong somewhere along the way. Also, showing a sensor measurement in gallons on a litre display that keeps track of the remaining fuel of a plane (a big no-no) is easily avoidable with this technique. What this translates is that one more of the many hidden assumptions in source code is made visible: units naturally complement other contract checking techniques, like assertions, invariants and the like. After all the unit that a value is measured in is part of the contract.This is one of those features that gets proposed frequently in multiple languages. It's a great example for metaprogramming. But, are there examples of this idea being seriously *used* in production code in ANY language? (For example, does anybody actually use Boost.Unit?)
Mar 29 2011
On 03/29/2011 04:45 PM, Andrei Alexandrescu wrote:On 03/29/2011 02:06 AM, Don wrote:Waow, this is a great explanation of expected benefits of units, I guess. Also, isn't this precisely the power of true typedefs? Denis -- _________________ vita es estrany spir.wikidot.comCristi Cobzarenco wrote:At work we use C++ enums for categorical types to great effect. The way it works is: enum UserId { min = 0, max = 1 << 31 }; enum AppId { min = 0, max = 1 << 31 }; then we express data in terms of UserID, AppId instead of an integral type, and we cast to it when we read it off the wire or the database. The beauty of it is that you can never pass by mistake an AppId instead of a UserId of vice versa, or even a raw int as one without explicitly stating intent. It's saved us a lot of bugs (I know because I found some when converting raw ints to enums) and presumably potential bugs. If we used quantities probably a similar benefit would emerge from using dimensional analysis. I know that in my machine learning code it's very difficult to spot bugs because "it's all numbers". If I used a sort of a double "enum" that could only be a probability, I'm sure I'd save myself a ton of bugs.First, let me apologize for this very late entry, it's the end of university and it's been a very busy period, I hope you will still consider it. Note this email is best read using a fixed font. PS: I'm really sorry if this is the wrong mailing list to post and I hope you'll forgive me if that's the case. ======= Google Summer of Code Proposal: Statically Checked Units ======= Abstract ------------- Measurement units allow to statically check the correctness of assignments and expressions at virtually no performance cost and very little extra effort. When it comes to physics the advantages are obvious – if you try to assign a force a variable measuring distance, you've most certainly got a formula wrong somewhere along the way. Also, showing a sensor measurement in gallons on a litre display that keeps track of the remaining fuel of a plane (a big no-no) is easily avoidable with this technique. What this translates is that one more of the many hidden assumptions in source code is made visible: units naturally complement other contract checking techniques, like assertions, invariants and the like. After all the unit that a value is measured in is part of the contract.This is one of those features that gets proposed frequently in multiple languages. It's a great example for metaprogramming. But, are there examples of this idea being seriously *used* in production code in ANY language? (For example, does anybody actually use Boost.Unit?)
Mar 29 2011
On 3/29/11 2:17 PM, spir wrote:On 03/29/2011 04:45 PM, Andrei Alexandrescu wrote: Waow, this is a great explanation of expected benefits of units, I guess. Also, isn't this precisely the power of true typedefs?Typedefs would not allow defining categorical types (e.g. no arithmetic). Fortunately there are already means in the language for defining such types. Andrei
Mar 29 2011