digitalmars.D - More on invariants

bearophile (196/196) Mar 17 2010 D is the first language that I've used that has contract programming, so...
bearophile <bearophileHUGS lycos.com> writes:
D is the first language that I've used that has contract programming, so I've
studied some about this topic.

In a recent post I have discussed about some possible (silly) new parts for the
contract programming; this post is just about class/struct invariants and loop
invariants.

Some of the following things come from the nice slides:
http://www.cse.yorku.ca/course/3311/slides/08-Assertions.pdf

--------------------

Loop Invariant:

A loop invariant expresses a relation that is true when the loop is entered, or
entered once more. 

I've found a small example in Eiffel:

Inserting an element into a sorted singly linked list

p = pointer to the current item
pp = pointer to the list item precedent to the item pointed by p


row := matElem.row
column := matElem.column
from
  p := rowList   row
invariant
  predecessor_relation : (pp = void and p = rowList   row) or
                         (pp /= void and pp.next = p)
  predecessor_before_column: pp = void or pp.column < column

  -- forall k : rowList   row .. pp :: k.column < column
  data_less_than : column_limit(rowList   row, pp, agent less_than(?, column))
until
  p = void or p.column >= column
loop
  pp := p
  p := p.next
end




I have translated it to D2 (I have removed the semantics of matrix, and added
driving code to show it works, I have not written unittests here):


import std.stdio: write, writeln;

struct Node {
    immutable int data; // the cargo can't be changed
    Node* next;
    this(const int d, Node* next_ptr=null) {
        data = d;
        next = next_ptr;
    }
}


void printList(const(Node)* p) {
    write("List: ");
    for (; p != null; p = p.next)
        write(p.data, " ");
    writeln();
}


void prependNode(ref Node* list_header, const int d)
    // list_header is a reference of a mutable pointer that's
    // the header pointer of a const list
    out {
        assert(list_header != null);
    } body {
        Node* ptr = new Node(d, list_header);
        list_header = ptr;
    }


void insertSorted(ref Node* list_header, Node* nptr)
    // nptr is a const pointer to a mutable node,
    // I don't know how to express this
    in {
        assert(nptr != null);
        assert(list_header != nptr);
        //assert(nptr.next == null); // probably too much restrictive

        // the list is sorted
        debug { // this is slow, so it's inside a debug
            if (list_header != nptr) {
                Node* p = list_header; // pointer to the current Node
                Node* pp; // pointer to the precedent Node

                while (p != null) {
                    if (pp != null)
                        assert(pp.data <= p.data);
                    pp = p;
                    p = p.next;
                }
            }
        }
    } out {
        assert(list_header != null);

        // the list is sorted and contains the given node
        debug {
            Node* p = list_header; // pointer to the current Node
            Node* pp; // pointer to the precedent Node

            while (p != null) {
                if (pp != null)
                    assert(pp.data <= p.data);
                pp = p;
                p = p.next;
            }
        }

    } body {
        Node* p = list_header; // pointer to the current Node
        Node* pp; // pointer to the precedent Node

        while (p != null && p.data < nptr.data) {
            /*invariant*/ { // loop invariant --------------------
                // predecessor relation
                assert((pp == null && p == list_header) || (pp != null &&
pp.next == p));

                // predecessor before data
                assert(pp == null || pp.data < nptr.data);

                // current before data
                assert(p == null || p.data < nptr.data); // extra

                // data less than
                // forall k : rowList   row .. pp :: k.column < column
                debug { // slow
                    if (pp != null) {
                        Node* pk = list_header;
                        while (pk != pp) {
                            assert(pk.data < nptr.data);
                            pk = pk.next;
                        }
                    }
                }
            } // end loop invariant --------------------

            pp = p;
            p = p.next;
        }

        assert((pp == null && p == list_header) || (pp != null && pp.next ==
p));
        assert(pp == null || pp.data < nptr.data);

        if (pp != null)
            pp.next = nptr;
        else
            list_header = nptr;
        nptr.next = p;
    }


void main() {
    Node* lh;
    foreach (x; [20, 15, 12, 11, 11, 9, 3, 0])
        prependNode(lh, x);
    printList(lh); // 0 3 9 11 11 12 15 20

    insertSorted(lh, new Node(13));
    printList(lh); // 0 3 9 11 11 12 13 15 20

    insertSorted(lh, new Node(-1));
    printList(lh); // -1 0 3 9 11 11 12 13 15 20

    insertSorted(lh, new Node(30));
    printList(lh); // -1 0 3 9 11 11 12 13 15 20 30
}


I have added those debug{} because they are heavy tests that change the
complexity of the code.

That code inside the debug{} is too much long and it looks like it can contain
more bugs than the code it guards.


A loop invariant syntax can be useful in D because:
- It self-documents its purpose;
- You can't forget to create a new scope, this is a little safer;
- In future it can be enforced to be  readonly (that is, the programmer can be
sure the code inside the invariant can't modify the variabiles in outside the
scope of the invariant, but can read them);
- There can be only one loop invariant, and it must be before the body of the
loop. This helps better separate the asserts from the body of the loop, and
keeps things tidy.


A possible syntax for D2 for loop invariant, no new keywords necessary, and the
syntax is essentially a cross between precodition and class invariant:

while (...)
invariant { } // only allowed at the top of the loop, only 1 optional invariant
body {} // body of the loop

for (...)
invariant { }
body {}



Using that syntax the loop becomes:


while (p != null && p.data < nptr.data)
    invariant {
        // predecessor relation
        assert((pp == null && p == list_header) || (pp != null && pp.next ==
p));

        // predecessor before data
        assert(pp == null || pp.data < nptr.data);

        // current before data
        assert(p == null || p.data < nptr.data); // extra
        
        //...
    } body {	
        pp = p;
        p = p.next;
    }



Note: this loop invariant syntax is an additive change, it can't break older D2
code, so it can be added later.

Note2: While reading Eiffel docs I have read: "Use comments if you cannot write
an executable assertion". So it says that in some situations inside the
conditions/invariants it's better to have a comment readable by just the
programmer, than nothing.

------------------

Class Invariant:

This is a short example, a class written in what I think is Eiffel-like
pseudolanguage, it shows some things/features:


class CITIZEN feature
    name, sex, age : VALUE
    spouse : CITIZEN
    children, parents : SET[CITIZEN]
    single : BOOLEAN is ensure Result iff (spouse = Void) end
    divorce is
        require  not single
        ensure  single and (old spouse).single
    end
invariant
        single or spouse.spouse = Current
        parents.count = 2
        for_all  c member_of children  it_holds
             (exists p member_of c.parents it_holds p = Current)
end


With some care all this can be written in with class invariant in D too.

--------------------

Things missing in D Contract programming:

Very useful and probably easy to do:
- class invariant has to be const (readonly instance).
- pre/post conditions can't be able to modify input arguments.

Useful:
- pre/post conditions can't be able to modify variables of outer/global scope
( readonly).
- no loop invariant.
- compact syntax (*)

Less useful:
- no loop variant.


(*) This comes from my experiments and from reading Eiffel docs. The code
inside the conditions/invariants has to be short and readable. So it's better
it to be quite high-level. Python is very good in this, it has list comps and
other things that shorten the code a lot.
In Eiffel you can't put generic code in conditions and invariants, probably to
keep them short and almost bug-free. If you want to perform more complex tests
Eiffel allows you to call another function from there, that can be debugged
normally. So in that D example code it's better to move the debug code to
external functions.

Bye,
bearophile
Mar 17 2010
D Programming

C/C++ Programming

Other

digitalmars.D - More on invariants