www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Cantor tuple function, BigInt, and integer overflow

Hello all,

I've been playing around trying to implement the Cantor tuple function (see 
attached code), which is essentially a way to map a sequence x1, x2, ..., xN of 
non-negative integers to a single unique number.  See:
https://en.wikipedia.org/wiki/Pairing_function#Cantor_pairing_function

The inspiration came from a need to check pairs of numbers for uniqueness.

The current implementation is a bit stupid and naive (it falls into the same 
trap as Andrei identifies in TDPL for the naive recursive Fibonacci function), 
but that's a deliberate short-term choice just to get something working-ish.

The challenge here is that as the sequence length increases, we very quickly
get 
to the point where the number calculated will overflow the bounds of any
inbuilt 
integer type.  Indeed, even where you're just dealing with a pair of numbers, 
you risk overflow if the values are close to the maximum of their given type. 
For this reason I've made use of BigInt internally and as a return value.

However, it occurs to me that it would be nice to avoid BigInt unless it's 
necessary, both for speed/memory reasons and for convenience.  I'm wondering if 
anyone has any advice on how to do this safely, i.e. to definitively check that 
whether overflow is going to happen and if it is, to make use of a larger-scale 
type.

I'd also like to be able to implement the cantorPair function so that it can 
take any integer input _including_ BigInt; however, implementing a templated 
version runs into the difficulty of clashing with the BigInt version.  One 
approach I tried involved a private function taking only BigInt input, with a 
templated version that takes arbitrary input types and converts them.  This ran 
into trouble since I could not find a ready way to check if an input was itself 
a BigInt.  So again, any advice here is welcome. :-)

Incidentally, a few notes on BigInt:

     -- it would be nice if there was a toULong as well as a toLong function

     -- the following fails:

                BigInt x = BigInt(y);

        ... if y is itself a BigInt.  Worth adding to BigInt's constructor to
        correct this?

Advice, thoughts, etc., welcome.  This seemed like something that could be 
useful for Phobos if cleaned up and well implemented .... ?
Dec 20 2012