www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Value-based function overloading

reply Nod <Nod_member pathlink.com> writes:
I know a good idea when I see one, and I believe this topic is good enough to
warrant a proper discussion. I will present a short summary and a proposed
integration into D. Yeah, I know it's a bit "up there"... Maybe for 3.0? :)

/****************************************
* Summary:
***/
As the name suggests, value-based function overloading does function call
resolution based on the *values* of the function parameters in addition to the
type-based overloading already found in D.

Potential benefits are: clearer, more concise code, eases refactoring, separates
the special cases from main algorithm, overriding buggy libraries easy.

Potential dangers are: compiler complexity, it can be misused to create
unreadable code (but what can't?).

For reference, these are interesting:
Antti(first):  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23643
David Medlock: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23651
David Medlock: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23686
Kevin Bealer:  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23745

Maybe not interesting, but explains some benefits:
Me: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23732

/****************************************
* Proposed integration with D
***/

Syntax:
retType funcName(paramType paramName in paramRange) { ... }

Example:
void foo(char ch in '\n') { ... }
void foo(int i in 0..10) { ... }
void foo(Bar b in null) { ... }
void foo(char[] str in new Regex("/fooi/i")) { ... }

Nothing complex, a new keyword, in, which tests for membership in something, and
after that a list (array literal?) of values. The last one would require an
operator like opIn or something overloaded. This is only a draft though.

Implementation:
If all data is literal, and ranges checkable at compile-time, the full overload
resolution can happen at compile time. If not, the compiler needs generate the
if-else[1] construct needed to select the correct function.

Ambiguites are illegal. If one value range overlaps another, a compile-time
error is generated.

Example:
void foo(int a in 0..10) { ... }
void foo(int a in 5..15) { ... }  // illegal, range 5..15 overlaps 0..10

Multiple parameters can resolve each other's ambiguites and as such, two
overloads are only ambiguous if all parameters common between the two have
ambiguities.

Example:
1) void foo(int a in 0..10, char b in 'x') { ... }
2) void foo(int a in 5..15, char b in 'y') { ... }  // legal
3) void foo(int a in 0..5,  char b in 'y') { ... }  // legal, see below
4) void foo(int a in 0..5,  char b in 'x') { ... }  // illegal, ambiguous to 1)

Only two overloads are considered at any one time, which is why number three
above is legal; while both a is ambiguous with 1), and b is ambiguous with 2),
*both* are not ambiguous at the same time.

If ranges are not comparable at compile time, such as when using objects with
overloaded opCmp, the compiler must defer ambiguity checking until run-time.
This only needs to be done in debug builds.

It is not necessary for the overloaded functions to cover all possible value
ranges. If a value does not fit any of the overloaded functions, either a
compile-time error is emitted if data is compile-time checkable, or an
OverloadExecption is thrown at run-time.

Example:
1) void foo(int a in 1..4) { ... }
2) void foo(int a in 6..9) { ... }

int main()
{
int x = 5;
foo(2);  // calls 1)
foo(5);  // complains at compile-time
foo(x);  // throws at run-time
}

/****************************************
* In closing...
***/
I think one can compare this technique with forward references. While one
certainly can manage without it, heck we've done it for >10 years, it is/would
be a great relief to have.

So, what do you all think?
-Nod-


[1] For brevity, I will only mention if-else constructs, but when the value
ranges allow it, a switch-case construct could be used, or some combination of
the two.
May 17 2005
next sibling parent reply "G.Vidal" <gyvidal wanadoo.fr> writes:
I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}

We can also imagine this for strings regexps:

match (string) {
	RegExp (stuff..) {
		blabla
		break;
	}
	RegExp (other stuff) {
		blibli;
		break;
	}
	default: baba...
}

But the _real power_ is for recursive array:

This function count the number of elements > 0 in an array (recursivity is
useless here, but it's just an illustration)

int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}


  
			
May 18 2005
next sibling parent reply Matthias Becker <Matthias_member pathlink.com> writes:
But I'd prefer a match() statement similar to CamL's one:
[snip]
But the _real power_ is for recursive array:

This function count the number of elements > 0 in an array (recursivity is
useless here, but it's just an illustration)

int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}
Well if the only deonstructable operation is :: pattern matching isn't as powerfull as Caml's (or better ML's) pattern matching by far.
May 18 2005
parent "G.Vidal" <gyvidal wanadoo.fr> writes:
Le Wed, 18 May 2005 11:00:11 +0000, Matthias Becker a écrit :

 Well if the only deonstructable operation is :: pattern matching isn't as
 powerfull as Caml's (or better ML's) pattern matching by far.
Of course that was a mere illustration. All the other operations should be implemented, but as I was talking to people who potentially don't know pattern matching... well anyway
May 18 2005
prev sibling parent reply Nod <Nod_member pathlink.com> writes:
In article <pan.2005.05.18.09.22.26.707525 wanadoo.fr>, G.Vidal says...
I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}

We can also imagine this for strings regexps:

match (string) {
	RegExp (stuff..) {
		blabla
		break;
	}
	RegExp (other stuff) {
		blibli;
		break;
	}
	default: baba...
}

But the _real power_ is for recursive array:

This function count the number of elements > 0 in an array (recursivity is
useless here, but it's just an illustration)

int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}
Hey, I like the [x..y[ inclusive/exclusive ranges. The match statement looks scary though :) -Nod-
May 18 2005
next sibling parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nod schrieb am Wed, 18 May 2005 19:57:36 +0000 (UTC):
 In article <pan.2005.05.18.09.22.26.707525 wanadoo.fr>, G.Vidal says...
I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}
<snip>
 Hey, I like the [x..y[ inclusive/exclusive ranges.
<snip> That syntax is a nightmare. Try to write a parser that generates a simple token tree and compare the complexity with and without exclusive-range support. Is there any situation where exclusive ranges are neccesary and can't be replaced by inclusive ranges? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCi+Bs3w+/yD4P9tIRAnEHAKCY18ISk1OSixy7GT8+vnVeiYqHpgCfRWN/ SVSZOEy+YFDcAtm+ngavrGc= =/llO -----END PGP SIGNATURE-----
May 18 2005
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Thomas Kuehne wrote:
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Nod schrieb am Wed, 18 May 2005 19:57:36 +0000 (UTC):
 
In article <pan.2005.05.18.09.22.26.707525 wanadoo.fr>, G.Vidal says...

I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}
<snip>
Hey, I like the [x..y[ inclusive/exclusive ranges.
<snip> That syntax is a nightmare. Try to write a parser that generates a simple token tree and compare the complexity with and without exclusive-range support. Is there any situation where exclusive ranges are neccesary and can't be replaced by inclusive ranges? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCi+Bs3w+/yD4P9tIRAnEHAKCY18ISk1OSixy7GT8+vnVeiYqHpgCfRWN/ SVSZOEy+YFDcAtm+ngavrGc= =/llO -----END PGP SIGNATURE-----
hmm, I can suggest of the top of my head the following: assume initially that both ends are exclusive, unless these ends are explicitly used somewhere else by themselves foo( 1..5 ) //initially inclusive { .. } foo( 5 ) //the previous declaration becomes exclusive but this could easily lead to duplicate code nd/or redundancy.. foo( int i ) { //complex code } foo( 1..5 ) { //something else } foo( 5 ) { //this is not a special case, it's just for exclusion, so we //will probably copy-paste the complex code from the general //function, or call something like // default foo( 5 ); //but even if we do that, the redundancy is still there }
May 18 2005
parent Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hasan Aljudy schrieb am Wed, 18 May 2005 17:18:20 -0600:
 Thomas Kuehne wrote:
 
 Nod schrieb am Wed, 18 May 2005 19:57:36 +0000 (UTC):
 
In article <pan.2005.05.18.09.22.26.707525 wanadoo.fr>, G.Vidal says...

I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}
<snip>
Hey, I like the [x..y[ inclusive/exclusive ranges.
<snip> That syntax is a nightmare. Try to write a parser that generates a simple token tree and compare the complexity with and without exclusive-range support. Is there any situation where exclusive ranges are neccesary and can't be replaced by inclusive ranges?
hmm, I can suggest of the top of my head the following: assume initially that both ends are exclusive, unless these ends are explicitly used somewhere else by themselves
The borders should be inclusive, otherwise your run into problems with type.max/type.min.
 foo( 1..5 ) //initially inclusive
 { .. }
 foo( 5 ) //the previous declaration becomes exclusive
inclusive and without change of the border type foo(1..4) foo(5) Why do you need to the change the border evaluation in dependency of another declaration? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCi/Ao3w+/yD4P9tIRAgzsAJ91TDsGM6ZwEfUjVUywtbvZEE3bOQCeJ28y 9YkhdMPGqpEXsxJowsgF06A= =rUPG -----END PGP SIGNATURE-----
May 18 2005
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Thu, 19 May 2005 02:40:16 +0200, Thomas Kuehne wrote:

 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Nod schrieb am Wed, 18 May 2005 19:57:36 +0000 (UTC):
 In article <pan.2005.05.18.09.22.26.707525 wanadoo.fr>, G.Vidal says...
I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}
<snip>
 Hey, I like the [x..y[ inclusive/exclusive ranges.
<snip> That syntax is a nightmare. Try to write a parser that generates a simple token tree and compare the complexity with and without exclusive-range support. Is there any situation where exclusive ranges are neccesary and can't be replaced by inclusive ranges?
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . } -- Derek Melbourne, Australia 19/05/2005 9:23:37 AM
May 18 2005
parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Derek Parnell schrieb am Thu, 19 May 2005 09:25:53 +1000:
 On Thu, 19 May 2005 02:40:16 +0200, Thomas Kuehne wrote:

 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Nod schrieb am Wed, 18 May 2005 19:57:36 +0000 (UTC):
 In article <pan.2005.05.18.09.22.26.707525 wanadoo.fr>, G.Vidal says...
I like it.

But I'd prefer a match() statement similar to CamL's one:

void foo (int e) {
	match (e) {
		-1: 
			dothing(); 

		[-1..10[: // -1 included to ten excluded
			dothis();
			break;

		[10..15]:  zero included to fifteen included
			dothat(); 
			break;
		default: 
			do();
	}
}

That would be transformed by the compiler into this, more uglier
and difficult to understand:

void foo (int e) {
	if (e == -1) dothing();
	if ((e >= 0) && (e < 10)) {
		dothis();
		goto end; // break
	}
	if ((e >= 10) && (10 <= 15)) {
		dothat();
		goto end; // break;
	}
	do();
	end:
}
<snip>
 Hey, I like the [x..y[ inclusive/exclusive ranges.
<snip> That syntax is a nightmare. Try to write a parser that generates a simple token tree and compare the complexity with and without exclusive-range support. Is there any situation where exclusive ranges are neccesary and can't be replaced by inclusive ranges?
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . }
Floating types can't be used as switch conditionals. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCi+3O3w+/yD4P9tIRAnmNAKDPDpwm8hXOR5yw6wpJqH13j3S4MACgqCsI hzLEYiHYbxY28Wb4Lu6B0Jg= =7X2i -----END PGP SIGNATURE-----
May 18 2005
parent reply Derek Parnell <derek psych.ward> writes:
On Thu, 19 May 2005 03:37:18 +0200, Thomas Kuehne wrote:

 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Derek Parnell schrieb am Thu, 19 May 2005 09:25:53 +1000:
[snip]
 Is there any situation where exclusive ranges are neccesary and can't be
 replaced by inclusive ranges?
 
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . }
Floating types can't be used as switch conditionals.
Hey! Neither can ranges ;-) We are brainstorming here, right? If we can consider ranges then I think we can consider switches using any type. -- Derek Melbourne, Australia 19/05/2005 10:25:43 AM
May 18 2005
parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Derek Parnell schrieb am Thu, 19 May 2005 10:28:08 +1000:
 On Thu, 19 May 2005 03:37:18 +0200, Thomas Kuehne wrote:

 Derek Parnell schrieb am Thu, 19 May 2005 09:25:53 +1000:
[snip]
 Is there any situation where exclusive ranges are neccesary and can't be
 replaced by inclusive ranges?
 
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . }
Floating types can't be used as switch conditionals.
Hey! Neither can ranges ;-) We are brainstorming here, right? If we can consider ranges then I think we can consider switches using any type.
The problem with floating types are their rounding problems. Rewrite of your sample in inclusive ranges. real x; ... switch(x){ case 0.0 ... 1-real.epsilon: .... case 1.0 ... 2-real.epsilon: .... } Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCjENn3w+/yD4P9tIRAhT9AKCMEN/pAS8xH57tHllWHpvh1GGL5QCdG2Sf Sx0XzxFMvxHe8oGzAX6gQmE= =Jo3W -----END PGP SIGNATURE-----
May 19 2005
next sibling parent "Uwe Salomon" <post uwesalomon.de> writes:
 when what I really need is ...

    if (x >= 0.0 and < 1.0)
    {
       . . .
    } else if (x >= 1.0 and x < 2.0)
    {
       . . .
    }
Floating types can't be used as switch conditionals.
Hey! Neither can ranges ;-) We are brainstorming here, right? If we can consider ranges then I think we can consider switches using any type.
The problem with floating types are their rounding problems. Rewrite of your sample in inclusive ranges. real x; ... switch(x){ case 0.0 ... 1-real.epsilon: .... case 1.0 ... 2-real.epsilon: .... }
His sample could be already wrong. When comparing floating points on equality, you mostly mean something like this: if (x >= (0.0 - delta) && x < (1.0 - delta)) where delta will be greater than epsilon often. This is certainly a problem when dealing with floating points... i remember a program where this issue cost me almost a day for some nested if clauses (i needed some time to figure out where to add/subtract the delta). Ciao uwe
May 18 2005
prev sibling parent reply James Dunne <james.jdunne gmail.com> writes:
In article <d8ogr7.3b2.thomas-dloop laermschleuder.kuehne.cn>, Thomas Kuehne
says...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Derek Parnell schrieb am Thu, 19 May 2005 10:28:08 +1000:
 On Thu, 19 May 2005 03:37:18 +0200, Thomas Kuehne wrote:

 Derek Parnell schrieb am Thu, 19 May 2005 09:25:53 +1000:
[snip]
 Is there any situation where exclusive ranges are neccesary and can't be
 replaced by inclusive ranges?
 
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . }
Floating types can't be used as switch conditionals.
Hey! Neither can ranges ;-) We are brainstorming here, right? If we can consider ranges then I think we can consider switches using any type.
The problem with floating types are their rounding problems. Rewrite of your sample in inclusive ranges. real x; ... switch(x){ case 0.0 ... 1-real.epsilon: .... case 1.0 ... 2-real.epsilon: .... } Thomas
I like your example, Thomas; it's clear to the programmer, however the epsilon is specific to whichever floating point type you're finding ranges of. real's epsilon won't work for double and float, and I'm sure you were aware of this. I'm also sure you meant it as a simple example demonstrating the technique. But it got me thinking... A syntax of inclusive and exclusive bounds should be proposed to fix the range problems of floating point types that lets the compiler choose the correct epsilon - or even better - generate simple floating point compare operations for range checking. Basing my initial proposal on mathematical range syntax: case (1.0 .. 2.0): // x > 1.0 && x < 2.0 case [1.0 .. 2.0): // x >= 1.0 && x < 2.0 case (1.0 .. 2.0]: // x > 1.0 && x <= 2.0 case [1.0 .. 2.0]: // x >= 1.0 && x <= 2.0 I'm sure the compiler can detect range conflicts with this inclusive/exclusive range syntax with a few simple heuristics or rules. NB: I might have the meaning of () and [] reversed here, but I'm pretty sure this is mathematically correct. Regards, James Dunne
May 19 2005
parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Dunne schrieb am Thu, 19 May 2005 09:33:02 +0000 (UTC):
 In article <d8ogr7.3b2.thomas-dloop laermschleuder.kuehne.cn>, Thomas Kuehne
 says...
Derek Parnell schrieb am Thu, 19 May 2005 10:28:08 +1000:
 On Thu, 19 May 2005 03:37:18 +0200, Thomas Kuehne wrote:

 Derek Parnell schrieb am Thu, 19 May 2005 09:25:53 +1000:
[snip]
 Is there any situation where exclusive ranges are neccesary and can't be
 replaced by inclusive ranges?
 
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . }
Floating types can't be used as switch conditionals.
Hey! Neither can ranges ;-) We are brainstorming here, right? If we can consider ranges then I think we can consider switches using any type.
The problem with floating types are their rounding problems. Rewrite of your sample in inclusive ranges. real x; ... switch(x){ case 0.0 ... 1-real.epsilon: .... case 1.0 ... 2-real.epsilon: .... }
I like your example, Thomas; it's clear to the programmer, however the epsilon is specific to whichever floating point type you're finding ranges of. real's epsilon won't work for double and float, and I'm sure you were aware of this. I'm also sure you meant it as a simple example demonstrating the technique. But it got me thinking... A syntax of inclusive and exclusive bounds should be proposed to fix the range problems of floating point types that lets the compiler choose the correct epsilon - or even better - generate simple floating point compare operations for range checking. Basing my initial proposal on mathematical range syntax: case (1.0 .. 2.0): // x > 1.0 && x < 2.0 case [1.0 .. 2.0): // x >= 1.0 && x < 2.0 case (1.0 .. 2.0]: // x > 1.0 && x <= 2.0 case [1.0 .. 2.0]: // x >= 1.0 && x <= 2.0 I'm sure the compiler can detect range conflicts with this inclusive/exclusive range syntax with a few simple heuristics or rules.
Any syntax that uses un-paired braces raises huge complexity issues for any code parser. switch(x){ case 0.0 ... 1-typeof(x).epsilon: .... case 1.0 ... 2-typeof(x).epsilon: .... } missing: int/byte/short/long/char/dchar/wchar.epsilon==1 A shortcut(e.g. $E) for typeof(x).epsilon in case statements would be nice. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCjIIK3w+/yD4P9tIRAvG3AJ9J/fD+76VcCYF/vKMM/H9qgty8MACgjDO6 o9S4/HdozZmpQh4zAkEv/PI= =rPfq -----END PGP SIGNATURE-----
May 19 2005
parent reply James Dunne <james.jdunne gmail.com> writes:
In article <d8p0gb.3k5.thomas-dloop laermschleuder.kuehne.cn>, Thomas Kuehne
says...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Dunne schrieb am Thu, 19 May 2005 09:33:02 +0000 (UTC):
 In article <d8ogr7.3b2.thomas-dloop laermschleuder.kuehne.cn>, Thomas Kuehne
 says...
Derek Parnell schrieb am Thu, 19 May 2005 10:28:08 +1000:
 On Thu, 19 May 2005 03:37:18 +0200, Thomas Kuehne wrote:

 Derek Parnell schrieb am Thu, 19 May 2005 09:25:53 +1000:
[snip]
 Is there any situation where exclusive ranges are neccesary and can't be
 replaced by inclusive ranges?
 
real x; . . . switch (x) { case 0.0 ... 0.9999999999: . . . break; case 1.0 ... 1.9999999999: . . . } when what I really need is ... if (x >= 0.0 and < 1.0) { . . . } else if (x >= 1.0 and x < 2.0) { . . . }
Floating types can't be used as switch conditionals.
Hey! Neither can ranges ;-) We are brainstorming here, right? If we can consider ranges then I think we can consider switches using any type.
The problem with floating types are their rounding problems. Rewrite of your sample in inclusive ranges. real x; ... switch(x){ case 0.0 ... 1-real.epsilon: .... case 1.0 ... 2-real.epsilon: .... }
I like your example, Thomas; it's clear to the programmer, however the epsilon is specific to whichever floating point type you're finding ranges of. real's epsilon won't work for double and float, and I'm sure you were aware of this. I'm also sure you meant it as a simple example demonstrating the technique. But it got me thinking... A syntax of inclusive and exclusive bounds should be proposed to fix the range problems of floating point types that lets the compiler choose the correct epsilon - or even better - generate simple floating point compare operations for range checking. Basing my initial proposal on mathematical range syntax: case (1.0 .. 2.0): // x > 1.0 && x < 2.0 case [1.0 .. 2.0): // x >= 1.0 && x < 2.0 case (1.0 .. 2.0]: // x > 1.0 && x <= 2.0 case [1.0 .. 2.0]: // x >= 1.0 && x <= 2.0 I'm sure the compiler can detect range conflicts with this inclusive/exclusive range syntax with a few simple heuristics or rules.
Any syntax that uses un-paired braces raises huge complexity issues for any code parser.
I agree. Perhaps escaping them with \ would alleviate the parser's complexity issues with unmatched braces? I don't think there is a \ token in D, so creating \[, \(, \], \) tokens would be trivial for this task. Would that really look that bad? It'd be easy to parse: case \(1.0 .. 2.0\): // x > 1.0 && x < 2.0 case \[1.0 .. 2.0\): // x >= 1.0 && x < 2.0 case \(1.0 .. 2.0\]: // x > 1.0 && x <= 2.0 case \[1.0 .. 2.0\]: // x >= 1.0 && x <= 2.0 Any other character not used in D token parsing would be an equally likely the single-character realm. There'd also have to be the default range where both boundaries are inclusive when you don't specify explicit range syntax.
	switch(x){
		case 0.0 ... 1-typeof(x).epsilon:
		....
		case 1.0 ... 2-typeof(x).epsilon:
		....
	}

missing: int/byte/short/long/char/dchar/wchar.epsilon==1


A shortcut(e.g. $E) for typeof(x).epsilon in case statements would be
nice.

Thomas
I'm still a bit skeptical on the epsilon idea. That $E would have to be aware of your (x) type in typeof(x).espilon. Perhaps $E(float), $E(double) would work for what you're proposing? Regards, James Dunne
May 19 2005
parent Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Dunne schrieb am Thu, 19 May 2005 10:41:11 +0000 (UTC):
 In article <d8p0gb.3k5.thomas-dloop laermschleuder.kuehne.cn>, Thomas Kuehne
 says...
<snip>
 Any other character not used in D token parsing would be an equally likely

 the single-character realm.
<snip>
	switch(x){
		case 0.0 ... 1-typeof(x).epsilon:
		....
		case 1.0 ... 2-typeof(x).epsilon:
		....
	}

missing: int/byte/short/long/char/dchar/wchar.epsilon==1


A shortcut(e.g. $E) for typeof(x).epsilon in case statements would be
nice.
I'm still a bit skeptical on the epsilon idea. That $E would have to be aware of your (x) type in typeof(x).espilon. Perhaps $E(float), $E(double) would work for what you're proposing?
$E knows the type due to "switch(x)". -----BEGIN PGP SIGNATURE----- iD8DBQFCjI3q3w+/yD4P9tIRAnO7AKCDXQEVIDy7FuBLhoeL21M4txXgcQCfY9Oe D+/BvtSR/+N0GtypcfE6HIY= =vCb+ -----END PGP SIGNATURE-----
May 19 2005
prev sibling parent reply Matthias Becker <Matthias_member pathlink.com> writes:
int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}
The match statement looks scary though :)
:: would be an operator to create an array: 1 :: [5, 9, 42] would create the array [1, 5, 9, 42]. This operator would have a backward-mode to deconstruct an array into it's first array and the other elements. This is used in this match statement here. In ML :: is a constructor to construct lists. Each constructor can be used to deconstruct an object.
May 19 2005
next sibling parent reply "G.Vidal" <gyvidal wanadoo.fr> writes:
In D, the '~' operator would be more appropriate to construct arrays.

( Is this allowed by the way: int[] e = a ~ [1,2,3]; 
 with a beeing int or int[] ? )

The '::' should be reserved for deconstruction only

match (a) { 
	foo::bar : // here, put code to be executed if a is not an
                   // empty array, foo would then be a variable with 
                   // the first element of the array a, and bar 
                   // would be the array a[1..$] (with its first 
                   // element removed)
                   break: // optional

        [1,2,3]::bar : // here put code to be executed if a begins with
                       // the subarray [1,2,3], then bar would be
                       // the array a[3..$] (with the subarray removed)

        [] : // here put code if a is an "empty" array.
}

I like pattern matching. This looks OK don't you think? the equivalent
with if, else, == would be more scary.

What do you think, feedback please !

See ya, gracias





Le Thu, 19 May 2005 10:12:02 +0000, Matthias Becker a écrit :

int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}
The match statement looks scary though :)
:: would be an operator to create an array: 1 :: [5, 9, 42] would create the array [1, 5, 9, 42]. This operator would have a backward-mode to deconstruct an array into it's first array and the other elements. This is used in this match statement here. In ML :: is a constructor to construct lists. Each constructor can be used to deconstruct an object.
May 19 2005
parent reply "G.Vidal" <gyvidal wanadoo.fr> writes:
And we could also imagine something more general:

	a::b::[3,4]::next

Match would occur if the 3th and 4th element of a would be '3' and '4'
a and b would contains the first and the second values, next would be a[4..$]

Etc.....



Le Thu, 19 May 2005 13:01:55 +0200, G.Vidal a écrit :

 In D, the '~' operator would be more appropriate to construct arrays.
 
 ( Is this allowed by the way: int[] e = a ~ [1,2,3]; 
  with a beeing int or int[] ? )
 
 The '::' should be reserved for deconstruction only
 
 match (a) { 
 	foo::bar : // here, put code to be executed if a is not an
                    // empty array, foo would then be a variable with 
                    // the first element of the array a, and bar 
                    // would be the array a[1..$] (with its first 
                    // element removed)
                    break: // optional
 
         [1,2,3]::bar : // here put code to be executed if a begins with
                        // the subarray [1,2,3], then bar would be
                        // the array a[3..$] (with the subarray removed)
 
         [] : // here put code if a is an "empty" array.
 I like pattern matching. This looks OK don't you think? the equivalent
 with if, else, == would be more scary.
 
 What do you think, feedback please !
 
 See ya, gracias
 
 
 
 
 
 Le Thu, 19 May 2005 10:12:02 +0000, Matthias Becker a écrit :
 
int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}
The match statement looks scary though :)
:: would be an operator to create an array: 1 :: [5, 9, 42] would create the array [1, 5, 9, 42]. This operator would have a backward-mode to deconstruct an array into it's first array and the other elements. This is used in this match statement here. In ML :: is a constructor to construct lists. Each constructor can be used to deconstruct an object.
May 19 2005
parent Matthias Becker <Matthias_member pathlink.com> writes:
And we could also imagine something more general:

	a::b::[3,4]::next

Match would occur if the 3th and 4th element of a would be '3' and '4'
a and b would contains the first and the second values, next would be a[4..$]

Etc.....
Do you use pattern matching only to deconstruct arrays (or lists in caml)? Normaly you can use it for anything and that's why pattern matching is interesting. Some SML: Or simpler other things like a binaray tree with related functions ..
May 19 2005
prev sibling parent Nod <Nod_member pathlink.com> writes:
In article <d6hopi$1geo$1 digitaldaemon.com>, Matthias Becker says...
int positive (int[] array) {
	int count (int[] a, int t) {
		match (array) {
			[]: return t; // end of array
			
			e::next : // e = first element of a, next = array.ptr +1
				if (e > 0) count (next, t+1);
				else       count (next, t);
		}
	}
	return count (array, 0);
}
The match statement looks scary though :)
:: would be an operator to create an array: 1 :: [5, 9, 42] would create the array [1, 5, 9, 42]. This operator would have a backward-mode to deconstruct an array into it's first array and the other elements. This is used in this match statement here. In ML :: is a constructor to construct lists. Each constructor can be used to deconstruct an object.
Yeah. Now it's all crystal clear. :) -Nod-
May 19 2005
prev sibling next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Nod wrote:
 I know a good idea when I see one, and I believe this topic is good enough to
 warrant a proper discussion. I will present a short summary and a proposed
 integration into D. Yeah, I know it's a bit "up there"... Maybe for 3.0? :)
 
 /****************************************
 * Summary:
 ***/
 As the name suggests, value-based function overloading does function call
 resolution based on the *values* of the function parameters in addition to the
 type-based overloading already found in D.
 
 Potential benefits are: clearer, more concise code, eases refactoring,
separates
 the special cases from main algorithm, overriding buggy libraries easy.
 
 Potential dangers are: compiler complexity, it can be misused to create
 unreadable code (but what can't?).
 
 For reference, these are interesting:
 Antti(first):  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23643
 David Medlock: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23651
 David Medlock: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23686
 Kevin Bealer:  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23745
 
 Maybe not interesting, but explains some benefits:
 Me: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23732
 
 /****************************************
 * Proposed integration with D
 ***/
 
 Syntax:
 retType funcName(paramType paramName in paramRange) { ... }
 
 Example:
 void foo(char ch in '\n') { ... }
 void foo(int i in 0..10) { ... }
 void foo(Bar b in null) { ... }
 void foo(char[] str in new Regex("/fooi/i")) { ... }
 
 Nothing complex, a new keyword, in, which tests for membership in something,
and
 after that a list (array literal?) of values. The last one would require an
 operator like opIn or something overloaded. This is only a draft though.
 
 Implementation:
 If all data is literal, and ranges checkable at compile-time, the full overload
 resolution can happen at compile time. If not, the compiler needs generate the
 if-else[1] construct needed to select the correct function.
 
 Ambiguites are illegal. If one value range overlaps another, a compile-time
 error is generated.
 
 Example:
 void foo(int a in 0..10) { ... }
 void foo(int a in 5..15) { ... }  // illegal, range 5..15 overlaps 0..10
 
 Multiple parameters can resolve each other's ambiguites and as such, two
 overloads are only ambiguous if all parameters common between the two have
 ambiguities.
 
 Example:
 1) void foo(int a in 0..10, char b in 'x') { ... }
 2) void foo(int a in 5..15, char b in 'y') { ... }  // legal
 3) void foo(int a in 0..5,  char b in 'y') { ... }  // legal, see below
 4) void foo(int a in 0..5,  char b in 'x') { ... }  // illegal, ambiguous to 1)
 
 Only two overloads are considered at any one time, which is why number three
 above is legal; while both a is ambiguous with 1), and b is ambiguous with 2),
 *both* are not ambiguous at the same time.
 
 If ranges are not comparable at compile time, such as when using objects with
 overloaded opCmp, the compiler must defer ambiguity checking until run-time.
 This only needs to be done in debug builds.
 
 It is not necessary for the overloaded functions to cover all possible value
 ranges. If a value does not fit any of the overloaded functions, either a
 compile-time error is emitted if data is compile-time checkable, or an
 OverloadExecption is thrown at run-time.
 
 Example:
 1) void foo(int a in 1..4) { ... }
 2) void foo(int a in 6..9) { ... }
 
 int main()
 {
 int x = 5;
 foo(2);  // calls 1)
 foo(5);  // complains at compile-time
 foo(x);  // throws at run-time
 }
 
 /****************************************
 * In closing...
 ***/
 I think one can compare this technique with forward references. While one
 certainly can manage without it, heck we've done it for >10 years, it is/would
 be a great relief to have.
 
 So, what do you all think?
 -Nod-
 
 
 [1] For brevity, I will only mention if-else constructs, but when the value
 ranges allow it, a switch-case construct could be used, or some combination of
 the two.
 
 
I like the idea, although one downside is some runtime over head, but I think it's not really an issue, it would probably be equivilent to just an if statement. Maybe there should be some keyword that must be used when declaring the function and/or using it, so that who ever reads the code can see that there are other parts of the function somewhere else. I say other "parts", because this is not really function overloading, it's just seperating the function's parts into seperate functions. so that function(int i) { if( i == 0 ) { .... } else if( i >= 1 && i <= 5 ) { .... } else { .... } } becomes function( int 0 ) { .... } function( int 1..5 ) { .... } function( int i ) { .... } This might be confusing for some, so there probably must be some rules to make it clearer for the reader (since the original motivation for the idea is clarity and maintainability), at least all the function parts must be in the same module. Also, maybe the part functions should be nested inside the whole function? and special cases should be above the general case, because they are handled logically as an if..else? So it maybe a clearer structur than a switch /keyword/ function(int i) { function( 0 ) { .... } function( 1..5 ) { .... } function( i ) { .... } } And the function may contain nothing except for the "parts"
May 18 2005
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Hasan Aljudy wrote:
 Nod wrote:
 
 I know a good idea when I see one, and I believe this topic is good 
 enough to
 warrant a proper discussion. I will present a short summary and a 
 proposed
 integration into D. Yeah, I know it's a bit "up there"... Maybe for 
 3.0? :)

 /****************************************
 * Summary:
 ***/
 As the name suggests, value-based function overloading does function call
 resolution based on the *values* of the function parameters in 
 addition to the
 type-based overloading already found in D.

 Potential benefits are: clearer, more concise code, eases refactoring, 
 separates
 the special cases from main algorithm, overriding buggy libraries easy.

 Potential dangers are: compiler complexity, it can be misused to create
 unreadable code (but what can't?).

 For reference, these are interesting:
 Antti(first):  
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23643
 David Medlock: 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23651
 David Medlock: 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23686
 Kevin Bealer:  
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23745

 Maybe not interesting, but explains some benefits:
 Me: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23732

 /****************************************
 * Proposed integration with D
 ***/

 Syntax:
 retType funcName(paramType paramName in paramRange) { ... }

 Example:
 void foo(char ch in '\n') { ... }
 void foo(int i in 0..10) { ... }
 void foo(Bar b in null) { ... }
 void foo(char[] str in new Regex("/fooi/i")) { ... }

 Nothing complex, a new keyword, in, which tests for membership in 
 something, and
 after that a list (array literal?) of values. The last one would 
 require an
 operator like opIn or something overloaded. This is only a draft though.

 Implementation:
 If all data is literal, and ranges checkable at compile-time, the full 
 overload
 resolution can happen at compile time. If not, the compiler needs 
 generate the
 if-else[1] construct needed to select the correct function.

 Ambiguites are illegal. If one value range overlaps another, a 
 compile-time
 error is generated.

 Example:
 void foo(int a in 0..10) { ... }
 void foo(int a in 5..15) { ... }  // illegal, range 5..15 overlaps 0..10

 Multiple parameters can resolve each other's ambiguites and as such, two
 overloads are only ambiguous if all parameters common between the two 
 have
 ambiguities.

 Example:
 1) void foo(int a in 0..10, char b in 'x') { ... }
 2) void foo(int a in 5..15, char b in 'y') { ... }  // legal
 3) void foo(int a in 0..5,  char b in 'y') { ... }  // legal, see below
 4) void foo(int a in 0..5,  char b in 'x') { ... }  // illegal, 
 ambiguous to 1)

 Only two overloads are considered at any one time, which is why number 
 three
 above is legal; while both a is ambiguous with 1), and b is ambiguous 
 with 2),
 *both* are not ambiguous at the same time.

 If ranges are not comparable at compile time, such as when using 
 objects with
 overloaded opCmp, the compiler must defer ambiguity checking until 
 run-time.
 This only needs to be done in debug builds.

 It is not necessary for the overloaded functions to cover all possible 
 value
 ranges. If a value does not fit any of the overloaded functions, either a
 compile-time error is emitted if data is compile-time checkable, or an
 OverloadExecption is thrown at run-time.

 Example:
 1) void foo(int a in 1..4) { ... }
 2) void foo(int a in 6..9) { ... }

 int main()
 {
 int x = 5;
 foo(2);  // calls 1)
 foo(5);  // complains at compile-time
 foo(x);  // throws at run-time
 }

 /****************************************
 * In closing...
 ***/
 I think one can compare this technique with forward references. While one
 certainly can manage without it, heck we've done it for >10 years, it 
 is/would
 be a great relief to have.

 So, what do you all think?
 -Nod-


 [1] For brevity, I will only mention if-else constructs, but when the 
 value
 ranges allow it, a switch-case construct could be used, or some 
 combination of
 the two.
I like the idea, although one downside is some runtime over head, but I think it's not really an issue, it would probably be equivilent to just an if statement. Maybe there should be some keyword that must be used when declaring the function and/or using it, so that who ever reads the code can see that there are other parts of the function somewhere else. I say other "parts", because this is not really function overloading, it's just seperating the function's parts into seperate functions. so that function(int i) { if( i == 0 ) { .... } else if( i >= 1 && i <= 5 ) { .... } else { .... } } becomes function( int 0 ) { .... } function( int 1..5 ) { .... } function( int i ) { .... } This might be confusing for some, so there probably must be some rules to make it clearer for the reader (since the original motivation for the idea is clarity and maintainability), at least all the function parts must be in the same module. Also, maybe the part functions should be nested inside the whole function? and special cases should be above the general case, because they are handled logically as an if..else? So it maybe a clearer structur than a switch /keyword/ function(int i) { function( 0 ) { .... } function( 1..5 ) { .... } function( i ) { .... } } And the function may contain nothing except for the "parts"
woops, my bad, didn't read the whole original post, the OP already outlined the specs for the feature.
May 18 2005
parent Nod <Nod_member pathlink.com> writes:
In article <d6fkhg$2i61$2 digitaldaemon.com>, Hasan Aljudy says...
 <snip
woops, my bad, didn't read the whole original post, the OP already outlined the specs for the feature.
Not at all, I posted a *draft* of the technique. Opinions/ideas are highly welcome!
May 18 2005
prev sibling parent reply Nod <Nod_member pathlink.com> writes:
In article <d6fkde$2i61$1 digitaldaemon.com>, Hasan Aljudy says...
 <snip>
I like the idea, although one downside is some runtime over head, but I think it's not really an issue, it would probably be equivilent to just an if statement.
Correct, that is what I think too. And since one would not be able to order the if/else statement after frequency, codegen may be slightly less optimized than a hand-coded one. But speed is rarely *that* important anyways. The gained clarity will make up for it.
Maybe there should be some keyword that must be used when declaring the 
function and/or using it, so that who ever reads the code can see that 
there are other parts of the function somewhere else.
I say other "parts", because this is not really function overloading, 
it's just seperating the function's parts into seperate functions.

so that

function(int i)
{
     if( i == 0 )
     {
         ....
     }
     else if( i >= 1 && i <= 5 )
     {
         ....
     }

     else
     {
         ....
     }
}

becomes

function( int 0 )
{
     ....
}
function( int 1..5 )
{
     ....
}
function( int i )
{
     ....
}

This might be confusing for some, so there probably must be some rules 
to make it clearer for the reader (since the original motivation for the 
idea is clarity and maintainability), at least all the function parts
must be in the same module.
Yes, this may be confusing. However, I believe that putting too many restrictions on the technique will also limit its value. In this way, you would not be able to overload any function to which you do not have write access to the source. While that wouldn't be a very clear thing to do, sometimes you really need to, and when you do, you would have to use an even worse kludge like a wrapper function.
Also, maybe the part functions should be nested inside the whole 
function? and special cases should be above the general case, because 
they are handled logically as an if..else? So it maybe a clearer 
structur than a switch

/keyword/ function(int i)
{
     function( 0 )
     {
         ....
     }

     function( 1..5 )
     {
         ....
     }

     function( i )
     {
         ....
     }
}

And the function may contain nothing except for the "parts"
This is again an interesting idea, and you are right, it would indeed make things very clear. But this would also be somewhat limiting. For instance, when splitting a function, that is turning it into two value-overloaded ones, you would have to add both the outer function and a new inner. This is not far from the original case of adding to an if/else statement *and* adding a new function. Too much editing imho. Don't get me wrong though, there *will* have to be limits to the technique, I just don't think boxing it into keyword/function/module "jails" is giving enough freedom for the programmer. But this may just be me being too liberal :)
May 18 2005
parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Nod wrote:
 In article <d6fkde$2i61$1 digitaldaemon.com>, Hasan Aljudy says...
 
<snip>
<snip>
Yes, this may be confusing. However, I believe that putting too many restrictions on the technique will also limit its value. In this way, you would not be able to overload any function to which you do not have write access to the source. While that wouldn't be a very clear thing to do, sometimes you really need to, and when you do, you would have to use an even worse kludge like a wrapper function.
 <snip>
This is again an interesting idea, and you are right, it would indeed make things very clear. But this would also be somewhat limiting. For instance, when splitting a function, that is turning it into two value-overloaded ones, you would have to add both the outer function and a new inner. This is not far from the original case of adding to an if/else statement *and* adding a new function. Too much editing imho. Don't get me wrong though, there *will* have to be limits to the technique, I just don't think boxing it into keyword/function/module "jails" is giving enough freedom for the programmer. But this may just be me being too liberal :)
I see your point. I still think the general function should have a keyword associated with it to indicate that it has value-based overloaders (if that's what you'd call it), otherwise it can cause serious ocnfusion and headechs when debugging, where you look at a function, expect certain results, but see that the actual results are totally different than what you expected. Then you're not so sure which values are overloaded and which are not, and unless you have a really good IDE, then you're gonna spend (read: waste) some time trying to figure out whether a specific value is overloaded or not. So I think some rules must be set to elliminate such problems. You can easily override library functions like this: file1.d ---------------------- import library; //we will override "foo" in the library void foo( 0 ) { //do something special .. } void foo( int i ) { library.foo( i ); } -------------------- file2.d ------------------- //don't import the library, import your overloads import file1; foo(0); //value overloaded foo(4); //default library function ------------------- Anyway, I think in most cases users aren't exactly aware of the details of library implementations, so I dunno, but I don't think that case would come up very often. However, the case of debugging code and not being sure which values are overloaded and which are not is very very likely to come up, infact I think it comes up all the time.
May 18 2005
parent Nod <Nod_member pathlink.com> writes:
In article <d6ghoo$for$1 digitaldaemon.com>, Hasan Aljudy says...
Nod wrote:
 In article <d6fkde$2i61$1 digitaldaemon.com>, Hasan Aljudy says...
 
<snip>
<snip>
<snip>
I see your point. I still think the general function should have a keyword associated with it to indicate that it has value-based overloaders (if that's what you'd call it),
That's the working name. :)
otherwise it can cause serious ocnfusion and headechs when 
debugging, where you look at a function, expect certain results, but see 
  that the actual results are totally different than what you expected.
Then you're not so sure which values are overloaded and which are not, 
and unless you have a really good IDE, then you're gonna spend (read: 
waste) some time trying to figure out whether a specific value is 
overloaded or not.

So I think some rules must be set to elliminate such problems.

You can easily override library functions like this:

file1.d
----------------------
import library;

//we will override "foo" in the library
void foo( 0 )
{
   //do something special ..
}

void foo( int i )
{
    library.foo( i );
}
--------------------

file2.d
-------------------
//don't import the library, import your overloads
import file1;

foo(0); //value overloaded
foo(4); //default library function
-------------------

Anyway, I think in most cases users aren't exactly aware of the details 
of library implementations, so I dunno, but I don't think that case 
would come up very often.
Hmm yes, I guess it's a call to be made based on design philosophy, and a weighting of programmer freedom vs. potential for trouble.
However, the case of debugging code and not being sure which values are 
overloaded and which are not is very very likely to come up, infact I 
think it comes up all the time.
Yes, I can see your concern, but this seems like a somewhat hypothetical problem. Overloads based on type has existed for quite some time, and it really isn't hard to figure out which one is being called. For one, you simply have to step into the function instead of around it when debugging. And you can use watches to see the parameter types/values which is being passed to the function. But yes, I acknowledge that the problem will presumably be somewhat exacerbated with value-based overloading due to more widespread use and the fact that values change more easily. A keyword may ease things a bit, I give you that. But is it strictly necessary? I believe good programming practises can go a long way, while not forcing all overloads to a single physical area, it would be good manors to keep the overloads together. It would be hard to read otherwise. -Nod-
May 18 2005
prev sibling parent reply Kyle Furlong <ky220 umail.ucsb.edu> writes:
Nod wrote:
 I know a good idea when I see one, and I believe this topic is good enough to
 warrant a proper discussion. I will present a short summary and a proposed
 integration into D. Yeah, I know it's a bit "up there"... Maybe for 3.0? :)
 
 /****************************************
 * Summary:
 ***/
 As the name suggests, value-based function overloading does function call
 resolution based on the *values* of the function parameters in addition to the
 type-based overloading already found in D.
 
 Potential benefits are: clearer, more concise code, eases refactoring,
separates
 the special cases from main algorithm, overriding buggy libraries easy.
 
 Potential dangers are: compiler complexity, it can be misused to create
 unreadable code (but what can't?).
 
 For reference, these are interesting:
 Antti(first):  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23643
 David Medlock: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23651
 David Medlock: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23686
 Kevin Bealer:  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23745
 
 Maybe not interesting, but explains some benefits:
 Me: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23732
 
 /****************************************
 * Proposed integration with D
 ***/
 
 Syntax:
 retType funcName(paramType paramName in paramRange) { ... }
 
 Example:
 void foo(char ch in '\n') { ... }
 void foo(int i in 0..10) { ... }
 void foo(Bar b in null) { ... }
 void foo(char[] str in new Regex("/fooi/i")) { ... }
 
 Nothing complex, a new keyword, in, which tests for membership in something,
and
 after that a list (array literal?) of values. The last one would require an
 operator like opIn or something overloaded. This is only a draft though.
 
 Implementation:
 If all data is literal, and ranges checkable at compile-time, the full overload
 resolution can happen at compile time. If not, the compiler needs generate the
 if-else[1] construct needed to select the correct function.
 
 Ambiguites are illegal. If one value range overlaps another, a compile-time
 error is generated.
 
 Example:
 void foo(int a in 0..10) { ... }
 void foo(int a in 5..15) { ... }  // illegal, range 5..15 overlaps 0..10
 
 Multiple parameters can resolve each other's ambiguites and as such, two
 overloads are only ambiguous if all parameters common between the two have
 ambiguities.
 
 Example:
 1) void foo(int a in 0..10, char b in 'x') { ... }
 2) void foo(int a in 5..15, char b in 'y') { ... }  // legal
 3) void foo(int a in 0..5,  char b in 'y') { ... }  // legal, see below
 4) void foo(int a in 0..5,  char b in 'x') { ... }  // illegal, ambiguous to 1)
 
 Only two overloads are considered at any one time, which is why number three
 above is legal; while both a is ambiguous with 1), and b is ambiguous with 2),
 *both* are not ambiguous at the same time.
 
 If ranges are not comparable at compile time, such as when using objects with
 overloaded opCmp, the compiler must defer ambiguity checking until run-time.
 This only needs to be done in debug builds.
 
 It is not necessary for the overloaded functions to cover all possible value
 ranges. If a value does not fit any of the overloaded functions, either a
 compile-time error is emitted if data is compile-time checkable, or an
 OverloadExecption is thrown at run-time.
 
 Example:
 1) void foo(int a in 1..4) { ... }
 2) void foo(int a in 6..9) { ... }
 
 int main()
 {
 int x = 5;
 foo(2);  // calls 1)
 foo(5);  // complains at compile-time
 foo(x);  // throws at run-time
 }
 
 /****************************************
 * In closing...
 ***/
 I think one can compare this technique with forward references. While one
 certainly can manage without it, heck we've done it for >10 years, it is/would
 be a great relief to have.
 
 So, what do you all think?
 -Nod-
 
 
 [1] For brevity, I will only mention if-else constructs, but when the value
 ranges allow it, a switch-case construct could be used, or some combination of
 the two.
 
 
Hear, hear.
May 18 2005
parent reply Kyle Furlong <ky220 umail.ucsb.edu> writes:
Kyle Furlong wrote:
 Nod wrote:
 
 I know a good idea when I see one, and I believe this topic is good 
 enough to
 warrant a proper discussion. I will present a short summary and a 
 proposed
 integration into D. Yeah, I know it's a bit "up there"... Maybe for 
 3.0? :)

 /****************************************
 * Summary:
 ***/
 As the name suggests, value-based function overloading does function call
 resolution based on the *values* of the function parameters in 
 addition to the
 type-based overloading already found in D.

 Potential benefits are: clearer, more concise code, eases refactoring, 
 separates
 the special cases from main algorithm, overriding buggy libraries easy.

 Potential dangers are: compiler complexity, it can be misused to create
 unreadable code (but what can't?).

 For reference, these are interesting:
 Antti(first):  
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23643
 David Medlock: 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23651
 David Medlock: 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23686
 Kevin Bealer:  
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23745

 Maybe not interesting, but explains some benefits:
 Me: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/23732

 /****************************************
 * Proposed integration with D
 ***/

 Syntax:
 retType funcName(paramType paramName in paramRange) { ... }

 Example:
 void foo(char ch in '\n') { ... }
 void foo(int i in 0..10) { ... }
 void foo(Bar b in null) { ... }
 void foo(char[] str in new Regex("/fooi/i")) { ... }

 Nothing complex, a new keyword, in, which tests for membership in 
 something, and
 after that a list (array literal?) of values. The last one would 
 require an
 operator like opIn or something overloaded. This is only a draft though.

 Implementation:
 If all data is literal, and ranges checkable at compile-time, the full 
 overload
 resolution can happen at compile time. If not, the compiler needs 
 generate the
 if-else[1] construct needed to select the correct function.

 Ambiguites are illegal. If one value range overlaps another, a 
 compile-time
 error is generated.

 Example:
 void foo(int a in 0..10) { ... }
 void foo(int a in 5..15) { ... }  // illegal, range 5..15 overlaps 0..10

 Multiple parameters can resolve each other's ambiguites and as such, two
 overloads are only ambiguous if all parameters common between the two 
 have
 ambiguities.

 Example:
 1) void foo(int a in 0..10, char b in 'x') { ... }
 2) void foo(int a in 5..15, char b in 'y') { ... }  // legal
 3) void foo(int a in 0..5,  char b in 'y') { ... }  // legal, see below
 4) void foo(int a in 0..5,  char b in 'x') { ... }  // illegal, 
 ambiguous to 1)

 Only two overloads are considered at any one time, which is why number 
 three
 above is legal; while both a is ambiguous with 1), and b is ambiguous 
 with 2),
 *both* are not ambiguous at the same time.

 If ranges are not comparable at compile time, such as when using 
 objects with
 overloaded opCmp, the compiler must defer ambiguity checking until 
 run-time.
 This only needs to be done in debug builds.

 It is not necessary for the overloaded functions to cover all possible 
 value
 ranges. If a value does not fit any of the overloaded functions, either a
 compile-time error is emitted if data is compile-time checkable, or an
 OverloadExecption is thrown at run-time.

 Example:
 1) void foo(int a in 1..4) { ... }
 2) void foo(int a in 6..9) { ... }

 int main()
 {
 int x = 5;
 foo(2);  // calls 1)
 foo(5);  // complains at compile-time
 foo(x);  // throws at run-time
 }

 /****************************************
 * In closing...
 ***/
 I think one can compare this technique with forward references. While one
 certainly can manage without it, heck we've done it for >10 years, it 
 is/would
 be a great relief to have.

 So, what do you all think?
 -Nod-


 [1] For brevity, I will only mention if-else constructs, but when the 
 value
 ranges allow it, a switch-case construct could be used, or some 
 combination of
 the two.
Hear, hear.
One thing that just occured to me is, cant you acheive the same goal using aserts and invariants? Isnt that what they are designed for?
May 18 2005
parent Nod <Nod_member pathlink.com> writes:
In article <d6fvvg$2vq1$2 digitaldaemon.com>, Kyle Furlong says...
Kyle Furlong wrote:
 Nod wrote:
 <snip>
Hear, hear.
One thing that just occured to me is, cant you acheive the same goal using aserts and invariants? Isnt that what they are designed for?
No, asserts and invariants are only for verifying that the function/program is operating with sane values. Value-based function overloading on the other hand is a form of implicit flow control. With asserts and invariants, the compiler inserts *checks* on function invocation, which throws exceptions if the values are out-of-range. With value-based overloading *flow instructions* are inserted, so that if the values are out-of-range (so to speak) another overload is called. The two does not exclude one another. Neither do any of them supersede the other. I rather believe that they complement eachother. -Nod-
May 18 2005