www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Find on sorted range slower?

reply "Tofu Ninja" <emmons0 purdue.edu> writes:
void main()
{
	auto a = new int[100*1024*1024];
	for(int i = 0; i < 100*1024*1024; i++)
	{
		a[i] = i;
	}

	enum f = 100*1024*1000;

	StopWatch sw;
	{
		sw.start();
		auto temp = assumeSorted(a).find(f);
		sw.stop();
	}
	auto t1 = sw.peek();

	sw.reset();

	{
		sw.start();
		auto temp = a.find(f);
		sw.stop();
	}
	auto t2 = sw.peek();

	writeln("Sorted\t", t1.length);
	writeln("Regular\t", t2.length);
	writeln("Ratio\t", float(t1.length)/ float(t2.length));
}


I am getting the assumeSorted version to be about 3x slower than 
the regular find, that seems very counter intuitive. Anyone know 
why this would be, seems like a very odd thing to happen. I 
expected the assumeSorted to be faster, expect it to do a binary 
search, instead if a linear one.
Aug 06 2015
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 7 August 2015 at 00:35:58 UTC, Tofu Ninja wrote:
 void main()
 {
 	auto a = new int[100*1024*1024];
 	for(int i = 0; i < 100*1024*1024; i++)
 	{
 		a[i] = i;
 	}

 	enum f = 100*1024*1000;

 	StopWatch sw;
 	{
 		sw.start();
 		auto temp = assumeSorted(a).find(f);
 		sw.stop();
 	}
 	auto t1 = sw.peek();

 	sw.reset();

 	{
 		sw.start();
 		auto temp = a.find(f);
 		sw.stop();
 	}
 	auto t2 = sw.peek();

 	writeln("Sorted\t", t1.length);
 	writeln("Regular\t", t2.length);
 	writeln("Ratio\t", float(t1.length)/ float(t2.length));
 }


 I am getting the assumeSorted version to be about 3x slower 
 than the regular find, that seems very counter intuitive. 
 Anyone know why this would be, seems like a very odd thing to 
 happen. I expected the assumeSorted to be faster, expect it to 
 do a binary search, instead if a linear one.
As usual, which compiler, which compiler version, which compilation flags?
Aug 06 2015
parent "Tofu Ninja" <emmons0 purdue.edu> writes:
On Friday, 7 August 2015 at 01:26:51 UTC, John Colvin wrote:
 As usual, which compiler, which compiler version, which 
 compilation flags?
dmd v2.067.1 -O -release -w -inline -boundscheck=off
Aug 06 2015
prev sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
Do you want to see SortedRange 1700 times faster? ;)

On 08/06/2015 05:35 PM, Tofu Ninja wrote:> void main()

          auto temp = assumeSorted(a).find(f);
SortedRange does not have a find() member. What happens is, it goes to find() algorithm. Replace that line with auto temp = assumeSorted(a).equalRange(f); writefln("found: %s", temp.front); New result: found: 102400000 found: 102400000 Sorted 83235 Regular 141392157 Ratio 0.000588682 Ali
Aug 06 2015
parent reply "Tofu Ninja" <emmons0 purdue.edu> writes:
On Friday, 7 August 2015 at 04:23:21 UTC, Ali Çehreli wrote:
 Do you want to see SortedRange 1700 times faster? ;)

 On 08/06/2015 05:35 PM, Tofu Ninja wrote:> void main()

          auto temp = assumeSorted(a).find(f);
SortedRange does not have a find() member. What happens is, it goes to find() algorithm. Replace that line with auto temp = assumeSorted(a).equalRange(f); writefln("found: %s", temp.front); New result: found: 102400000 found: 102400000 Sorted 83235 Regular 141392157 Ratio 0.000588682 Ali
Hmm.... I feel like I have read on the forums like a million times that certain parts of std.algorithm was specialized for sorted ranges, I guess that was just talk... Still does not really explain why find is slower on the assumeSorted version.
Aug 06 2015
parent reply "Tofu Ninja" <emmons0 purdue.edu> writes:
On Friday, 7 August 2015 at 05:01:41 UTC, Tofu Ninja wrote:
 On Friday, 7 August 2015 at 04:23:21 UTC, Ali Çehreli wrote:
 Do you want to see SortedRange 1700 times faster? ;)

 On 08/06/2015 05:35 PM, Tofu Ninja wrote:> void main()

          auto temp = assumeSorted(a).find(f);
SortedRange does not have a find() member. What happens is, it goes to find() algorithm. Replace that line with auto temp = assumeSorted(a).equalRange(f); writefln("found: %s", temp.front); New result: found: 102400000 found: 102400000 Sorted 83235 Regular 141392157 Ratio 0.000588682 Ali
Hmm.... I feel like I have read on the forums like a million times that certain parts of std.algorithm was specialized for sorted ranges, I guess that was just talk... Still does not really explain why find is slower on the assumeSorted version.
HAHAH wow, this is hilarious, I just checked, nothing in std.algo takes advantage of sorted ranges, sort doesn't even take advantage of it! You pass a sorted range into sort and it will just resort it! Wow....
Aug 06 2015
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 7 August 2015 at 05:21:32 UTC, Tofu Ninja wrote:
 HAHAH wow, this is hilarious, I just checked, nothing in 
 std.algo takes advantage of sorted ranges, sort doesn't even 
 take advantage of it! You pass a sorted range into sort and it 
 will just resort it! Wow....
Who fixes this? I can look into it... is there an issue for this?
Aug 07 2015
next sibling parent "Andrea Fontana" <nospam example.com> writes:
On Friday, 7 August 2015 at 08:18:04 UTC, Nordlöw wrote:
 On Friday, 7 August 2015 at 05:21:32 UTC, Tofu Ninja wrote:
 HAHAH wow, this is hilarious, I just checked, nothing in 
 std.algo takes advantage of sorted ranges, sort doesn't even 
 take advantage of it! You pass a sorted range into sort and it 
 will just resort it! Wow....
Who fixes this? I can look into it... is there an issue for this?
https://issues.dlang.org/show_bug.cgi?id=11667
Aug 07 2015
prev sibling parent reply "Tofu Ninja" <emmons0 purdue.edu> writes:
On Friday, 7 August 2015 at 08:18:04 UTC, Nordlöw wrote:
 On Friday, 7 August 2015 at 05:21:32 UTC, Tofu Ninja wrote:
 HAHAH wow, this is hilarious, I just checked, nothing in 
 std.algo takes advantage of sorted ranges, sort doesn't even 
 take advantage of it! You pass a sorted range into sort and it 
 will just resort it! Wow....
Who fixes this? I can look into it... is there an issue for this?
I have no idea, but it is pretty silly. Sort/isSorted on a sorted range should be a nop. Find and friends, should do doing some kind of binary search. Max and min should be O(1). Some of the functions that return a sub range or a mutated range could probably be returning sorted ranges as well if its input is a sorted range, remove, strip and split at least could.
Aug 07 2015
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 08/07/2015 11:03 AM, Tofu Ninja wrote:
 On Friday, 7 August 2015 at 08:18:04 UTC, Nordlöw wrote:
 On Friday, 7 August 2015 at 05:21:32 UTC, Tofu Ninja wrote:
 HAHAH wow, this is hilarious, I just checked, nothing in std.algo
 takes advantage of sorted ranges, sort doesn't even take advantage of
 it! You pass a sorted range into sort and it will just resort it!
 Wow....
Who fixes this? I can look into it... is there an issue for this?
I have no idea, but it is pretty silly. Sort/isSorted on a sorted range should be a nop. Find and friends, should do doing some kind of binary search. Max and min should be O(1). Some of the functions that return a sub range or a mutated range could probably be returning sorted ranges as well if its input is a sorted range, remove, strip and split at least could.
Binary search is not always faster than linear search.
Aug 07 2015
parent "Tofu Ninja" <emmons0 purdue.edu> writes:
On Friday, 7 August 2015 at 10:01:39 UTC, Timon Gehr wrote:
 On 08/07/2015 11:03 AM, Tofu Ninja wrote:
 On Friday, 7 August 2015 at 08:18:04 UTC, Nordlöw wrote:
 On Friday, 7 August 2015 at 05:21:32 UTC, Tofu Ninja wrote:
 HAHAH wow, this is hilarious, I just checked, nothing in 
 std.algo
 takes advantage of sorted ranges, sort doesn't even take 
 advantage of
 it! You pass a sorted range into sort and it will just 
 resort it!
 Wow....
Who fixes this? I can look into it... is there an issue for this?
I have no idea, but it is pretty silly. Sort/isSorted on a sorted range should be a nop. Find and friends, should do doing some kind of binary search. Max and min should be O(1). Some of the functions that return a sub range or a mutated range could probably be returning sorted ranges as well if its input is a sorted range, remove, strip and split at least could.
Binary search is not always faster than linear search.
It will be for the majority of sorted ranges. Though if other searches are needed, find and friends could take an extra arg SearchPolicy for sorted ranges that defaults to binary search.
Aug 07 2015