digitalmars.D - [WIP] Native SQLite Database reader (works at CTFE)

Stefan Koch (9/9) Feb 18 2016 Hello,

=?UTF-8?B?TcOhcmNpbw==?= Martins (2/11) Feb 18 2016 Mother of god! Looking forward to it!
Andrei Alexandrescu (2/10) Feb 18 2016 That's very exciting! Are you planning a DConf talk on the topic? -- And...

Stefan Koch (3/5) Feb 19 2016 I would love to give a talk on this.

Andrei Alexandrescu (2/6) Feb 19 2016 Submission deadline is Feb 29. -- Andrei

Era Scarecrow (5/8) Feb 18 2016 Interesting... I'd almost want a page with the ddoc information

Stefan Koch (5/9) Feb 19 2016 Well I can do that :)

Stefan Koch (2/13) Feb 19 2016 ... oh crap is misspelled algorithm ...

Stefan Koch (11/11) Feb 21 2016 just a small update on the API

Chris Wright (2/6) Feb 21 2016 That means never using an index.

Stefan Koch (2/3) Feb 21 2016 How so ?

Chris Wright (24/27) Feb 21 2016 Or rather, parsing a query to use an index becomes exponential unless yo...

Stefan Koch (8/36) Feb 21 2016 I don't parse anything.

Any (3/4) Feb 21 2016 That means your doing a full table scan. When the table gets

Stefan Koch (3/7) Feb 21 2016 When the table get's large enough you probably don't worry about

Chris Wright (3/12) Feb 21 2016 So you intend this to work *only* at compile time? Or would you supply a...

Stefan Koch (5/18) Feb 21 2016 I intend to have a D-style SQLite compatible Database solution.

Stefan Koch (5/5) Feb 22 2016 Another small update.

Stefan Koch (14/19) Feb 26 2016 After rewriting the engine to allocate less gc memory, gdc is

Stefan Koch (4/7) Feb 26 2016 dong => doing.

Laeeth Isharc (5/25) Feb 27 2016 Great project, Stefan. Any idea what kind of maximum database

Stefan Koch (3/7) Feb 27 2016 The limits will be almost the same as in sqlite.

Laeeth Isharc (2/11) Feb 28 2016 nice. and thanks. Laeeth.

Chris Wright (2/3) Feb 21 2016 Right, which prevents you from using indexes.

Era Scarecrow (10/18) Feb 21 2016 Looks good! Although took me a little bit to notice the Lambda :P

Stefan Koch (6/16) Feb 21 2016 The Intention is not to be another SQLite but rather to provide a

WebFreak001 (4/13) Feb 21 2016 Awesome! Really looking forward to it. It will probably change
Stefan Koch (3/3) Feb 27 2016 Another update :

Stefan Koch <uplink.coder googlemail.com> writes:

Hello,

It is not quite ready to post in Announce,
but I would like to inform you that I am planing to open-source 
my native-sqlite database driver. (well currently it just reads 
them).

However it works fully at CTFE.
so if you want to you can extract sourceCode out of a 
sql-database and make a mixin with it :D
given you string imported the database file.

Feb 18 2016

=?UTF-8?B?TcOhcmNpbw==?= Martins <marcioapm gmail.com> writes:

On Thursday, 18 February 2016 at 21:09:15 UTC, Stefan Koch wrote:
 Hello,

 It is not quite ready to post in Announce,
 but I would like to inform you that I am planing to open-source 
 my native-sqlite database driver. (well currently it just reads 
 them).

 However it works fully at CTFE.
 so if you want to you can extract sourceCode out of a 
 sql-database and make a mixin with it :D
 given you string imported the database file.

Mother of god! Looking forward to it!

Feb 18 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 02/18/2016 04:09 PM, Stefan Koch wrote:
 Hello,

 It is not quite ready to post in Announce,
 but I would like to inform you that I am planing to open-source my
 native-sqlite database driver. (well currently it just reads them).

 However it works fully at CTFE.
 so if you want to you can extract sourceCode out of a sql-database and
 make a mixin with it :D
 given you string imported the database file.

That's very exciting! Are you planning a DConf talk on the topic? -- Andrei

Feb 18 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Friday, 19 February 2016 at 01:28:11 UTC, Andrei Alexandrescu 
wrote:

 That's very exciting! Are you planning a DConf talk on the 
 topic? -- Andrei

I would love to give a talk on this.

Feb 19 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/19/16 11:03 AM, Stefan Koch wrote:
 On Friday, 19 February 2016 at 01:28:11 UTC, Andrei Alexandrescu wrote:

 That's very exciting! Are you planning a DConf talk on the topic? --
 Andrei

 I would love to give a talk on this.

Submission deadline is Feb 29. -- Andrei

Feb 19 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Thursday, 18 February 2016 at 21:09:15 UTC, Stefan Koch wrote:
 I am planing to open-source my native-sqlite database driver. 
 (well currently it just reads them).

 However it works fully at CTFE.

  Interesting... I'd almost want a page with the ddoc information 
to glance over the API, and a couple code snippets of how you'd 
see it working. <snip>

  I'll look forward to seeing this when it's out :)

Feb 18 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Friday, 19 February 2016 at 06:40:08 UTC, Era Scarecrow wrote:

  Interesting... I'd almost want a page with the ddoc 
 information to glance over the API, and a couple code snippets 
 of how you'd see it working. <snip>

  I'll look forward to seeing this when it's out :)

Well I can do that :)

http://dpaste.dzfl.pl/d8ef67f4b22b

here is a taste of the internal API
please feel free to post improvements or suggestions

Feb 19 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Friday, 19 February 2016 at 16:20:50 UTC, Stefan Koch wrote:
 On Friday, 19 February 2016 at 06:40:08 UTC, Era Scarecrow 
 wrote:

  Interesting... I'd almost want a page with the ddoc 
 information to glance over the API, and a couple code snippets 
 of how you'd see it working. <snip>

  I'll look forward to seeing this when it's out :)

 Well I can do that :)

 http://dpaste.dzfl.pl/d8ef67f4b22b

 here is a taste of the internal API
 please feel free to post improvements or suggestions

... oh crap is misspelled algorithm ...

Feb 19 2016

Stefan Koch <uplink.coder googlemail.com> writes:

just a small update on the API
It could look something like this

auto table = db.tables("personal");
auto results = table.select("name","surname").where!("age","sex", 
(age, sex) => sex.as!Sex == Sex.female, age.as!uint < 40));
auto names = results[0].as!string;
auto surnames = results[0].as!string;

auto tbl_names = 
db.tables("master_table_copy").select("name").where!((type) => 
type.as!string == "table")("type").as!string;

please comment!

Feb 21 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 21 Feb 2016 16:05:49 +0000, Stefan Koch wrote:

 just a small update on the API It could look something like this
 
 auto results = table.select("name","surname").where!("age","sex",
 (age, sex) => sex.as!Sex == Sex.female, age.as!uint < 40));

That means never using an index.

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 21 February 2016 at 18:08:30 UTC, Chris Wright wrote:
 That means never using an index.

How so ?

Feb 21 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 21 Feb 2016 18:12:27 +0000, Stefan Koch wrote:

 On Sunday, 21 February 2016 at 18:08:30 UTC, Chris Wright wrote:
 That means never using an index.

 How so ?

Or rather, parsing a query to use an index becomes exponential unless you 
parse X86 instructions.


So in order to evaluate the lambda, you have to invoke it.

You can't override <= separately from <. You only have an opCmp 
invocation and return value. So with a query filter like:

query.where!("age", (age) => age >= 18);

You know it's being compared to 10, but you don't know what comparison. 
So the first time, you have opCmp return -1, and the lambda yields false. 
Try again, have opCmp return 0, and the lambda yields true. Try a third 
time, have opCmp return 1, and it yields true.

That's three invocations for one variable. What about two variables?

query.where!("age,parental_supervision",
  (age, parentalSupervision) => age >= 18 || parentalSupervision == true);

Now I have to try three different values for the 'age' expression and two 
for the 'parentalSupervision' variable, and there's a combinatorial 
expansion.

That yields a giant truth table, and that's probably good enough. You 
look at which columns have an index and, of those, which have the best 
correlation with the lambda's result. That even lets you reduce the 
amount of memory you use -- you don't need to hold the whole truth table, 
just a series of counters. But you're still dealing with O(2**n) 
evaluations of the lambda.

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 21 February 2016 at 18:32:29 UTC, Chris Wright wrote:
 Or rather, parsing a query to use an index becomes exponential 
 unless you parse X86 instructions.

 D doesn't let you access the contents of a lambda expression. 

 invoke it.

 You can't override <= separately from <. You only have an opCmp 
 invocation and return value. So with a query filter like:

 query.where!("age", (age) => age >= 18);

 You know it's being compared to 10, but you don't know what 
 comparison. So the first time, you have opCmp return -1, and 
 the lambda yields false. Try again, have opCmp return 0, and 
 the lambda yields true. Try a third time, have opCmp return 1, 
 and it yields true.

 That's three invocations for one variable. What about two 
 variables?

 query.where!("age,parental_supervision",
   (age, parentalSupervision) => age >= 18 || 
 parentalSupervision == true);

 Now I have to try three different values for the 'age' 
 expression and two for the 'parentalSupervision' variable, and 
 there's a combinatorial expansion.

 That yields a giant truth table, and that's probably good 
 enough. You look at which columns have an index and, of those, 
 which have the best correlation with the lambda's result. That 
 even lets you reduce the amount of memory you use -- you don't 
 need to hold the whole truth table, just a series of counters. 
 But you're still dealing with O(2**n) evaluations of the lambda.

I don't parse anything.
the string "age" gets evaluated once so I know which columns your 
lambda is referring to
then the lambda gets evaluated exactly n times where n is the 
number of rows.
I your example above you should have written :
query.where!("age","parentalSupervision", (age,pa) => age.as!uint 
= 18 || pa == true)

Feb 21 2016

Any <hot_katjans hotmail.com> writes:

On Sunday, 21 February 2016 at 19:21:31 UTC, Stefan Koch wrote:
 where n is the number of rows.

That means your doing a full table scan. When the table gets 
large enough, this gets problematic.

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 21 February 2016 at 19:55:38 UTC, Any wrote:
 On Sunday, 21 February 2016 at 19:21:31 UTC, Stefan Koch wrote:
 where n is the number of rows.

 That means your doing a full table scan. When the table gets 
 large enough, this gets problematic.

When the table get's large enough you probably don't worry about 
CTFE'ing it anymore.

Feb 21 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 21 Feb 2016 21:15:01 +0000, Stefan Koch wrote:

 On Sunday, 21 February 2016 at 19:55:38 UTC, Any wrote:
 On Sunday, 21 February 2016 at 19:21:31 UTC, Stefan Koch wrote:
 where n is the number of rows.

 That means your doing a full table scan. When the table gets large
 enough, this gets problematic.

 
 When the table get's large enough you probably don't worry about
 CTFE'ing it anymore.

So you intend this to work *only* at compile time? Or would you supply a 
different API for querying at runtime?

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Monday, 22 February 2016 at 03:33:27 UTC, Chris Wright wrote:
 On Sun, 21 Feb 2016 21:15:01 +0000, Stefan Koch wrote:

 On Sunday, 21 February 2016 at 19:55:38 UTC, Any wrote:
 On Sunday, 21 February 2016 at 19:21:31 UTC, Stefan Koch 
 wrote:
 where n is the number of rows.

 That means your doing a full table scan. When the table gets 
 large enough, this gets problematic.

 
 When the table get's large enough you probably don't worry 
 about CTFE'ing it anymore.

 So you intend this to work *only* at compile time? Or would you 
 supply a different API for querying at runtime?

I intend to have a D-style SQLite compatible Database solution. 
However I will not reinvent MySql or postgres. Performance is 
important to me. Scaling to bigger Data however is secondary at 
this point.

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

Another small update.

So far ldc has the lead in my performance test
ldc2 produces significantly faster results then gdc or dmd.
I will have a look at the IR and see how I can "port" the 
optimisations it does to the sourceCode.

Feb 22 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Monday, 22 February 2016 at 14:11:46 UTC, Stefan Koch wrote:
 Another small update.

 So far ldc has the lead in my performance test
 ldc2 produces significantly faster results then gdc or dmd.
 I will have a look at the IR and see how I can "port" the 
 optimisations it does to the sourceCode.

After rewriting the engine to allocate less gc memory, gdc is 
faster then ldc again.

- In a slightly unfair benchmark I had an 3.5x performance 
advantage over sqlite.
   (unfair, because I don't parse SQL and just extract the colums 
raw)
    yet at the and of the day, this is how SQLite is meant to be 
used.
    As a convenient data-storage.
more news :

Write support is coming!
After that is finished let's see by how much I can beat sqlite 
dong bulk inserts.

Feb 26 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Friday, 26 February 2016 at 09:53:46 UTC, Stefan Koch wrote:

 Write support is coming!
 After that is finished let's see by how much I can beat sqlite 
 dong bulk inserts.

dong => doing.

- anyone who wants to help bench-marking please write me.
uplink DOT coder AT gmail C O M

Feb 26 2016

Laeeth Isharc <laeethnospam nospam.laeeth.com> writes:

On Monday, 22 February 2016 at 03:39:57 UTC, Stefan Koch wrote:
 On Monday, 22 February 2016 at 03:33:27 UTC, Chris Wright wrote:
 On Sun, 21 Feb 2016 21:15:01 +0000, Stefan Koch wrote:

 On Sunday, 21 February 2016 at 19:55:38 UTC, Any wrote:
 On Sunday, 21 February 2016 at 19:21:31 UTC, Stefan Koch 
 wrote:
 where n is the number of rows.

 That means your doing a full table scan. When the table gets 
 large enough, this gets problematic.

 
 When the table get's large enough you probably don't worry 
 about CTFE'ing it anymore.

 So you intend this to work *only* at compile time? Or would 
 you supply a different API for querying at runtime?

 I intend to have a D-style SQLite compatible Database solution. 
 However I will not reinvent MySql or postgres. Performance is 
 important to me. Scaling to bigger Data however is secondary at 
 this point.

Great project, Stefan.  Any idea what kind of maximum database 
size will be feasible ?  I realise it is early days so far and 
not your main focus.

Laeeth

Feb 27 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 28 February 2016 at 01:45:48 UTC, Laeeth Isharc wrote:
 Great project, Stefan.  Any idea what kind of maximum database 
 size will be feasible ?  I realise it is early days so far and 
 not your main focus.

 Laeeth

The limits will be almost the same as in sqlite.
So a few 100TB

Feb 27 2016

Laeeth Isharc <nospamlaeeth nospamlaeeth.com> writes:

On Sunday, 28 February 2016 at 01:55:50 UTC, Stefan Koch wrote:
 On Sunday, 28 February 2016 at 01:45:48 UTC, Laeeth Isharc 
 wrote:
 Great project, Stefan.  Any idea what kind of maximum database 
 size will be feasible ?  I realise it is early days so far and 
 not your main focus.

 Laeeth

 The limits will be almost the same as in sqlite.
 So a few 100TB

nice.  and thanks.  Laeeth.

Feb 28 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 21 Feb 2016 19:21:31 +0000, Stefan Koch wrote:
 I don't parse anything.

Right, which prevents you from using indexes.

Feb 21 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Sunday, 21 February 2016 at 16:05:49 UTC, Stefan Koch wrote:
 just a small update on the API
 It could look something like this

 auto table = db.tables("personal");
 auto results = 
 table.select("name","surname").where!("age","sex", (age, sex) 
 => sex.as!Sex == Sex.female, age.as!uint < 40));
 auto names = results[0].as!string;
 auto surnames = results[0].as!string;

  Looks good! Although took me a little bit to notice the Lambda :P

  The result from names, surnames, once you access the entry it 
consumes it? That seems wrong.

  Hmmm is there a short example of how the left/right/natural 
joins would be done? I almost worry if there will be heavy uses 
for lambdas for that as well.

  Also the heavy template/explicit type use of _as_ makes me 
wonder if those can be avoided somehow. Then again SQLite might 
not care enough from what i recall so it may be unavoidable.

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Monday, 22 February 2016 at 02:52:41 UTC, Era Scarecrow wrote:
  Looks good! Although took me a little bit to notice the Lambda 
 :P

  The result from names, surnames, once you access the entry it 
 consumes it? That seems wrong.

  Hmmm is there a short example of how the left/right/natural 
 joins would be done? I almost worry if there will be heavy uses 
 for lambdas for that as well.

  Also the heavy template/explicit type use of _as_ makes me 
 wonder if those can be avoided somehow. Then again SQLite might 
 not care enough from what i recall so it may be unavoidable.

The Intention is not to be another SQLite but rather to provide a 
compatible D-style solution.
As for the conversions they could be avoided. However SQLite does 
not gurantee a type for a colum... And you cannot use costum 
types when you don't convert explicitly.

Feb 21 2016

WebFreak001 <janju007 web.de> writes:

On Thursday, 18 February 2016 at 21:09:15 UTC, Stefan Koch wrote:
 Hello,

 It is not quite ready to post in Announce,
 but I would like to inform you that I am planing to open-source 
 my native-sqlite database driver. (well currently it just reads 
 them).

 However it works fully at CTFE.
 so if you want to you can extract sourceCode out of a 
 sql-database and make a mixin with it :D
 given you string imported the database file.

Awesome! Really looking forward to it. It will probably change 
the way I store data forever. I currently do it in json files, 
which suck for simply storing data :D

Feb 21 2016

Stefan Koch <uplink.coder googlemail.com> writes:

Another update :
Performance is now 3 times better then with my initial version.

Tip of the Day  : constantly reevaluate your design-decisions.

Feb 27 2016

D Programming

C/C++ Programming

Other

digitalmars.D - [WIP] Native SQLite Database reader (works at CTFE)