digitalmars.D - std.database
- Erik Smith (33/33) Mar 01 2016 I'm back to actively working on a std.database specification &
- jmh530 (6/8) Mar 01 2016 Minor typo in the fluent style select section of readme.md.
- Erik Smith (15/15) Mar 01 2016 Typo fixed - thanks. Incidentally, I'm not 100% content with
- Stefan Koch (6/11) Mar 01 2016 Hi Erik,
- Erik Smith (23/23) Mar 01 2016 Hi Stefan,
- Rikki Cattermole (13/13) Mar 01 2016 Okay I've found a problem.
- Erik Smith (15/29) Mar 01 2016 Yes agree that the poly Database is broken - it isn't reference
- Rikki Cattermole (14/45) Mar 01 2016 Currently its use after free. E.g. destructor gets called but there is
- Erik Smith (12/70) Mar 01 2016 I will look at your managed approach to understand it better.
- landaire (3/17) Mar 02 2016 How is this a UAF? Isn't the struct copied?
- landaire (3/4) Mar 02 2016 Ah I think I misunderstood. You mean in the database, not the
- Rikki Cattermole (2/5) Mar 02 2016 Correct. My code was just to showcase the problem.
- Piotrek (20/30) Mar 02 2016 My quick comments:
- Erik Smith (26/40) Mar 02 2016 I agree that std.dbc would be more accurate, although I like
- Piotrek (12/21) Mar 03 2016 I agree with you we need database manipulation in Phobos. However
- Erik Smith (14/24) Mar 03 2016 Your process proposal (DIP73) was helpful and gives me a better
- Kagamin (4/8) Mar 04 2016 BTW in the oracle driver you use that ODBC idiom of passing
- Erik Smith (3/7) Mar 04 2016 That will be fixed in the next push.
- Stefan Koch (5/16) Mar 02 2016 Looks good.
- Kagamin (5/9) Mar 02 2016 I believe that section is about PHP forums being locked to mysql,
- Chris Wright (25/29) Mar 02 2016 It looks like you're trying to write a LevelDB analogue that implements
- Piotrek (31/60) Mar 03 2016 The thing is I strongly encourage to not reserve std.database for
- Chris Wright (79/143) Mar 03 2016 You were a bit vague before. I interpreted you as saying "just offer a
- Piotrek (46/135) Mar 03 2016 I agree I could be better in describing the concept. But I just
- Chris Wright (6/8) Mar 04 2016 Which is all I'm saying. Something in std.database sounds like it should...
- Piotrek (8/11) Mar 04 2016 The one written from scratch specially for D (I'm talking in
- Dejan Lekic (9/13) Mar 03 2016 I suggest you call the package stdx.db - it is not (and may not
- Erik Smith (10/18) Mar 03 2016 I totally agree that it must be collaborative and community
- Kagamin (3/3) Mar 03 2016 Also member names: methods are named after verbs, you use nouns.
- Erik Smith (16/19) Mar 03 2016 Those are actually internal methods not intended for the
- Kagamin (9/13) Mar 03 2016 db.execute("select from t").range();
- Kagamin (4/4) Mar 03 2016 Other options:
- Kagamin (4/4) Mar 03 2016 On the other hand execute can simply return the reader with extra
- Erik Smith (11/31) Mar 03 2016 More good options (the 3rd one is there). Also at the value
- Kagamin (13/29) Mar 04 2016 Can you elaborate?
- Erik Smith (11/39) Mar 04 2016 Actually I like this and I think it can work. I'm trying to keep
- Kagamin (10/25) Mar 04 2016 Another idea from previous attempts: rename `execute` to `query`.
- Erik Smith (17/44) Mar 04 2016 On further thought, execute() should definitely return something
- Bubbasaur (7/13) Mar 04 2016 Why not only:
- Erik Smith (7/22) Mar 04 2016 Agree connection execute is simpler and it is supported, although
- Sebastiaan Koppe (10/14) Mar 04 2016 You'll want to have some types in there. As in
- Erik Smith (18/33) Mar 04 2016 I think some basic object serialization capabilities would be
- Sebastiaan Koppe (26/29) Mar 05 2016 I did a project a while back using mysql-native (see
- John Colvin (4/17) Mar 03 2016 std.experimental, ugly or not, is what is in phobos.
- Saurabh Das (6/11) Mar 05 2016 A little late to the party, nevertheless: Thanks for doing this,
- Jacob Carlborg (5/9) Mar 05 2016 Yeah, that's really important. Unfortunately it looks like a
- Erik Smith (21/31) Mar 05 2016 I'm definitely going to start working in async capability (or
- Jacob Carlborg (5/23) Mar 05 2016 Just for the record, there's already a native Postgres client that is
- Kagamin (6/10) Mar 06 2016 Huh? Non-blocking operation is not reflected in interface. Only
- Erik Smith (10/18) Mar 06 2016 I think a basic non-blocking interface should be similar to the
- Erik Smith (15/15) Mar 10 2016 I've made a few updates based on some of the feedback in this
I'm back to actively working on a std.database specification & implementation. It's still unstable, minimally tested, and there is plenty of work to do, but I wanted to share an update on my progress. The main focus of this project is to bring a standard interface for database clients. This is similar to the purpose of JDBC (java) and DBI (perl). While there is some existing work in place (ddbc, etc.c.odbc.sql, vibe.d and other newer projects), my goal, after a lengthly period of community review/revision, is to achieve Phobos inclusion for the interface standard and also some of the implementations. There is debate on whether the implementations belong there, but I have made the implementation Phobos compatible (by using templates) while this issue is sorted out. The initial focus of this project is to first cover the synchronous interface details and get it to a refined state. Asynchronous I/O is also important and it will take some time to work that into the model with a number of complicated aspects to consider. Also, this is not an ORM tool, although it does provide a foundational layer for building one. This is also aimed at SQL databases, although I will consider how no-sql targets might fit in. There are now implementations for 4 initial targets (sqlite, mysql, Oracle, and ODBC) and they are in a basic working state with a common test suite. For usage examples and other limited project details, see the github page: https://github.com/cruisercoder/dstddb Dub link coming soon (dealing with dub zombie name) along with more progress updates. Please feel free to comment or question on the design as evolves. erik
Mar 01 2016
On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:Please feel free to comment or question on the design as evolves.Minor typo in the fluent style select section of readme.md. the line createDatabase("file:///demo.sqlite"); should be createDatabase("file:///demo.sqlite")
Mar 01 2016
Typo fixed - thanks. Incidentally, I'm not 100% content with createDatabase. With the library being template based, the types are no longer as easy to work with directly: auto database = Database!DefaultPolicy(); alias cant be used because it instantiates. The template argument can be defaulted, but the best I can do is this: auto database = Database!()(); It would be nice if the compiler invoked the default type without the extra !(). I settled on using a no parameter template function named createDatabase. I would have liked just database() to match the other member function style in the package, but I think that might be injecting a name that might be too common as a variable name. Project minutiae. erik
Mar 01 2016
On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:I'm back to actively working on a std.database specification & implementation. It's still unstable, minimally tested, and there is plenty of work to do, but I wanted to share an update on my progress. [...]Hi Erik, As soon as sqlite-d is at the point that it is usable. I will see if I can support the API proposed by you. Have you ever used it in production or semi-production code ? What is the rational for your design ?
Mar 01 2016
Hi Stefan, It might be a challenge for CTFE compatibility in the API, but it would be interesting to see how much of it is workable. There does need to be some optional API elements in cases where it's not supported by the underlying database client(array binding for example) and these compile time policies could also be used to define a restricted subset if needed for your design. I don't have any D code in production unfortunately, but I have a similar design in C++ that has been in production for years. As far as the design rationale, the primary aspect of the design is to ensure deterministic resource lifetimes for each of the resources commonly exposed by clients (connections, statements, rowsets, buffer allocations, etc), which is why structs are used. This could be done with with simple non-copyable structs for a more limited design. I went with the value type structs (using reference counting) which provides better composibility overall. This can be seen in the call chaining examples, which I think significantly increase ease-of-use. It also covers use cases in application code where, for example, lazy results are returned through code layers and they need to retain their underlying connections. Thanks for the questions and the interest. erik
Mar 01 2016
Okay I've found a problem. Here is some code demonstrating it. http://dpaste.dzfl.pl/022c9e610a18 Now take a look again at Database https://github.com/cruisercoder/dstddb/blob/master/src/std/database/poly/database.d#L37 Do you see the problem? The solution is simple. The client database type can be a struct or a class. It doesn't matter too much (assuming you're going the way of ranges). But the intermediary representation must be on the heap and should probably use the constructor and not a static create method to get it. This way people can use other memory management solutions and construct it however they like.
Mar 01 2016
Yes agree that the poly Database is broken - it isn't reference counted and I will fix that. Your sample code had me wondering if I am missing something else, but I can't see another issue yet. I think the use of classes would definitely lead to problems with resources being freed out of order or too late. As far as memory management options, my plan is to work allocators into the design and that would seem to provide a lot of options. I'm having a problem at the moment with MallocAllocator's shared interface. I'm not sure why it's shared since malloc/free are thread safe and I can't seem to cast away the shared. I'm sure there is a reason. erik On Wednesday, 2 March 2016 at 03:07:54 UTC, Rikki Cattermole wrote:Okay I've found a problem. Here is some code demonstrating it. http://dpaste.dzfl.pl/022c9e610a18 Now take a look again at Database https://github.com/cruisercoder/dstddb/blob/master/src/std/database/poly/database.d#L37 Do you see the problem? The solution is simple. The client database type can be a struct or a class. It doesn't matter too much (assuming you're going the way of ranges). But the intermediary representation must be on the heap and should probably use the constructor and not a static create method to get it. This way people can use other memory management solutions and construct it however they like.
Mar 01 2016
On 02/03/16 4:48 PM, Erik Smith wrote:Yes agree that the poly Database is broken - it isn't reference counted and I will fix that.My point was, you shouldn't handle that.Your sample code had me wondering if I am missing something else, but I can't see another issue yet. I think the use of classes would definitely lead to problems with resources being freed out of order or too late.Currently its use after free. E.g. destructor gets called but there is still a copy around. At the very least it should be an explicit call. If classes lead to problems, so will structs.As far as memory management options, my plan is to work allocators into the design and that would seem to provide a lot of options. I'm having a problem at the moment with MallocAllocator's shared interface. I'm not sure why it's shared since malloc/free are thread safe and I can't seem to cast away the shared. I'm sure there is a reason.Use IAllocator. Don't touch the structs unless you want pain. Oh and btw final class is your friend. As an FYI here is my managed memory concept https://github.com/rikkimax/alphaPhobos/blob/master/source/std/experimen al/memory/managed.d its not finished but it mostly works. Of course I would want to go the more OOP way, sure thats more allocations but over all I think there are enough wins that its worth it. So if you feel it doesn't fit well with your goal, say so :)erik On Wednesday, 2 March 2016 at 03:07:54 UTC, Rikki Cattermole wrote:Okay I've found a problem. Here is some code demonstrating it. http://dpaste.dzfl.pl/022c9e610a18 Now take a look again at Database https://github.com/cruisercoder/dstddb/blob/master/src/std/database/poly/database.d#L37 Do you see the problem? The solution is simple. The client database type can be a struct or a class. It doesn't matter too much (assuming you're going the way of ranges). But the intermediary representation must be on the heap and should probably use the constructor and not a static create method to get it. This way people can use other memory management solutions and construct it however they like.
Mar 01 2016
I will look at your managed approach to understand it better. One drawback I can think of is that destruction might not occur immediately. This can be an issue for shared libraries when the GC hasn't run when the library call returns to the host application. Maybe it's a C++ bias, but the RefCounted approach seems like such a natural fit for this use case and so far it's working well. There is the race condition issue but I think there is an allocator solution for that coming. erik On Wednesday, 2 March 2016 at 04:11:10 UTC, Rikki Cattermole wrote:On 02/03/16 4:48 PM, Erik Smith wrote:Yes agree that the poly Database is broken - it isn't reference counted and I will fix that.My point was, you shouldn't handle that.Your sample code had me wondering if I am missing something else, but I can't see another issue yet. I think the use of classes would definitely lead to problems with resources being freed out of order or too late.Currently its use after free. E.g. destructor gets called but there is still a copy around. At the very least it should be an explicit call. If classes lead to problems, so will structs.As far as memory management options, my plan is to work allocators into the design and that would seem to provide a lot of options. I'm having a problem at the moment with MallocAllocator's shared interface. I'm not sure why it's shared since malloc/free are thread safe and I can't seem to cast away the shared. I'm sure there is a reason.Use IAllocator. Don't touch the structs unless you want pain. Oh and btw final class is your friend. As an FYI here is my managed memory concept https://github.com/rikkimax/alphaPhobos/blob/master/source/std/experimen al/memory/managed.d its not finished but it mostly works. Of course I would want to go the more OOP way, sure thats more allocations but over all I think there are enough wins that its worth it. So if you feel it doesn't fit well with your goal, say so :)erik On Wednesday, 2 March 2016 at 03:07:54 UTC, Rikki Cattermole wrote:Okay I've found a problem. Here is some code demonstrating it. http://dpaste.dzfl.pl/022c9e610a18 Now take a look again at Database https://github.com/cruisercoder/dstddb/blob/master/src/std/database/poly/database.d#L37 Do you see the problem? The solution is simple. The client database type can be a struct or a class. It doesn't matter too much (assuming you're going the way of ranges). But the intermediary representation must be on the heap and should probably use the constructor and not a static create method to get it. This way people can use other memory management solutions and construct it however they like.
Mar 01 2016
On Wednesday, 2 March 2016 at 03:07:54 UTC, Rikki Cattermole wrote:Okay I've found a problem. Here is some code demonstrating it. http://dpaste.dzfl.pl/022c9e610a18 Now take a look again at Database https://github.com/cruisercoder/dstddb/blob/master/src/std/database/poly/database.d#L37 Do you see the problem? The solution is simple. The client database type can be a struct or a class. It doesn't matter too much (assuming you're going the way of ranges). But the intermediary representation must be on the heap and should probably use the constructor and not a static create method to get it. This way people can use other memory management solutions and construct it however they like.How is this a UAF? Isn't the struct copied?
Mar 02 2016
On Wednesday, 2 March 2016 at 18:28:34 UTC, landaire wrote:How is this a UAF? Isn't the struct copied?Ah I think I misunderstood. You mean in the database, not the dpaste?
Mar 02 2016
On 03/03/16 8:36 AM, landaire wrote:On Wednesday, 2 March 2016 at 18:28:34 UTC, landaire wrote:Correct. My code was just to showcase the problem.How is this a UAF? Isn't the struct copied?Ah I think I misunderstood. You mean in the database, not the dpaste?
Mar 02 2016
On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:The main focus of this project is to bring a standard interface for database clients. This is similar to the purpose of JDBC (java) and DBI (perl). While there is some existing work in place (ddbc, etc.c.odbc.sql, vibe.d and other newer projects), my goal, after a lengthly period of community review/revision, is to achieve Phobos inclusion for the interface standard and also some of the implementations. There is debate on whether the implementations belong there, but I have made the implementation Phobos compatible (by using templates) while this issue is sorted out.My quick comments: 1. In my opinion it should not be called std.database, but let's say "std.dbc". This is because it looks like a wrapper over db clients. Moreover one may say it's almost impossible to make a common and effective interface which would work with all databases. e.g. someone on Wikipedia argues ODBC is obolete (check "ODBC today") 2. I'm not against such a functionality in Phobos. However your project seems to be a duplication of the DDBC project. And it was not proposed for inclusion so often. 3. What I call a D Database API could be described as: - Database object - DbCollection object (equivalent to array) - ranges + std.algorithm And here is my implementation (experimental) of this API: https://github.com/PiotrekDlang/AirLock/tree/master/docs/database/design.md https://github.com/PiotrekDlang/AirLock/tree/master/src Piotrek
Mar 02 2016
1. In my opinion it should not be called std.database, but let's say "std.dbc". This is because it looks like a wrapper over db clients. Moreover one may say it's almost impossible to make a common and effective interface which would work with all databases. e.g. someone on Wikipedia argues ODBC is obolete (check "ODBC today")I agree that std.dbc would be more accurate, although I like std.database better because it is more recognizable. Another option could be std.database.client. ODBC is less important now as there are fewer desktop applications that connect directly to databases. There is at least one example related to D, which is the ODBC driver for presto, which is used in Excel. Server side code is generally written using the underlying native client directly. However ODBC support is useful in that it represents a catch all that covers all databases for which a native client is not yet supported. At least for the SQL databases, a standard interface is definitely achievable and I believe that it having it would allow D to get more traction as an application platform.2. I'm not against such a functionality in Phobos. However your project seems to be a duplication of the DDBC project. And it was not proposed for inclusion so often.There are a number of areas where this design is an improvement over DDBC: ease-of-use, better resource management (no scope, no GC), phobos compatibility, to name a few. There is a lot more that needs to be added to make it standards grade.3. What I call a D Database API could be described as: - Database object - DbCollection object (equivalent to array) - ranges + std.algorithmMy design is indeed range based: stmt.range() returns a forward range proxy for the query results (the accessor name might change however). Your engine project is interesting. I think there is common ground in the interfaces in the two projects, particularly in how the interface for the results might work. I will look more closely at the details to see what might be workable. erik
Mar 02 2016
On Wednesday, 2 March 2016 at 17:13:32 UTC, Erik Smith wrote:There are a number of areas where this design is an improvement over DDBC: ease-of-use, better resource management (no scope, no GC), phobos compatibility, to name a few. There is a lot more that needs to be added to make it standards grade.I agree with you we need database manipulation in Phobos. However modules like db, gui, xml or similar are too much work for a one developer. And as you can see from time to time there apears someone with its own vision. That's why, long time ago, I suggested DIP73 (http://wiki.dlang.org/DIP73) so the collaborative work would be controlled by the D community (or the D foundation). But I am aware that there is no agreement nor resources for that.Your engine project is interesting. I think there is common ground in the interfaces in the two projects, particularly in how the interface for the results might work. I will look more closely at the details to see what might be workable. erikI agree that we (as a community) should work on common and effective APIs. Maybe when D foundation is big enough... Piotrek
Mar 03 2016
On Thursday, 3 March 2016 at 16:08:03 UTC, Piotrek wrote:I agree with you we need database manipulation in Phobos. However modules like db, gui, xml or similar are too much work for a one developer. And as you can see from time to time there apears someone with its own vision. That's why, long time ago, I suggested DIP73 (http://wiki.dlang.org/DIP73) so the collaborative work would be controlled by the D community (or the D foundation). But I am aware that there is no agreement nor resources for that. I agree that we (as a community) should work on common and effective APIs. Maybe when D foundation is big enough...Your process proposal (DIP73) was helpful and gives me a better perspective on the standardization process. Thanks for referencing that Piotrek. You are right that areas like this are too much work for one developer. The only leverage I might have is a lot of familiarity working with many of the native C client interfaces and experience implementing higher level interfaces on top. That and is also a lot of existing work to draw on that can inform the design. Also, my sense is that while there is less process for standardization in D at present, the rate at which progress can occur should be much higher (compared to ISOCCP, for example). The last thing I want, however, is to get something accepted into std.experimental that is highly contentious or of subpar quality.
Mar 03 2016
On Thursday, 3 March 2016 at 17:46:02 UTC, Erik Smith wrote:The only leverage I might have is a lot of familiarity working with many of the native C client interfaces and experience implementing higher level interfaces on top. That and is also a lot of existing work to draw on that can inform the design.BTW in the oracle driver you use that ODBC idiom of passing strings as pointer+length pairs. Why don't you use it in ODBC driver?
Mar 04 2016
On Friday, 4 March 2016 at 16:54:33 UTC, Kagamin wrote:On Thursday, 3 March 2016 at 17:46:02 UTC, Erik Smith wrote: BTW in the oracle driver you use that ODBC idiom of passing strings as pointer+length pairs. Why don't you use it in ODBC driver?That will be fixed in the next push. erik
Mar 04 2016
On Wednesday, 2 March 2016 at 15:41:56 UTC, Piotrek wrote:On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:Looks good. At least at a first glace it seems pretty similar to SQLite. I'd like to take this api for sqlite-d. Because it seems to fit quite naturally.[...]My quick comments: 1. In my opinion it should not be called std.database, but let's say "std.dbc". This is because it looks like a wrapper over db clients. Moreover one may say it's almost impossible to make a common and effective interface which would work with all databases. e.g. someone on Wikipedia argues ODBC is obolete (check "ODBC today") [...]
Mar 02 2016
On Wednesday, 2 March 2016 at 15:41:56 UTC, Piotrek wrote:Moreover one may say it's almost impossible to make a common and effective interface which would work with all databases. e.g. someone on Wikipedia argues ODBC is obolete (check "ODBC today")I believe that section is about PHP forums being locked to mysql, and performance is not the reason for that. Mostly because mysql support is builtin, works for most things and is provided by hosters.
Mar 02 2016
On Wed, 02 Mar 2016 15:41:56 +0000, Piotrek wrote:3. What I call a D Database API could be described as: - Database object - DbCollection object (equivalent to array) - ranges + std.algorithmIt looks like you're trying to write a LevelDB analogue that implements an array rather than a key/value collection -- and that's a decent start. If you're trying to connect to a SQL database or a document database, as I'd expect for something called "std.database", it's a pretty terrible API: * No index scans, lookups, or range queries. * No support for complex queries. * No support for joins. * No support for projections. * No support for transactions. * If you add support for transactions, you'll crash all the time because the transactions got too large, thanks to the full table scan mentality. * In your implementation, updates must bring every affected row over the wire, then send back the modified row. * Updates affect an entire row. If one process updates one field in a row and another one updates a different field, one of those writes gets clobbered. * The API assumes a total ordering for each DbCollection. This is not valid. * If there are multiple rows that compare as equals, there's no way to update only one of them in your implementation. * In your implementation, updating one row is a ϴ(N) operation. It still costs ϴ(N) when the row you want to update is the first one in the collection.
Mar 02 2016
On Thursday, 3 March 2016 at 01:49:22 UTC, Chris Wright wrote:If you're trying to connect to a SQL database or a document database, as I'd expect for something called "std.database"The thing is I strongly encourage to not reserve std.database for external database clients and even what is more limiting to SQL ones only., it's a pretty terrible API: * No index scans, lookups, or range queries.Indexes can be supported by strings and CTFE, can't they? e.g. filter!q{item.elements.length < 10 && item.model == "Sport"}* No support for complex queries.Not sure what you mean by complex queries. Also I think the API allows arbitrary complex queries.* No support for joins.Can be done by attributes or other linking functionality between DbCollections.* No support for projections.You mean something like referring to part of the item's fields? I see no problem here.* No support for transactions. * If you add support for transactions, you'll crash all the time because the transactions got too large, thanks to the full table scan mentality.Isn't it just the "index support" case?* In your implementation, updates must bring every affected row over the wire, then send back the modified row.In my implementation there is no wire (that's why I call it embedded). However I thought we talk about API and not particular implementation. I don't see how this API excludes RPC. Query strings (e.g. SQL) can be provided in old fashioned way.* Updates affect an entire row. If one process updates one field in a row and another one updates a different field, one of those writes gets clobbered.I think this is just a "must have" for any db engine. I don't see how it applies to the proposed API other than any implementation of db engine has to handle it properly. When I say DbCollection should behave similar to an ordinal array I don't mean it should be an ordinal array.* The API assumes a total ordering for each DbCollection. This is not valid.I don't know what you mean here. Example would be good.* If there are multiple rows that compare as equals, there's no way to update only one of them in your implementation. * In your implementation, updating one row is a ϴ(N) operation. It still costs ϴ(N) when the row you want to update is the first one in the collection.I'm still not sure if you are referring to my implementation or hypothetical API. To be clear: my current implementation is still proof of concept and surly *unfinished*. And in case you refer to my implementation I plan to support O(1), O(log n) and O(n) access patterns with its "rights and duties". Cheers, Piotrek
Mar 03 2016
On Thu, 03 Mar 2016 15:50:04 +0000, Piotrek wrote:On Thursday, 3 March 2016 at 01:49:22 UTC, Chris Wright wrote:You were a bit vague before. I interpreted you as saying "just offer a range and an array-like API, and then you can use it with std.algorithm". But if you meant to offer an API that is similar to std.algorithm and also array-like, that's more feasible. You're still left with the task of transpiling D to SQL. This model does not work with CouchDB. You must avoid using std.algorithm and std.range functions assiduously because they would offer terrible performance.If you're trying to connect to a SQL database or a document database, as I'd expect for something called "std.database"The thing is I strongly encourage to not reserve std.database for external database clients and even what is more limiting to SQL ones only., it's a pretty terrible API: * No index scans, lookups, or range queries.Indexes can be supported by strings and CTFE, can't they? e.g. filter!q{item.elements.length < 10 && item.model == "Sport"}Aggregates, especially with joins. Computed fields.* No support for complex queries.Not sure what you mean by complex queries. Also I think the API allows arbitrary complex queries.With attributes, you need users to define aggregate types instead of just using Row and the like. That's ORM territory. At a previous job I maintained an internal BI site that exposed 50-100 different queries, each with their own set of result fields. We didn't want to use ORM there; it would have been cumbersome and inappropriate. Also, that assumes that you will always want a join when querying a table. I maintained an application once, using ORM, in which we sometimes wanted an eager join and sometimes wanted a lazy one. This posed a nontrivial performance impact. I'm not sure ORM would be a candidate for phobos.* No support for joins.Can be done by attributes or other linking functionality between DbCollections.Let me point you to the existence of the TEXT and BLOB datatypes. They can each hold 2**32 bytes of data in MySQL. I'm not splitting those off into a separate table to port my legacy database to your API. I'm not dragging in multiple megabytes of data in every query. If you're going full ORM, you can add lazy fields. That adds complexity. It's also inefficient when I know in advance that I need those fields.* No support for projections.You mean something like referring to part of the item's fields? I see no problem here.You didn't mention transactions at all in the initial outline. After that, yes, in large portion index support addresses this. DB-side aggregation also helps.* No support for transactions. * If you add support for transactions, you'll crash all the time because the transactions got too large, thanks to the full table scan mentality.Isn't it just the "index support" case?I'm running a website and decide that, with the latest changes, existing users need to get the new user email. So I write: UPDATE users SET sent_join_email = FALSE; -- ok; 1,377,212 rows affected Or I'm using your database system. If it uses std.algorithm, I have to iterate through the users list, pulling each row into my process's memory from the database server, and then I have to write everything back to the database server. Depending on the implementation, it's using a database cursor or issuing a new query for every K results. If it's using a database cursor, those might not be valid across transaction boundaries. I'm not sure. If they aren't, you get a large transaction, which causes problems. If your database system instead offers a string-based API similar to std.algorithm, you might be able to turn this into a single query, but it's going to be a lot of work for you.* In your implementation, updates must bring every affected row over the wire, then send back the modified row.In my implementation there is no wire (that's why I call it embedded). However I thought we talk about API and not particular implementation. I don't see how this API excludes RPC. Query strings (e.g. SQL) can be provided in old fashioned way.Without transactions, MySQL supports writing to two different columns in two different queries without those writes clobbering each other. That's handling it properly.* Updates affect an entire row. If one process updates one field in a row and another one updates a different field, one of those writes gets clobbered.I think this is just a "must have" for any db engine. I don't see how it applies to the proposed API other than any implementation of db engine has to handle it properly.When I say DbCollection should behave similar to an ordinal array I don't mean it should be an ordinal array.opIndex(size_t offset) assumes the database supports a one-to-one mapping between offsets and rows. SQLite, for one, does not guarantee query result ordering if the query does not include an ORDER BY clause. So offering on opIndex(size_t offset) operation -- either you load the entire table into memory in advance, or you might get the same row returned for every index. Alternatively, you must examine the table and generate a sufficiently unique ordering for it. Any mutation to the table's contents can change collection iterators throw an exception if you modify the collection during iteration. But, to avoid silent errors you can't defend against, you have to dump the whole table into a transaction or have the database somehow tell you when someone else has modified it. Speaking of modifications, let's say I write a simple loop like: for (size_t i = 0; i < dbTable.length; i++) { writeln(dbTable[i].id); } This can go wrong in about four ways: * Someone inserts a row whose index is less than i. This prints out the same id twice. * Someone deletes a row whose index is less than i. This skips a different row. * Someone deletes a row when i == dbTable.length - 1, just before opIndex executes. I get an index out of bounds error. * This is issuing two queries per iteration. It's going to take probably fifty times longer than using a cursor. This is a terrible usage pattern, but by offering opIndex and length operations, you are recommending it.* The API assumes a total ordering for each DbCollection. This is not valid.I don't know what you mean here. Example would be good.I specifically said "in your implementation" for these two because I was referring to your implementation rather than your proposal in general. The rest refers to the basic idea.* If there are multiple rows that compare as equals, there's no way to update only one of them in your implementation. * In your implementation, updating one row is a ϴ(N) operation. It still costs ϴ(N) when the row you want to update is the first one in the collection.I'm still not sure if you are referring to my implementation or hypothetical API. To be clear: my current implementation is still proof of concept and surly *unfinished*. And in case you refer to my implementation I plan to support O(1), O(log n) and O(n) access patterns with its "rights and duties".
Mar 03 2016
On Thursday, 3 March 2016 at 18:48:08 UTC, Chris Wright wrote:You were a bit vague before. I interpreted you as saying "just offer a range and an array-like API, and then you can use it with std.algorithm". But if you meant to offer an API that is similar to std.algorithm and also array-like, that's more feasible.I agree I could be better in describing the concept. But I just sketched the idea.You're still left with the task of transpiling D to SQL.If someone wants to use SQL in its *full power* no D API nor any other language will suffice. Mainly because it will be always a traslation layer . The only we can to is to provide an aid like things suggested by Andrei (sql parser, value binding, etc).This model does not work with CouchDB.I don't know CouchDB so I can't comment.You must avoid using std.algorithm and std.range functions assiduously because they would offer terrible performance.For big data in db, plain vanilla std.algorithm won't be insufficient. I agree.Regarding computed fields and other database vendor specific features you are right. But on the other hand aggregations and joins can be represented as objects and proxies of objects.Aggregates, especially with joins. Computed fields.* No support for complex queries.Not sure what you mean by complex queries. Also I think the API allows arbitrary complex queries.I don't like ORM with respect to SQL. But quasi object database which can look similar to ORM is not a problem for me.With attributes, you need users to define aggregate types instead of just using Row and the like. That's ORM territory.* No support for joins.Can be done by attributes or other linking functionality between DbCollections.At a previous job I maintained an internal BI site that exposed 50-100 different queries, each with their own set of result fields. We didn't want to use ORM there; it would have been cumbersome and inappropriate.I can see your point. But the problem can be solved by not using SQL.Also, that assumes that you will always want a join when querying a table. I maintained an application once, using ORM, in which we sometimes wanted an eager join and sometimes wanted a lazy one. This posed a nontrivial performance impact.Something like DbProxy would handle lazy "joins".I'm not sure ORM would be a candidate for phobos.As I don't plan to use an (traditional) ORM I'm not involved. However if other people would find it worthy I don't object.This is something a DbProxy would handle. Eventually: struct OrginalObject { int id; string bigString; } struct StrippedObject { int id; } then auto collA = db.collection!OrginalObject("Big"); auto collA = db.collection!StrippedObject("Big"); In the second line the string is not fetched.Let me point you to the existence of the TEXT and BLOB datatypes. They can each hold 2**32 bytes of data in MySQL.* No support for projections.You mean something like referring to part of the item's fields? I see no problem here.I'm not splitting those off into a separate table to port my legacy database to your API. I'm not dragging in multiple megabytes of data in every query. If you're going full ORM, you can add lazy fields. That adds complexity. It's also inefficient when I know in advance that I need those fields.For client-server approach I agree with the above. For embedded design (as in my project) this is not a case.I'm running a website and decide that, with the latest changes, existing users need to get the new user email. So I write: UPDATE users SET sent_join_email = FALSE; -- ok; 1,377,212 rows affected Or I'm using your database system. If it uses std.algorithm, I have to iterate through the users list, pulling each row into my process's memory from the database server, and then I have to write everything back to the database server. Depending on the implementation, it's using a database cursor or issuing a new query for every K results. If it's using a database cursor, those might not be valid across transaction boundaries. I'm not sure. If they aren't, you get a large transaction, which causes problems. If your database system instead offers a string-based API similar to std.algorithm, you might be able to turn this into a single query, but it's going to be a lot of work for you.* In your implementation, updates must bring every affected row over the wire, then send back the modified row.In my implementation there is no wire (that's why I call it embedded). However I thought we talk about API and not particular implementation. I don't see how this API excludes RPC. Query strings (e.g. SQL) can be provided in old fashioned way.Designing a good locking mechanism will be a challenging task, that is what I'm sure :)Without transactions, MySQL supports writing to two different columns in two different queries without those writes clobbering each other. That's handling it properly.* Updates affect an entire row. If one process updates one field in a row and another one updates a different field, one of those writes gets clobbered.I think this is just a "must have" for any db engine. I don't see how it applies to the proposed API other than any implementation of db engine has to handle it properly.[...]When I say DbCollection should behave similar to an ordinal array I don't mean it should be an ordinal array.opIndex(size_t offset) assumes the database supports a one-to-one mapping between offsets and rows.* The API assumes a total ordering for each DbCollection. This is not valid.I don't know what you mean here. Example would be good.This is a terrible usage pattern, but by offering opIndex and length operations, you are recommending it.I don't recommend it. I just added it for evaluation. I'm aware it only works when the collection is not mutated. I think the same goes for all shared collections (also those in memory) Finally, IMO any DB API will be biased toward one solution. Cheers Piotrek
Mar 03 2016
On Fri, 04 Mar 2016 06:16:59 +0000, Piotrek wrote:For client-server approach I agree with the above. For embedded design (as in my project) this is not a case.Which is all I'm saying. Something in std.database sounds like it should allow you to interact with databases, like JDBC. With embedded databases, there's a lot of variety out there, probably a decent selection of tradeoffs, so I'm not sure any one would be appropriate to phobos.
Mar 04 2016
On Friday, 4 March 2016 at 16:41:35 UTC, Chris Wright wrote:With embedded databases, there's a lot of variety out there, probably a decent selection of tradeoffs, so I'm not sure any one would be appropriate to phobos.The one written from scratch specially for D (I'm talking in general, nothing particular in mind). This is my idea of the optimal solution. And I can be wrong of course. And there is a need for a module with interface to other databases, something like Erik's project. Piotrek
Mar 04 2016
On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:I'm back to actively working on a std.database specification & implementation. It's still unstable, minimally tested, and there is plenty of work to do, but I wanted to share an update on my progress.I suggest you call the package stdx.db - it is not (and may not become) a standard package, so `std` is out of question. If it is supposed to be *proposed* as standard package, then `stdx` is good because that is what some people have used in the past (while others used the ugly std.experimental. for the same purpose). I humbly believe that this effort **must** be collaborative as such package is doomed to fail if done wrong.
Mar 03 2016
I suggest you call the package stdx.db - it is not (and may not become) a standard package, so `std` is out of question. If it is supposed to be *proposed* as standard package, then `stdx` is good because that is what some people have used in the past (while others used the ugly std.experimental. for the same purpose). I humbly believe that this effort **must** be collaborative as such package is doomed to fail if done wrong.I totally agree that it must be collaborative and community driven. I failed to add the proposed qualifier in this thread - sorry about that. Right now I'm just trying to put together enough of a substantive design to be worthy of discussion. While I'm presenting a design the way I think it should work, I'm definitely asking for feedback and especially opposition to any aspect of it. My package name choice is just an example, the package name would have to be changed to whatever the "D committee" thinks is appropriate. erik
Mar 03 2016
Also member names: methods are named after verbs, you use nouns. Method `next` is ambiguous: is the first row the next row? `fetch` or `fetchRow` would be better.
Mar 03 2016
On Thursday, 3 March 2016 at 15:07:43 UTC, Kagamin wrote:Also member names: methods are named after verbs, you use nouns. Method `next` is ambiguous: is the first row the next row? `fetch` or `fetchRow` would be better.Those are actually internal methods not intended for the interface user. They might get exposed in some way to the implementer so I'll have to consider this issue. However, If I apply what you are suggesting to the the interface methods, it would look like this: auto r = db.connection().statement("select from t").range(); // nouns auto r = db.getConnection().getStatement("select from t").getRange(); // verbs Maybe there is a more creative way to change the names rather than prepending with get (let me know if you have suggestions), but the nouns seem cleaner to me. I'll have to look at more phobos code to get a better sense of the conventions. Also range will be replaced by an opSlice operator. erik
Mar 03 2016
On Thursday, 3 March 2016 at 15:53:28 UTC, Erik Smith wrote:auto r = db.connection().statement("select from t").range(); // nounsdb.execute("select from t").range(); `range` is probably ok. Or auto connection = db.createConnection(); connection.execute("select from t").range();auto r = db.getConnection().getStatement("select from t").getRange(); // verbsGetters must be actually nouns, but these are not getters, but factory methods: createConnection, createStatement, probably low level and/or private.
Mar 03 2016
Other options: db.execute("select from t").reader; //row range db.execute("select from t").get!long; //scalar db.execute("select from t"); //non-query
Mar 03 2016
On the other hand execute can simply return the reader with extra getter for scalar result. Just don't do stuff until it's iterated. Is it possible? auto rows = db.execute("select * from t");
Mar 03 2016
On Thursday, 3 March 2016 at 17:03:58 UTC, Kagamin wrote:On Thursday, 3 March 2016 at 15:53:28 UTC, Erik Smith wrote:Good point. I will track this as a design option to debate.auto r = db.connection().statement("select from t").range(); // nounsdb.execute("select from t").range(); `range` is probably ok. Or auto connection = db.createConnection(); connection.execute("select from t").range();auto r = db.getConnection().getStatement("select from t").getRange(); // verbsGetters must be actually nouns, but these are not getters, but factory methods: createConnection, createStatement, probably low level and/or private.db.execute("select from t").reader; //row range db.execute("select from t").get!long; //scalar db.execute("select from t"); //non-queryMore good options (the 3rd one is there). Also at the value access level there are several options to consider: v.get!long, v.as!long, v.to!long, etc.On the other hand execute can simply return the reader with extra getter for scalar result. Just don't do stuff until it's iterated. Is it possible? auto rows = db.execute("select * from t");I'm hedging a bit on this because there are other capabilities that need to be introduced that might present a problem. Another issue is that this might conflict with the notion of a container and the use of opSlice. Great feedback though and I'm tracking it. erik
Mar 03 2016
On Thursday, 3 March 2016 at 18:08:26 UTC, Erik Smith wrote:to!long looks ok.db.execute("select from t").reader; //row range db.execute("select from t").get!long; //scalar db.execute("select from t"); //non-queryMore good options (the 3rd one is there). Also at the value access level there are several options to consider: v.get!long, v.as!long, v.to!long, etc.Can you elaborate? On the other hand these helper methods are built on top of abstraction API and you have to duplicate them in all drivers. Maybe better have them as extension methods in a single module that will work with all drivers? They are effectively a sort of minimal frontend already.On the other hand execute can simply return the reader with extra getter for scalar result. Just don't do stuff until it's iterated. Is it possible? auto rows = db.execute("select * from t");I'm hedging a bit on this because there are other capabilities that need to be introduced that might present a problem.Another issue is that this might conflict with the notion of a container and the use of opSlice. Great feedback though and I'm tracking it.The result set is even iterator/stream, i.e. conceptually has even less container properties than ranges themselves. Why would you think of it as a container? I think it's ok as input range. Anyway, this `execute` method is a frontend, i.e. it's replaceable without touching the driver.
Mar 04 2016
On Friday, 4 March 2016 at 11:57:49 UTC, Kagamin wrote:On Thursday, 3 March 2016 at 18:08:26 UTC, Erik Smith wrote:Actually I like this and I think it can work. I'm trying to keep a single execute function name for both row/no-row queries. I can still return the range proxy for no-row queries that is either empty or throws on access.to!long looks ok.db.execute("select from t").reader; //row range db.execute("select from t").get!long; //scalar db.execute("select from t"); //non-queryMore good options (the 3rd one is there). Also at the value access level there are several options to consider: v.get!long, v.as!long, v.to!long, etc.Can you elaborate?On the other hand execute can simply return the reader with extra getter for scalar result. Just don't do stuff until it's iterated. Is it possible? auto rows = db.execute("select * from t");I'm hedging a bit on this because there are other capabilities that need to be introduced that might present a problem.On the other hand these helper methods are built on top of abstraction API and you have to duplicate them in all drivers. Maybe better have them as extension methods in a single module that will work with all drivers? They are effectively a sort of minimal frontend already.I'm just waiting for the code to settle a bit in a basic working state before I refactor into a two layer design.The result set is even iterator/stream, i.e. conceptually has even less container properties than ranges themselves. Why would you think of it as a container? I think it's ok as input range. Anyway, this `execute` method is a frontend, i.e. it's replaceable without touching the driver.The range itself is an InputRange. The statement is acting as a container only in that it is the source of the range and holds the data. I agree that it is confusing to use the term container although it seems to fit the definition of one.
Mar 04 2016
On Friday, 4 March 2016 at 14:44:48 UTC, Erik Smith wrote:Actually I like this and I think it can work. I'm trying to keep a single execute function name for both row/no-row queries. I can still return the range proxy for no-row queries that is either empty or throws on access.Yes, that's the idea.I'm just waiting for the code to settle a bit in a basic working state before I refactor into a two layer design.Another idea from previous attempts: rename `execute` to `query`. foreach(row; db.query("select * from t")) { ... } And name for package: std.sql`execute` should not return a statement, but a result set reader, that would be a range. Yeah, the result set is indeed a range conceptually: it has a definite number of ordered rows, a beginning and an end.The result set is even iterator/stream, i.e. conceptually has even less container properties than ranges themselves. Why would you think of it as a container? I think it's ok as input range. Anyway, this `execute` method is a frontend, i.e. it's replaceable without touching the driver.The range itself is an InputRange. The statement is acting as a container only in that it is the source of the range and holds the data. I agree that it is confusing to use the term container although it seems to fit the definition of one.
Mar 04 2016
On Friday, 4 March 2016 at 16:43:00 UTC, Kagamin wrote:On Friday, 4 March 2016 at 14:44:48 UTC, Erik Smith wrote:On further thought, execute() should definitely return something but I think it needs to return a Result (alternative name RowSet), the actual container, rather than the range. This more cleanly separates out the post execute state management into a separate object from Statement. The user will often want other columns, columns names/types) describing the returned table and RowSet provides convenient access to it. This also suggests that execute should be explicit & once/only, which I think is better. Here is an updated example to illustrate: auto db = createDatabase("file:///testdb"); auto rowSet = db.connection().statement("select name,score from score").execute; foreach (r; rowSet) writeln(r[0].as!string,",",r[1].as!int); I'll track query as an alternative name for execute() and std.sql as alternative for std.database.Actually I like this and I think it can work. I'm trying to keep a single execute function name for both row/no-row queries. I can still return the range proxy for no-row queries that is either empty or throws on access.Yes, that's the idea.I'm just waiting for the code to settle a bit in a basic working state before I refactor into a two layer design.Another idea from previous attempts: rename `execute` to `query`. foreach(row; db.query("select * from t")) { ... } And name for package: std.sql`execute` should not return a statement, but a result set reader, that would be a range. Yeah, the result set is indeed a range conceptually: it has a definite number of ordered rows, a beginning and an end.The result set is even iterator/stream, i.e. conceptually has even less container properties than ranges themselves. Why would you think of it as a container? I think it's ok as input range. Anyway, this `execute` method is a frontend, i.e. it's replaceable without touching the driver.The range itself is an InputRange. The statement is acting as a container only in that it is the source of the range and holds the data. I agree that it is confusing to use the term container although it seems to fit the definition of one.
Mar 04 2016
On Friday, 4 March 2016 at 18:42:45 UTC, Erik Smith wrote:auto db = createDatabase("file:///testdb"); auto rowSet = db.connection().statement("select name,score from score").execute; foreach (r; rowSet) writeln(r[0].as!string,",",r[1].as!int); I'll track query as an alternative name for execute() and std.sql as alternative for std.database.Why not only: auto rowSet = db.connection().query("select name,scorefrom score"); Or instead of query maybe "returnRows" or just "rows", I think the other way is too verbose. Bubba.
Mar 04 2016
On Friday, 4 March 2016 at 19:27:47 UTC, Bubbasaur wrote:On Friday, 4 March 2016 at 18:42:45 UTC, Erik Smith wrote:Agree connection execute is simpler and it is supported, although I need to add the bind version to connection as well. The statement should only get involved when you need more specific control over the input binding. I add returnRows/rows to the execute/query alternative name list. erikauto db = createDatabase("file:///testdb"); auto rowSet = db.connection().statement("select name,score from score").execute; foreach (r; rowSet) writeln(r[0].as!string,",",r[1].as!int); I'll track query as an alternative name for execute() and std.sql as alternative for std.database.Why not only: auto rowSet = db.connection().query("select name,scorefrom score"); Or instead of query maybe "returnRows" or just "rows", I think the other way is too verbose. Bubba.
Mar 04 2016
On Friday, 4 March 2016 at 18:42:45 UTC, Erik Smith wrote:auto db = createDatabase("file:///testdb"); auto rowSet = db.connection().statement("select name,score from score").execute; foreach (r; rowSet) writeln(r[0].as!string,",",r[1].as!int);You'll want to have some types in there. As in struct S { string name; int age; } auto rowSet = db.connection().statement("select name,score from score").executeAs!S; foreach (r; rowSet) writeln(r.name,",",r.age);
Mar 04 2016
On Friday, 4 March 2016 at 22:44:24 UTC, Sebastiaan Koppe wrote:On Friday, 4 March 2016 at 18:42:45 UTC, Erik Smith wrote:I think some basic object serialization capabilities would be great although I'm not sure how the bare names can be accessed like that through the rowSet How would that work? I can see this: S s; auto rowSet = db.connection().statement("select name,score from score").into(s); writeln(s.name,",",s.age); Two other options (not really serialization): // 1 string name; int age; auto rowSet = db.connection().execute("select name,age from score").into(s.name,age); foreach (r; rowSet) writeln(name,",",age); // 2 foreach (r; rowSet) writeln(r["name"],",",r["age"]);auto db = createDatabase("file:///testdb"); auto rowSet = db.connection().statement("select name,score from score").execute; foreach (r; rowSet) writeln(r[0].as!string,",",r[1].as!int);You'll want to have some types in there. As in struct S { string name; int age; } auto rowSet = db.connection().statement("select name,score from score").executeAs!S; foreach (r; rowSet) writeln(r.name,",",r.age);
Mar 04 2016
On Friday, 4 March 2016 at 23:55:55 UTC, Erik Smith wrote:I think some basic object serialization capabilities would be great although I'm not sure how the bare names can be accessed like that through the rowSet How would that work?I did a project a while back using mysql-native (see code.dlang.org) and it has a `toStruct` function on its Row type. Maybe have a look there. To access the bare named RowSet would have to be templated on the struct. Follows code from said project: template executeAs(Type) { struct TypedSQLSequence { ResultSequence rs; alias rs this; property Type front() { Row r = rs.front; Type temp; r.toStruct!Type(temp); return temp; } } auto executeAs(ref mysql.connection.ResultSequence rs) { return TypedSQLSequence(rs); } }
Mar 05 2016
On Thursday, 3 March 2016 at 11:16:03 UTC, Dejan Lekic wrote:On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:std.experimental, ugly or not, is what is in phobos. See std.experimental.allocator, std.experimental.logger and std.experimental.ndsliceI'm back to actively working on a std.database specification & implementation. It's still unstable, minimally tested, and there is plenty of work to do, but I wanted to share an update on my progress.I suggest you call the package stdx.db - it is not (and may not become) a standard package, so `std` is out of question. If it is supposed to be *proposed* as standard package, then `stdx` is good because that is what some people have used in the past (while others used the ugly std.experimental. for the same purpose). I humbly believe that this effort **must** be collaborative as such package is doomed to fail if done wrong.
Mar 03 2016
On Tuesday, 1 March 2016 at 21:00:30 UTC, Erik Smith wrote:I'm back to actively working on a std.database specification & implementation. It's still unstable, minimally tested, and there is plenty of work to do, but I wanted to share an update on my progress. [...]A little late to the party, nevertheless: Thanks for doing this, it will be super-helpful! My only feature request will be: please make it work with minimal effort with Vibe.D! :) ~ sd
Mar 05 2016
On 2016-03-05 11:23, Saurabh Das wrote:A little late to the party, nevertheless: Thanks for doing this, it will be super-helpful! My only feature request will be: please make it work with minimal effort with Vibe.D! :)Yeah, that's really important. Unfortunately it looks like a synchronous interface is prioritized :(. -- /Jacob Carlborg
Mar 05 2016
On Saturday, 5 March 2016 at 13:13:09 UTC, Jacob Carlborg wrote:On 2016-03-05 11:23, Saurabh Das wrote:I'm definitely going to start working in async capability (or more accurately, non-blocking) into the interface. Both models are essential and there are strong use cases for both, but I know there is a lot of interest in NBIO for vibe.d compatibility. This is challenging and, as many are aware, the C clients for SQL databases have traditionally been synchronous only. One suggested approach is to fork the driver source and change the I/O calls to be non-blocking. Another is to implement a non-blocking driver directly against the wire protocol (a "native" driver). These are limited options, but to the extent this approach is viable in some cases (mysql-native for example), they could be adapted to a standard interface. The good news is that some databases do have (or are working on) non-blocking support in their C clients. The webscalesql mysql fork, for example, has a non-blocking client that is in production and their client works with regular mysql databases. That fork should eventually be merged back into mysql (or may have been already). Postgres also provides non-blocking support. These are two cases that I'm targeting for initial reference implementations.A little late to the party, nevertheless: Thanks for doing this, it will be super-helpful! My only feature request will be: please make it work with minimal effort with Vibe.D! :)Yeah, that's really important. Unfortunately it looks like a synchronous interface is prioritized :(.
Mar 05 2016
On 2016-03-05 19:00, Erik Smith wrote:I'm definitely going to start working in async capability (or more accurately, non-blocking) into the interface. Both models are essential and there are strong use cases for both, but I know there is a lot of interest in NBIO for vibe.d compatibility. This is challenging and, as many are aware, the C clients for SQL databases have traditionally been synchronous only. One suggested approach is to fork the driver source and change the I/O calls to be non-blocking. Another is to implement a non-blocking driver directly against the wire protocol (a "native" driver). These are limited options, but to the extent this approach is viable in some cases (mysql-native for example), they could be adapted to a standard interface. The good news is that some databases do have (or are working on) non-blocking support in their C clients. The webscalesql mysql fork, for example, has a non-blocking client that is in production and their client works with regular mysql databases. That fork should eventually be merged back into mysql (or may have been already). Postgres also provides non-blocking support. These are two cases that I'm targeting for initial reference implementations.Just for the record, there's already a native Postgres client that is non-blocking and vibe.d compatible. https://github.com/pszturmaj/ddb -- /Jacob Carlborg
Mar 05 2016
On Saturday, 5 March 2016 at 18:00:56 UTC, Erik Smith wrote:I'm definitely going to start working in async capability (or more accurately, non-blocking) into the interface.Huh? Non-blocking operation is not reflected in interface. Only asynchronous requires special interface. Vibe interfaces are synchronous.Postgres also provides non-blocking support. These are two cases that I'm targeting for initial reference implementations.I've looked into OCI headers, they say something about asynchronous operations, do you know something about that?
Mar 06 2016
On Sunday, 6 March 2016 at 08:53:05 UTC, Kagamin wrote:On Saturday, 5 March 2016 at 18:00:56 UTC, Erik Smith wrote:I think a basic non-blocking interface should be similar to the synchronous one, but also exposes a file/socket descriptor to pass to the event system and also provide status method for whether the R/W operation is waiting.I'm definitely going to start working in async capability (or more accurately, non-blocking) into the interface.Huh? Non-blocking operation is not reflected in interface. Only asynchronous requires special interface. Vibe interfaces are synchronous.I've looked into OCI headers, they say something about asynchronous operations, do you know something about that?I need to look into this more, but the OCI non-blocking model is problematic (a missing file descriptor maybe). It can probably be made to work since Node now has a driver directly from Oracle, but I'm not sure if it has the same scaling profile as other non-blocking clients.
Mar 06 2016
I've made a few updates based on some of the feedback in this thread. - execute() renamed to query() - query with input binds directly from connection - query() returns result for chaining The design is still early stage. I've got a growing list of design options which I'll bring into discussion once the interface becomes more stable. Other updates: - postgres reference implementation added (synchronous only at moment) - DUB package now available as dstddb - DMD "cyclic structs" bug fix may land soon (PR appeared) with design improvements to follow. erik
Mar 10 2016