www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A few holes (imho) in the ecosystem of libraries

reply Guillaume Piolat <first.last spam.org> writes:
As an enthusiast D practionner, I could start more libraries, but 
I'm sure a lot of us have a good plate of work already to eat 
each day :)

Still, creating base libraries for D is fun/rewarding and I'm 
sure you'll agree.
But if it's not useful to others, there is risk of just creating 
debt.
Here is the DOMAIN-SPECIFIC libraries I'd be interested to see. 
What are yours?


1. A font library that "does it all":
    * support TrueType/OpenType
    * rasterizing glyphs, glyph cache
    * or output the bezier curve instead to give to a Canvas 
rasterizer
       * including support for the many annoying Truetype 
extensions
       * because the dg2D rasterizer is faster than doing the 
glyph cache thing!
    * get font metrics
    * support a registry with font, parsed from system, but also 
added manually
    * to avoid fragmentation: -betterC/ nogc compatible, no 
exceptions, stuff like that
    * small, defined API
    arsd font it almost there, and printed:font is almost there, 
and many others do a part of this, but nothing quite does each of 
these things.


2. A I/O abstraction really suitable for parsing/emitting
    * just ubyte[]
    * no exceptions, instead provide a way to parse anyway (past 
stream end) and get the error later, so that we don't need to 
check for stream end (like stb does)
    * parse base types with provided endianness
    * good names, not so easy to come across
    * abstract over (say) fread, ftell, fwrite, fseek, for 
operations over files/memory
ranges/lazy streams/growable memory ranges
    * helpers to skip bytes, ensure remaining bytes, substream, 
backtrack, etc.
    * not overly templated, struct-based, even manual delegation 
for wrapped I/O
    I have done a variation of that badly for both audio-formats 
and gamut, and it's annoyingly similar.


What do YOU think is missing in the library department?
May 15 2023
next sibling parent Adam D Ruppe <destructionator gmail.com> writes:
On Monday, 15 May 2023 at 13:04:57 UTC, Guillaume Piolat wrote:
    * support a registry with font, parsed from system, but also 
 added manually
simpledisplay.d has a thing to pull out of the operating system, and then you can get the bytes to feed back into a stb ttf/freetype kind of thing, or just use it directly with the OS draw functions. I like it quite a bit, since writing that (iirc 2019ish, with the get ttf bytes added in Sept 2021), I've used it in many places. Doesn't do a custom added registry.... integrating that so the OS functions can read it might take some effort but if you just wanted loading that'd be a fairly easy add-on interface. If someone were to write such a thing, I'd adopt it under my umbrella.
 2. A I/O abstraction really suitable for parsing/emitting
    * just ubyte[]
    * no exceptions, instead provide a way to parse anyway (past 
 stream end) and get the error later, so that we don't need to 
 check for stream end (like stb does)
    * parse base types with provided endianness
    * good names, not so easy to come across
    * abstract over (say) fread, ftell, fwrite, fseek, for 
 operations over files/memory
 ranges/lazy streams/growable memory ranges
    * helpers to skip bytes, ensure remaining bytes, substream, 
 backtrack, etc.
    * not overly templated, struct-based, even manual delegation 
 for wrapped I/O
    I have done a variation of that badly for both audio-formats 
 and gamut, and it's annoyingly similar.
Yeah, I've done this a few times myself too and it is one of the things I'm putting in my new arsd.core lib (which was originally slated for release today, but I'm pretty far behind right now) This is what I have so far: https://github.com/adamdruppe/arsd/blob/7958a5eb939fadd55c3ae01c08bc60027d977430/core.d#L4889 The unittest shows how you might use it: auto fiber = new Fiber(() { position = 1; int a = stream.get!int(ByteOrder.littleEndian); assert(a == 10); position = 2; ubyte b = stream.get!ubyte; assert(b == 33); position = 3; }); fiber.call(); assert(position == 1); stream.feedData([10, 0, 0, 0]); assert(position == 2); stream.feedData([33]); assert(position == 3); (the same interface can also fread() instead of getting fed data from outside with other subclass implementations, but I specifically wanted it to be fiber compatible like this, so a task can be put on hold waiting for new data, e.g., from a network socket or ipc pipe.) I'm fairly happy with it but haven't used it in a real program yet so that might change. But the templated get/put things give you the types an then the virtual feed/flush convert that to just a byte array for transport. Then my declarativeloader.d can be adapted to use this instead of its current fgetc and friends impl to be more generic to load up structs. (The only thing I've used it for right now is reading Java's .class files, but that covers a lot of common ground.) Day job been keeping me busy during the day and kid during the night so I'm behind schedule on the arsd 11 tag, but I'm sure I'll be caught up and ready to tag in a couple more weeks so we can start really playing with this stuff.
May 15 2023
prev sibling next sibling parent reply Andrew <andrewlalisofficial gmail.com> writes:
On Monday, 15 May 2023 at 13:04:57 UTC, Guillaume Piolat wrote:
 2. A I/O abstraction really suitable for parsing/emitting
Regarding your second point, I noticed this was limiting me recently, so I've been working in the evenings on a proof-of-concept for what this could look like if added to Phobos in the same style as ranges: https://forum.dlang.org/thread/okoaghstghyxapqcwbtr forum.dlang.org My implementation of course allows for input and output streams of any type, but does offer quite a few extras for specifically `ubyte` streams, and in my opinion, it's a much simpler, more opaque interface that'll be easier for programmers to grasp. To anyone reading, I'd appreciate if you could spare some time to check it out and provide feedback in that thread.
 What do YOU think is missing in the library department?
We're missing a lot of "one solution" libraries; which is that there are many small libraries that probably do most of what you want, but we don't have any large libraries backed by open source communities that offer a production-quality experience, like how Java has so many Apache projects under it. Some concrete examples might be: - A common interface for HTTP request sending and receiving. It would be great if D had a single unifying interface for dealing with HTTP in particular, as it's pretty much the most common form of traffic for user-facing apps nowadays. Something like Java's Servlets. - A single most-popular UI framework. Most other languages have identified and "chosen" a single UI framework to be the "one" that pretty much any UI is created with. - An out-of-the-box crypto library. Botan is getting close, but we really need better documentation on that. - And finally, better documentation! I don't mean any offense, but ddoc is quite ugly compared to the documentation that modern javascript and python projects have, and even javadoc is easier to read than ddoc, in my opinion. Oh, and one other thing I noticed was a lack of an up-to-date coverage parser and analyzer library, so I wrote my own script for that recently. But maybe I should make something for that.
May 15 2023
parent reply Guillaume Piolat <first.last spam.org> writes:
On Monday, 15 May 2023 at 13:38:12 UTC, Andrew wrote:
 On Monday, 15 May 2023 at 13:04:57 UTC, Guillaume Piolat wrote:
 2. A I/O abstraction really suitable for parsing/emitting
Regarding your second point, I noticed this was limiting me recently, so I've been working in the evenings on a proof-of-concept for what this could look like if added to Phobos in the same style as ranges: https://forum.dlang.org/thread/okoaghstghyxapqcwbtr forum.dlang.org
I'll take an example. My current TGA parsing looks like this ugly: bool getImageInfo(IO* io, IOHandle _handle) { // Taken right from stb_image.h bool err; _dataOffset = _io.read_ubyte(_handle, &err); if (err) return false; _cmapType = _io.read_ubyte(_handle, &err); if (err || _cmapType > 1) return false; // only RGB or indexed allowed _imageType = _io.read_ubyte(_handle, &err); if (err) return false; /// ...more parsing... } What I would really want is: bool getImageInfo(IO* io) { _dataOffset = _io.read_ubyte(); _cmapType = _io.read_ubyte(); _imageType = _io.read_ubyte(); return !_io.error; } Can't use exceptions since downstream user will use that on WebASM, -betterC not having class etc. And I cannot use your library for similar reasons, it uses the runtime, it uses exceptions, it has generic names etc. We have a lot of such libraries, but we have to keep in mind people use D because they are performance-oriented and they may have druntime constraints. Ideally the D runtime would be pay-as-you-go and infinitely portable, but that's not the current state of things.
May 15 2023
parent reply Andrew <andrewlalisofficial gmail.com> writes:
On Monday, 15 May 2023 at 13:53:15 UTC, Guillaume Piolat wrote:
 Can't use exceptions since downstream user will use that on 
 WebASM, -betterC not having class etc.
 And I cannot use your library for similar reasons, it uses the 
 runtime, it uses exceptions, it has generic names etc. We have 
 a lot of such libraries, but we have to keep in mind people use 
 D because they are performance-oriented and they may have 
 druntime constraints. Ideally the D runtime would be 
 pay-as-you-go and infinitely portable, but that's not the 
 current state of things.
Yeah, please bear in mind that my goal is to make my library ultimately compatible with those not using the D runtime, but for prototyping I have indeed made use of exceptions and `std.traits` just to get something working. But the idea is to be able to achieve your desired functionality like this: ```d bool getImageInfo(S)(S* stream) if (isByteInputStream!S) { ubyte[3] buf; int bytes = stream.read(buf[]); if (bytes != 3) return false; _dataOffset = buf[0]; _cmapType = buf[1]; _imageType = buf[2]; return true; } ``` Or I am also working on a concept of "data streams" which automatically convert between byte arrays and scalar types in an endian-aware manner: ```d void getImageInfo(S)(S* stream) if (isByteInputStream!S) { auto dIn = dataInputStreamFor(stream, Endianness.BigEndian); _dataOffset = dIn.read!ubyte(); _cmapType = dIn.read!ubyte(); _imageType = dIn.read!ubyte(); } ``` But I've only been working on it for a few days, I think that your input would be greatly appreciated; if you could list on that thread some more examples of what you need from a binary IO library.
May 15 2023
next sibling parent reply Guillaume Piolat <first.last spam.org> writes:
On Monday, 15 May 2023 at 14:04:39 UTC, Andrew wrote:
 if you could list on that thread some more examples of what you 
 need from a binary IO library.
On the top of my head - my first post details what I see as must haves, but that's just my opinion and obviously maintainer propose one's design. - if you reserve common works like `read`/`write` as identifier for your library, it **will** be annoying downstream because it clashes with other careless namespace landgrab from Phobos. Like std.file.read, std.file.write std.stdio.read, std.stdio.write. This is what happens in practice, and the simple fact you use very common words for not-trivial operation is a big deterrent. I had endless problems with Phobos min/max, and ended avoiding those functions altogether. And "read" or "write" do not say at all what your function does! This kind of naming end up worse than C library names with prefixes. Also templated libraries often have lack of separation between API and internals, this is poison to versionning.
May 15 2023
parent Andrew <andrewlalisofficial gmail.com> writes:
On Monday, 15 May 2023 at 14:23:24 UTC, Guillaume Piolat wrote:
 - if you reserve common works like `read`/`write` as identifier 
 for your library, it **will** be annoying downstream because it 
 clashes with other careless namespace landgrab from Phobos. 
 Like std.file.read, std.file.write std.stdio.read, 
 std.stdio.write. This is what happens in practice, and the 
 simple fact you use very common words for not-trivial operation 
 is a big deterrent. I had endless problems with Phobos min/max, 
 and ended avoiding those functions altogether. And "read" or 
 "write" do not say at all what your function does! This kind of 
 naming end up worse than C library names with prefixes.
 Also templated libraries often have lack of separation between 
 API and internals, this is poison to versionning.
Okay, this is good information to work with. It might make more sense to work under a `readFromStream` and `writeToStream` function name instead of generic `read` and `write`. If only D's module system could resolve such naming issues for us... Also, could you explain a little bit more about what you mean by "lack of separation between API and internals"? Do you mean exposing structs that should be defined inline or privately in the module? And in your other reply you mentioned:
 you propose the fread interface (return numbner of read 
 elements), I've tried that in the image library and it's not as 
 nice as just returning a default value, such as zero on 
 failure, and fail out of band.
and this is definitely something I'll add; thanks for suggesting it.
May 15 2023
prev sibling parent Guillaume Piolat <first.last spam.org> writes:
On Monday, 15 May 2023 at 14:04:39 UTC, Andrew wrote:
 ```d
 bool getImageInfo(S)(S* stream) if (isByteInputStream!S)
 {
     ubyte[3] buf;
     int bytes = stream.read(buf[]);
     if (bytes != 3) return false;
     _dataOffset = buf[0];
     _cmapType = buf[1];
     _imageType = buf[2];
     return true;
 }
 ```
- I was using ranges for parsing initially but realized all my ranges were instantiated with ubyte at one point. - you propose the fread interface (return numbner of read elements), I've tried that in the image library and it's not as nice as just returning a default value, such as zero on failure, and fail out of band.
May 15 2023
prev sibling next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 16/05/2023 1:04 AM, Guillaume Piolat wrote:

 
 A font library that "does it all":
 
   * support TrueType/OpenType
   * rasterizing glyphs, glyph cache
   * or output the bezier curve instead to give to a Canvas rasterizer
       o including support for the many annoying Truetype extensions
       o because the dg2D rasterizer is faster than doing the glyph cache
         thing!
   * get font metrics
   * support a registry with font, parsed from system, but also added
     manually
   * to avoid fragmentation: -betterC/ nogc compatible, no exceptions,
     stuff like that
   * small, defined API arsd font it almost there, and printed:font is
     almost there, and many others do a part of this, but nothing quite
     does each of these things.
I need everything but the bezier curve (plan is my canvas library would be GPU accelerated). So if it was possible to use another library that I could wrap, that would be great. If I have to do it all myself whatever I come up with is going to have something like 4+ D shared libraries as dependencies. Joy!
May 15 2023
prev sibling parent reply Bradley Chatha <sealabjaster gmail.com> writes:
On Monday, 15 May 2023 at 13:04:57 UTC, Guillaume Piolat wrote:
 What do YOU think is missing in the library department?
Libraries for modern cloud-based development: cloud platforms (AWS, GCP, etc.); popular SaaS (e.g. for Slack bots), and a common foundation for such libraries to actually build off of. I don't really have much hope right now for such things to occur spontaneously, which is why I've decided to go my own way and in a very ideal sense: have my own ecosystem, for myself. Whether it lives up to reality is another question, however I'm optimistic since I recently managed to get my async library into a functioning (though not super useable yet) state, where I can start create more libraries that directly integrate with it, and eventually applications. and Python land: all the battery included ecosystems/libraries/standard libraries that already exist and integrate well (or at least, well enough) with one another, instead of being entirely separate things that I have to patch together, and that don't even (and likely can't be) integrate with whatever I'm using for async stuff.
May 15 2023
parent Andrew <andrewlalisofficial gmail.com> writes:
On Monday, 15 May 2023 at 16:22:20 UTC, Bradley Chatha wrote:

 Go, and Python land: all the battery included 
 ecosystems/libraries/standard libraries that already exist and 
 integrate well (or at least, well enough) with one another, 
 instead of being entirely separate things that I have to patch 
 together, and that don't even (and likely can't be) integrate 
 with whatever I'm using for async stuff.
Yeah, if we could somehow motivate people to revitalize and refactor Phobos so that it enables that sort of batteries-included libraries to happen, that would be awesome. I feel that so much emphasis in D is placed on the language design, that we forget that there's a lot more that probably could benefit from some organization.
May 15 2023