www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Serialization library candidate review request

reply GrimMaple <grimmaple95 gmail.com> writes:
Hello everyone! When my https://github.com/dlang/phobos/pull/8662 
predictably failed, I moved on to create a common serialization 
library instead that could later on be included in phobos. So I 
would like to request a review from you to get it to a decent 
state. If interested, you can find the code 
[here](https://github.com/GrimMaple/mud/tree/v0.3.0/source/mud/serialization),
and if you want to use it with dub you can use `"mud": "~>0.3.0"`.

Code contains two files: package.d to ease making serializators, 
and json.d is an example serializator from/to json. Please 
consider this example to see how it works in general:
```
     struct Foo
     {
         // Works with or without brackets
          serializable int a = 123;
          serializable() double floating = 123;
          serializable("flag") bool check = true;
     }

     // Will produce `{"a":123,"floating"=123,"flag"=true}`
     serializeToJSONString(Foo());
```

If you're interested in making a custom serializator, here's a 
rough idea on how it's going to be implemented with this library:
```d
foreach(alias prop; readableSerializables!T)
{
     enum name = getSerializableName!prop;
     // Work your magic here
}
```

I tried to make it as universal as I could, but any suggestions 
are welcome for a discussion.
Aug 27 2023
next sibling parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Sunday, 27 August 2023 at 19:39:05 UTC, GrimMaple wrote:
 Hello everyone! When my 
 https://github.com/dlang/phobos/pull/8662 predictably failed, I 
 moved on to create a common serialization library instead that 
 could later on be included in phobos. So I would like to 
 request a review from you to get it to a decent state. If 
 interested, you can find the code 
 [here](https://github.com/GrimMaple/mud/tree/v0.3.0/source/mud/serialization),
and if you want to use it with dub you can use `"mud": "~>0.3.0"`.

 Code contains two files: package.d to ease making 
 serializators, and json.d is an example serializator from/to 
 json. Please consider this example to see how it works in 
 general:
 ```
     struct Foo
     {
         // Works with or without brackets
          serializable int a = 123;
          serializable() double floating = 123;
          serializable("flag") bool check = true;
     }

     // Will produce `{"a":123,"floating"=123,"flag"=true}`
     serializeToJSONString(Foo());
 ```

 If you're interested in making a custom serializator, here's a 
 rough idea on how it's going to be implemented with this 
 library:
 ```d
 foreach(alias prop; readableSerializables!T)
 {
     enum name = getSerializableName!prop;
     // Work your magic here
 }
 ```

 I tried to make it as universal as I could, but any suggestions 
 are welcome for a discussion.
isnt the hard part enumerating over every base type to make it have sane encoding/decoding string behavior? I dont understand why your adding a uda rather then assuming all values are serializable. I wouldnt make json be the base case, "dlang object notation" or something that spits binary into a file, or ideally both
Aug 27 2023
parent reply GrimMaple <grimmaple95 gmail.com> writes:
On Monday, 28 August 2023 at 01:01:57 UTC, monkyyy wrote:
 I dont understand why your adding a uda rather then assuming 
 all values are serializable.
Because on practice, you might want to leave some of the stuff unserialized. It's generally better (IMO) to specifically mark serializable fields. It also allows serializaing / deserializing properties and getters / setters.
 I wouldnt make json be the base case, "dlang object notation" 
 or something that spits binary into a file, or ideally both
JSON is what I use and need first-hand, therefore it's something that I made. Binary serialization comes with added leayers of complexity (read ordering).
Aug 28 2023
next sibling parent Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Monday, 28 August 2023 at 07:59:17 UTC, GrimMaple wrote:

 Binary serialization comes with added leayers of complexity 
 (read ordering).
Fwow, we use sbin for that, works great. https://code.dlang.org/packages/sbin — Bastiaan.
Aug 28 2023
prev sibling next sibling parent reply bauss <jacobbauss gmail.com> writes:
On Monday, 28 August 2023 at 07:59:17 UTC, GrimMaple wrote:
 On Monday, 28 August 2023 at 01:01:57 UTC, monkyyy wrote:
 I dont understand why your adding a uda rather then assuming 
 all values are serializable.
Because on practice, you might want to leave some of the stuff unserialized. It's generally better (IMO) to specifically mark serializable fields. It also allows serializaing / deserializing properties and getters / setters.
I disagree. It's better to make fields serializable by default and have an attribute to ignore serialization.
Aug 28 2023
parent reply GrimMaple <grimmaple95 gmail.com> writes:
On Monday, 28 August 2023 at 18:06:13 UTC, bauss wrote:
 On Monday, 28 August 2023 at 07:59:17 UTC, GrimMaple wrote:
 On Monday, 28 August 2023 at 01:01:57 UTC, monkyyy wrote:
 I dont understand why your adding a uda rather then assuming 
 all values are serializable.
Because on practice, you might want to leave some of the stuff unserialized. It's generally better (IMO) to specifically mark serializable fields. It also allows serializaing / deserializing properties and getters / setters.
I disagree. It's better to make fields serializable by default and have an attribute to ignore serialization.
I disagree with your disagreement, because then you can serialize __everything__ and get garbage: ```d auto f = File("myfile.txt", "rt"); serializeJSON(f); // ??? ``` The attitutde of "everything is serializable" leads to tons of rules about __how__ everything is serialized, where a UDA creates a very simple ruleset. It's not that I don't understand the convenience of being able to serialize something without having to add ` serializable` to everything. It's inconveniences that this logic brings. Consider phobos, for example. Someone will have to go around placing ` dontSerialize` to things so they produce any reasonable result (I might actually add a compile-time warning/error if type being serialized doesn't have things to serialize), otherwise you're gonna get garbage in your serialization out of the box. Wrost-case scenario, it could be made that applying ` serializable` to the struct/class __itself__ would make everything inside it ` serializable`. Eg: ```d serializable struct Test { int a; } assert(serializeToJSONString(Test()) == `{"a":0}`); ```
Aug 28 2023
parent reply bauss <jacobbauss gmail.com> writes:
On Monday, 28 August 2023 at 18:19:25 UTC, GrimMaple wrote:
 On Monday, 28 August 2023 at 18:06:13 UTC, bauss wrote:
 On Monday, 28 August 2023 at 07:59:17 UTC, GrimMaple wrote:
 On Monday, 28 August 2023 at 01:01:57 UTC, monkyyy wrote:
 I dont understand why your adding a uda rather then assuming 
 all values are serializable.
Because on practice, you might want to leave some of the stuff unserialized. It's generally better (IMO) to specifically mark serializable fields. It also allows serializaing / deserializing properties and getters / setters.
I disagree. It's better to make fields serializable by default and have an attribute to ignore serialization.
I disagree with your disagreement, because then you can serialize __everything__ and get garbage: ```d auto f = File("myfile.txt", "rt"); serializeJSON(f); // ??? ``` The attitutde of "everything is serializable" leads to tons of rules about __how__ everything is serialized, where a UDA creates a very simple ruleset. It's not that I don't understand the convenience of being able to serialize something without having to add ` serializable` to everything. It's inconveniences that this logic brings. Consider phobos, for example. Someone will have to go around placing ` dontSerialize` to things so they produce any reasonable result (I might actually add a compile-time warning/error if type being serialized doesn't have things to serialize), otherwise you're gonna get garbage in your serialization out of the box. Wrost-case scenario, it could be made that applying ` serializable` to the struct/class __itself__ would make everything inside it ` serializable`. Eg: ```d serializable struct Test { int a; } assert(serializeToJSONString(Test()) == `{"a":0}`); ```
What if you want to serialize types that are from other libraries, you are forced to copy the structures yourself in order to serialize. Where as serialization by default lets you just serialize the types regardless. Your approach only works with your own types, not types from anywhere else, so serializing types from other libraries etc. is going to be tedious.
Aug 28 2023
parent GrimMaple <grimmaple95 gmail.com> writes:
On Monday, 28 August 2023 at 18:27:10 UTC, bauss wrote:
 What if you want to serialize types that are from other 
 libraries, you are forced to copy the structures yourself in 
 order to serialize.

 Where as serialization by default lets you just serialize the 
 types regardless.

 Your approach only works with your own types, not types from 
 anywhere else, so serializing types from other libraries etc. 
 is going to be tedious.
Not every type is serializable by default, like with my File example above. Yes, this won't work with external libraries because external libraries aren't designed to have serializable structs. However, adding custom serializators to unsupported types can be easy via getter serialization, eg: ```d struct A { Unsupported a; serializable("a") Supported b() { /* magic */ } } ``` Before you can serialize something, you need to define rules on how serialization is going to be performed. If a type wasn't designed to be serializable (and you can't really prove if it was), then serializing that is going to produce garbage anyway.
Aug 28 2023
prev sibling parent Martyn <martyn.developer googlemail.com> writes:
https://code.dlang.org/packages/asdf

I have used this package in a past project and works well for 
both serialize and deserialize. Makes use of UDA, etc.

Just throwing it out there in case you are not aware -- there 
could be some niceties in there to help with the development of 
mud. asdf seems to be focused purely on json, though.

To have mud be supported for many formats would be great! Xml, 
Bson, etc.
Looking at the code, I guess thats the direction you are heading. 
Nice!


On Monday, 28 August 2023 at 07:59:17 UTC, GrimMaple wrote:

 Because on practice, you might want to leave some of the stuff 
 unserialized. It's generally better (IMO) to specifically mark 
 serializable fields. It also allows serializaing / 
 deserializing properties and getters / setters.
I personally am not bothered whichever route mud goes regarding rules on "serialized by default" However, I wonder if it is worth reviewing most json libraries not just in Dlang - but in various popular languages (C++, Go, etc). If 90% are serializing all by default (generally speaking in that lang) I don't think D should try to be different. asdf above, for example, exports all public members by default.. and can exclude using serdeIgnore. exclude using [JsonIgnore]
Aug 30 2023
prev sibling next sibling parent Mathias LANG <geod24 gmail.com> writes:
On Sunday, 27 August 2023 at 19:39:05 UTC, GrimMaple wrote:
 I tried to make it as universal as I could, but any suggestions 
 are welcome for a discussion.
TL;DR: - Build escape hatch first, then idioms; - Composability is key; - Use dedicated data structure for serialization, don't mix with business code; - Sane defaults based on the language if possible, UDA for exceptions; Long version: 1) From experience, the biggest challenge is the code you do not control. You need an escape hatch when you can't edit the code but need something specific, e.g. you have a struct with a deeply nested component that you want serialized in a certain way. 2) That escape hatch should compose well with the regular case, so that you only have to specialize the tiny bit you need, and don't need to copy some of the logic of the tool. 3) Once you have this in place, you can start adding idioms / patterns you want to expose. Each pattern you expose will make assumptions that will reduce the formats you can support (e.g. what if the format can only have 2 levels of nesting?), so a lot of it will be judgement calls. You don't want to mix business code with serialization code, because things will get messy very quickly. In your example, the ` Serializable` attribute is the wrong approach - not serializing something in a struct you are giving to the serializer should be the exception rather than the rule. 4) Whenever possible, use the language rather than UDA. For a value that is optional, give it an initializer (because it doesn't make sense for something that is required to have a default value). I wrote a deserialization library to read YAML configuration files and validate them based on those principles (https://github.com/dlang-community/configy). I'm not 100% happy with it, but starting from the escape hatch and building on top of that allowed me to always fulfill my needs. Note that something making its way into Phobos, it would be the right time to add `core.attributes` as well. Things like ` Name("newname")` are no brainers. We might or might not want to decide on some convention, e.g. do we assume `public_` is serialized as `public` or do we always need an UDA ?
Aug 28 2023
prev sibling next sibling parent reply user1234 <user1234 12.de> writes:
On Sunday, 27 August 2023 at 19:39:05 UTC, GrimMaple wrote:
 [...]
 I tried to make it as universal as I could, but any suggestions 
 are welcome for a discussion.
I had a quick look yesterday. One thing I have noticed is that virtual setter/getters dont seem to be supported.
Aug 28 2023
parent reply GrimMaple <grimmaple95 gmail.com> writes:
On Monday, 28 August 2023 at 19:07:39 UTC, user1234 wrote:
 On Sunday, 27 August 2023 at 19:39:05 UTC, GrimMaple wrote:
 [...]
 I tried to make it as universal as I could, but any 
 suggestions are welcome for a discussion.
I had a quick look yesterday. One thing I have noticed is that virtual setter/getters dont seem to be supported.
I took a quick look and I don't think I fully understood what you meant. Meanwhile, I made a small snippet of how I understood your point, here's an example: ```d safe unittest { static class A { serializable int b() safe { return 0; } } static class B : A { override int b() safe { return 1; } } B b = new B(); assert(serializeToJSONString(cast(A)b) == `{"b":1}`); } ``` I don't know if it's what you expected, but this unittest passes.
Aug 28 2023
parent user1234 <user1234 12.de> writes:
On Monday, 28 August 2023 at 19:18:17 UTC, GrimMaple wrote:
 On Monday, 28 August 2023 at 19:07:39 UTC, user1234 wrote:
 On Sunday, 27 August 2023 at 19:39:05 UTC, GrimMaple wrote:
 [...]
 I tried to make it as universal as I could, but any 
 suggestions are welcome for a discussion.
I had a quick look yesterday. One thing I have noticed is that virtual setter/getters dont seem to be supported.
I took a quick look and I don't think I fully understood what you meant. Meanwhile, I made a small snippet of how I understood your point, here's an example: ```d safe unittest { static class A { serializable int b() safe { return 0; } } static class B : A { override int b() safe { return 1; } } B b = new B(); assert(serializeToJSONString(cast(A)b) == `{"b":1}`); } ``` I don't know if it's what you expected, but this unittest passes.
```d safe unittest { static class A { serializable int b() safe { return 0; } } static class B : A { override int b() safe { return 1; } } A o = new B(); // for o.b serialialize 1, not 0 } ```
Aug 29 2023
prev sibling parent Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
On Sunday, 27 August 2023 at 19:39:05 UTC, GrimMaple wrote:
 Hello everyone! When my 
 https://github.com/dlang/phobos/pull/8662 predictably failed, I 
 moved on to create a common serialization library instead that 
 could later on be included in phobos. So I would like to 
 request a review from you to get it to a decent state. If 
 interested, you can find the code 
 [here](https://github.com/GrimMaple/mud/tree/v0.3.0/source/mud/serialization),
and if you want to use it with dub you can use `"mud": "~>0.3.0"`.
I've not been very good with follow-thru, but I thought it more important phobos provide standard attributes over serialization. https://github.com/JesseKPhillips/DIPs/blob/serialize/attribute/DIPs/1NNN-jkp.md I actually started trying to implement the thoughts, though that proved to not have enough time. --- I'm of the opinion serialize by default is better, not because it is good for defining contracts but because it is a good way to get info on a class and reduce barrier to entrance. Outside of old school xml serialization nobody is making libraries opt in.
Aug 29 2023