digitalmars.D.learn - Code review: JSON unmarshaller
- Tyler Jameson Little (16/16) Oct 15 2012 https://gist.github.com/3894337
- Jacob Carlborg (14/20) Oct 15 2012 I'm not sure what your goal with this marshaller is but I would say it's...
- Tyler Jameson Little (20/32) Oct 15 2012 I'm basically trying to reproduce other JSON marshallers, like
- Jacob Carlborg (67/94) Oct 15 2012 Most other languages are not as complicated as D, it's basically only C
- Tyler Jameson Little (28/28) Oct 17 2012 I could make my marshaller/unmarshaller only update objects in
- Jacob Carlborg (27/53) Oct 17 2012 Orange works with the archive at a lower level. For example, the archive...
- Tyler Jameson Little (13/13) Oct 17 2012 Here's the updated code. It's got a marshaller and unmarshaller:
- Adam D. Ruppe (6/8) Oct 17 2012 That seems weird, I've done something similar with non-static
- Jacob Carlborg (5/17) Oct 17 2012 What do you mean with "static structs/classes"? Are you talking about
- Jacob Carlborg (4/8) Oct 17 2012 I'll try and see if I can find some time to give feedback on this.
- Kagamin (2/4) Oct 17 2012 Can it serialize Variant?
- Jacob Carlborg (5/6) Oct 18 2012 No, but I'm working on it. Actually, it can serialize it, but not
- Dan (35/43) Oct 18 2012 I like Go's JSON convenience as well. There is a nice feature
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (13/22) Oct 16 2012 You have mentioned needing an allMembers that excluded functions in one
- Tyler Jameson Little (23/42) Oct 17 2012 The mentioned solution doesn't account for shared fields from a
- Jacob Carlborg (7/17) Oct 17 2012 Just do something like this:
https://gist.github.com/3894337 This is my first non-trivial D code, and I'd eventually like to get this into Phobos as part of std.json. I haven't written the marshaller yet, but that shouldn't be too hard. I wanted to get some feedback on whether this code is up to the quality standards of Phobos. I used a lot of templates, so I hope I didn't break any cardinal sins, especially in terms of readability. I did my best in grokking std.traits, but I may have missed some subtleties about what the templates are actually testing. I used asserts and contracts to validate input, so the following would throw an AssertError: int x = unmarshalJSON!int(`"5"`); I wasn't sure if this is bad style, since AssertError is in core.exception. If this is considered bad style in D, I can create a JSONMarshalException and throw that instead.
Oct 15 2012
On 2012-10-15 21:03, Tyler Jameson Little wrote:https://gist.github.com/3894337 This is my first non-trivial D code, and I'd eventually like to get this into Phobos as part of std.json. I haven't written the marshaller yet, but that shouldn't be too hard. I wanted to get some feedback on whether this code is up to the quality standards of Phobos.I'm not sure what your goal with this marshaller is but I would say it's a lot harder than you think if you want to have a complete serialization library. A couple of things making it harder to create a fully working serialization library: * Pointers * Array slices * Serializing through base class references * const/immutable fields * Any reference type (not really hard but it's more work) Have a look at for a basically fully working serialization library Orange: https://github.com/jacob-carlborg/orange -- /Jacob Carlborg
Oct 15 2012
I'm not sure what your goal with this marshaller is but I would say it's a lot harder than you think if you want to have a complete serialization library. A couple of things making it harder to create a fully working serialization library:I'm basically trying to reproduce other JSON marshallers, like Go's, but using compile-time reflection. Go uses runtime reflection, which D notably does not support. I like the idea of compile-time reflection better anyway. There are a few things that would make it easier (like a __traits call like allMembers that excludes functions). I use a lot of JSON, so a JSON marshaller/unmarshaller is going to save a lot of time, and make my code a lot cleaner.* PointersI've done this, but haven't fully tested it. Basic pointers work.* Array slicesI think this is handled.* Serializing through base class referencesDoesn't __traits(allMembers, T) give everything from all super classes?* const/immutable fieldsHmm, not sure to handle this. These have to be set in the constructor, right?* Any reference type (not really hard but it's more work)Are you talking about aliases? What other kind of reference types are there in structs/classes? I'm assuming this will have more to do with marshalling as opposed to unmarshalling.Have a look at for a basically fully working serialization library Orange: https://github.com/jacob-carlborg/orangeHmm, looks interesting. This looks like it only supports XML, which I don't use, but I'm sure you've already solved a lot of the corner cases. Thanks, I'll take a look!
Oct 15 2012
On 2012-10-15 22:35, Tyler Jameson Little wrote:I'm basically trying to reproduce other JSON marshallers, like Go's, but using compile-time reflection. Go uses runtime reflection, which D notably does not support. I like the idea of compile-time reflection better anyway. There are a few things that would make it easier (like a __traits call like allMembers that excludes functions).Most other languages are not as complicated as D, it's basically only C and C++ that are. Implementing a marshaller in Ruby would be dead simple. No pointers, no array slices (in the same way as D), support for full runtime reflection.I use a lot of JSON, so a JSON marshaller/unmarshaller is going to save a lot of time, and make my code a lot cleaner.Most of these points are when unmarshalling. I haven't actually looked if your marshaller can handle these cases but looking at the small amount of code I would guess no.Are they correctly setup when unmarshaling. Example: int a = 3; // global/TLS class Foo { int b = 4; int* c; int* d; } auto foo = new Foo; foo.c = &a; foo.d = &foo.b; When unmarshaling will "foo.d" point to "foo.b"?* PointersI've done this, but haven't fully tested it. Basic pointers work.This is basically the same as pointers: class Foo { int[] a; int[] b; } auto foo = new Foo; foo.a = [3, 4, 5, 6]; foo.b = foo.a[1 .. 3]; When unmarshaling will "foo.b" point to "foo.a"?* Array slicesI think this is handled.__traits only work at compile time. class A { int a; } class B : A { int b; } A b = new B; The static type of "b" is "A" so all information about "B" is lost at compile time. You either need to provide a way to register all subclasses that should be be marshaled through a base class reference or you need to implement proper runtime reflection.* Serializing through base class referencesDoesn't __traits(allMembers, T) give everything from all super classes?You shouldn't call the constructor when unmarshaling. That's another problem. Do you want to limit your marshaller to only work with classes that have a default constructor or none. You need to create the class instances without calling the constructor. Then you could provide a method that will be called before/after unmarshaling. Have a look that this post: http://www.digitalmars.com/d/archives/digitalmars/D/Deserializing_const_fields_175774.html* const/immutable fieldsHmm, not sure to handle this. These have to be set in the constructor, right?Yes, you don't want to marshal the same object twice. References types in D are: objects, pointers, associative arrays and arrays. These are the ones I can think of for now.* Any reference type (not really hard but it's more work)Are you talking about aliases? What other kind of reference types are there in structs/classes? I'm assuming this will have more to do with marshalling as opposed to unmarshalling.I have solved a lot of corner cases but there are a few left. I have a branch for handling const/immutable fields but it needs more testing before merging it with the master branch. I'm also not really happy about the deserializing of arrays. It's quite slow. Apparently it's also breaks as soon as you turn on some kind of optimization when compiling. The goal of Orange was to be able serialize basically everything found in D. Also to support multiple archive types, i.e. XML, JSON, binary and so on. -- /Jacob CarlborgHave a look at for a basically fully working serialization library Orange: https://github.com/jacob-carlborg/orangeHmm, looks interesting. This looks like it only supports XML, which I don't use, but I'm sure you've already solved a lot of the corner cases. Thanks, I'll take a look!
Oct 15 2012
I could make my marshaller/unmarshaller only update objects in place. I think this is more useful and would remove the overlap between orange and the JSON library. We could then write a JSON archiver for orange and include it in std.json as well. The call to unmarshal would look like: bool unmarshalJSON(T)(JSONValue val, out T ret); The following restrictions would apply: * T must be fully instantiated (all pointers are valid [not null]) * T must not be recursive (results in infinite recursion, and hence stack overflow) And the marshaller: JSONValue marshalJSON(T)(in T val); For marshalling, the restrictions are: * Slices are handled as if they were an array (copy all values) * Same as unmarshaller, except null pointers will be treated as JSON null I really like Go's JSON marshaller/unmarshaller, so I'm trying to model after that one. It allows updating an object in place, which was already a goal. There should probably be some standard D serialization format. In working with a structure trained on data (for machine learning, natural language processing, etc), a complete serialization solution makes sense. But for simple data passing, JSON makes a lot of sense. What do you think, do you think there's a place in Phobos for a simple JSON marshaller/unmarshaller? I'll have some updated code soon, and I'll post back when that's done, in case you'd like to have a look.
Oct 17 2012
On 2012-10-17 19:39, Tyler Jameson Little wrote:I could make my marshaller/unmarshaller only update objects in place. I think this is more useful and would remove the overlap between orange and the JSON library. We could then write a JSON archiver for orange and include it in std.json as well. The call to unmarshal would look like: bool unmarshalJSON(T)(JSONValue val, out T ret);Orange works with the archive at a lower level. For example, the archive doesn't really have to know how to (un)archive an object or struct. The serializer will break down the object into its fields and ask the archive to (un)archive the individual fields. The only thing the archive needs to know is that "here starts an object, from now on all (un)archived values will be part of the object until I say otherwise".The following restrictions would apply: * T must be fully instantiated (all pointers are valid [not null])That seems to be an unnecessary restriction.* T must not be recursive (results in infinite recursion, and hence stack overflow)I think the serializer in Orange can handle this. That would mean the archive doesn't need to handle this.And the marshaller: JSONValue marshalJSON(T)(in T val); For marshalling, the restrictions are: * Slices are handled as if they were an array (copy all values)So mean: int[] a = [3, 4, 5, 6]; int[] b = [1 .. $ - 1]; That "a" and "b" would be marshaled as two distinct arrays? In Orange, I think the serializer will handle this and the archive doesn't need to care. I tried to but as much of the code in the serializer so the archives doesn't need to bother with these kind of things.* Same as unmarshaller, except null pointers will be treated as JSON nullIf you can marshal a null pointer, how can you not unmarshal it?I really like Go's JSON marshaller/unmarshaller, so I'm trying to model after that one. It allows updating an object in place, which was already a goal. There should probably be some standard D serialization format. In working with a structure trained on data (for machine learning, natural language processing, etc), a complete serialization solution makes sense. But for simple data passing, JSON makes a lot of sense.Absolutely, there is a need for both, see below.What do you think, do you think there's a place in Phobos for a simple JSON marshaller/unmarshaller?Absolutely. I think there is a need for several types and variants of serialization. Sometimes you need to have a fully capable serialization library that can handle all types, custom serialization of third party types and so on. In other cases you don't really care an just want to dump some data to disk or whatever.I'll have some updated code soon, and I'll post back when that's done, in case you'd like to have a look.-- /Jacob Carlborg
Oct 17 2012
Here's the updated code. It's got a marshaller and unmarshaller: https://gist.github.com/3894337 It's about 650 lines. If you have time, I'd be very interested in getting some feedback (or from anyone else who sees this post of course). The main problem I'm having right now is that classes/structs have to be static. I'm not 100% sure why the compiler cannot see non-static classes/structs at compile time. Do you happen to know why? It seems like a template should work in either case, assuming I'm understanding D templates correctly. I didn't find any clear documentation for static outer classes, only static inner classes. It's not the same as static Java classes, which cannot be instantiated (if memory serves).
Oct 17 2012
On Wednesday, 17 October 2012 at 19:44:47 UTC, Tyler Jameson Little wrote:The main problem I'm having right now is that classes/structs have to be static.That seems weird, I've done something similar with non-static structs before. Maybe it will help if you use __traits(getMember, obj, name) instead of mixin.
Oct 17 2012
On 2012-10-17 21:44, Tyler Jameson Little wrote:Here's the updated code. It's got a marshaller and unmarshaller: https://gist.github.com/3894337 It's about 650 lines. If you have time, I'd be very interested in getting some feedback (or from anyone else who sees this post of course). The main problem I'm having right now is that classes/structs have to be static. I'm not 100% sure why the compiler cannot see non-static classes/structs at compile time. Do you happen to know why? It seems like a template should work in either case, assuming I'm understanding D templates correctly.What do you mean with "static structs/classes"? Are you talking about nested classes and structs?I didn't find any clear documentation for static outer classes, only static inner classes. It's not the same as static Java classes, which cannot be instantiated (if memory serves).-- /Jacob Carlborg
Oct 17 2012
On 2012-10-17 21:44, Tyler Jameson Little wrote:Here's the updated code. It's got a marshaller and unmarshaller: https://gist.github.com/3894337 It's about 650 lines. If you have time, I'd be very interested in getting some feedback (or from anyone else who sees this post of course).I'll try and see if I can find some time to give feedback on this. -- /Jacob Carlborg
Oct 17 2012
On Tuesday, 16 October 2012 at 06:37:55 UTC, Jacob Carlborg wrote:The goal of Orange was to be able serialize basically everything found in D.Can it serialize Variant?
Oct 17 2012
On 2012-10-17 22:03, Kagamin wrote:Can it serialize Variant?No, but I'm working on it. Actually, it can serialize it, but not deserialize it. -- /Jacob Carlborg
Oct 18 2012
On Monday, 15 October 2012 at 20:35:34 UTC, Tyler Jameson Little wrote:I'm basically trying to reproduce other JSON marshallers, like Go's, but using compile-time reflection. Go uses runtime reflection, which D notably does not support. I like the idea of compile-time reflection better anyway. There are a few things that would make it easier (like a __traits call like allMembers that excludes functions). I use a lot of JSON, so a JSON marshaller/unmarshaller is going to save a lot of time, and make my code a lot cleaner.I like Go's JSON convenience as well. There is a nice feature where you can add attributes to the members that are then available at runtime and therefore used by the serializer. So you could have: ------ type AcquiredRetired struct { Acquired tvm.Date `bson:"a"` Retired tvm.Date `bson:"r"` } ------ Here it specifies a shortened key for bson, but you can do the same for json. The size benefit can be significant. A design choice they made is to only serialize members that are capitalized which means visible. There is a nice json serialize/deserialize library in vibed. When I throw this struct at your marshalJSON I get compile errors. ---------- import std.stdio; struct X { class D { string b = "B"; } string a = "A"; D d; } void main() { auto c = new X(); auto o = marshalJSON(c); writeln(o); } ---------- Thanks Dan
Oct 18 2012
On 10/15/2012 12:03 PM, Tyler Jameson Little wrote:I did my best in grokking std.traits, but I may have missed some subtleties about what the templates are actually testing.You have mentioned needing an allMembers that excluded functions in one of your other posts. The following thread was exactly about that. I can never remember the solution, but I found it again: :) http://www.digitalmars.com/d/archives/digitalmars/D/learn/Getting_only_the_data_members_of_a_type_34086.html (Although JSON is not mentioned in there, that thread has been inspired by marshalling as well. :))I used asserts and contracts to validate input, so the following would throw an AssertError: int x = unmarshalJSON!int(`"5"`);std.exception.enforce is the right choice in that case. You don't want the checks to disappear when asserts are turned off.I wasn't sure if this is bad style, since AssertError is in core.exception. If this is considered bad style in D, I can create a JSONMarshalException and throw that instead.That makes sense too. There is enforceEx() to throw a specific type of exception. Ali
Oct 16 2012
You have mentioned needing an allMembers that excluded functions in one of your other posts. The following thread was exactly about that. I can never remember the solution, but I found it again: :) http://www.digitalmars.com/d/archives/digitalmars/D/learn/Getting_only_the_data_members_of_a_type_34086.htmlThe mentioned solution doesn't account for shared fields from a super class: class A { int a; } class S { int b; } foreach (i, type; typeof(S.tupleof)) { enum name = S.tupleof[i].stringof[4..$]; writef("(%s) %s\n", type.stringof, name); } This will print: (int) b My implementation is ugly, but it works for this case: (ret.b) b (ret.a) a I could use std.traits.BaseClassTuple, but then I'd have to filter out common fields, and that sounds like a lot of work, especially since there's no practical difference.Good point. I'll probably make a JSONMarshalException, which is separate from JSONException in std.json so the library clearly indicates which part failed. Thanks for the link, it was an interesting read! Maybe I'll have to dig around in std.traits and maybe add some missing stuff. With mixin() (I'd forgotten about it) I was able to get rid of all __traits calls except for allMembers.I used asserts and contracts to validate input, so thefollowing wouldthrow an AssertError: int x = unmarshalJSON!int(`"5"`);std.exception.enforce is the right choice in that case. You don't want the checks to disappear when asserts are turned off.I wasn't sure if this is bad style, since AssertError is in core.exception. If this is considered bad style in D, I cancreate aJSONMarshalException and throw that instead.That makes sense too. There is enforceEx() to throw a specific type of exception. Ali
Oct 17 2012
On 2012-10-17 20:36, Tyler Jameson Little wrote:The mentioned solution doesn't account for shared fields from a super class: class A { int a; } class S { int b; } foreach (i, type; typeof(S.tupleof)) { enum name = S.tupleof[i].stringof[4..$]; writef("(%s) %s\n", type.stringof, name); } This will print: (int) bJust do something like this: alias BaseTypeOf!(S) BaseType; BaseType t = type; And run the same loop again. -- /Jacob Carlborg
Oct 17 2012