digitalmars.D - std.jgrandson
- Andrei Alexandrescu (24/24) Aug 03 2014 We need a better json library at Facebook. I'd discussed with Sönke the
- Johannes Pfau (20/53) Aug 03 2014 API looks great but I'd like to see some simple serialize/deserialize
- ponce (4/18) Aug 03 2014 That's what https://github.com/Orvid/JSONSerialization does.
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (3/18) Aug 03 2014 The default mode for vibe.data.serialization also doesn't need any UDAs,...
- Andrei Alexandrescu (11/29) Aug 03 2014 Nice.
- Dicebot (12/17) Aug 03 2014 Before going this route one needs to have a good vision how it
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (8/22) Aug 03 2014 Do you have a specific case in mind where the data format doesn't fit
- Dicebot (9/17) Aug 03 2014 For example we use special binary serialization format for
- Jacob Carlborg (8/19) Aug 04 2014 I suggest only provide functions for serializing primitive types.
- Daniel Murphy (4/5) Aug 04 2014 This is exactly what I need in most projects. Basic types, arrays, AAs,...
- Jacob Carlborg (7/9) Aug 04 2014 I was more thinking only types that cannot be broken down in to
- Daniel Murphy (25/30) Aug 05 2014 I guess I meant types that have an obvious mapping to json types.
- Andrea Fontana (5/39) Aug 05 2014 If I'm right, json has just one numeric type. No difference
- Daniel Murphy (3/7) Aug 05 2014 Maybe, but std.json has three numeric types.
- Andrei Alexandrescu (6/15) Aug 05 2014 I searched around a bit and it seems different libraries have different
- Dicebot (4/22) Aug 05 2014 There is certain benefit in using same primitive types for JSON
- Sean Kelly (20/25) Aug 05 2014 The original point of JSON was that it auto-converts to
- Andrei Alexandrescu (2/28) Aug 05 2014 All good points. Proceed with implementation! :o) -- Andrei
- Dicebot (3/4) Aug 05 2014 Any news about std.allocator ? ;)
- Andrei Alexandrescu (4/7) Aug 05 2014 It looks like I need to go all out and write a garbage collector, design...
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (7/15) Aug 05 2014 A few months ago, you posted a video of a talk where you
- H. S. Teoh via Digitalmars-d (11/31) Aug 05 2014 Would it make sense to wrap a JSON number in an opaque type that
- Andrea Fontana (6/12) Aug 06 2014 IMO we should store original json number value as string and then
- Jacob Carlborg (6/28) Aug 05 2014 I'm not saying that is a bad idea or that I don't want to be able to do
- Daniel Murphy (4/7) Aug 05 2014 I know, but I don't really care if it's part of a generic serialization
- Jacob Carlborg (5/7) Aug 06 2014 Yeah, that's the problem. But where do you draw the line. Should arrays
- Daniel Murphy (9/13) Aug 06 2014 Yes. Allow T, where T is any of
- Jacob Carlborg (4/10) Aug 06 2014 BTW, why not classes? It's basically the same implementation as for stru...
- Daniel Murphy (4/6) Aug 06 2014 I guess I've just never needed to do it with classes. A lot of the time...
- Sean Kelly (4/11) Aug 06 2014 We could do something like Jackson. I wouldn't want it as the
- Dicebot (4/22) Aug 04 2014 Do you consider structs primitive types? This is probably #1 use
- Jacob Carlborg (7/10) Aug 04 2014 No, only types that cannot be broken down in to smaller pieces,
- Dicebot (21/29) Aug 04 2014 That is exactly the problem - if `structToJson` won't be
- Jacob Carlborg (8/27) Aug 04 2014 I see. I need to think a bit about this.
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (15/27) Aug 05 2014 On the other hand, a simplistic solution will inevitably result in
- Dicebot (4/8) Aug 05 2014 Simple option is to define required serializer traits and make
- Jacob Carlborg (21/32) Aug 05 2014 I have a very flexible trait like system in place. This allows to
- bearophile (10/11) Aug 03 2014 Good.
- Andrei Alexandrescu (2/11) Aug 03 2014 Yah, the latter is in the code. It's a ddoc problem. -- Andrei
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (56/56) Aug 03 2014 A few thoughts based on my experience with vibe.data.json:
- Andrei Alexandrescu (35/90) Aug 03 2014 Nonono. I think there's a confusion. The input strings are not UTF
- Dicebot (6/20) Aug 03 2014 I support this opinion. opDispatch looks cool with JSON objects
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (29/62) Aug 03 2014 Ah okay, *phew* ;) But in that case I'd actually think about leaving off...
- Andrei Alexandrescu (19/53) Aug 03 2014 Yah, that's awesome.
- Wyatt (11/22) Aug 04 2014 I suspect that depends on the circumstances. I've been using
- Andrei Alexandrescu (34/34) Aug 03 2014 On 8/3/14, 2:38 AM, Sönke Ludwig wrote:
- Johannes Pfau (17/37) Aug 03 2014 I think for the lowest level interface we could avoid allocation
- Andrei Alexandrescu (8/44) Aug 03 2014 That works but not e.g. for File.byLine which reuses its internal
- Johannes Pfau (9/15) Aug 03 2014 https://github.com/D-Programming-Language/phobos/blob/master/std/variant...
- Andrei Alexandrescu (12/27) Aug 03 2014 That could be translated to a comparison of pointers to functions.
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (15/48) Aug 03 2014 This may be the crux w.r.t. the vibe.data.json implementation. My
- w0rp (9/22) Aug 03 2014 My issue with is is that if you ask for a key in an object which
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (8/26) Aug 03 2014 Yes, this is what I meant with the JavaScript part of API. In addition
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (6/17) Aug 04 2014 There is a parallel discussion about the concept of associative
- Andrei Alexandrescu (6/32) Aug 03 2014 What would be your estimated time of finishing?
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (6/7) Aug 05 2014 My rough estimate would be that about two weeks of calender time should
- w0rp (16/41) Aug 03 2014 I like it. Here's what I think about it.
- Daniel Gibson (10/16) Aug 03 2014 Is the name supposed to stay or just a working title?
- Andrei Alexandrescu (3/12) Aug 03 2014 Just a working title, but of course if it were wildly successful... but
- Sean Kelly (15/15) Aug 03 2014 I don't want to pay for anything I don't use. No allocations
- Andrei Alexandrescu (14/29) Aug 03 2014 What to do about arrays and objects, which would naturally allocate
- Dmitry Olshansky (25/31) Aug 03 2014 SAX-style would imply that array is "parsed" by calling 6 user-defined
- Dmitry Olshansky (4/9) Aug 03 2014 Aw. Stray brace..
- Sean Kelly (18/47) Aug 03 2014 This is tricky with a range. With an event-based parser I'd have
- Jacob Carlborg (9/13) Aug 04 2014 Have a look at Token.Kind in the top of the module [1]. The enum
- Jacob Carlborg (7/10) Aug 04 2014 I think it should only provide very primitive functions to
- Orvid King (15/39) Aug 03 2014 If your looking for serialization from statically known type layouts
- Andrea Fontana (29/54) Aug 04 2014 On my bson library I found very useful to have some methods to
- Andrei Alexandrescu (7/31) Aug 04 2014 Cool. Is it unlikely that a value contains an actual slash? If so would
- Andrea Fontana (17/72) Aug 05 2014 I wrote assume just to use proposed syntax :)
- Andrei Alexandrescu (5/9) Aug 05 2014 One one side enters vibe.data.json with the deltas prompted by
- Jacob Carlborg (18/24) Aug 04 2014 * Could you please put it on Github to get syntax highlighting
- Andrei Alexandrescu (15/35) Aug 04 2014 Quick workaround: http://dpaste.dzfl.pl/65f4dcc36ab8
- Jacob Carlborg (12/19) Aug 04 2014 That's why it's easier with Github ;) I can comment directly on a line.
- Andrei Alexandrescu (4/8) Aug 04 2014 "Favorite foods and colors are not to be disputed." 51,300 results on
We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Here are a few differences compared to vibe.d's library. I think these are desirable to have in that library as well: * Parsing strings is decoupled into tokenization (which is lazy and only needs an input range) and parsing proper. Tokenization is lazy, which allows users to create their own advanced (e.g. partial/lazy) parsing if needed. The parser itself is eager. * There's no decoding of strings. * The representation is built on Algebraic, with the advantages that it benefits from all of its primitives. Implementation is also very compact because Algebraic obviates a bunch of boilerplate. Subsequent improvements to Algebraic will also reflect themselves into improvements to std.jgrandson. * The JSON value (called std.jgrandson.Value) has no named member variables or methods except for __payload. This is so there's no clash between dynamic properties exposed via opDispatch. Well that's about it. What would it take for this to become a Phobos proposal? Destroy. Andrei
Aug 03 2014
Am Sun, 03 Aug 2014 00:16:04 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:We need a better json library at Facebook. I'd discussed with S=C3=B6nke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: =20 http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html =20 Here are a few differences compared to vibe.d's library. I think these are desirable to have in that library as well: =20 * Parsing strings is decoupled into tokenization (which is lazy and only needs an input range) and parsing proper. Tokenization is lazy, which allows users to create their own advanced (e.g. partial/lazy) parsing if needed. The parser itself is eager. =20 * There's no decoding of strings. =20 * The representation is built on Algebraic, with the advantages that it benefits from all of its primitives. Implementation is also very compact because Algebraic obviates a bunch of boilerplate. Subsequent=20 improvements to Algebraic will also reflect themselves into improvements to std.jgrandson. =20 * The JSON value (called std.jgrandson.Value) has no named member=20 variables or methods except for __payload. This is so there's no clash between dynamic properties exposed via opDispatch. =20 Well that's about it. What would it take for this to become a Phobos=20 proposal? Destroy. =20 =20 AndreiAPI looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJson vibe uses UDAs to customize the serialization output. That's actually not json specific and therefore shouldn't be part of this module. But a simple deserializeJson which simply fills in all fields of a struct given a TokenStream is very useful and can be done without allocations (so it's much faster than going through the DOM). Nitpicks: * I'd make Token only store strings, then convert to double/number only when requested. If a user is simply skipping some tokens these conversions are unnecessary overhead. * parseString really shouldn't use appender. Make it somehow possible to supply a buffer to TokenStream and use that. (This way there's no memory allocation. If a user want to keep the string he has to .dup it). A BufferedRange concept might even be better, because you can read in blocks and reuse buffers.
Aug 03 2014
API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJson vibe uses UDAs to customize the serialization output. That's actually not json specific and therefore shouldn't be part of this module. But a simple deserializeJson which simply fills in all fields of a struct given a TokenStream is very useful and can be done without allocations (so it's much faster than going through the DOM).That's what https://github.com/Orvid/JSONSerialization does. Also msgpack-d https://github.com/msgpack/msgpack-d whose defaults need no UDAs That makes the typical use case very fast to write.
Aug 03 2014
Am 03.08.2014 10:25, schrieb ponce:The default mode for vibe.data.serialization also doesn't need any UDAs, but it's still often useful to be able to make customizations.API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJson vibe uses UDAs to customize the serialization output. That's actually not json specific and therefore shouldn't be part of this module. But a simple deserializeJson which simply fills in all fields of a struct given a TokenStream is very useful and can be done without allocations (so it's much faster than going through the DOM).That's what https://github.com/Orvid/JSONSerialization does. Also msgpack-d https://github.com/msgpack/msgpack-d whose defaults need no UDAs That makes the typical use case very fast to write.
Aug 03 2014
On 8/3/14, 1:02 AM, Johannes Pfau wrote:API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJsonAgreed.vibe uses UDAs to customize the serialization output. That's actually not json specific and therefore shouldn't be part of this module. But a simple deserializeJson which simply fills in all fields of a struct given a TokenStream is very useful and can be done without allocations (so it's much faster than going through the DOM).Nice.Nitpicks: * I'd make Token only store strings, then convert to double/number only when requested. If a user is simply skipping some tokens these conversions are unnecessary overhead.Well... this is tricky. If the input has immutable characters, they can be stored because it can be assumed they'll live forever. If they're mutable or const, that assumption doesn't hold so every number must allocate. At that point it's probably cheaper to just convert to double. One thing is I didn't treat integers specially, but I did notice some json parsers do make that distinction.* parseString really shouldn't use appender. Make it somehow possible to supply a buffer to TokenStream and use that. (This way there's no memory allocation. If a user want to keep the string he has to .dup it). A BufferedRange concept might even be better, because you can read in blocks and reuse buffers.Good suggestion, thanks. Andrei
Aug 03 2014
On Sunday, 3 August 2014 at 08:04:40 UTC, Johannes Pfau wrote:API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJsonBefore going this route one needs to have a good vision how it may interact with imaginary std.serialization to avoid later deprecation. At the same time I have recently started to think that dedicated serialization module that decouples aggregate iteration from data storage format is in most cases impractical for performance reasons - different serialization methods imply very different efficient iteration strategies. Probably it is better to define serialization compile-time traits instead and require each `std.data.*` provider to implement those on its own in the most effective fashion.
Aug 03 2014
Am 03.08.2014 20:44, schrieb Dicebot:On Sunday, 3 August 2014 at 08:04:40 UTC, Johannes Pfau wrote:Do you have a specific case in mind where the data format doesn't fit the process used by vibe.data.serialization? The data format iteration part *is* abstracted away there in basically a kind of traits structure (the "Serializer"). When serializing, the data always gets written in the order defined by the input value, while during deserialization the serializer defines how aggregates are iterated. This seems to fit all of the data formats that I had in mind.API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJsonBefore going this route one needs to have a good vision how it may interact with imaginary std.serialization to avoid later deprecation. At the same time I have recently started to think that dedicated serialization module that decouples aggregate iteration from data storage format is in most cases impractical for performance reasons - different serialization methods imply very different efficient iteration strategies. Probably it is better to define serialization compile-time traits instead and require each `std.data.*` provider to implement those on its own in the most effective fashion.
Aug 03 2014
On Sunday, 3 August 2014 at 19:36:43 UTC, Sönke Ludwig wrote:Do you have a specific case in mind where the data format doesn't fit the process used by vibe.data.serialization? The data format iteration part *is* abstracted away there in basically a kind of traits structure (the "Serializer"). When serializing, the data always gets written in the order defined by the input value, while during deserialization the serializer defines how aggregates are iterated. This seems to fit all of the data formats that I had in mind.For example we use special binary serialization format for structs where serialized content is actually a valid D struct - after updating internal array pointers one can simply do `cast(S*) buffer.ptr` and work with it normally. Doing this efficiently requires breadth-first traversal and keeping track of one upper level to update the pointers. This does not fit very well with classical depth-first recursive traversal usually required by JSON-structure formats.
Aug 03 2014
On Sunday, 3 August 2014 at 18:44:37 UTC, Dicebot wrote:Before going this route one needs to have a good vision how it may interact with imaginary std.serialization to avoid later deprecation.I suggest only provide functions for serializing primitive types. A separate serialization module/package with a JSON archive type would use this module as its backend.At the same time I have recently started to think that dedicated serialization module that decouples aggregate iteration from data storage format is in most cases impractical for performance reasons - different serialization methods imply very different efficient iteration strategies. Probably it is better to define serialization compile-time traits instead and require each `std.data.*` provider to implement those on its own in the most effective fashion.I'm not sure I agree with that. In my work on std.serialization I have not seen this to be a problem. What problems have you found? -- /Jacob Carlborg
Aug 04 2014
"Jacob Carlborg" wrote in message news:bjecckhwlmkwkeqegwqa forum.dlang.org...I suggest only provide functions for serializing primitive types.This is exactly what I need in most projects. Basic types, arrays, AAs, and structs are usually enough.
Aug 04 2014
On Monday, 4 August 2014 at 09:10:46 UTC, Daniel Murphy wrote:This is exactly what I need in most projects. Basic types, arrays, AAs, and structs are usually enough.I was more thinking only types that cannot be broken down in to smaller pieces, i.e. integer, floating point, bool and string. The serializer would break down the other types in to smaller pieces. -- /Jacob Carlborg
Aug 04 2014
"Jacob Carlborg" wrote in message news:kvuaxyxjwmpqrorlozrz forum.dlang.org...I guess I meant types that have an obvious mapping to json types. int/long -> json integer bool -> json bool string -> json string float/real -> json float (close enough) T[] -> json array T[string] -> json object struct -> json object This is usually enough for config and data files. Being able to do this is just awesome: struct AppConfig { string somePath; bool someOption; string[] someList; string[string] someMap; } void main() { auto config = "config.json".readText().parseJSON().fromJson!AppConfig(); } Being able to serialize whole graphs into json is something I need much less often.This is exactly what I need in most projects. Basic types, arrays, AAs, and structs are usually enough.I was more thinking only types that cannot be broken down in to smaller pieces, i.e. integer, floating point, bool and string. The serializer would break down the other types in to smaller pieces.
Aug 05 2014
On Tuesday, 5 August 2014 at 12:40:25 UTC, Daniel Murphy wrote:"Jacob Carlborg" wrote in message news:kvuaxyxjwmpqrorlozrz forum.dlang.org...If I'm right, json has just one numeric type. No difference between integers / float and no limits. So probably the mapping is: float/double/real/int/long => numberI guess I meant types that have an obvious mapping to json types. int/long -> json integer bool -> json bool string -> json string float/real -> json float (close enough) T[] -> json array T[string] -> json object struct -> json object This is usually enough for config and data files. Being able to do this is just awesome: struct AppConfig { string somePath; bool someOption; string[] someList; string[string] someMap; } void main() { auto config = "config.json".readText().parseJSON().fromJson!AppConfig(); } Being able to serialize whole graphs into json is something I need much less often.This is exactly what I need in most projects. Basic types, arrays, AAs, and structs are usually enough.I was more thinking only types that cannot be broken down in to smaller pieces, i.e. integer, floating point, bool and string. The serializer would break down the other types in to smaller pieces.
Aug 05 2014
"Andrea Fontana" wrote in message news:takluoqmlmmooxlovqya forum.dlang.org...If I'm right, json has just one numeric type. No difference between integers / float and no limits. So probably the mapping is: float/double/real/int/long => numberMaybe, but std.json has three numeric types.
Aug 05 2014
On 8/5/14, 8:23 AM, Daniel Murphy wrote:"Andrea Fontana" wrote in message news:takluoqmlmmooxlovqya forum.dlang.org...I searched around a bit and it seems different libraries have different takes to this numeric matter. A simple reading of the spec suggests that floating point data is the only numeric type. However, many implementations choose to distinguish between floating point and integrals. AndreiIf I'm right, json has just one numeric type. No difference between integers / float and no limits. So probably the mapping is: float/double/real/int/long => numberMaybe, but std.json has three numeric types.
Aug 05 2014
On Tuesday, 5 August 2014 at 17:17:56 UTC, Andrei Alexandrescu wrote:On 8/5/14, 8:23 AM, Daniel Murphy wrote:There is certain benefit in using same primitive types for JSON as ones defined by BSON spec."Andrea Fontana" wrote in message news:takluoqmlmmooxlovqya forum.dlang.org...I searched around a bit and it seems different libraries have different takes to this numeric matter. A simple reading of the spec suggests that floating point data is the only numeric type. However, many implementations choose to distinguish between floating point and integrals.If I'm right, json has just one numeric type. No difference between integers / float and no limits. So probably the mapping is: float/double/real/int/long => numberMaybe, but std.json has three numeric types.
Aug 05 2014
On Tuesday, 5 August 2014 at 17:17:56 UTC, Andrei Alexandrescu wrote:I searched around a bit and it seems different libraries have different takes to this numeric matter. A simple reading of the spec suggests that floating point data is the only numeric type. However, many implementations choose to distinguish between floating point and integrals.The original point of JSON was that it auto-converts to Javascript data. And since Javascript only has one numeric type, of course JSON does too. But I think it's important that a JSON package for a language maps naturally to the types available in that language. D provides both floating point and integer types, each with their own costs and benefits, and so the JSON package should as well. It ends up being a lot easier to deal with than remembering to round from JSON.number or whatever when assigning to an int. In fact, JSON doesn't even impose any precision restrictions on its numeric type, so one could argue that we should be using BigInt and BigFloat. But this would stink most of the time, so... On an unrelated note, while the default encoding for strings is UTF-8, the RFC absolutely allows for UTF-16 surrogate pairs, and this must be supported. Any strings you get from Internet Explorer will be encoded as UTF-16 surrogate pairs regardless of content, presumably since Windows uses 16 bit wide chars for unicode.
Aug 05 2014
On 8/5/14, 10:48 AM, Sean Kelly wrote:On Tuesday, 5 August 2014 at 17:17:56 UTC, Andrei Alexandrescu wrote:All good points. Proceed with implementation! :o) -- AndreiI searched around a bit and it seems different libraries have different takes to this numeric matter. A simple reading of the spec suggests that floating point data is the only numeric type. However, many implementations choose to distinguish between floating point and integrals.The original point of JSON was that it auto-converts to Javascript data. And since Javascript only has one numeric type, of course JSON does too. But I think it's important that a JSON package for a language maps naturally to the types available in that language. D provides both floating point and integer types, each with their own costs and benefits, and so the JSON package should as well. It ends up being a lot easier to deal with than remembering to round from JSON.number or whatever when assigning to an int. In fact, JSON doesn't even impose any precision restrictions on its numeric type, so one could argue that we should be using BigInt and BigFloat. But this would stink most of the time, so... On an unrelated note, while the default encoding for strings is UTF-8, the RFC absolutely allows for UTF-16 surrogate pairs, and this must be supported. Any strings you get from Internet Explorer will be encoded as UTF-16 surrogate pairs regardless of content, presumably since Windows uses 16 bit wide chars for unicode.
Aug 05 2014
On Tuesday, 5 August 2014 at 17:58:08 UTC, Andrei Alexandrescu wrote:All good points. Proceed with implementation! :o) -- AndreiAny news about std.allocator ? ;)
Aug 05 2014
On 8/5/14, 10:58 AM, Dicebot wrote:On Tuesday, 5 August 2014 at 17:58:08 UTC, Andrei Alexandrescu wrote:It looks like I need to go all out and write a garbage collector, design and implementation and all. AndreiAll good points. Proceed with implementation! :o) -- AndreiAny news about std.allocator ? ;)
Aug 05 2014
On Tuesday, 5 August 2014 at 18:12:54 UTC, Andrei Alexandrescu wrote:On 8/5/14, 10:58 AM, Dicebot wrote:A few months ago, you posted a video of a talk where you presented code from a garbage collector (it used templated mark functions to get precise tracing). I remember you said that this code was in use somewhere (I guess at FB?). Can this be used as a basis?On Tuesday, 5 August 2014 at 17:58:08 UTC, Andrei Alexandrescu wrote:It looks like I need to go all out and write a garbage collector, design and implementation and all.All good points. Proceed with implementation! :o) -- AndreiAny news about std.allocator ? ;)
Aug 05 2014
On Tue, Aug 05, 2014 at 10:58:08AM -0700, Andrei Alexandrescu via Digitalmars-d wrote:On 8/5/14, 10:48 AM, Sean Kelly wrote:[...]Would it make sense to wrap a JSON number in an opaque type that implicitly casts to the target built-in type?The original point of JSON was that it auto-converts to Javascript data. And since Javascript only has one numeric type, of course JSON does too. But I think it's important that a JSON package for a language maps naturally to the types available in that language. D provides both floating point and integer types, each with their own costs and benefits, and so the JSON package should as well. It ends up being a lot easier to deal with than remembering to round from JSON.number or whatever when assigning to an int. In fact, JSON doesn't even impose any precision restrictions on its numeric type, so one could argue that we should be using BigInt and BigFloat. But this would stink most of the time, so...[...] Wait, I thought surrogate pairs only apply to characters past U+FFFF? Is it even possible to encode BMP characters with surrogate pairs?? Or do you mean UTF-16? T -- Music critic: "That's an imitation fugue!"On an unrelated note, while the default encoding for strings is UTF-8, the RFC absolutely allows for UTF-16 surrogate pairs, and this must be supported. Any strings you get from Internet Explorer will be encoded as UTF-16 surrogate pairs regardless of content, presumably since Windows uses 16 bit wide chars for unicode.
Aug 05 2014
On Tuesday, 5 August 2014 at 18:11:21 UTC, H. S. Teoh via Digitalmars-d wrote:On Tue, Aug 05, 2014 at 10:58:08AM -0700, Andrei Alexandrescu via Digitalmars-d wrote:On 8/5/14, 10:48 AM, Sean Kelly wrote:[...]Would it make sense to wrap a JSON number in an opaque type that implicitly casts to the target built-in type?IMO we should store original json number value as string and then try to convert to what user asks for. As said, it could be a big int, or a big floating point value without any limit.
Aug 06 2014
On 2014-08-05 14:40, Daniel Murphy wrote:I guess I meant types that have an obvious mapping to json types. int/long -> json integer bool -> json bool string -> json string float/real -> json float (close enough) T[] -> json array T[string] -> json object struct -> json object This is usually enough for config and data files. Being able to do this is just awesome: struct AppConfig { string somePath; bool someOption; string[] someList; string[string] someMap; } void main() { auto config = "config.json".readText().parseJSON().fromJson!AppConfig(); }I'm not saying that is a bad idea or that I don't want to be able to do this. I just prefer this to be handled by a generic serialization module. Which can of course handle the simple cases, like above, as well. -- /Jacob Carlborg
Aug 05 2014
"Jacob Carlborg" wrote in message news:lrqvfa$2has$1 digitalmars.com...I'm not saying that is a bad idea or that I don't want to be able to do this. I just prefer this to be handled by a generic serialization module. Which can of course handle the simple cases, like above, as well.I know, but I don't really care if it's part of a generic serialization library or not. I just want it there. Chances are tying it to a future generic serialization library is going to make it take longer.
Aug 05 2014
On 2014-08-05 18:42, Daniel Murphy wrote:Chances are tying it to a future generic serialization library is going to make it take longer.Yeah, that's the problem. But where do you draw the line. Should arrays of structs be supported? -- /Jacob Carlborg
Aug 06 2014
"Jacob Carlborg" wrote in message news:lrsrek$19mf$1 digitalmars.com...Yes. Allow T, where T is any of int, float, long, etc bool struct { T... } T[string] T[] Sure, you _can_ make a struct containing an array that contains itself, but you probably won't.Chances are tying it to a future generic serialization library is going to make it take longer.Yeah, that's the problem. But where do you draw the line. Should arrays of structs be supported?
Aug 06 2014
On 2014-08-06 13:36, Daniel Murphy wrote:Yes. Allow T, where T is any of int, float, long, etc bool struct { T... } T[string] T[]BTW, why not classes? It's basically the same implementation as for structs. -- /Jacob Carlborg
Aug 06 2014
"Jacob Carlborg" wrote in message news:lrtf8l$22d3$1 digitalmars.com...BTW, why not classes? It's basically the same implementation as for structs.I guess I've just never needed to do it with classes. A lot of the time when I use classes I use inheritance, and this simple translation doesn't work out so will then...
Aug 06 2014
On Wednesday, 6 August 2014 at 15:28:06 UTC, Daniel Murphy wrote:"Jacob Carlborg" wrote in message news:lrtf8l$22d3$1 digitalmars.com...We could do something like Jackson. I wouldn't want it as the primary interface for a JSON package, but for serializing classes it's a pretty easy design to work with from a user perspective.BTW, why not classes? It's basically the same implementation as for structs.I guess I've just never needed to do it with classes. A lot of the time when I use classes I use inheritance, and this simple translation doesn't work out so will then...
Aug 06 2014
On Monday, 4 August 2014 at 07:34:19 UTC, Jacob Carlborg wrote:On Sunday, 3 August 2014 at 18:44:37 UTC, Dicebot wrote:case for JSON conversion.Before going this route one needs to have a good vision how it may interact with imaginary std.serialization to avoid later deprecation.I suggest only provide functions for serializing primitive types. A separate serialization module/package with a JSON archive type would use this module as its backend.http://forum.dlang.org/post/mzweposldwqdtmqoltiy forum.dlang.orgAt the same time I have recently started to think that dedicated serialization module that decouples aggregate iteration from data storage format is in most cases impractical for performance reasons - different serialization methods imply very different efficient iteration strategies. Probably it is better to define serialization compile-time traits instead and require each `std.data.*` provider to implement those on its own in the most effective fashion.I'm not sure I agree with that. In my work on std.serialization I have not seen this to be a problem. What problems have you found?
Aug 04 2014
On Monday, 4 August 2014 at 14:02:22 UTC, Dicebot wrote:use case for JSON conversion.No, only types that cannot be broken down in to smaller pieces, i.e. integral, floating points, bool and strings.http://forum.dlang.org/post/mzweposldwqdtmqoltiy forum.dlang.orgI don't understand exactly how that binary serialization works. I think I would need a code example. -- /Jacob Carlborg
Aug 04 2014
On Monday, 4 August 2014 at 14:18:41 UTC, Jacob Carlborg wrote:On Monday, 4 August 2014 at 14:02:22 UTC, Dicebot wrote:That is exactly the problem - if `structToJson` won't be provided, complaints are inevitable, it is too basic feature to wait for std.serialization :(use case for JSON conversion.No, only types that cannot be broken down in to smaller pieces, i.e. integral, floating points, bool and strings.Simplified serialization algorithm: 1) write (cast(void*) &struct)[0..struct.sizeof] to target buffer 2) write any of array content to the same buffer after the struct 3.1) if array contains structs, recursion 3.2) go back to buffer[0..struct.sizeof] slice and update array fields to store an index in the same buffer instead of actual ptr Simplified deserialization algorithm: 1) recursively traverse the struct and replace array index offsets with real slices to the buffer (I don't want to bother with getting copyright permissions to publish actual code) I am pretty sure that this is not the only optimized serialization approach out there that does not fit in a content-insensitive primitive-based traversal scheme. And we won't Phobos stuff to be blazingly fast which can lead to situation where new data module will circumvent the std.serialization API to get more performance.http://forum.dlang.org/post/mzweposldwqdtmqoltiy forum.dlang.orgI don't understand exactly how that binary serialization works. I think I would need a code example.
Aug 04 2014
On 2014-08-04 16:55, Dicebot wrote:That is exactly the problem - if `structToJson` won't be provided, complaints are inevitable, it is too basic feature to wait for std.serialization :(Hmm, yeah, that's a problem.Simplified serialization algorithm: 1) write (cast(void*) &struct)[0..struct.sizeof] to target buffer 2) write any of array content to the same buffer after the struct 3.1) if array contains structs, recursion 3.2) go back to buffer[0..struct.sizeof] slice and update array fields to store an index in the same buffer instead of actual ptr Simplified deserialization algorithm: 1) recursively traverse the struct and replace array index offsets with real slices to the bufferI see. I need to think a bit about this.(I don't want to bother with getting copyright permissions to publish actual code)Fair enough. The above was quite descriptive.I am pretty sure that this is not the only optimized serialization approach out there that does not fit in a content-insensitive primitive-based traversal scheme. And we won't Phobos stuff to be blazingly fast which can lead to situation where new data module will circumvent the std.serialization API to get more performance.I don't like the idea of having to reimplement serialization for each data type that can be generalized. -- /Jacob Carlborg
Aug 04 2014
Am 04.08.2014 20:38, schrieb Jacob Carlborg:On 2014-08-04 16:55, Dicebot wrote:On the other hand, a simplistic solution will inevitably result in people needing more. And when at some point a serialization module is in Phobos, there will be duplicate functionality in the library.That is exactly the problem - if `structToJson` won't be provided, complaints are inevitable, it is too basic feature to wait for std.serialization :(Hmm, yeah, that's a problem.I think we could also simply keep the generic default recursive descent behavior, but allow serializers to customize the process using some kind of trait. This could even be added later in a backwards compatible fashion if necessary. BTW, how is the progress for Orange w.r.t. to the conversion to a more template+allocation-less approach, is a new std proposal within the next DMD release cycle realistic? I quite like most of how vibe.data.serialization turned out, but it can't do any alias detection/deduplication (and I have no concrete plans to add support for that), which is why I currently wouldn't consider it as a potential Phobos candidate.I am pretty sure that this is not the only optimized serialization approach out there that does not fit in a content-insensitive primitive-based traversal scheme. And we won't Phobos stuff to be blazingly fast which can lead to situation where new data module will circumvent the std.serialization API to get more performance.I don't like the idea of having to reimplement serialization for each data type that can be generalized.
Aug 05 2014
On Tuesday, 5 August 2014 at 09:54:42 UTC, Sönke Ludwig wrote:I think we could also simply keep the generic default recursive descent behavior, but allow serializers to customize the process using some kind of trait. This could even be added later in a backwards compatible fashion if necessary.Simple option is to define required serializer traits and make both std.serialization default and any custom data-specific ones conform it.
Aug 05 2014
On 2014-08-05 11:54, Sönke Ludwig wrote:I think we could also simply keep the generic default recursive descent behavior, but allow serializers to customize the process using some kind of trait. This could even be added later in a backwards compatible fashion if necessary.I have a very flexible trait like system in place. This allows to configure the serializer based on the given archiver and user customizations. To avoid having the serializer do unnecessary work which the archiver cannot handle.BTW, how is the progress for Orange w.r.t. to the conversion to a more template+allocation-less approachSlowly. I think the range support in the serializer is basically complete. But the deserializer isn't done yet. I would also like to provide, at least, one additional archiver type besides XML. BTW std.xml doesn't make it any easier to rangify the serializer. I've been focusing on D/Objective-C lately, which I think is in a more complete state than std.serialization. I would really like to get it done and create a pull request so I can get back to std.serialization. But I always get stuck after a merge with something breaking. With the summer and vacations I haven't been able to work that much on D at all. , is a new std proposal within the nextDMD release cycle realistic?Probably not.I quite like most of how vibe.data.serialization turned out, but it can't do any alias detection/deduplication (and I have no concrete plans to add support for that), which is why I currently wouldn't consider it as a potential Phobos candidate.I'm quite satisfied with the feature support and flexibility of Orange/std.serialization. With the new trait like system it will be even more flexible. -- /Jacob Carlborg
Aug 05 2014
Andrei Alexandrescu:* The representation is built on Algebraic,Good. But here I'd like a little more readable type: alias Payload = std.variant.VariantN!(16LU, typeof(null), bool, double, string, Value[], Value[string]).VariantN; Like: alias Payload = std.variant.Algebraic!(typeof(null), bool, double, string, Value[], Value[string]); Bye, bearophile
Aug 03 2014
On 8/3/14, 1:19 AM, bearophile wrote:Andrei Alexandrescu:Yah, the latter is in the code. It's a ddoc problem. -- Andrei* The representation is built on Algebraic,Good. But here I'd like a little more readable type: alias Payload = std.variant.VariantN!(16LU, typeof(null), bool, double, string, Value[], Value[string]).VariantN; Like: alias Payload = std.variant.Algebraic!(typeof(null), bool, double, string, Value[], Value[string]);
Aug 03 2014
A few thoughts based on my experience with vibe.data.json: 1. No decoding of strings appears to mean that "Value" also always contains encoded strings. This seems the be a leaky and also error prone leaky abstraction. For the token stream, performance should be top priority, so it's okay to not decode there, but "Value" is a high level abstraction of a JSON value, so it should really hide all implementation details of the storage format. 2. Algebraic is a good choice for its generic handling of operations on the contained types (which isn't exposed here, though). However, a tagged union type in my experience has quite some advantages for usability. Since adding a type tag possibly affects the interface in a non-backwards compatible way, this should be evaluated early on. 2.b) I'm currently working on a generic tagged union type that also enables operations between values in a natural generic way. This has the big advantage of not having to manually define operators like in "Value", which is error prone and often limited (I've had to make many fixes and additions in this part of the code over time). 3. Use of "opDispatch" for an open set of members has been criticized for vibe.data.json before and I agree with that criticism. The only advantage is saving a few keystrokes (json.key instead of json["key"]), but I came to the conclusion that the right approach to work with JSON values in D is to always directly deserialize when/if possible anyway, which mostly makes this is a moot point. This approach has a lot of advantages, e.g. reduction of allocations, performance of field access and avoiding typos when accessing fields. Especially the last point is interesting, because opDispatch based field access gives the false impression that a static field is accessed. The decision to minimize the number of static fields within "Value" reduces the chance of accidentally accessing a static field instead of hitting opDispatch, but there are still *some* static fields/methods and any later addition of a symbol must now be considered a breaking change. 3.b) Bad interaction of UFCS and opDispatch: Functions like "remove" and "assume" certainly look like they could be used with UFCS, but opDispatch destroys that possibility. 4. I know the stance on this is often "The D module system has enough facilities to disambiguate" (which is not really a valid argument, but rather just the lack of a counter argument, IMO), but I highly dislike the choice to leave off any mention of "JSON" or "Json" in the global symbol names. Using the module either requires to always use a renamed import or a manual alias, or the resulting source code will always leave the reader wondering what kind of data is actually handled. Handling multiple "value" types in a single piece of code, which is not uncommon (e.g. JSON + BSON/ini value/...) would always require explicit disambiguation. I'd certainly include the "JSON" or "Json" part in the names. 5. Whatever happens, *please* let's aim for a module name of std.data.json (similar to std.digest.*), so that any data formats added later are nicely organized. All existing data format support (XML + CSV) doesn't follow contemporary Phobos style, so they will need to be deprecated at some point anyway, freeing the way for a clean an non-breaking transition to a more organized module hierarchy. 6. (Possibly compile time optional) support for keeping track of line/column numbers is often important for better error messages, so that would be good to have included as part of the parser and in the "Token" type. Sönke
Aug 03 2014
On 8/3/14, 2:38 AM, Sönke Ludwig wrote:A few thoughts based on my experience with vibe.data.json: 1. No decoding of strings appears to mean that "Value" also always contains encoded strings. This seems the be a leaky and also error prone leaky abstraction. For the token stream, performance should be top priority, so it's okay to not decode there, but "Value" is a high level abstraction of a JSON value, so it should really hide all implementation details of the storage format.Nonono. I think there's a confusion. The input strings are not UTF decoded for the simple need there's no need (all tokenization decisions are taken on the basis of ASCII characters/code units). The backslash-prefixed characters are indeed decoded. An optimization I didn't implement yet is to use slices of the input wherever possible (when the input is string, immutable(byte)[], or immutable(ubyte)[]). That will reduce allocations considerably.2. Algebraic is a good choice for its generic handling of operations on the contained types (which isn't exposed here, though). However, a tagged union type in my experience has quite some advantages for usability. Since adding a type tag possibly affects the interface in a non-backwards compatible way, this should be evaluated early on.There's a public opCast(Payload) that gives the end user access to the Payload inside a Value. I forgot to add documentation to it. What advantages are to a tagged union? (FWIW: to me Algebraic and Variant are also tagged unions, just that the tags are not 0, 1, ..., n. That can be easily fixed for Algebraic by defining operations to access the index of the currently-stored type.)2.b) I'm currently working on a generic tagged union type that also enables operations between values in a natural generic way. This has the big advantage of not having to manually define operators like in "Value", which is error prone and often limited (I've had to make many fixes and additions in this part of the code over time).I did notice that vibe.json has quite a repetitive implementation, so reducing it would be great. The way I see it, good work on tagged unions must be either integrated within std.variant (either by modifying Variant/Algebraic or by adding new types to it). I am very strongly opposed to adding a tagged union type only for JSON purposes, which I'd consider essentially a usability bug in std.variant, the opposite of dogfooding, etc.3. Use of "opDispatch" for an open set of members has been criticized for vibe.data.json before and I agree with that criticism. The only advantage is saving a few keystrokes (json.key instead of json["key"]), but I came to the conclusion that the right approach to work with JSON values in D is to always directly deserialize when/if possible anyway, which mostly makes this is a moot point.Interesting. Well if experience with opDispatch is negative then it should probably not be used here, or only offered on an opt-in basis.This approach has a lot of advantages, e.g. reduction of allocations, performance of field access and avoiding typos when accessing fields. Especially the last point is interesting, because opDispatch based field access gives the false impression that a static field is accessed.Good point.The decision to minimize the number of static fields within "Value" reduces the chance of accidentally accessing a static field instead of hitting opDispatch, but there are still *some* static fields/methods and any later addition of a symbol must now be considered a breaking change.Right now the idea is that the only named member is __payload. Well then there's opXxxx as well. The idea is/was to add all other functionality as free functions.3.b) Bad interaction of UFCS and opDispatch: Functions like "remove" and "assume" certainly look like they could be used with UFCS, but opDispatch destroys that possibility.Yah, agreed. The bummer is people coming from Python won't be able to continue using the same style without opDispatch.4. I know the stance on this is often "The D module system has enough facilities to disambiguate" (which is not really a valid argument, but rather just the lack of a counter argument, IMO), but I highly dislike the choice to leave off any mention of "JSON" or "Json" in the global symbol names. Using the module either requires to always use a renamed import or a manual alias, or the resulting source code will always leave the reader wondering what kind of data is actually handled. Handling multiple "value" types in a single piece of code, which is not uncommon (e.g. JSON + BSON/ini value/...) would always require explicit disambiguation. I'd certainly include the "JSON" or "Json" part in the names.Good point, I agree.5. Whatever happens, *please* let's aim for a module name of std.data.json (similar to std.digest.*), so that any data formats added later are nicely organized. All existing data format support (XML + CSV) doesn't follow contemporary Phobos style, so they will need to be deprecated at some point anyway, freeing the way for a clean an non-breaking transition to a more organized module hierarchy.I agree.6. (Possibly compile time optional) support for keeping track of line/column numbers is often important for better error messages, so that would be good to have included as part of the parser and in the "Token" type.Yah, saw that in vibe.d but forgot about it. Thanks, Andrei
Aug 03 2014
On Sunday, 3 August 2014 at 15:14:43 UTC, Andrei Alexandrescu wrote:I support this opinion. opDispatch looks cool with JSON objects when you implement it but it results in many subtle quirks when you consider something like range traits for example - most annoying to encounter and debug. It is not worth the gain.3. Use of "opDispatch" for an open set of members has been criticized for vibe.data.json before and I agree with that criticism. The only advantage is saving a few keystrokes (json.key instead of json["key"]), but I came to the conclusion that the right approach to work with JSON values in D is to always directly deserialize when/if possible anyway, which mostly makes this is a moot point.Interesting. Well if experience with opDispatch is negative then it should probably not be used here, or only offered on an opt-in basis.
Aug 03 2014
Am 03.08.2014 17:14, schrieb Andrei Alexandrescu:On 8/3/14, 2:38 AM, Sönke Ludwig wrote:Ah okay, *phew* ;) But in that case I'd actually think about leaving off the backslash decoding in the low level parser, so that slices could be used for immutable inputs in all cases - maybe with a name of "rawString" for the stored data and an additional "string" property that decodes on the fly. This may come in handy when the first comparative benchmarks together with rapidjson and the like are done.A few thoughts based on my experience with vibe.data.json: 1. No decoding of strings appears to mean that "Value" also always contains encoded strings. This seems the be a leaky and also error prone leaky abstraction. For the token stream, performance should be top priority, so it's okay to not decode there, but "Value" is a high level abstraction of a JSON value, so it should really hide all implementation details of the storage format.Nonono. I think there's a confusion. The input strings are not UTF decoded for the simple need there's no need (all tokenization decisions are taken on the basis of ASCII characters/code units). The backslash-prefixed characters are indeed decoded. An optimization I didn't implement yet is to use slices of the input wherever possible (when the input is string, immutable(byte)[], or immutable(ubyte)[]). That will reduce allocations considerably.I see. Suppose that opDispatch would be dropped, would anything speak against "alias this"ing _payload to avoid the need for the manually defined operators?2. Algebraic is a good choice for its generic handling of operations on the contained types (which isn't exposed here, though). However, a tagged union type in my experience has quite some advantages for usability. Since adding a type tag possibly affects the interface in a non-backwards compatible way, this should be evaluated early on.There's a public opCast(Payload) that gives the end user access to the Payload inside a Value. I forgot to add documentation to it.What advantages are to a tagged union? (FWIW: to me Algebraic and Variant are also tagged unions, just that the tags are not 0, 1, ..., n. That can be easily fixed for Algebraic by defining operations to access the index of the currently-stored type.)The two major points are probably that it's possible to use "final switch" on the type tag if it's an enum, and the type id can be easily stored in both integer and string form (which is not as conveniently possible with a TypeInfo).(...) The way I see it, good work on tagged unions must be either integrated within std.variant (either by modifying Variant/Algebraic or by adding new types to it). I am very strongly opposed to adding a tagged union type only for JSON purposes, which I'd consider essentially a usability bug in std.variant, the opposite of dogfooding, etc.Definitely agree there. An enum based tagged union design also currently has the unfortunate property that the order of enum values and that of the accepted types must be defined consistently, or bad things will happen. Supporting UDAs on enum values would be a possible direction to fix this: enum JsonType { variantType!string string, variantType!(JsonValue[]) array, variantType!(JsonValue[string]) object } alias JsonValue = TaggedUnion!JsonType; But then there are obviously still issues with cyclic type references. So, anyway, this is something that still requires some thought. It could also be designed in a way that is backwards compatible with a pure "Algebraic", so it shouldn't be a blocker for the current design.
Aug 03 2014
On 8/3/14, 11:03 AM, Sönke Ludwig wrote:Am 03.08.2014 17:14, schrieb Andrei Alexandrescu:[snip]Ah okay, *phew* ;) But in that case I'd actually think about leaving off the backslash decoding in the low level parser, so that slices could be used for immutable inputs in all cases - maybe with a name of "rawString" for the stored data and an additional "string" property that decodes on the fly. This may come in handy when the first comparative benchmarks together with rapidjson and the like are done.Yah, that's awesome.Correct. In fact the conversion was there but I removed it for the sake of opDispatch.There's a public opCast(Payload) that gives the end user access to the Payload inside a Value. I forgot to add documentation to it.I see. Suppose that opDispatch would be dropped, would anything speak against "alias this"ing _payload to avoid the need for the manually defined operators?So I just tried this: http://dpaste.dzfl.pl/eeadac68fac0. Sadly, the cast doesn't take. Without the cast the enum does compile, but not the switch. I submitted https://issues.dlang.org/show_bug.cgi?id=13247.What advantages are to a tagged union? (FWIW: to me Algebraic and Variant are also tagged unions, just that the tags are not 0, 1, ..., n. That can be easily fixed for Algebraic by defining operations to access the index of the currently-stored type.)The two major points are probably that it's possible to use "final switch" on the type tag if it's an enum,and the type id can be easily stored in both integer and string form (which is not as conveniently possible with a TypeInfo).I think here pointers to functions "win" because getting a string (or anything else for that matter) is an indirect call away. std.variant has been among the first artifacts I wrote for D. It's a topic I've been dabbling in for a long time in a C++ context (http://goo.gl/zqUwFx), with always almost-satisfactory results. I told myself if I get to implement things in D properly, then this language has good potential. Replacing the integral tag I'd always used with a pointer to function is, I think, net progress. Things turned out fine, save for the switch matter.An enum based tagged union design also currently has the unfortunate property that the order of enum values and that of the accepted types must be defined consistently, or bad things will happen. Supporting UDAs on enum values would be a possible direction to fix this: enum JsonType { variantType!string string, variantType!(JsonValue[]) array, variantType!(JsonValue[string]) object } alias JsonValue = TaggedUnion!JsonType; But then there are obviously still issues with cyclic type references. So, anyway, this is something that still requires some thought. It could also be designed in a way that is backwards compatible with a pure "Algebraic", so it shouldn't be a blocker for the current design.I think something can be designed along these lines if necessary. Andrei
Aug 03 2014
On Sunday, 3 August 2014 at 15:14:43 UTC, Andrei Alexandrescu wrote:On 8/3/14, 2:38 AM, Sönke Ludwig wrote:I suspect that depends on the circumstances. I've been using this style (with Adam's jsvar), and I find it quite nice for decomposing my TOML parse trees to Variant-like structures that go several levels deep. It makes reading (and, consequently, reasoning about) them much easier for me. That said, I think the ideal would be that nesting Variant[] should work predictably such that users can just write a one-line opDispatch if they want it to behave that way. -Wyatt3. Use of "opDispatch" for an open set of members has been criticized for vibe.data.json before and I agree with that criticism. The only advantage is saving a few keystrokes (json.key instead of json["key"]), but I came to the conclusion that the right approach to work with JSON values in D is to always directly deserialize when/if possible anyway, which mostly makes this is a moot point.Interesting. Well if experience with opDispatch is negative then it should probably not be used here, or only offered on an opt-in basis.
Aug 04 2014
On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. Following our email exchange I decided to work on this because (a) you mentioned more work is needed and your schedule was unclear, (b) we need this at FB sooner rather than later, (c) there were a few things I thought can be improved in vibe.data.json. I hope that taking std.jgrandson to proof spurs things into action. Would you want to merge some of std.jgrandson's deltas into a new proposal std.data.json based on vibe.data.json? Here's a few things that I consider necessary: 1. Commit to a schedule. I can't abandon stuff in wait for the perfect design that may or may not come someday. 2. Avoid UTF decoding. 3. Offer a lazy token stream as a basis for a non-lazy parser. A lazy general parser would be considerably more difficult to write and would only serve a small niche. On the other hand, a lazy tokenizer is easy to write and make efficient, and serve as a basis for user-defined specialized lazy parsers if the user wants so. 4. Avoid string allocation. String allocation can be replaced with slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same. 5. Build on std.variant through and through. Again, anything that doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable. 6. Address w0rp's issue with undefined. In fact std.Algebraic does have an uninitialized state :o). Sönke, what do you think? Andrei
Aug 03 2014
Am Sun, 03 Aug 2014 08:34:20 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 8/3/14, 2:38 AM, S=C3=B6nke Ludwig wrote: [snip] =20 We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. =20 [...] =20 4. Avoid string allocation. String allocation can be replaced with=20 slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same.I think for the lowest level interface we could avoid allocation completely: The tokenizer could always return slices to the raw string, even if a string contains backslash-encode sequences or if the token is a number. Simply expose that as token.rawValue. Then add a function, Token.decodeString() and token.decodeNumber() to actually decode the numbers. decodeString could additionally support decoding into a buffer. If the input is not sliceable, read the input into an internal buffer first and slice that buffer. The main usecase for this is if you simply stream lots of data and you only want to parse very little of it and skip over most content. Then you don't need to decode the strings. This is also true if you only write a JSON formatter: No need to decode and encode the strings.=20 5. Build on std.variant through and through. Again, anything that=20 doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable. =20Variant uses TypeInfo internally, right? I think as long as it uses TypeInfo it can't replace all use-cases for a standard tagged union.
Aug 03 2014
On 8/3/14, 8:51 AM, Johannes Pfau wrote:Am Sun, 03 Aug 2014 08:34:20 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:That works but not e.g. for File.byLine which reuses its internal buffer. But it's a neat idea for arrays of immutable bytes.On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. [...] 4. Avoid string allocation. String allocation can be replaced with slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same.I think for the lowest level interface we could avoid allocation completely: The tokenizer could always return slices to the raw string, even if a string contains backslash-encode sequences or if the token is a number. Simply expose that as token.rawValue. Then add a function, Token.decodeString() and token.decodeNumber() to actually decode the numbers. decodeString could additionally support decoding into a buffer.If the input is not sliceable, read the input into an internal buffer first and slice that buffer.At that point the cost of decoding becomes negligible.The main usecase for this is if you simply stream lots of data and you only want to parse very little of it and skip over most content. Then you don't need to decode the strings.Awesome.This is also true if you only write a JSON formatter: No need to decode and encode the strings.But wouldn't that still need to encode \n, \r, \t, \v?No. Andrei5. Build on std.variant through and through. Again, anything that doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable.Variant uses TypeInfo internally, right?
Aug 03 2014
Am Sun, 03 Aug 2014 09:17:57 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 8/3/14, 8:51 AM, Johannes Pfau wrote:https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L210 https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L371 https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L696 Also the handler function concept will always have more overhead than a simple tagged union. It is certainly useful if you want to store any type, but if you only want a limited set of types there are more efficient implementations.Variant uses TypeInfo internally, right?No.
Aug 03 2014
On 8/3/14, 11:08 AM, Johannes Pfau wrote:Am Sun, 03 Aug 2014 09:17:57 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:That's a query for the TypeInfo.On 8/3/14, 8:51 AM, Johannes Pfau wrote:https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L210Variant uses TypeInfo internally, right?No.https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L371That could be translated to a comparison of pointers to functions.https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L696That, too, could be translated to a comparison of pointers to functions. It's a confision Let me clarify this. What Variant does is to use pointers to functions instead of integers. The space overhead (one word) is generally the same due to alignment issues.Also the handler function concept will always have more overhead than a simple tagged union. It is certainly useful if you want to store any type, but if you only want a limited set of types there are more efficient implementations.I'm not sure at all actually. The way I see it a pointer to a function offers most everything an integer does, plus universal functionality by actually calling the function. What it doesn't offer is ordering of small integers, but that can be easily arranged at a small cost. Andrei
Aug 03 2014
Am 03.08.2014 17:34, schrieb Andrei Alexandrescu:On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. Following our email exchange I decided to work on this because (a) you mentioned more work is needed and your schedule was unclear, (b) we need this at FB sooner rather than later, (c) there were a few things I thought can be improved in vibe.data.json. I hope that taking std.jgrandson to proof spurs things into action. Would you want to merge some of std.jgrandson's deltas into a new proposal std.data.json based on vibe.data.json? Here's a few things that I consider necessary: 1. Commit to a schedule. I can't abandon stuff in wait for the perfect design that may or may not come someday.This may be the crux w.r.t. the vibe.data.json implementation. My schedule will be very crowded this month, so I could only really start to work on it beginning of September. But apart from the mentioned points, I think your implementation is already the closest thing to what I have in mind, so I'm all for going the clean slate route (I'll have to do a lot in terms of deprecation work in vibe.d anyway).2. Avoid UTF decoding. 3. Offer a lazy token stream as a basis for a non-lazy parser. A lazy general parser would be considerably more difficult to write and would only serve a small niche. On the other hand, a lazy tokenizer is easy to write and make efficient, and serve as a basis for user-defined specialized lazy parsers if the user wants so. 4. Avoid string allocation. String allocation can be replaced with slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same. 5. Build on std.variant through and through. Again, anything that doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable. 6. Address w0rp's issue with undefined. In fact std.Algebraic does have an uninitialized state :o). Sönke, what do you think?My requirements would be the same, except for 6. The "undefined" state in the vibe.d version was necessary due to early API decisions and it's more or less a prominent part of it (specifically because the API was designed to behave similar to JavaScript). In hindsight, I'd definitely avoid that. However, I don't think its existence (also in the form of Algebraic.init) is an issue per se, as long as such values are properly handled when converting the runtime value back to a JSON string (i.e. skipped or treated as null values).
Aug 03 2014
On Sunday, 3 August 2014 at 18:37:48 UTC, Sönke Ludwig wrote:Am 03.08.2014 17:34, schrieb Andrei Alexandrescu:My issue with is is that if you ask for a key in an object which doesn't exist, you get an 'undefined' value back, just like JavaScript. I'd rather that be propagated as a RangeError, which is more consistent with associative arrays in the language and probably more correct. A minor issue is being able to create a Json object which isn't a valid Json object by itself. I'd rather the initial value was just 'null', which would match how pointers and class instances behave in the language.6. Address w0rp's issue with undefined. In fact std.Algebraic does have an uninitialized state :o).My requirements would be the same, except for 6. The "undefined" state in the vibe.d version was necessary due to early API decisions and it's more or less a prominent part of it (specifically because the API was designed to behave similar to JavaScript). In hindsight, I'd definitely avoid that. However, I don't think its existence (also in the form of Algebraic.init) is an issue per se, as long as such values are properly handled when converting the runtime value back to a JSON string (i.e. skipped or treated as null values).
Aug 03 2014
Am 03.08.2014 20:57, schrieb w0rp:On Sunday, 3 August 2014 at 18:37:48 UTC, Sönke Ludwig wrote:Yes, this is what I meant with the JavaScript part of API. In addition to opIndex(), there should of course also be a .get(key, default_value) style accessor and the "in" operator.The "undefined" state in the vibe.d version was necessary due to early API decisions and it's more or less a prominent part of it (specifically because the API was designed to behave similar to JavaScript). In hindsight, I'd definitely avoid that. However, I don't think its existence (also in the form of Algebraic.init) is an issue per se, as long as such values are properly handled when converting the runtime value back to a JSON string (i.e. skipped or treated as null values).My issue with is is that if you ask for a key in an object which doesn't exist, you get an 'undefined' value back, just like JavaScript. I'd rather that be propagated as a RangeError, which is more consistent with associative arrays in the language and probably more correct.A minor issue is being able to create a Json object which isn't a valid Json object by itself. I'd rather the initial value was just 'null', which would match how pointers and class instances behave in the language.This is what I meant with not being an issue by itself. But having such a special value of course has its pros and cons, and I could personally definitely also live with JSON values being initialized to JSON "null", if somebody hacks Algebraic to support that kind of use case.
Aug 03 2014
On Sunday, 3 August 2014 at 19:54:12 UTC, Sönke Ludwig wrote:Am 03.08.2014 20:57, schrieb w0rp:There is a parallel discussion about the concept of associative ranges: http://forum.dlang.org/thread/jheurakujksdlrjaoncs forum.dlang.org Maybe you could also have a look there, because JSON seems to be a good candidate for an associative range.My issue with is is that if you ask for a key in an object which doesn't exist, you get an 'undefined' value back, just like JavaScript. I'd rather that be propagated as a RangeError, which is more consistent with associative arrays in the language and probably more correct.Yes, this is what I meant with the JavaScript part of API. In addition to opIndex(), there should of course also be a .get(key, default_value) style accessor and the "in" operator.
Aug 04 2014
On 8/3/14, 11:37 AM, Sönke Ludwig wrote:Am 03.08.2014 17:34, schrieb Andrei Alexandrescu:What would be your estimated time of finishing? Would anyone want to take vibe.data.json and std.jgrandson, put them in a crucible, and have std.data.json emerge from it in a timely manner? My understanding is that everyone involved would be cool with that. AndreiOn 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. Following our email exchange I decided to work on this because (a) you mentioned more work is needed and your schedule was unclear, (b) we need this at FB sooner rather than later, (c) there were a few things I thought can be improved in vibe.data.json. I hope that taking std.jgrandson to proof spurs things into action. Would you want to merge some of std.jgrandson's deltas into a new proposal std.data.json based on vibe.data.json? Here's a few things that I consider necessary: 1. Commit to a schedule. I can't abandon stuff in wait for the perfect design that may or may not come someday.This may be the crux w.r.t. the vibe.data.json implementation. My schedule will be very crowded this month, so I could only really start to work on it beginning of September. But apart from the mentioned points, I think your implementation is already the closest thing to what I have in mind, so I'm all for going the clean slate route (I'll have to do a lot in terms of deprecation work in vibe.d anyway).
Aug 03 2014
Am 03.08.2014 21:53, schrieb Andrei Alexandrescu:What would be your estimated time of finishing?My rough estimate would be that about two weeks of calender time should suffice for a first candidate, since the functionality and the design is already mostly there. However, it seems that VariantN will need some work, too (currently using opAdd results in an error for an Algebraic defined for JSON usage).
Aug 05 2014
On Sunday, 3 August 2014 at 07:16:05 UTC, Andrei Alexandrescu wrote:We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Here are a few differences compared to vibe.d's library. I think these are desirable to have in that library as well: * Parsing strings is decoupled into tokenization (which is lazy and only needs an input range) and parsing proper. Tokenization is lazy, which allows users to create their own advanced (e.g. partial/lazy) parsing if needed. The parser itself is eager. * There's no decoding of strings. * The representation is built on Algebraic, with the advantages that it benefits from all of its primitives. Implementation is also very compact because Algebraic obviates a bunch of boilerplate. Subsequent improvements to Algebraic will also reflect themselves into improvements to std.jgrandson. * The JSON value (called std.jgrandson.Value) has no named member variables or methods except for __payload. This is so there's no clash between dynamic properties exposed via opDispatch. Well that's about it. What would it take for this to become a Phobos proposal? Destroy. AndreiI like it. Here's what I think about it. * When I wrote my JSON library, the thing I wanted most was constructors and opAssign functions for creating JSON values easily. JSON x = "some string"; You have this, so it's great. * You didn't include an 'undefined' value like vibe.d, which is a very minor detail, but something I dislike. This is good. * I'd just name Value either 'JSON' or 'JSONValue.' So you can just import the module without using aliases. * opDispatch is kind of "meh" for JSON objects. It works until you hit a name clash with a UFCS function. I don't mind typing the extra three characters. That's all I could think of really.
Aug 03 2014
Am 03.08.2014 09:16, schrieb Andrei Alexandrescu:We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.htmlIs the name supposed to stay or just a working title? "std.j*grandson*" (being the successor of "std.j*son*") is of course a funny play of words, but it's not really obvious on the first sight what it does. i.e. if someone skims the std. modules in the documentation, looking for json, he'd probably not think that this is the new json module. std.json2 or something like that would be more obvious. Cheers, Daniel
Aug 03 2014
On 8/3/14, 9:49 AM, Daniel Gibson wrote:Am 03.08.2014 09:16, schrieb Andrei Alexandrescu:Just a working title, but of course if it were wildly successful... but then again it's not. -- AndreiWe need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.htmlIs the name supposed to stay or just a working title?
Aug 03 2014
I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input. So the lowest layer should allow me to iterate across symbols in some way. When I've done this in the past it was SAX-style (ie. a callback per type) but with the range interface that shouldn't be necessary. The parser shouldn't decode or convert anything unless I ask it to. Most of the time I only care about specific values, and paying for conversions on everything is wasted process time. I suggest splitting number into float and integer types. In a language like D where these are distinct internal types, it can be valuable to know this up front. Is there support for output? I see the makeArray and makeObject routines... Ideally, there should be a way to serialize JSON against an OutputRange with optional formatting.
Aug 03 2014
On 8/3/14, 10:19 AM, Sean Kelly wrote:I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input.What to do about arrays and objects, which would naturally allocate arrays and associative arrays respectively? What about strings with backslash-encoded characters? No allocation works for tokenization, but parsing is a whole different matter.So the lowest layer should allow me to iterate across symbols in some way.Yah, that would be the tokenizer.When I've done this in the past it was SAX-style (ie. a callback per type) but with the range interface that shouldn't be necessary. The parser shouldn't decode or convert anything unless I ask it to. Most of the time I only care about specific values, and paying for conversions on everything is wasted process time.That's tricky. Once you scan for 2 specific characters you may as well scan for a couple more, the added cost is negligible. In contrast, scanning once for finding termination and then again for decoding purposes will definitely be a lot more expensive.I suggest splitting number into float and integer types. In a language like D where these are distinct internal bfulifbucivrdfvhhjnrunrgultdjbjutypes, it can be valuable to know this up front.Yah, that kept on sticking like a sore thumb throughout.Is there support for output? I see the makeArray and makeObject routines... Ideally, there should be a way to serialize JSON against an OutputRange with optional formatting.Not yet, and yah those should be in. Andrei
Aug 03 2014
03-Aug-2014 21:40, Andrei Alexandrescu пишет:On 8/3/14, 10:19 AM, Sean Kelly wrote:SAX-style would imply that array is "parsed" by calling 6 user-defined callbacks inside of a parser: startArray, endArray, startObject, endObject, id and value. A simplified pseudo-code of JSON-parser inner loop is then: if(cur == '[') startArray(); else if(cur == '{'){ startObject(); else if(cur == '}') endObject(); else if(cur == ']') endArray(); else{ if(expectObjectKey){ id(parseAsIdentifier()); } else value(parseAsValue()); } This is as barebones as it can get and is very fast in practice esp. in context of searching/extracting/matching specific sub-tries of JSON documents. -- Dmitry OlshanskyI don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input.What to do about arrays and objects, which would naturally allocate arrays and associative arrays respectively? What about strings with backslash-encoded characters?
Aug 03 2014
03-Aug-2014 23:54, Dmitry Olshansky пишет:03-Aug-2014 21:40, Andrei Alexandrescu пишет: A simplified pseudo-code of JSON-parser inner loop is then: if(cur == '[') startArray(); else if(cur == '{'){Aw. Stray brace.. -- Dmitry Olshansky
Aug 03 2014
On Sunday, 3 August 2014 at 17:40:48 UTC, Andrei Alexandrescu wrote:On 8/3/14, 10:19 AM, Sean Kelly wrote:This is tricky with a range. With an event-based parser I'd have events for object and array begin / end, but with a range you end up having an element that's a token, which is pretty weird. For encoded characters (and you need to make sure you handle surrogate pairs in your decoder) I'd still provide some means of decoding on demand. If nothing else, decode lazily when the user asks for the string value. That way the user isn't paying to decode strings he isn't interested in.I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input.What to do about arrays and objects, which would naturally allocate arrays and associative arrays respectively? What about strings with backslash-encoded characters?No allocation works for tokenization, but parsing is a whole different matter.But that will halt on comma and colon and such, correct? That's a tad lower than I'd want, though I guess it would be easy enough to build a parser on top of it.So the lowest layer should allow me to iterate across symbols in some way.Yah, that would be the tokenizer.I think I'm getting a bit confused. For the JSON parser I wrote, the parser performs full validation but leaves the content as-is, then provides a routine to decode values from their string representation if the user wishes to. I'm not sure where scanning figures in here.When I've done this in the past it was SAX-style (ie. a callback per type) but with the range interface that shouldn't be necessary. The parser shouldn't decode or convert anything unless I ask it to. Most of the time I only care about specific values, and paying for conversions on everything is wasted process time.That's tricky. Once you scan for 2 specific characters you may as well scan for a couple more, the added cost is negligible. In contrast, scanning once for finding termination and then again for decoding purposes will definitely be a lot more expensive.Andrei
Aug 03 2014
On Sunday, 3 August 2014 at 20:40:47 UTC, Sean Kelly wrote:This is tricky with a range. With an event-based parser I'd have events for object and array begin / end, but with a range you end up having an element that's a token, which is pretty weird.Have a look at Token.Kind in the top of the module [1]. The enum has objectStart, objectEnd, arrayStart and arrayEnd. By just looking that that, it seems it already works very similar to an event parser, but with a range API. This is exactly like the XML pull parser in Tango. [1] http://erdani.com/d/jgrandson.d -- /Jacob Carlborg
Aug 04 2014
On Sunday, 3 August 2014 at 17:19:04 UTC, Sean Kelly wrote:Is there support for output? I see the makeArray and makeObject routines... Ideally, there should be a way to serialize JSON against an OutputRange with optional formatting.I think it should only provide very primitive functions to serialize basic data types. Then Phobos should provide a separate module/package for generic serialization where JSON is an archive type using this module as its backend. -- /Jacob Carlborg
Aug 04 2014
On 8/3/2014 2:16 AM, Andrei Alexandrescu wrote:We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Here are a few differences compared to vibe.d's library. I think these are desirable to have in that library as well: * Parsing strings is decoupled into tokenization (which is lazy and only needs an input range) and parsing proper. Tokenization is lazy, which allows users to create their own advanced (e.g. partial/lazy) parsing if needed. The parser itself is eager. * There's no decoding of strings. * The representation is built on Algebraic, with the advantages that it benefits from all of its primitives. Implementation is also very compact because Algebraic obviates a bunch of boilerplate. Subsequent improvements to Algebraic will also reflect themselves into improvements to std.jgrandson. * The JSON value (called std.jgrandson.Value) has no named member variables or methods except for __payload. This is so there's no clash between dynamic properties exposed via opDispatch. Well that's about it. What would it take for this to become a Phobos proposal? Destroy. AndreiIf your looking for serialization from statically known type layouts then I believe my JSON (de)serialization code (https://github.com/Orvid/JSONSerialization) might actually be of interest to you, as it uses no intermediate representation, nor does it allocate when it converts an object to JSON. As far as I know, even when only compiled with DMD, it's among the fastest JSON (de)serialization libraries. Unless it needs to convert a floating point number to a string, in which case I suppose you could certainly use a local buffer to write to, but at the moment it just converts it to a normal string that gets written to the output range. It also supports (de)serializing from, what I called at the time, dynamic types, such as std.variant, which isn't actually supported because that code is only there because I needed it for something else, and wasn't using std.variant at the time.
Aug 03 2014
On Sunday, 3 August 2014 at 07:16:05 UTC, Andrei Alexandrescu wrote:We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Here are a few differences compared to vibe.d's library. I think these are desirable to have in that library as well: * Parsing strings is decoupled into tokenization (which is lazy and only needs an input range) and parsing proper. Tokenization is lazy, which allows users to create their own advanced (e.g. partial/lazy) parsing if needed. The parser itself is eager. * There's no decoding of strings. * The representation is built on Algebraic, with the advantages that it benefits from all of its primitives. Implementation is also very compact because Algebraic obviates a bunch of boilerplate. Subsequent improvements to Algebraic will also reflect themselves into improvements to std.jgrandson. * The JSON value (called std.jgrandson.Value) has no named member variables or methods except for __payload. This is so there's no clash between dynamic properties exposed via opDispatch. Well that's about it. What would it take for this to become a Phobos proposal? Destroy. AndreiOn my bson library I found very useful to have some methods to know if a field exists or not, and to get a "defaulted" value. Something like: auto assume(T)(Value v, T default = T.init); Another good method could be something like xpath to get a deep value: Value v = value["/path/to/sub/object"]; Moreover in my library I actually have three different methods to read a value: T get(T)() // Exception if value is not a T or not valid or value doesn't exist T to(T)() // Try to convert value to T using to!string. Exception if doesn't exists or not valid BsonField!T as(T)(lazy T default = T.init) // Always return a value BsonField!T is an "alias this"-ed struct with two fields: T value and bool error(). T value is the aliased field, and error() tells you if value is defaulted (because of an error: field not exists or can't convert to T) So I can write something like this: int myvalue = json["/that/deep/property"].as!int; or auto myvalue = json["/that/deep/property"].as!int(10); if (myvalue.error) writeln("Property doesn't exists, I'm using default value); writeln("Property value: ", myvalue); I hope this can be useful...
Aug 04 2014
On 8/4/14, 12:47 AM, Andrea Fontana wrote:On my bson library I found very useful to have some methods to know if a field exists or not, and to get a "defaulted" value. Something like: auto assume(T)(Value v, T default = T.init);Nice. Probably "get" would be better to be in keep with built-in hashtables.Another good method could be something like xpath to get a deep value: Value v = value["/path/to/sub/object"];Cool. Is it unlikely that a value contains an actual slash? If so would be value["path"]["to"]["sub"]["object"] more precise?Moreover in my library I actually have three different methods to read a value: T get(T)() // Exception if value is not a T or not valid or value doesn't exist T to(T)() // Try to convert value to T using to!string. Exception if doesn't exists or not valid BsonField!T as(T)(lazy T default = T.init) // Always return a value BsonField!T is an "alias this"-ed struct with two fields: T value and bool error(). T value is the aliased field, and error() tells you if value is defaulted (because of an error: field not exists or can't convert to T) So I can write something like this: int myvalue = json["/that/deep/property"].as!int; or auto myvalue = json["/that/deep/property"].as!int(10); if (myvalue.error) writeln("Property doesn't exists, I'm using default value); writeln("Property value: ", myvalue); I hope this can be useful...Sure is, thanks. Listen, would you want to volunteer a std.data.json proposal? Andrei
Aug 04 2014
On Monday, 4 August 2014 at 16:58:12 UTC, Andrei Alexandrescu wrote:On 8/4/14, 12:47 AM, Andrea Fontana wrote:I wrote assume just to use proposed syntax :)On my bson library I found very useful to have some methods to know if a field exists or not, and to get a "defaulted" value. Something like: auto assume(T)(Value v, T default = T.init);Nice. Probably "get" would be better to be in keep with built-in hashtables.Key with a slash (or dot?) inside is not common at all. Never seen on json data. In many languages there're libraries to bind json to struct or objects so usually people doesn't use strange chars inside key. If needed you can still use old good method to read a single field. value["path"]["to"]["object"] was my first choice but i didn't like it. First: it create a lot of temporary objects. Second: it is easier to implement using a single string (also on assignment) I gave it a try with value["path", "to", "index"] but it's not confortable if you need to generate your path from code.Another good method could be something like xpath to get a deep value: Value v = value["/path/to/sub/object"];Cool. Is it unlikely that a value contains an actual slash? If so would be value["path"]["to"]["sub"]["object"] more precise?What does it mean? :)Moreover in my library I actually have three different methods to read a value: T get(T)() // Exception if value is not a T or not valid or value doesn't exist T to(T)() // Try to convert value to T using to!string. Exception if doesn't exists or not valid BsonField!T as(T)(lazy T default = T.init) // Always return a value BsonField!T is an "alias this"-ed struct with two fields: T value and bool error(). T value is the aliased field, and error() tells you if value is defaulted (because of an error: field not exists or can't convert to T) So I can write something like this: int myvalue = json["/that/deep/property"].as!int; or auto myvalue = json["/that/deep/property"].as!int(10); if (myvalue.error) writeln("Property doesn't exists, I'm using default value); writeln("Property value: ", myvalue); I hope this can be useful...Sure is, thanks. Listen, would you want to volunteer a std.data.json proposal?Andrei
Aug 05 2014
On 8/5/14, 2:08 AM, Andrea Fontana wrote:One one side enters vibe.data.json with the deltas prompted by std.jgrandson plus your talent and determination, and on the other side comes std.data.json with code and documentation that passes the Phobos review process. -- AndreiSure is, thanks. Listen, would you want to volunteer a std.data.json proposal?What does it mean? :)
Aug 05 2014
On Sunday, 3 August 2014 at 07:16:05 UTC, Andrei Alexandrescu wrote:We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html* Could you please put it on Github to get syntax highlighting and all the other advantages * It doesn't completely follow the Phobos naming conventions * The indentation is off in some places * The unit tests is a bit lacking for the separate parsing functions * There are methods for getting the strings and numbers, what about booleans? * Shouldn't it be called TokenRange? * Shouldn't this be built using the lexer generator you so strongly have been pushing for? * The unit tests for TokenStream is very dense. I would prefer empty newlines for grouping "assert" and calls to "popFront" belonging together -- /Jacob Carlborg
Aug 04 2014
On 8/4/14, 12:56 AM, Jacob Carlborg wrote:On Sunday, 3 August 2014 at 07:16:05 UTC, Andrei Alexandrescu wrote:Thanks for your comments! A few responses within:We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html* Could you please put it on Github to get syntax highlighting and all the other advantagesQuick workaround: http://dpaste.dzfl.pl/65f4dcc36ab8* It doesn't completely follow the Phobos naming conventionsWhat would be the places?* The indentation is off in some placesXamarin/Mono-D is at fault here :o).* The unit tests is a bit lacking for the separate parsing functionsAgreed.* There are methods for getting the strings and numbers, what about booleans?You mean for Token? Good point. Numbers and strings are somewhat special because they have a payload associated. In contrast Booleans are represented by two distinct tokens. Would be good to add a convenience method.* Shouldn't it be called TokenRange?Yah.* Shouldn't this be built using the lexer generator you so strongly have been pushing for?Of course, and in the beginning I've actually pasted some code from it. Then I regressed to minimizing dependencies.* The unit tests for TokenStream is very dense. I would prefer empty newlines for grouping "assert" and calls to "popFront" belonging togetherDe gustibus et the coloribus non est disputandum :o). Andrei
Aug 04 2014
On 2014-08-04 18:55, Andrei Alexandrescu wrote:What would be the places?That's why it's easier with Github ;) I can comment directly on a line. I just had a quick look but "_true", "_false" and "_null" in Token.Kind. If I recall correctly we add an underscore as a suffix for symbols with the same name as keywords.You mean for Token? Good point.Yes, in Token.Numbers and strings are somewhat special because they have a payload associated. In contrast Booleans are represented by two distinct tokens. Would be good to add a convenience method.Right.De gustibus et the coloribus non est disputandum :o).Please avoid these Latin sentences, I have no idea what they mean. This is an international community, please don't make it more complicated than it already is with language barriers. -- /Jacob Carlborg
Aug 04 2014
On 8/4/14, 11:46 AM, Jacob Carlborg wrote:"Favorite foods and colors are not to be disputed." 51,300 results on google... and please let's end this before it becomes another Epic Debate. -- AndreiDe gustibus et the coloribus non est disputandum :o).Please avoid these Latin sentences, I have no idea what they mean. This is an international community, please don't make it more complicated than it already is with language barriers.
Aug 04 2014