digitalmars.D - std.json API improvement - Request for code review
- Brian Schott (21/21) Sep 12 2010 Now that a few bugs are fixed in DMD (notably 4826), my improvements to
- sybrandy (19/40) Sep 12 2010 Everything I've seen looks good to me, though I haven't tried to execute...
- Brian Schott (27/47) Sep 12 2010 Everything is a JSONValue. JSONValue has a union inside it that actually
- sybrandy (7/33) Sep 13 2010 Cool. It looks very simple and easy...just the way I like it. What you...
- Robert Jacques (15/53) Sep 13 2010 Unfortunately, the above code is horribly broken. Here's how to read a
- sybrandy (8/22) Sep 14 2010 Understood. Even though this probably isn't the way it will end up
- Robert Jacques (5/29) Sep 14 2010 And the really awesome thing about D is that its trivial to support both...
- Robert Jacques (30/51) Sep 12 2010 Hi Brain,
- Brian Schott (8/40) Sep 12 2010 I just found the phobos mailing list. Why is it on a completely
- Robert Jacques (22/30) Sep 12 2010 No clue. I've only just joined the list recently myself.
- Sean Kelly (2/35) Sep 13 2010 Could all this sit atop a SAX-style API? I'm not likely to ever use an ...
- Robert Jacques (22/61) Sep 13 2010 Well, writing data could be done using output ranges easily enough, so n...
- Sean Kelly (11/43) Sep 14 2010 Random access is definitely nice, it's more the performance cost of all ...
- Robert Jacques (4/68) Sep 14 2010 This seems pretty straight forward. Could you list the advanced-mode
Now that a few bugs are fixed in DMD (notably 4826), my improvements to std.json compile. My primary purpose in this code change is streamlining the process of creating JSON documents. You can now do stuff like this: auto json = JSONValue(); json["numbers"] = [1, 3, 5, 7]; json["nullValue"] = null; json["something"] = false; json["vector"] = ["x": 234.231, "y" : 12.8, "z" : 35.0]; assert("vector" in json); json["vector"]["x"] = 42.8; More example usage: http://www.hackerpilot.org/src/phobos/jsontest.d The implementation of the actual JSON data structure and parsing / writing is unchanged. Ddoc comments were added so that the documentation page for the module won't be quite so empty. Implementation: http://www.hackerpilot.org/src/phobos/json.d I'd like to get this committed back to Phobos if there's a consensus that these changes make sense. Comments welcome. (Note: You'll need a DMD version built from SVN to use this) - Brian
Sep 12 2010
On 09/12/2010 05:05 AM, Brian Schott wrote:Now that a few bugs are fixed in DMD (notably 4826), my improvements to std.json compile. My primary purpose in this code change is streamlining the process of creating JSON documents. You can now do stuff like this: auto json = JSONValue(); json["numbers"] = [1, 3, 5, 7]; json["nullValue"] = null; json["something"] = false; json["vector"] = ["x": 234.231, "y" : 12.8, "z" : 35.0]; assert("vector" in json); json["vector"]["x"] = 42.8; More example usage: http://www.hackerpilot.org/src/phobos/jsontest.d The implementation of the actual JSON data structure and parsing / writing is unchanged. Ddoc comments were added so that the documentation page for the module won't be quite so empty. Implementation: http://www.hackerpilot.org/src/phobos/json.d I'd like to get this committed back to Phobos if there's a consensus that these changes make sense. Comments welcome. (Note: You'll need a DMD version built from SVN to use this) - BrianEverything I've seen looks good to me, though I haven't tried to execute it. The fact that I can directly manipulate a JSONValue looks good to me. However, here's a question: if I have the following JSON document and parse it using parseJSON, will obj1 and obj2 both be JSONValues? My current project deals with this type of situation on a regular basis, so I'm curious about how easy this will be to access the data. Casey { "obj1": { "obj2": { "val1": 1, "val2": "a string" }, "val3": [ 1, 2, 3, 4] } }
Sep 12 2010
Everything is a JSONValue. JSONValue has a union inside it that actually holds the data. I just wrote a short program that reads your example file. (This could be a bit more efficient if I stored obj2, but I think it's enough to communicate the idea.) import std.stdio; import std.json; import std.file; void main(string[] args) { auto jsonString = readText("../ml.json"); JSONValue json = parseJSON(jsonString); writeln(json["obj1"]["obj2"]["val1"].integer); writeln(json["obj1"]["obj2"]["val2"].str); foreach(value; json["obj1"]["val3"].array) writeln(value.integer); } Output: 1 a string 1 2 3 4 Your question did remind me to document the union members so that the HTML documentation will show how to access the actual data. I've uploaded the new version of the file. The link is the same. On 09/12/2010 05:19 PM, sybrandy wrote:Everything I've seen looks good to me, though I haven't tried to execute it. The fact that I can directly manipulate a JSONValue looks good to me. However, here's a question: if I have the following JSON document and parse it using parseJSON, will obj1 and obj2 both be JSONValues? My current project deals with this type of situation on a regular basis, so I'm curious about how easy this will be to access the data. Casey { "obj1": { "obj2": { "val1": 1, "val2": "a string" }, "val3": [ 1, 2, 3, 4] } }
Sep 12 2010
On 09/12/2010 10:53 PM, Brian Schott wrote:Everything is a JSONValue. JSONValue has a union inside it that actually holds the data. I just wrote a short program that reads your example file. (This could be a bit more efficient if I stored obj2, but I think it's enough to communicate the idea.) import std.stdio; import std.json; import std.file; void main(string[] args) { auto jsonString = readText("../ml.json"); JSONValue json = parseJSON(jsonString); writeln(json["obj1"]["obj2"]["val1"].integer); writeln(json["obj1"]["obj2"]["val2"].str); foreach(value; json["obj1"]["val3"].array) writeln(value.integer); } Output: 1 a string 1 2 3 4 Your question did remind me to document the union members so that the HTML documentation will show how to access the actual data. I've uploaded the new version of the file. The link is the same.Cool. It looks very simple and easy...just the way I like it. What you have is actually quite nice as I can navigate down to a low-level element without having to store 200 different intermediate values. Probably not very common, but a nicety for when it's needed. Thanks! Casey
Sep 13 2010
On Mon, 13 Sep 2010 18:59:57 -0400, sybrandy <sybrandy gmail.com> wrote:On 09/12/2010 10:53 PM, Brian Schott wrote:Unfortunately, the above code is horribly broken. Here's how to read a number correctly: real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } You'll notice that before any access you must check to ensure the JSON type is what you think it should be. As noted above, JSON does not differentiate between integers and reals, so you have to test both on access.Everything is a JSONValue. JSONValue has a union inside it that actually holds the data. I just wrote a short program that reads your example file. (This could be a bit more efficient if I stored obj2, but I think it's enough to communicate the idea.) import std.stdio; import std.json; import std.file; void main(string[] args) { auto jsonString = readText("../ml.json"); JSONValue json = parseJSON(jsonString); writeln(json["obj1"]["obj2"]["val1"].integer); writeln(json["obj1"]["obj2"]["val2"].str); foreach(value; json["obj1"]["val3"].array) writeln(value.integer); } Output: 1 a string 1 2 3 4 Your question did remind me to document the union members so that the HTML documentation will show how to access the actual data. I've uploaded the new version of the file. The link is the same.Cool. It looks very simple and easy...just the way I like it. What you have is actually quite nice as I can navigate down to a low-level element without having to store 200 different intermediate values. Probably not very common, but a nicety for when it's needed. Thanks! Casey
Sep 13 2010
Unfortunately, the above code is horribly broken. Here's how to read a number correctly: real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } You'll notice that before any access you must check to ensure the JSON type is what you think it should be. As noted above, JSON does not differentiate between integers and reals, so you have to test both on access.Understood. Even though this probably isn't the way it will end up based on some previous discussion, I like the way the indices are used to access the elements. Perhaps this means I'm more of a C guy than a D guy in some respects. Definitely not a Java guy. As for the integers vs. floats, does the API always treat a number as a float even if it is an integer? If so, then checking for an integer vs. a float may not be a big deal in many cases. Casey
Sep 14 2010
On Tue, 14 Sep 2010 17:48:25 -0400, sybrandy <sybrandy gmail.com> wrote:And the really awesome thing about D is that its trivial to support both json.vector.x and json["vector"]["x"] syntaxes.Unfortunately, the above code is horribly broken. Here's how to read a number correctly: real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } You'll notice that before any access you must check to ensure the JSON type is what you think it should be. As noted above, JSON does not differentiate between integers and reals, so you have to test both on access.Understood. Even though this probably isn't the way it will end up based on some previous discussion, I like the way the indices are used to access the elements. Perhaps this means I'm more of a C guy than a D guy in some respects. Definitely not a Java guy.As for the integers vs. floats, does the API always treat a number as a float even if it is an integer? If so, then checking for an integer vs. a float may not be a big deal in many cases. CaseyNope. The current std.json dynamically checks if a number is an integer or real and stores the data accordingly. So you'd have to do the checks.
Sep 14 2010
On Sun, 12 Sep 2010 05:05:09 -0400, Brian Schott <brian-schott cox.net> wrote:Now that a few bugs are fixed in DMD (notably 4826), my improvements to std.json compile. My primary purpose in this code change is streamlining the process of creating JSON documents. You can now do stuff like this: auto json = JSONValue(); json["numbers"] = [1, 3, 5, 7]; json["nullValue"] = null; json["something"] = false; json["vector"] = ["x": 234.231, "y" : 12.8, "z" : 35.0]; assert("vector" in json); json["vector"]["x"] = 42.8; More example usage: http://www.hackerpilot.org/src/phobos/jsontest.d The implementation of the actual JSON data structure and parsing / writing is unchanged. Ddoc comments were added so that the documentation page for the module won't be quite so empty. Implementation: http://www.hackerpilot.org/src/phobos/json.d I'd like to get this committed back to Phobos if there's a consensus that these changes make sense. Comments welcome. (Note: You'll need a DMD version built from SVN to use this) - BrianHi Brain, This really belongs on the phobos mailing list as JSON isn't ready for public consumption yet (as far as I know). I would suspect that it even has a decent chance of being dropped in favor of serialization + variant. The implementation has several bugs. First, it doesn't parse Unicode escape sequences correctly (e.g. \u0026). Second, JSON has no integer type. Third, the serializer with certain JSON value inputs will write a JSON file that can not be read by the parser. It's also missing some key features, like output range and human readable output support. The design is very C-ish as opposed to D-ish: its composed of a bunch of free functions / types all containing JSON in their name. (i.e. parseJSON). These should all be encapsulated as member functions. Getting more to the API itself, the reading of a JSON value is a use case that just isn't considered currently. Consider: // It's relatively simple to write to a JSON value json["vector"]["x"] = 42.8; // But reading it... real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } By contrast, this is the API on my personal JSON library: json.vector.x = 42.8; auto x = json.vector.x.number;
Sep 12 2010
I just found the phobos mailing list. Why is it on a completely different server? Regarding the bugs, my intent was just to improve the usefulness of the existing implementation. I was not aware of any plans to drop this module. (There's no notice to this effect in the documentation the way there is with std.contracts) What do you recommend for dealing with JSON files until this is sorted out? On 09/12/2010 08:45 PM, Robert Jacques wrote:Hi Brain, This really belongs on the phobos mailing list as JSON isn't ready for public consumption yet (as far as I know). I would suspect that it even has a decent chance of being dropped in favor of serialization + variant. The implementation has several bugs. First, it doesn't parse Unicode escape sequences correctly (e.g. \u0026). Second, JSON has no integer type. Third, the serializer with certain JSON value inputs will write a JSON file that can not be read by the parser. It's also missing some key features, like output range and human readable output support. The design is very C-ish as opposed to D-ish: its composed of a bunch of free functions / types all containing JSON in their name. (i.e. parseJSON). These should all be encapsulated as member functions. Getting more to the API itself, the reading of a JSON value is a use case that just isn't considered currently. Consider: // It's relatively simple to write to a JSON value json["vector"]["x"] = 42.8; // But reading it... real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } By contrast, this is the API on my personal JSON library: json.vector.x = 42.8; auto x = json.vector.x.number;
Sep 12 2010
On Mon, 13 Sep 2010 00:18:40 -0400, Brian Schott <brian-schott cox.net> wrote:I just found the phobos mailing list. Why is it on a completely different server?No clue. I've only just joined the list recently myself.Regarding the bugs, my intent was just to improve the usefulness of the existing implementation. I was not aware of any plans to drop this module. (There's no notice to this effect in the documentation the way there is with std.contracts)There's no plan that I know of regarding std.json. The code was literally taken from a pastebin by Jeremie Pelletier and hasn't been touched (or discussed) since. It was lurking around with its documentation unlinked, like std.perf, but it appears that this is no longer the case (though this wasn't mentioned in the change logs). However, there has been a bunch of talk regarding serialization and so std.json will need to change to accommodate this. And Json value is really a specialized version of variant, so if certain bugs/etc where fixed in variant there'll be no need for json value.What do you recommend for dealing with JSON files until this is sorted out?There are two solutions, as I see it: 1) Move std.json (or an improved version) to user code land (i.e. scrapple) in the short term for people who need it. Long term, fix/improve std.variant and add a "std.serialize" module, probably based on orange. 2) Fix/improve std.json and decide later what to do with it when "std.serialize" arrives. I've got my own Json library that I'm willing to share, but I need some basic serialization support and I know my compile-time only solution isn't going to be the final solution for phobos, so I've been reluctant to submit it as a replacement.
Sep 12 2010
Robert Jacques Wrote:Hi Brain, This really belongs on the phobos mailing list as JSON isn't ready for public consumption yet (as far as I know). I would suspect that it even has a decent chance of being dropped in favor of serialization + variant. The implementation has several bugs. First, it doesn't parse Unicode escape sequences correctly (e.g. \u0026). Second, JSON has no integer type. Third, the serializer with certain JSON value inputs will write a JSON file that can not be read by the parser. It's also missing some key features, like output range and human readable output support. The design is very C-ish as opposed to D-ish: its composed of a bunch of free functions / types all containing JSON in their name. (i.e. parseJSON). These should all be encapsulated as member functions. Getting more to the API itself, the reading of a JSON value is a use case that just isn't considered currently. Consider: // It's relatively simple to write to a JSON value json["vector"]["x"] = 42.8; // But reading it... real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } By contrast, this is the API on my personal JSON library: json.vector.x = 42.8; auto x = json.vector.x.number;Could all this sit atop a SAX-style API? I'm not likely to ever use an API that requires memory allocation for parsing or writing data.
Sep 13 2010
On Mon, 13 Sep 2010 14:30:10 -0400, Sean Kelly <sean invisibleduck.org> wrote:Robert Jacques Wrote:Well, writing data could be done using output ranges easily enough, so no extra memory writing troubles there. As for parsing, the biggest cost with JSON is that fact that all strings can include escape chars, so things have to be copied instead of sliced. However, there's nothing preventing a SAX style implementation in the format itself. Except that JSON has less extra meta-data than XML so SAX becomes a less informative. Instead of: object start vector member x number 42.8 object end vector you have something like object start member vector object start member x number 42.8 object end object end For myself, the files are under a mb and random access makes everything much faster to program and debug.Hi Brain, This really belongs on the phobos mailing list as JSON isn't ready for public consumption yet (as far as I know). I would suspect that it even has a decent chance of being dropped in favor of serialization + variant. The implementation has several bugs. First, it doesn't parse Unicode escape sequences correctly (e.g. \u0026). Second, JSON has no integer type. Third, the serializer with certain JSON value inputs will write a JSON file that can not be read by the parser. It's also missing some key features, like output range and human readable output support. The design is very C-ish as opposed to D-ish: its composed of a bunch of free functions / types all containing JSON in their name. (i.e. parseJSON). These should all be encapsulated as member functions. Getting more to the API itself, the reading of a JSON value is a use case that just isn't considered currently. Consider: // It's relatively simple to write to a JSON value json["vector"]["x"] = 42.8; // But reading it... real x; if(json["vector"]["x"].type == JSON_TYPE.INTEGER) { x = json["vector"]["x"].integer; } else if(json["vector"]["x"].type == JSON_TYPE.FLOAT) { x = json["vector"]["x"].floating; } else { enforceEx!(JSONException)(false); } By contrast, this is the API on my personal JSON library: json.vector.x = 42.8; auto x = json.vector.x.number;Could all this sit atop a SAX-style API? I'm not likely to ever use an API that requires memory allocation for parsing or writing data.
Sep 13 2010
Robert Jacques Wrote:On Mon, 13 Sep 2010 14:30:10 -0400, Sean Kelly <sean invisibleduck.org> wrote:What I've always done is to not automatically unescape string data but rather provide a function for the user to do it so they can provide the buffer. Alternately, this behavior could be configurable. Escaping output should definitely be configurable though. In fact, I often don't even want numbers to be automatically converted from their string to real/int representation, since it's common for me to want to operate on the value as a string. So even this I like being given the original representation and calling to!int or whatever on my own.Could all this sit atop a SAX-style API? I'm not likely to ever use an API that requires memory allocation for parsing or writing data.Well, writing data could be done using output ranges easily enough, so no extra memory writing troubles there. As for parsing, the biggest cost with JSON is that fact that all strings can include escape chars, so things have to be copied instead of sliced.However, there's nothing preventing a SAX style implementation in the format itself. Except that JSON has less extra meta-data than XML so SAX becomes a less informative. Instead of: object start vector member x number 42.8 object end vector you have something like object start member vector object start member x number 42.8 object end object end For myself, the files are under a mb and random access makes everything much faster to program and debug.Random access is definitely nice, it's more the performance cost of all those allocations that's an issue for me. What I'm basically looking for is a set of events like this: alias void delegate(char[]) ParseEvent; ParseEvent onObjectEnter, onObjectKey, onObjectLeave; ParseEvent onArrayEnter, onArrayLeave; ParseEvent onStringValue, onFloatValue, onIntValue; ParseEvent onTrueValue, onFalseValue, onNullValue; With corresponding write events on the output side so if I hooked the parser to the writer the data would all flow through and generate output identical to the input, formatting notwithstanding (though I'd add the option to write numbers as either a string, real, or int). I like the event parameter being a char[] because it allows me to unescape string data in place, etc. There are some advanced-mode writer options I'd like as well, like the ability to dump a char[] blob directly into the destination string without translation, saving and restoring writer state, etc. I don't know if anyone besides myself would find all this useful though. These are just some things I've found necessary for the work I do.
Sep 14 2010
On Tue, 14 Sep 2010 11:18:14 -0400, Sean Kelly <sean invisibleduck.org> wrote:Robert Jacques Wrote:This seems pretty straight forward. Could you list the advanced-mode features you'd need?On Mon, 13 Sep 2010 14:30:10 -0400, Sean Kelly <sean invisibleduck.org> wrote:What I've always done is to not automatically unescape string data but rather provide a function for the user to do it so they can provide the buffer. Alternately, this behavior could be configurable. Escaping output should definitely be configurable though. In fact, I often don't even want numbers to be automatically converted from their string to real/int representation, since it's common for me to want to operate on the value as a string. So even this I like being given the original representation and calling to!int or whatever on my own.Could all this sit atop a SAX-style API? I'm not likely to ever useanAPI that requires memory allocation for parsing or writing data.Well, writing data could be done using output ranges easily enough, so no extra memory writing troubles there. As for parsing, the biggest cost with JSON is that fact that all strings can include escape chars, so things have to be copied instead of sliced.However, there's nothing preventing a SAX style implementation in the format itself. Except that JSON has less extra meta-data than XML so SAX becomes a less informative. Instead of: object start vector member x number 42.8 object end vector you have something like object start member vector object start member x number 42.8 object end object end For myself, the files are under a mb and random access makes everything much faster to program and debug.Random access is definitely nice, it's more the performance cost of all those allocations that's an issue for me. What I'm basically looking for is a set of events like this: alias void delegate(char[]) ParseEvent; ParseEvent onObjectEnter, onObjectKey, onObjectLeave; ParseEvent onArrayEnter, onArrayLeave; ParseEvent onStringValue, onFloatValue, onIntValue; ParseEvent onTrueValue, onFalseValue, onNullValue; With corresponding write events on the output side so if I hooked the parser to the writer the data would all flow through and generate output identical to the input, formatting notwithstanding (though I'd add the option to write numbers as either a string, real, or int). I like the event parameter being a char[] because it allows me to unescape string data in place, etc. There are some advanced-mode writer options I'd like as well, like the ability to dump a char[] blob directly into the destination string without translation, saving and restoring writer state, etc. I don't know if anyone besides myself would find all this useful though. These are just some things I've found necessary for the work I do.
Sep 14 2010