digitalmars.D.announce - IAP Tools for D
- Jakob Jenkov (21/21) Dec 16 2015 Hi D Community,
- Rikki Cattermole (3/23) Dec 16 2015 If you hop onto IRC #d Freenode, there maybe somebody from time to time
- Jakob Jenkov (3/6) Dec 16 2015 Thanks!
- Stefan Koch (3/9) Dec 16 2015 Sounds like an interesting thing. I will lend a hand.
- Jakob Jenkov (5/6) Dec 16 2015 Great! We probably won't get started until January, as we have
- Stefan Koch (3/9) Dec 16 2015 yeah I think so
- belkin (6/12) Dec 19 2015 ION is similar to MessagePack and CBOR,
- Jakob Jenkov (40/43) Dec 19 2015 That depends on what API you use, and how much "meta data" (e.g.
- Paolo Invernizzi (7/13) Dec 20 2015 I suggest to compare also against this [1].
- Jakob Jenkov (3/8) Dec 20 2015 Will do - at some point. Writing proper benchmarks against other
- Jakob Jenkov (34/39) Dec 20 2015 I just had a look at Cap'n Proto. From what I can see in the
- John Carter (33/35) Dec 20 2015 "If a disease has many treatments, it has no cure".
- Jakob Jenkov (25/28) Dec 20 2015 Then why is HTTP 2 moving away from it? And Web Sockets?
- Joakim (5/34) Dec 21 2015 Yep, the whole stateless argument is a complete joke, it has not
- David Nadlinger (8/10) Dec 20 2015 There seems to be a confusion of terminology here. Thrift has a
- Jakob Jenkov (6/16) Dec 20 2015 Thanks for the clarification! I couldn't really make out from the
- Jakob Jenkov (14/17) Dec 19 2015 Oh - one final thing:
- Guillaume Piolat (6/10) Dec 22 2015 Be sure to look at how MsgPack is implemented in D:
Hi D Community, I am currently working on a cloud project where we intend to reinvent a lot of the old, less-than-optimal technologies. Among the technologies we are working on is a new general purpose network protocol called IAP. IAP comes with a general purpose binary data format called ION (IAP Object Notation). ION is similar to MessagePack and CBOR, but with a few additions. ION has a table mode which can be used to model tables (like CSV files) efficiently, and which can also be used in larger object graphs. Our early serialized length + performance benchmarks look promising (tables can be down to 1/5 of JSON, and up to 2 x the speed of parsing CBOR). ION can be used both inside IAP, but also separately with HTTP and in data and log files. We already have a working toolkit in Java (we have Java backgrounds), but since we really find D interesting, we would like to make a D toolkit too. Since we are rather new to D, would anyone be interested in helping us a bit out making such a library? We can probably do the coding ourselves, but might need some tips about how to pack it nicely into a D library which can be used with Dub etc.
Dec 16 2015
On 16/12/15 10:47 PM, Jakob Jenkov wrote:Hi D Community, I am currently working on a cloud project where we intend to reinvent a lot of the old, less-than-optimal technologies. Among the technologies we are working on is a new general purpose network protocol called IAP. IAP comes with a general purpose binary data format called ION (IAP Object Notation). ION is similar to MessagePack and CBOR, but with a few additions. ION has a table mode which can be used to model tables (like CSV files) efficiently, and which can also be used in larger object graphs. Our early serialized length + performance benchmarks look promising (tables can be down to 1/5 of JSON, and up to 2 x the speed of parsing CBOR). ION can be used both inside IAP, but also separately with HTTP and in data and log files. We already have a working toolkit in Java (we have Java backgrounds), but since we really find D interesting, we would like to make a D toolkit too. Since we are rather new to D, would anyone be interested in helping us a bit out making such a library? We can probably do the coding ourselves, but might need some tips about how to pack it nicely into a D library which can be used with Dub etc.If you hop onto IRC #d Freenode, there maybe somebody from time to time that can give you a hand. Or at worst help solve some of your problems.
Dec 16 2015
If you hop onto IRC #d Freenode, there maybe somebody from time to time that can give you a hand. Or at worst help solve some of your problems.Thanks! Oh, I forgot to tell that the IAP Tools for D library will be open source, Apache 2 License.
Dec 16 2015
On Wednesday, 16 December 2015 at 10:08:14 UTC, Jakob Jenkov wrote:Sounds like an interesting thing. I will lend a hand.If you hop onto IRC #d Freenode, there maybe somebody from time to time that can give you a hand. Or at worst help solve some of your problems.Thanks! Oh, I forgot to tell that the IAP Tools for D library will be open source, Apache 2 License.
Dec 16 2015
Sounds like an interesting thing. I will lend a hand.Great! We probably won't get started until January, as we have some documentation work to do on the Java library still, and some more systematic benchmarks to run etc. We will announce it here again when we get there. A GitHub repo would suffice, right?
Dec 16 2015
On Wednesday, 16 December 2015 at 11:06:21 UTC, Jakob Jenkov wrote:yeah I think soSounds like an interesting thing. I will lend a hand.Great! We probably won't get started until January, as we have some documentation work to do on the Java library still, and some more systematic benchmarks to run etc. We will announce it here again when we get there. A GitHub repo would suffice, right?
Dec 16 2015
On Wednesday, 16 December 2015 at 09:47:35 UTC, Jakob Jenkov wrote:Hi D Community,ION is similar to MessagePack and CBOR,but with a few additions. ION has a table mode which can be used to model tables (like CSV files) efficiently, and which can also be used in larger object graphs. Our early serialized length + performance benchmarks look promising (tables can be down to 1/5 of JSON, and up to 2 x the speed of parsing CBOR).How does the performance of ION compare with Protocol Buffers (https://developers.google.com/protocol-buffers/?hl=en) and Apache Thrift ( https://thrift.apache.org/)?
Dec 19 2015
How does the performance of ION compare with Protocol Buffers (https://developers.google.com/protocol-buffers/?hl=en) and Apache Thrift ( https://thrift.apache.org/)?That depends on what API you use, and how much "meta data" (e.g. class names and property names) you write in the serialized ION data. ION is quite flexible about how much meta you want to include. If you remove property names and rely only the sequence of fields, ION can write faster than Google Protocol Buffers. When reading, if you only rely in the sequence of fields, ION is a bit slower than Google Protocol Buffers. All in all I believe performance will be on-par with Google Protocol Buffers. We have some benchmarks here: http://tutorials.jenkov.com/iap/ion-performance-benchmarks.html We still have a few minor optimizations to do, and more benchmarks to run, but perhaps also some validations to add etc, so the benchmarks on this page (for Java) are probably not too far off from the final numbers. Regarding Apache Avro and Thrift, I looked at them today. It seems that Avro's encoding is similar to ION (and MessagePack and CBOR), although without e.g. tables. According to Thrift's own docs their binary encoding is not compact. For compact encoding it seems they refer to Protobuf. ION has several advantages over Protobuf as a general purpose data format. ION is self describing, so you can iterate it without a schema. This means that you can do pretty fast arbitrary hierarchical navigation of an ION "file/message". Protobuf's own docs say that Protobuf is not good for large amounts of raw bytes (e.g. files). ION is capable of modeling both raw binary data (e.g. files), JSON, XML and CSV efficiently. You could even convert ION to a restricted XML format, edit it in a text editor, and convert it back to ION (we have not implemented this yet, but we have planned it). We also believe that ION can support cyclic object graphs, but this is also not fully implemented and tested yet. ION has a very compact encoding of arrays of objects in "Tables" which are similar to CSV files with only 1 header row, and N value rows. It is very common to transport arrays of object over the network, e.g. N search results from a service. Thus ION tables are a major advantage. Tables can also be used inside object graphs where an object has 0..N children (in an array). We have a comparison of ION to other data formats here: http://tutorials.jenkov.com/iap/ion-vs-other-formats.html
Dec 19 2015
On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:I suggest to compare also against this [1]. The author, Kenton Varda, was the primary author of Protocol Buffers version 2, which is the version that Google released open source. [1] https://capnproto.org /Paolo[...]That depends on what API you use, and how much "meta data" (e.g. class names and property names) you write in the serialized ION data. ION is quite flexible about how much meta you want to include. [...]
Dec 20 2015
I suggest to compare also against this [1]. The author, Kenton Varda, was the primary author of Protocol Buffers version 2, which is the version that Google released open source. [1] https://capnproto.orgWill do - at some point. Writing proper benchmarks against other frameworks / encodings takes time though. That's why we have started with MessagePack, CBOR and Google Protocol Buffers.
Dec 20 2015
I suggest to compare also against this [1]. The author, Kenton Varda, was the primary author of Protocol Buffers version 2, which is the version that Google released open source. [1] https://capnproto.orgI just had a look at Cap'n Proto. From what I can see in the encoding spec, performance of ION will be comparable. Cap'n Proto claims to be "infinitely faster" than Google Protocol Buffers, but that is only if you do not pack the CP data - in which case it will transfer slower over the network. CP solves that using packing - but then you are back to serialization / deserialization, and the original promise of being "inifinitely faster" is gone. Cap'n Proto also has the "problem" that its messages require an external schema. To iterate through a Cap'n Proto file / message you must already know what data is in it (the schema). Some see this as an advantage, because it forces you to write a schema for your data structure, and you get slightly faster encoding / decoding time. And others see this is a disadvantage because you now have to import schemas, or generate code, in order to read a serialized message. You cannot just step through it like you can with e.g. XML or JSON. I tend to be in this camp - although I am not blind to the arguments in favor of external schemas. Speed matters, but so does ease-of-use. On a network protocol level I tend to disagree with the "distributed object" model. I know Capn Proto tries to explain why this model is not a problem with CP. However, fine grained communication between fine grained distributed objects *is* a performance killer in the long run, regardless of whether you "pipeline" requests. ION is intended to be the message format for our IAP network protocol. IAP will be message oriented, so you can do one-way messaging, request-response, subscriptions (e.g. to a stream), pipelining, routing of messages via intermediate nodes etc. Anyways, if you really want to use Cap'N Proto (or something else) over IAP (+ION) you can just nest a binary message inside an IAP message, and then parse it any way you like when it comes out.
Dec 20 2015
On Sunday, 20 December 2015 at 17:52:40 UTC, Jakob Jenkov wrote:I just had a look at Cap'n Proto. From what I can see in the encoding spec, performance of ION will be comparable."If a disease has many treatments, it has no cure". This is certainly true for serialization protocols. The major advantage I see in Cap'n'Proto is the pipelining can do quite a lot to reduce round trip latency. (You don't have to google far to find rants pointing out that latency is often more important than bandwidth in determining throughput.) I was just reading your IAP web site, when I came across "No Stateful Communication" under the heading "What is Wrong With HTTP?". The designers of HTTP would strongly argue that is a major thing HTTP got right, and is the feature primarily responsible for it huge success. Certainly in the realm of IoT HTTP is way too heavy.... so in that domain I would reach for http://coap.technology/ The use case I keep challenging my colleagues with is.... So one end or the other dies. Or resets. Or fades and comes back. Or changes batteries. This is the IoT things. It will happen, and you will be required to recover the whole end to end system automatically without manual intervention. What is your plan? Too often the answer is... "We don't have a plan but we will have a wheel restarting the link.... umm, then a wheel resending the stuff that was lost in the link buffers when the link went down.... and a, errrr, maybe wheel restarting Everything when we realise the other side has lost it's state about our connection. And in practice the only wheel that works is shutting everything down and restarting everything up. Suddenly "No stateful communication" is looking really really Good. Coap clearly has thought these issues through.
Dec 20 2015
The designers of HTTP would strongly argue that is a major thing HTTP got right, and is the feature primarily responsible for it huge success.Then why is HTTP 2 moving away from it? And Web Sockets? Clearly, having the choice between keeping state and not keeping state is preferable to HTTP taking that choice away from you. Lots of apps also spend quite an effort to mimic stateful communication on top of HTTP. Sessions? Authentication tokens? Cookies? Caching in the browser? HTML5 Local Storage? No, HTTP did not get "stateless" right. Your "fix-the-network" problem is definitely valid. At this point we have mostly focused on ION - the binary object / message format for IAP. However, we have a pretty good idea about how IAP will work on a conceptual level. IAP will have a set of "semantic protocols". Each semantic protocol can address its own area of concern. File exchange, time, RPC, distributed transactions, P2P, streaming etc. You can also define your own semantic protocol to address exactly your specific situation (e.g. the Byzantine Generals Problem - distributed consensus). Everything is not yet in place - but we will get there step by step.
Dec 20 2015
On Sunday, 20 December 2015 at 21:37:35 UTC, Jakob Jenkov wrote:Yep, the whole stateless argument is a complete joke, it has not been true except maybe in the very beginning. HTTP 2 is a huge step forward for this, its binary encoding, and other reasons.The designers of HTTP would strongly argue that is a major thing HTTP got right, and is the feature primarily responsible for it huge success.Then why is HTTP 2 moving away from it? And Web Sockets? Clearly, having the choice between keeping state and not keeping state is preferable to HTTP taking that choice away from you. Lots of apps also spend quite an effort to mimic stateful communication on top of HTTP. Sessions? Authentication tokens? Cookies? Caching in the browser? HTML5 Local Storage? No, HTTP did not get "stateless" right.Your "fix-the-network" problem is definitely valid. At this point we have mostly focused on ION - the binary object / message format for IAP. However, we have a pretty good idea about how IAP will work on a conceptual level. IAP will have a set of "semantic protocols". Each semantic protocol can address its own area of concern. File exchange, time, RPC, distributed transactions, P2P, streaming etc. You can also define your own semantic protocol to address exactly your specific situation (e.g. the Byzantine Generals Problem - distributed consensus). Everything is not yet in place - but we will get there step by step.Interesting effort, I'll check it out.
Dec 21 2015
On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:According to Thrift's own docs their binary encoding is not compact. For compact encoding it seems they refer to Protobuf.There seems to be a confusion of terminology here. Thrift has a "Binary" protocol, which is not compact in the sense that it consists of the data fields more or less blitted into a message. There is also a "Compact" protocol, which is also a binary format, but employs things like variable-length integers to reduce size – similar to Protobuf. — David
Dec 20 2015
On Sunday, 20 December 2015 at 19:16:19 UTC, David Nadlinger wrote:On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:Thanks for the clarification! I couldn't really make out from the Thrift website if they had their own compact protocol, or switched to Protobuf. But now you say that they do have their own compact protocol. Now I know that.According to Thrift's own docs their binary encoding is not compact. For compact encoding it seems they refer to Protobuf.There seems to be a confusion of terminology here. Thrift has a "Binary" protocol, which is not compact in the sense that it consists of the data fields more or less blitted into a message. There is also a "Compact" protocol, which is also a binary format, but employs things like variable-length integers to reduce size – similar to Protobuf. — David
Dec 20 2015
How does the performance of ION compare with Protocol Buffers (https://developers.google.com/protocol-buffers/?hl=en) and Apache Thrift ( https://thrift.apache.org/)?Oh - one final thing: If you *really* want speed you should not parse ION into objects before using the data. Since ION is self describing, you can just navigate through it and find the data you need, and ignore the rest. This should be faster than first parsing the data into objects first. Especially if you parse an array of objects which may end up scattered all over the heap, and thus lead to cache misses. Accessing these objects directly in the message buffer might save you both the ION-to-object parse time, plus it might play better with the L1, L2 and L3 caches. We have not yet benchmarked this, but we will within long. In this mode I expect the read+use time to be faster than Google Protocol Buffers.
Dec 19 2015
On Wednesday, 16 December 2015 at 09:47:35 UTC, Jakob Jenkov wrote:Since we are rather new to D, would anyone be interested in helping us a bit out making such a library? We can probably do the coding ourselves, but might need some tips about how to pack it nicely into a D library which can be used with Dub etc.Be sure to look at how MsgPack is implemented in D: https://github.com/msgpack/msgpack-d It has a very easy interface, and is one of the better D library out there.
Dec 22 2015