digitalmars.D.announce - IAP Tools for D

Jakob Jenkov (21/21) Dec 16 2015 Hi D Community,

Rikki Cattermole (3/23) Dec 16 2015 If you hop onto IRC #d Freenode, there maybe somebody from time to time

Jakob Jenkov (3/6) Dec 16 2015 Thanks!

Stefan Koch (3/9) Dec 16 2015 Sounds like an interesting thing. I will lend a hand.

Jakob Jenkov (5/6) Dec 16 2015 Great! We probably won't get started until January, as we have

Stefan Koch (3/9) Dec 16 2015 yeah I think so

belkin (6/12) Dec 19 2015 ION is similar to MessagePack and CBOR,

Jakob Jenkov (40/43) Dec 19 2015 That depends on what API you use, and how much "meta data" (e.g.

Paolo Invernizzi (7/13) Dec 20 2015 I suggest to compare also against this [1].

Jakob Jenkov (3/8) Dec 20 2015 Will do - at some point. Writing proper benchmarks against other
Jakob Jenkov (34/39) Dec 20 2015 I just had a look at Cap'n Proto. From what I can see in the

John Carter (33/35) Dec 20 2015 "If a disease has many treatments, it has no cure".

Jakob Jenkov (25/28) Dec 20 2015 Then why is HTTP 2 moving away from it? And Web Sockets?

Joakim (5/34) Dec 21 2015 Yep, the whole stateless argument is a complete joke, it has not

David Nadlinger (8/10) Dec 20 2015 There seems to be a confusion of terminology here. Thrift has a

Jakob Jenkov (6/16) Dec 20 2015 Thanks for the clarification! I couldn't really make out from the

Jakob Jenkov (14/17) Dec 19 2015 Oh - one final thing:

Guillaume Piolat (6/10) Dec 22 2015 Be sure to look at how MsgPack is implemented in D:

Jakob Jenkov <jakob jenkov.com> writes:

Hi D Community,

I am currently working on a cloud project where we intend to 
reinvent a lot of the old, less-than-optimal technologies. Among 
the technologies we are working on is a new general purpose 
network protocol called IAP.

IAP comes with a general purpose binary data format called ION 
(IAP Object Notation). ION is similar to MessagePack and CBOR, 
but with a few additions. ION has a table mode which can be used 
to model tables (like CSV files) efficiently, and which can also 
be used in larger object graphs. Our early serialized length + 
performance benchmarks look promising (tables can be down to 1/5 
of JSON, and up to 2 x the speed of parsing CBOR).

ION can be used both inside IAP, but also separately with HTTP 
and in data and log files.

We already have a working toolkit in Java (we have Java 
backgrounds), but since we really find D interesting, we would 
like to make a D toolkit too.

Since we are rather new to D, would anyone be interested in 
helping us a bit out making such a library? We can probably do 
the coding ourselves, but might need some tips about how to pack 
it nicely into a D library which can be used with Dub etc.

Dec 16 2015

Rikki Cattermole <alphaglosined gmail.com> writes:

On 16/12/15 10:47 PM, Jakob Jenkov wrote:
 Hi D Community,

 I am currently working on a cloud project where we intend to reinvent a
 lot of the old, less-than-optimal technologies. Among the technologies
 we are working on is a new general purpose network protocol called IAP.

 IAP comes with a general purpose binary data format called ION (IAP
 Object Notation). ION is similar to MessagePack and CBOR, but with a few
 additions. ION has a table mode which can be used to model tables (like
 CSV files) efficiently, and which can also be used in larger object
 graphs. Our early serialized length + performance benchmarks look
 promising (tables can be down to 1/5 of JSON, and up to 2 x the speed of
 parsing CBOR).

 ION can be used both inside IAP, but also separately with HTTP and in
 data and log files.

 We already have a working toolkit in Java (we have Java backgrounds),
 but since we really find D interesting, we would like to make a D
 toolkit too.

 Since we are rather new to D, would anyone be interested in helping us a
 bit out making such a library? We can probably do the coding ourselves,
 but might need some tips about how to pack it nicely into a D library
 which can be used with Dub etc.

If you hop onto IRC #d Freenode, there maybe somebody from time to time 
that can give you a hand. Or at worst help solve some of your problems.

Dec 16 2015

Jakob Jenkov <jakob jenkov.com> writes:

 If you hop onto IRC #d Freenode, there maybe somebody from time 
 to time that can give you a hand. Or at worst help solve some 
 of your problems.

Thanks!

Oh, I forgot to tell that the IAP Tools for D library will be 
open source, Apache 2 License.

Dec 16 2015

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 16 December 2015 at 10:08:14 UTC, Jakob Jenkov 
wrote:
 If you hop onto IRC #d Freenode, there maybe somebody from 
 time to time that can give you a hand. Or at worst help solve 
 some of your problems.

 Thanks!

 Oh, I forgot to tell that the IAP Tools for D library will be 
 open source, Apache 2 License.

Sounds like an interesting thing. I will lend a hand.

Dec 16 2015

Jakob Jenkov <jakob jenkov.com> writes:

 Sounds like an interesting thing. I will lend a hand.

Great! We probably won't get started until January, as we have 
some documentation work to do on the Java library still, and some 
more systematic benchmarks to run etc. We will announce it here 
again when we get there.

A GitHub repo would suffice, right?

Dec 16 2015

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 16 December 2015 at 11:06:21 UTC, Jakob Jenkov 
wrote:
 Sounds like an interesting thing. I will lend a hand.

 Great! We probably won't get started until January, as we have 
 some documentation work to do on the Java library still, and 
 some more systematic benchmarks to run etc. We will announce it 
 here again when we get there.

 A GitHub repo would suffice, right?

yeah I think so

Dec 16 2015

belkin <belkin yahoo.in.com> writes:

On Wednesday, 16 December 2015 at 09:47:35 UTC, Jakob Jenkov 
wrote:
 Hi D Community,

  ION is similar to MessagePack and CBOR,
 but with a few additions. ION has a table mode which can be 
 used to model tables (like CSV files) efficiently, and which 
 can also be used in larger object graphs. Our early serialized 
 length + performance benchmarks look promising (tables can be 
 down to 1/5 of JSON, and up to 2 x the speed of parsing CBOR).

How does the performance of ION compare with Protocol Buffers 
(https://developers.google.com/protocol-buffers/?hl=en) and 
Apache Thrift ( https://thrift.apache.org/)?

Dec 19 2015

Jakob Jenkov <jakob jenkov.com> writes:

 How does the performance of ION compare with Protocol Buffers 
 (https://developers.google.com/protocol-buffers/?hl=en) and 
 Apache Thrift ( https://thrift.apache.org/)?

That depends on what API you use, and how much "meta data" (e.g. 
class names and property names) you write in the serialized ION 
data. ION is quite flexible about how much meta you want to 
include.

If you remove property names and rely only the sequence of 
fields, ION can write faster than Google Protocol Buffers. When 
reading, if you only rely in the sequence of fields, ION is a bit 
slower than Google Protocol Buffers. All in all I believe 
performance will be on-par with Google Protocol Buffers.

We have some benchmarks here:

http://tutorials.jenkov.com/iap/ion-performance-benchmarks.html

We still have a few minor optimizations to do, and more 
benchmarks to run, but perhaps also some validations to add etc, 
so the benchmarks on this page (for Java) are probably not too 
far off from the final numbers.

Regarding Apache Avro and Thrift, I looked at them today. It 
seems that Avro's encoding is similar to ION (and MessagePack and 
CBOR), although without e.g. tables. According to Thrift's own 
docs their binary encoding is not compact. For compact encoding 
it seems they refer to Protobuf.

ION has several advantages over Protobuf as a general purpose 
data format. ION is self describing, so you can iterate it 
without a schema. This means that you can do pretty fast 
arbitrary hierarchical navigation of an ION "file/message".

Protobuf's own docs say that Protobuf is not good for large 
amounts of raw bytes (e.g. files). ION is capable of modeling 
both raw binary data (e.g. files), JSON, XML and CSV efficiently. 
You could even convert ION to a restricted XML format, edit it in 
a text editor, and convert it back to ION (we have not 
implemented this yet, but we have planned it). We also believe 
that ION can support cyclic object graphs, but this is also not 
fully implemented and tested yet.

ION has a very compact encoding of arrays of objects in "Tables" 
which are similar to CSV files with only 1 header row, and N 
value rows. It is very common to transport arrays of object over 
the network, e.g. N search results from a service. Thus ION 
tables are a major advantage. Tables can also be used inside 
object graphs where an object has 0..N children (in an array).

We have a comparison of ION to other data formats here:

http://tutorials.jenkov.com/iap/ion-vs-other-formats.html

Dec 19 2015

Paolo Invernizzi <paolo.invernizzi no.address> writes:

On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:
 [...]

 That depends on what API you use, and how much "meta data" 
 (e.g. class names and property names) you write in the 
 serialized ION data. ION is quite flexible about how much meta 
 you want to include.

 [...]

I suggest to compare also against this [1].
The author, Kenton Varda, was the primary author of Protocol 
Buffers version 2, which is the version that Google released open 
source.

[1] https://capnproto.org

/Paolo

Dec 20 2015

Jakob Jenkov <jakob jenkov.com> writes:

 I suggest to compare also against this [1].
 The author, Kenton Varda, was the primary author of Protocol 
 Buffers version 2, which is the version that Google released 
 open source.

 [1] https://capnproto.org

Will do - at some point. Writing proper benchmarks against other 
frameworks / encodings takes time though. That's why we have 
started with MessagePack, CBOR and Google Protocol Buffers.

Dec 20 2015

Jakob Jenkov <jakob jenkov.com> writes:

 I suggest to compare also against this [1].
 The author, Kenton Varda, was the primary author of Protocol 
 Buffers version 2, which is the version that Google released 
 open source.

 [1] https://capnproto.org


I just had a look at Cap'n Proto. From what I can see in the 
encoding spec, performance of ION will be comparable.

Cap'n Proto claims to be "infinitely faster" than Google Protocol 
Buffers, but that is only if you do not pack the CP data - in 
which case it will transfer slower over the network. CP solves 
that using packing - but then you are back to serialization / 
deserialization, and the original promise of being "inifinitely 
faster" is gone.

Cap'n Proto also has the "problem" that its messages require an 
external schema. To iterate through a Cap'n Proto file / message 
you must already know what data is in it (the schema).

Some see this as an advantage, because it forces you to write a 
schema for your data structure, and you get slightly faster 
encoding / decoding time.

And others see this is a disadvantage because you now have to 
import schemas, or generate code, in order to read a serialized 
message. You cannot just step through it like you can with e.g. 
XML or JSON. I tend to be in this camp - although I am not blind 
to the arguments in favor of external schemas. Speed matters, but 
so does ease-of-use.

On a network protocol level I tend to disagree with the 
"distributed object" model. I know Capn Proto tries to explain 
why this model is not a problem with CP. However, fine grained 
communication between fine grained distributed objects *is* a 
performance killer in the long run, regardless of whether you 
"pipeline" requests.

ION is intended to be the message format for our IAP network 
protocol. IAP will be message oriented, so you can do one-way 
messaging, request-response, subscriptions (e.g. to a stream), 
pipelining, routing of messages via intermediate nodes etc.

Anyways, if you really want to use Cap'N Proto (or something 
else) over IAP (+ION) you can just nest a binary message inside 
an IAP message, and then parse it any way you like when it comes 
out.

Dec 20 2015

John Carter <john.carter taitradio.com> writes:

On Sunday, 20 December 2015 at 17:52:40 UTC, Jakob Jenkov wrote:
 I just had a look at Cap'n Proto. From what I can see in the 
 encoding spec, performance of ION will be comparable.

"If a disease has many treatments, it has no cure".

This is certainly true for serialization protocols.

The major advantage I see in Cap'n'Proto is the pipelining can do 
quite a lot to reduce round trip latency. (You don't have to 
google far to find rants pointing out that latency is often more 
important than bandwidth in determining throughput.)

I was just reading your IAP web site, when I came across "No 
Stateful Communication" under the heading "What is Wrong With 
HTTP?".

The designers of HTTP would strongly argue that is a major thing 
HTTP got right, and is the feature primarily responsible for it 
huge success.

Certainly in the realm of IoT HTTP is way too heavy.... so in 
that domain I would reach for
http://coap.technology/

The use case I keep challenging my colleagues with is....

So one end or the other dies. Or resets. Or fades and comes back. 
Or changes batteries.

This is the IoT things. It will happen, and you will be required 
to recover the whole end to end system automatically without 
manual intervention.

What is your plan?

Too often the answer is... "We don't have a plan but we will have 
a wheel restarting the link.... umm, then a wheel resending the 
stuff that was lost in the link buffers when the link went 
down.... and a, errrr, maybe wheel restarting Everything when we 
realise the other side has lost it's state about our connection.

And in practice the only wheel that works is shutting everything 
down and restarting everything up.

Suddenly "No stateful communication" is looking really really 
Good.

Coap clearly has thought these issues through.

Dec 20 2015

Jakob Jenkov <jakob jenkov.com> writes:

 The designers of HTTP would strongly argue that is a major 
 thing HTTP got right, and is the feature primarily responsible 
 for it huge success.

Then why is HTTP 2 moving away from it? And Web Sockets?
Clearly, having the choice between keeping state and not keeping
state is preferable to HTTP taking that choice away from you.

Lots of apps also spend quite an effort to mimic stateful 
communication
on top of HTTP. Sessions? Authentication tokens? Cookies? Caching
in the browser? HTML5 Local Storage?

No, HTTP did not get "stateless" right.


Your "fix-the-network" problem is definitely valid.

At this point we have mostly focused on ION - the binary object / 
message format for IAP.
However, we have a pretty good idea about how IAP will work on a 
conceptual
level.

IAP will have a set of "semantic protocols". Each semantic 
protocol can address
its own area of concern. File exchange, time, RPC, distributed 
transactions,
P2P, streaming etc.

You can also define your own semantic protocol to address exactly 
your specific
situation (e.g. the Byzantine Generals Problem - distributed 
consensus).

Everything is not yet in place - but we will get there step by 
step.

Dec 20 2015

Joakim <dlang joakim.fea.st> writes:

On Sunday, 20 December 2015 at 21:37:35 UTC, Jakob Jenkov wrote:
 The designers of HTTP would strongly argue that is a major 
 thing HTTP got right, and is the feature primarily responsible 
 for it huge success.

 Then why is HTTP 2 moving away from it? And Web Sockets?
 Clearly, having the choice between keeping state and not keeping
 state is preferable to HTTP taking that choice away from you.

 Lots of apps also spend quite an effort to mimic stateful 
 communication
 on top of HTTP. Sessions? Authentication tokens? Cookies? 
 Caching
 in the browser? HTML5 Local Storage?

 No, HTTP did not get "stateless" right.

Yep, the whole stateless argument is a complete joke, it has not 
been true except maybe in the very beginning.  HTTP 2 is a huge 
step forward for this, its binary encoding, and other reasons.

 Your "fix-the-network" problem is definitely valid.

 At this point we have mostly focused on ION - the binary object 
 / message format for IAP.
 However, we have a pretty good idea about how IAP will work on 
 a conceptual
 level.

 IAP will have a set of "semantic protocols". Each semantic 
 protocol can address
 its own area of concern. File exchange, time, RPC, distributed 
 transactions,
 P2P, streaming etc.

 You can also define your own semantic protocol to address 
 exactly your specific
 situation (e.g. the Byzantine Generals Problem - distributed 
 consensus).

 Everything is not yet in place - but we will get there step by 
 step.

Interesting effort, I'll check it out.

Dec 21 2015

David Nadlinger <code klickverbot.at> writes:

On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:
 According to Thrift's own docs their binary encoding is not 
 compact. For compact encoding it seems they refer to Protobuf.

There seems to be a confusion of terminology here. Thrift has a 
"Binary" protocol, which is not compact in the sense that it 
consists of the data fields more or less blitted into a message. 
There is also a "Compact" protocol, which is also a binary 
format, but employs things like variable-length integers to 
reduce size –  similar to Protobuf.

  — David

Dec 20 2015

Jakob Jenkov <jakob jenkov.com> writes:

On Sunday, 20 December 2015 at 19:16:19 UTC, David Nadlinger 
wrote:
 On Sunday, 20 December 2015 at 01:16:46 UTC, Jakob Jenkov wrote:
 According to Thrift's own docs their binary encoding is not 
 compact. For compact encoding it seems they refer to Protobuf.

 There seems to be a confusion of terminology here. Thrift has a 
 "Binary" protocol, which is not compact in the sense that it 
 consists of the data fields more or less blitted into a 
 message. There is also a "Compact" protocol, which is also a 
 binary format, but employs things like variable-length integers 
 to reduce size –  similar to Protobuf.

  — David

Thanks for the clarification! I couldn't really make out from the 
Thrift website if they had their own compact protocol, or 
switched to Protobuf. But now you say that they do have their own 
compact protocol. Now I know that.

Dec 20 2015

Jakob Jenkov <jakob jenkov.com> writes:

 How does the performance of ION compare with Protocol Buffers 
 (https://developers.google.com/protocol-buffers/?hl=en) and 
 Apache Thrift ( https://thrift.apache.org/)?

Oh - one final thing:

If you *really* want speed you should not parse ION into objects 
before using the data. Since ION is self describing, you can just 
navigate through it and find the data you need, and ignore the 
rest.

This should be faster than first parsing the data into objects 
first. Especially if you parse an array of objects which may end 
up scattered all over the heap, and thus lead to cache misses. 
Accessing these objects directly in the message buffer might save 
you both the ION-to-object parse time, plus it might play better 
with the L1, L2 and L3 caches.

We have not yet benchmarked this, but we will within long. In 
this mode I expect the read+use time to be faster than Google 
Protocol Buffers.

Dec 19 2015

Guillaume Piolat <first.last gmail.com> writes:

On Wednesday, 16 December 2015 at 09:47:35 UTC, Jakob Jenkov 
wrote:
 Since we are rather new to D, would anyone be interested in 
 helping us a bit out making such a library? We can probably do 
 the coding ourselves, but might need some tips about how to 
 pack it nicely into a D library which can be used with Dub etc.

Be sure to look at how MsgPack is implemented in D:
https://github.com/msgpack/msgpack-d

It has a very easy interface, and is one of the better D library 
out there.

Dec 22 2015

D Programming

C/C++ Programming

Other

digitalmars.D.announce - IAP Tools for D