digitalmars.D.announce - New Article: My Experience Porting Python Dateutil's Date Parser to D

Jack Stouffer (11/11) Mar 09 2016 Hello everyone,

Walter Bright (3/12) Mar 09 2016 I haven't read the article yet, but you'll get more interest by putting ...

H. S. Teoh via Digitalmars-d-announce (21/41) Mar 09 2016 I read the article. While I'm no Python expert (do have a little

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/9) Mar 09 2016 What is problematic with __next__ (Py3) and next (Py2)?

Jack Stouffer (3/13) Mar 09 2016 I explain my grievances in the article.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/16) Mar 10 2016 They didn't make all that much sense to me, so I wondered what

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (28/28) Mar 10 2016 Just pointing out the obvious:
Chris Wright (14/22) Mar 10 2016 It's a little easier to write iterators in the Python style: you don't

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/28) Mar 10 2016 I don't have any firm opinions on this, but escaping out of the

Jack Stouffer (4/6) Mar 10 2016 Thanks for the advice, I think it caused more people to read it.

cym13 (3/9) Mar 11 2016 Did you time the python tests too? A value by itself doesn't mean

Jack Stouffer (3/5) Mar 18 2016 Quick update: all dateutil tests are now passing. It can how

Jack Stouffer <jack jackstouffer.com> writes:

Hello everyone,

I have spent the last two weeks porting the date string parsing 
functionality from the popular Python library, dateutil, to D. I 
have written about my experience here: 
http://jackstouffer.com/blog/porting_dateutil.html

The code and docs can be found here: 
https://github.com/JackStouffer/date-parser

reddit: 
https://www.reddit.com/r/programming/comments/49qdpt/my_experience_porting_python_dateutils_date/

Let me know what you think about the article and the code.

Thanks in advance.

Mar 09 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 3/9/2016 1:55 PM, Jack Stouffer wrote:
 Hello everyone,

 I have spent the last two weeks porting the date string parsing functionality
 from the popular Python library, dateutil, to D. I have written about my
 experience here: http://jackstouffer.com/blog/porting_dateutil.html

 The code and docs can be found here:
https://github.com/JackStouffer/date-parser

 reddit:
 https://www.reddit.com/r/programming/comments/49qdpt/my_experience_porting_python_dateutils_date/


 Let me know what you think about the article and the code.

 Thanks in advance.

I haven't read the article yet, but you'll get more interest by putting a 
summary as the first comment on reddit.

Mar 09 2016

"H. S. Teoh via Digitalmars-d-announce" writes:

On Wed, Mar 09, 2016 at 02:12:42PM -0800, Walter Bright via
Digitalmars-d-announce wrote:
On 3/9/2016 1:55 PM, Jack Stouffer wrote:
Hello everyone,

I have spent the last two weeks porting the date string parsing
functionality from the popular Python library, dateutil, to D. I have
written about my experience here:
http://jackstouffer.com/blog/porting_dateutil.html

The code and docs can be found here: https://github.com/JackStouffer/date-parser

reddit:
https://www.reddit.com/r/programming/comments/49qdpt/my_experience_porting_python_dateutils_date/

Let me know what you think about the article and the code.

Thanks in advance.

I haven't read the article yet, but you'll get more interest by
putting a summary as the first comment on reddit.

I read the article. While I'm no Python expert (do have a little
experience with it mainly through using SCons as a build system for my
personal projects), I can totally sympathize with the annoyances of
using a dynamically-typed language, as well as dodgy iterator designs
like __next__. (I've not had to deal with __next__ in Python so far, but
*have* worked with C/C++ code that basically iterates that way, and it's
not pretty.)

Totally agree that if you can convert something to D in about a week's
worth of work, it's totally worth it. D is just a much more comfortable
language to work in (to me, anyway -- this is highly subjective,
obviously), and, provided you don't do anything silly, generally gives
you better performance than many of the alternatives out there. Even
when it doesn't perform the best without hand-tweaking, I'd still prefer
it for general use, because of nice sanity features such as built-in
unittests (now that I've gotten used to them, I sorely miss them in
every other language!), sane template syntax, etc..

Nice article.

--
He who does not appreciate the beauty of language is not worthy to bemoan its
flaws.

Mar 09 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Wednesday, 9 March 2016 at 22:17:39 UTC, H. S. Teoh wrote:
 system for my personal projects), I can totally sympathize with 
 the annoyances of using a dynamically-typed language, as well 
 as dodgy iterator designs like __next__. (I've not had to deal 
 with __next__ in Python so far, but *have* worked with C/C++ 
 code that basically iterates that way, and it's not pretty.)

What is problematic with __next__ (Py3) and next (Py2)?

It's a pretty straight forward standard iterator design and quite 
different from the table pointers C++ uses.

Mar 09 2016

Jack Stouffer <jack jackstouffer.com> writes:

On Wednesday, 9 March 2016 at 23:31:04 UTC, Ola Fosheim Grøstad 
wrote:
 On Wednesday, 9 March 2016 at 22:17:39 UTC, H. S. Teoh wrote:
 system for my personal projects), I can totally sympathize 
 with the annoyances of using a dynamically-typed language, as 
 well as dodgy iterator designs like __next__. (I've not had to 
 deal with __next__ in Python so far, but *have* worked with 
 C/C++ code that basically iterates that way, and it's not 
 pretty.)

 What is problematic with __next__ (Py3) and next (Py2)?

 It's a pretty straight forward standard iterator design and 
 quite different from the table pointers C++ uses.

I explain my grievances in the article.

Mar 09 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Thursday, 10 March 2016 at 00:29:46 UTC, Jack Stouffer wrote:
 It's a pretty straight forward standard iterator design and 
 quite different from the table pointers C++ uses.

 I explain my grievances in the article.

They didn't make all that much sense to me, so I wondered what 
Theo's issues were. As in: real issues that have empirical 
significance.

D ranges and Python's are regular iterators, nothing special. The 
oddball are C++ "iterators" that are pairs of pointers.

Efficiency and semantic issues when it comes to 
iterator-implementation go both ways all depending on the 
application area. This is nothing new. People have known this for 
ages, as in decades.

If you want fast you have to use a "next" iterator-implementation 
writing multiple elements directly to the buffer. This is what 
you do in signal processing.

Mar 10 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

Just pointing out the obvious:

For the simple iterators/generators that run on a non-changing 
source you can basically break it up into:

1. iterators without lookahead
2. iterators with lookahead

Which is basically the same issues you deal with when 
implementing a lexer.

Python-style iterators/generators is basically the former. That 
comes with one set of advantages, but no lookahead. But lookahead 
frequently have cost penalties. There are many tradeoffs. And 
those tradeoffs become rather clear when you consider factors 
like:

1. mutating iterators
2. the size of the object
3. copyable iterators
4. concurrency/thread safety
5. progress with high computational cost
6. high computational cost for the value
7. sources with latency
8. skip functionality
7. non-inlineable situations
8. exceptions
9. complex iterators (e.g. interpolation)
etc

There are massive tradeoffs even when writing iterators for 
really simple data-structures like the linked list. It all 
depends on what functionality one are looking for.

There is no best solution. It all depends on the application.

Mar 10 2016

Chris Wright <dhasenan gmail.com> writes:

On Thu, 10 Mar 2016 08:22:58 +0000, Ola Fosheim Grøstad wrote:

 On Thursday, 10 March 2016 at 00:29:46 UTC, Jack Stouffer wrote:
 It's a pretty straight forward standard iterator design and quite
 different from the table pointers C++ uses.

 I explain my grievances in the article.

 
 They didn't make all that much sense to me, so I wondered what Theo's
 issues were. As in: real issues that have empirical significance.

It's a little easier to write iterators in the Python style: you don't 
have to cache the current value, and you don't have to have a separate 
check for end-of-iteration. It's a little easier to use them in the D 
style: you get more flexibility, can check for emptiness without popping 
an item, and can grab the first item several times.

You can convert one to the other, so there's no theoretical difference in 
what you can accomplish with them. It's mainly annoying. A small 
efficiency concern, because throwing exceptions is a little slow.

The largest practical difference comes when multiple functions are 
interested in viewing the first item in the same range. LL(1) parsers 
need to do this.

Of course, that's just looking at input ranges versus iterators. If you 
look at other types of ranges, there's a lot there that Python is missing.

Mar 10 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Thursday, 10 March 2016 at 17:59:21 UTC, Chris Wright wrote:
 It's a little easier to write iterators in the Python style: 
 you don't have to cache the current value, and you don't have 
 to have a separate check for end-of-iteration. It's a little 
 easier to use them in the D style: you get more flexibility, 
 can check for emptiness without popping an item, and can grab 
 the first item several times.

I don't have any firm opinions on this, but escaping out of the 
loop with an exception means you don't have to check for 
emptiness. So I am not sure why D range-iterators should be 
considered  easier.

 You can convert one to the other, so there's no theoretical 
 difference in what you can accomplish with them. It's mainly 
 annoying. A small efficiency concern, because throwing 
 exceptions is a little slow.

Efficiency of exceptions in Python is an implementation issue, 
though. But I agree that the difference isn't all that 
interesting.

 The largest practical difference comes when multiple functions 
 are interested in viewing the first item in the same range. 
 LL(1) parsers need to do this.

Iterators and generators in Python are mostly for for-loops and 
comprehensions. In the rare case where you want lookahead you can 
just write your own or use an adapter.

 Of course, that's just looking at input ranges versus 
 iterators. If you look at other types of ranges, there's a lot 
 there that Python is missing.

Is there any work done on range-iterators and streams?

Mar 10 2016

Jack Stouffer <jack jackstouffer.com> writes:

On Wednesday, 9 March 2016 at 22:12:42 UTC, Walter Bright wrote:
 I haven't read the article yet, but you'll get more interest by 
 putting a summary as the first comment on reddit.

Thanks for the advice, I think it caused more people to read it.

Also, I forgot to mention in the article that the unit tests with 
coverage reports enabled run in 110ms. I love fast tests :)

Mar 10 2016

cym13 <cpicard openmailbox.org> writes:

On Thursday, 10 March 2016 at 21:25:16 UTC, Jack Stouffer wrote:
 On Wednesday, 9 March 2016 at 22:12:42 UTC, Walter Bright wrote:
 I haven't read the article yet, but you'll get more interest 
 by putting a summary as the first comment on reddit.

 Thanks for the advice, I think it caused more people to read it.

 Also, I forgot to mention in the article that the unit tests 
 with coverage reports enabled run in 110ms. I love fast tests :)

Did you time the python tests too? A value by itself doesn't mean 
much to me

Mar 11 2016

Jack Stouffer <jack jackstouffer.com> writes:

On Wednesday, 9 March 2016 at 21:55:23 UTC, Jack Stouffer wrote:
 The code and docs can be found here: 
 https://github.com/JackStouffer/date-parser

Quick update: all dateutil tests are now passing. It can how 
parse just about any date format you can throw at it :)

Mar 18 2016

D Programming

C/C++ Programming

Other

digitalmars.D.announce - New Article: My Experience Porting Python Dateutil's Date Parser to D