digitalmars.D.announce - text based file formats
- Robert Schadek (24/24) Dec 18 2022 I complaint before that D and phobos needs more stuff.
- Adam D Ruppe (4/6) Dec 18 2022 my dom.d doesn't do the sax parser part but has its own
- rikki cattermole (10/13) Dec 18 2022 I've toyed with std.experimental.xml.
- Adrian Matoga (8/13) Dec 20 2022 I frequently find it useful for a text data file parser to call a
- Walter Bright (2/7) Dec 21 2022 Yes, sometimes I think this might be the right answer.
- CM (3/4) Dec 18 2022 Thank you for remembering it. I feel like I'm one of the few who
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/7) Dec 19 2022 If I were you I would join forces with Ilya and work on getting
- Walter Bright (5/6) Dec 19 2022 Yup!
- Adam D Ruppe (2/3) Dec 19 2022 Maybe std.csv is already good enough?
- Walter Bright (2/6) Dec 19 2022 LOL, learn something every day! I've even written my own, but it isn't v...
- H. S. Teoh (10/17) Dec 19 2022 There's also my little experimental csv parser that was designed to be
- John Colvin (2/20) Dec 20 2022 We use this at work with some light tweaks, itโs done a lot work ๐
- H. S. Teoh (8/17) Dec 20 2022 [...]
- 9il (3/29) Dec 20 2022 It has already been replaced with
- Tejas (4/12) Dec 21 2022 Wow, I didn't even know `mir.csv` was a thing
- John Colvin (3/11) Dec 21 2022 Hah, so it has! Well anyway, it did do a lot of hard work for us
- Walter Bright (2/5) Dec 21 2022 Propose this for Phobos?
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) Dec 22 2022 Great work. Will this module be extracted into a separate package?
- Walter Bright (2/3) Dec 21 2022 Sweet!
- Adam D Ruppe (6/8) Dec 21 2022 Yeah, I wrote a csv module too back in... I think 2010, before
- Walter Bright (2/10) Dec 21 2022 What this all means is Phobos could use a better one!
- Robert Schadek (5/10) Dec 19 2022 As Adam said, std.csv is already there and its at least from my
- Robert Schadek (1/1) Dec 19 2022 replay -> reply
- bachmeier (7/20) Dec 19 2022 A natural complement to this would be the functionality in
I complaint before that D and phobos needs more stuff. But I can't do it all by myself, but I can ask for help. So here it goes https://github.com/burner/textbasedfileformats As on the tin, text based file formats is a library of SAX and DOM parsers for text based file formats. I would like to get the following file formats in. * json (JSON5) there is actually some code in there already * xml, there is some code already, the old std.experimental.xml code * yaml, maybe there is something in code.dlang.org to be reused * toml, maybe there is something in code.dlang.org to be reused * ini, can likely be parsed by the toml parser * sdl, I know I know, but D uses it. There are a few design guidelines I would like to adhere to. * If it exists in phobos, use phobos * have the DOM parser based on the sax parser * no return by ref * make it safe and pure if possible (and its likely possible) * share the std.sumtype type if possible (yaml, toml should work) * no nogc, this should eventually get into phobos So stop talking, and start creating PR's. For the project admin stuff, this will use github. There are milestones for the five formats, so please start creating the issues you want/can work on and start typing.
Dec 18 2022
On Sunday, 18 December 2022 at 15:56:38 UTC, Robert Schadek wrote:* xml, there is some code already, the old std.experimental.xml codemy dom.d doesn't do the sax parser part but has its own advantages over the other things (including being continually maintained for over a decade, unlike the phobos things)
Dec 18 2022
On 19/12/2022 4:56 AM, Robert Schadek wrote:* xml, there is some code already, the old std.experimental.xml codeI've toyed with std.experimental.xml. I'm not convinced that it is a good code base for inclusion.* no return by refAs a bit of a follow up of what we were talking about on BeerConf: Because these are not data structures, they won't own externally facing memory (thats the GC job). So these lifetimes issues with ref should never be encountered.* make it safe and pure if possible (and its likely possible)pure is always a worry for me, but yeah safe and ideally nothrow (if they are forgiving which they absolutely should be, there is no reason to throw an exception until its time to inspect it).
Dec 18 2022
On Sunday, 18 December 2022 at 16:12:35 UTC, rikki cattermole wrote:I frequently find it useful for a text data file parser to call a diagnostic callback instead of assuming some default behavior (whether that's forgiving, printing warnings, throwing or something else). With template callback parameters the parser can throw if the user wants it or stay pure nothrow if no action is required.* make it safe and pure if possible (and its likely possible)pure is always a worry for me, but yeah safe and ideally nothrow (if they are forgiving which they absolutely should be, there is no reason to throw an exception until its time to inspect it).
Dec 20 2022
On 12/20/2022 1:51 PM, Adrian Matoga wrote:I frequently find it useful for a text data file parser to call a diagnostic callback instead of assuming some default behavior (whether that's forgiving, printing warnings, throwing or something else). With template callback parameters the parser can throw if the user wants it or stay pure nothrow if no action is required.Yes, sometimes I think this might be the right answer.
Dec 21 2022
On Sunday, 18 December 2022 at 15:56:38 UTC, Robert Schadek wrote:* sdl, I know I know, but D uses it.Thank you for remembering it. I feel like I'm one of the few who prefer SDL to YAML, JSON, and the like.
Dec 18 2022
On Sunday, 18 December 2022 at 15:56:38 UTC, Robert Schadek wrote:So stop talking, and start creating PR's. For the project admin stuff, this will use github. There are milestones for the five formats, so please start creating the issues you want/can work on and start typing.If I were you I would join forces with Ilya and work on getting the mir libraries doing text-parsing integrated into Phobos.
Dec 19 2022
On 12/18/2022 7:56 AM, Robert Schadek wrote:So stop talking, and start creating PR's.Yup! Curious why CSV isn't in the list. I encounter that a lot at tax time. https://en.wikipedia.org/wiki/Comma-separated_values Maybe just ask OpenAI?
Dec 19 2022
On Monday, 19 December 2022 at 09:55:47 UTC, Walter Bright wrote:Curious why CSV isn't in the list.Maybe std.csv is already good enough?
Dec 19 2022
On 12/19/2022 4:35 AM, Adam D Ruppe wrote:On Monday, 19 December 2022 at 09:55:47 UTC, Walter Bright wrote:LOL, learn something every day! I've even written my own, but it isn't very good.Curious why CSV isn't in the list.Maybe std.csv is already good enough?
Dec 19 2022
On Mon, Dec 19, 2022 at 04:16:57PM -0800, Walter Bright via Digitalmars-d-announce wrote:On 12/19/2022 4:35 AM, Adam D Ruppe wrote:There's also my little experimental csv parser that was designed to be as fast as possible: https://github.com/quickfur/fastcsv However it can only handle input that fits in memory (using std.mmfile is one possible workaround), has a static limit on field sizes, and does not do validation. T -- Debian GNU/Linux: Cray on your desktop.On Monday, 19 December 2022 at 09:55:47 UTC, Walter Bright wrote:LOL, learn something every day! I've even written my own, but it isn't very good.Curious why CSV isn't in the list.Maybe std.csv is already good enough?
Dec 19 2022
On Tuesday, 20 December 2022 at 00:40:07 UTC, H. S. Teoh wrote:On Mon, Dec 19, 2022 at 04:16:57PM -0800, Walter Bright via Digitalmars-d-announce wrote:We use this at work with some light tweaks, itโs done a lot work ๐On 12/19/2022 4:35 AM, Adam D Ruppe wrote:There's also my little experimental csv parser that was designed to be as fast as possible: https://github.com/quickfur/fastcsv However it can only handle input that fits in memory (using std.mmfile is one possible workaround), has a static limit on field sizes, and does not do validation. TOn Monday, 19 December 2022 at 09:55:47 UTC, Walter Bright wrote:LOL, learn something every day! I've even written my own, but it isn't very good.Curious why CSV isn't in the list.Maybe std.csv is already good enough?
Dec 20 2022
On Tue, Dec 20, 2022 at 07:46:36PM +0000, John Colvin via Digitalmars-d-announce wrote: [...][...]There's also my little experimental csv parser that was designed to be as fast as possible: https://github.com/quickfur/fastcsv However it can only handle input that fits in memory (using std.mmfile is one possible workaround), has a static limit on field sizes, and does not do validation.We use this at work with some light tweaks, itโs done a lot work ๐Wow, I never expected it to be actually useful. :-P Good to know it's worth something! T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Dec 20 2022
On Tuesday, 20 December 2022 at 19:46:36 UTC, John Colvin wrote:On Tuesday, 20 December 2022 at 00:40:07 UTC, H. S. Teoh wrote:It has already been replaced with [mir.csv](https://github.com/libmir/mir-ion/blob/master/source/mir/csv.d). Mir is faster, SIMD accelerated, and supports numbers and timestamp recognition.On Mon, Dec 19, 2022 at 04:16:57PM -0800, Walter Bright via Digitalmars-d-announce wrote:We use this at work with some light tweaks, itโs done a lot work ๐On 12/19/2022 4:35 AM, Adam D Ruppe wrote:There's also my little experimental csv parser that was designed to be as fast as possible: https://github.com/quickfur/fastcsv However, it can only handle input that fits in memory (using std.mmfile is one possible workaround), has a static limit on field sizes, and does not do validation. TOn Monday, 19 December 2022 at 09:55:47 UTC, Walter Bright wrote:LOL, learn something every day! I've even written my own, but it isn't very good.Curious why CSV isn't in the list.Maybe std.csv is already good enough?
Dec 20 2022
On Wednesday, 21 December 2022 at 04:19:46 UTC, 9il wrote:On Tuesday, 20 December 2022 at 19:46:36 UTC, John Colvin wrote:Wow, I didn't even know `mir.csv` was a thing Thank you very much!!!On Tuesday, 20 December 2022 at 00:40:07 UTC, H. S. Teoh wrote:It has already been replaced with [mir.csv](https://github.com/libmir/mir-ion/blob/master/source/mir/csv.d). Mir is faster, SIMD accelerated, and supports numbers and timestamp recognition.[...]We use this at work with some light tweaks, itโs done a lot work ๐
Dec 21 2022
On Wednesday, 21 December 2022 at 04:19:46 UTC, 9il wrote:On Tuesday, 20 December 2022 at 19:46:36 UTC, John Colvin wrote:Hah, so it has! Well anyway, it did do a lot of hard work for us for a long time, so thanks :)On Tuesday, 20 December 2022 at 00:40:07 UTC, H. S. Teoh wrote:It has already been replaced with [mir.csv](https://github.com/libmir/mir-ion/blob/master/source/mir/csv.d). Mir is faster, SIMD accelerated, and supports numbers and timestamp recognition.[...]We use this at work with some light tweaks, itโs done a lot work ๐
Dec 21 2022
On 12/20/2022 8:19 PM, 9il wrote:It has already been replaced with [mir.csv](https://github.com/libmir/mir-ion/blob/master/source/mir/csv.d). Mir is faster, SIMD accelerated, and supports numbers and timestamp recognition.Propose this for Phobos?
Dec 21 2022
On Wednesday, 21 December 2022 at 04:19:46 UTC, 9il wrote:It has already been replaced with [mir.csv](https://github.com/libmir/mir-ion/blob/master/source/mir/csv.d). Mir is faster, SIMD accelerated, and supports numbers and timestamp recognition.Great work. Will this module be extracted into a separate package?
Dec 22 2022
On 12/20/2022 11:46 AM, John Colvin wrote:We use this at work with some light tweaks, itโs done a lot work ๐Sweet!
Dec 21 2022
On Tuesday, 20 December 2022 at 00:16:57 UTC, Walter Bright wrote:LOL, learn something every day! I've even written my own, but it isn't very good.Yeah, I wrote a csv module too back in... I think 2010, before Phobos had one. It is about 90 lines, still works. Nothing special but I actually kinda like it. https://github.com/adamdruppe/arsd/blob/master/csv.d
Dec 21 2022
On 12/21/2022 6:27 AM, Adam D Ruppe wrote:On Tuesday, 20 December 2022 at 00:16:57 UTC, Walter Bright wrote:What this all means is Phobos could use a better one!LOL, learn something every day! I've even written my own, but it isn't very good.Yeah, I wrote a csv module too back in... I think 2010, before Phobos had one. It is about 90 lines, still works. Nothing special but I actually kinda like it. https://github.com/adamdruppe/arsd/blob/master/csv.d
Dec 21 2022
Curious why CSV isn't in the list. I encounter that a lot at tax time.As Adam said, std.csv is already there and its at least from my perspective okay enough. That being said, I liked how you quoted me here On Monday, 19 December 2022 at 09:55:47 UTC, Walter Bright wrote:On 12/18/2022 7:56 AM, Robert Schadek wrote:and replay, create an PR that puts it on the list ;-)So stop talking, and start creating PR's.Yup!
Dec 19 2022
On Sunday, 18 December 2022 at 15:56:38 UTC, Robert Schadek wrote:I complaint before that D and phobos needs more stuff. But I can't do it all by myself, but I can ask for help. So here it goes https://github.com/burner/textbasedfileformats As on the tin, text based file formats is a library of SAX and DOM parsers for text based file formats. I would like to get the following file formats in. * json (JSON5) there is actually some code in there already * xml, there is some code already, the old std.experimental.xml code * yaml, maybe there is something in code.dlang.org to be reused * toml, maybe there is something in code.dlang.org to be reused * ini, can likely be parsed by the toml parser * sdl, I know I know, but D uses it.A natural complement to this would be the functionality in https://github.com/eBay/tsv-utils I've created versions of the filter and select functions that take a string as input and return a string or string[] as output. It's a performant way to query text files. Most important, all the hard work is already done.
Dec 19 2022