www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - How to represent multiple files in a forum post?

reply Jonathan Marler <johnnymarler gmail.com> writes:
 timotheecour and I came up with a solution to a common problem:

How to represent multiple files in a forum post?

So we decided to take a stab at creating a standard! (queue links 
to https://xkcd.com/927)

We're calling it "har" (inspired by the name tar). Here's the 
REPO: https://github.com/marler8997/har and here's what it looks 
like:

--- file1.d
module file1;

--- file2.d
module file2;

// some cool stuff

--- main.d
import file1, file2;
void main() { }

--- Makefile
main: main.d file1.d file2.d
     dmd main.d file1.d file2.d

The repo contains the standard in README.md and a reference 
implementation for extracting files from a har file (archiving 
not implemented yet).

One of the great things is when someone creates a post with this 
format, you can simply copy paste it to "stuff.har" and then 
extract it with `har stuff.har`.  No need to create each 
individual file and copy/paste the contents to each one.

Is this going to change the world? No...but seems like a nice 
solution to an minor annoyance :)
Feb 14 2018
next sibling parent reply user1234 <user1234 12.nl> writes:
On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?

 So we decided to take a stab at creating a standard! (queue 
 links to https://xkcd.com/927)

 We're calling it "har" (inspired by the name tar). Here's the 
 REPO: https://github.com/marler8997/har and here's what it 
 looks like:

 --- file1.d
 module file1;

 --- file2.d
 module file2;

 // some cool stuff

 --- main.d
 import file1, file2;
 void main() { }

 --- Makefile
 main: main.d file1.d file2.d
     dmd main.d file1.d file2.d

 The repo contains the standard in README.md and a reference 
 implementation for extracting files from a har file (archiving 
 not implemented yet).

 One of the great things is when someone creates a post with 
 this format, you can simply copy paste it to "stuff.har" and 
 then extract it with `har stuff.har`.  No need to create each 
 individual file and copy/paste the contents to each one.

 Is this going to change the world? No...but seems like a nice 
 solution to an minor annoyance :)
how does it mix with markdown, html etc ? They'll have to use escapes to be compliant, haven't they ?
Feb 14 2018
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Wednesday, 14 February 2018 at 18:44:06 UTC, user1234 wrote:
 how does it mix with markdown, html etc ?
 They'll have to use escapes to be compliant, haven't they ?
Works great with mardown. ``` --- file1 Contents of file1 --- file2 Contents of file2 ```
Feb 14 2018
parent reply user1234 <user1234 12.nl> writes:
On Wednesday, 14 February 2018 at 18:47:31 UTC, Jonathan Marler 
wrote:
 On Wednesday, 14 February 2018 at 18:44:06 UTC, user1234 wrote:
 how does it mix with markdown, html etc ?
 They'll have to use escapes to be compliant, haven't they ?
Works great with mardown. ``` --- file1 Contents of file1 --- file2 Contents of file2 ```
hyphens are used for titles in some flavors.
Feb 14 2018
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Wednesday, 14 February 2018 at 18:52:35 UTC, user1234 wrote:
 On Wednesday, 14 February 2018 at 18:47:31 UTC, Jonathan Marler 
 wrote:
 On Wednesday, 14 February 2018 at 18:44:06 UTC, user1234 wrote:
 how does it mix with markdown, html etc ?
 They'll have to use escapes to be compliant, haven't they ?
Works great with mardown. ``` --- file1 Contents of file1 --- file2 Contents of file2 ```
hyphens are used for titles in some flavors.
2 things mitigate that. 1) Markdown does not preprocess text in between triple backticks 2) Even if you didn't put your HAR file in between triple backticks: Har uses "newline, dash, dash, dash, space, name" Markdown uses "newline, dash, dash, dash, dash*, newline" These don't actually conflict, i.e. Markdown title --- --- har file
Feb 14 2018
prev sibling next sibling parent reply Seb <seb wilzba.ch> writes:
On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?
What's wrong with https://gist.github.com? FYI: I have it on my agenda to work with Vladimir to add run.dlang.io support to DFeed (the software that runs this forum) [1]. Ideally it will be like those runnable snippets on StackOverflow [2]. [1] https://github.com/CyberShadow/DFeed [2] https://stackoverflow.blog/2014/09/16/introducing-runnable-javascript-css-and-html-code-snippets/
Feb 14 2018
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Wednesday, 14 February 2018 at 18:45:59 UTC, Seb wrote:
 On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
 wrote:
  timotheecour and I came up with a solution to a common 
 problem:

 How to represent multiple files in a forum post?
What's wrong with https://gist.github.com?
I'm not sure how that allows you to represent multiple files in a forum post, or how it would help you download the files locally to reproduce/test them.
Feb 14 2018
prev sibling next sibling parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?
I've been using: https://github.com/CyberShadow/misc/blob/master/dir2bug.d Looks pretty similar to har, but the delimiters use syntax used for comments in D and C-like languages.
Feb 14 2018
prev sibling next sibling parent reply John Gabriele <jgabriele fastmail.fm> writes:
On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?

 So we decided to take a stab at creating a standard! (queue 
 links to https://xkcd.com/927)

 We're calling it "har" (inspired by the name tar).
Clever name. Har D har har {ducks!} :)
 Here's the REPO: https://github.com/marler8997/har and here's 
 what it looks like:

 --- file1.d
 module file1;

 --- file2.d
 module file2;

 // some cool stuff

 --- main.d
 import file1, file2;
 void main() { }

 --- Makefile
 main: main.d file1.d file2.d
     dmd main.d file1.d file2.d
This looks handy. Yes, it's easy enough in markdown docs to just put code block markers around them (such as ``` or ~~~). Can the har file delimiter be more than three characters? What do you think of allowing trailing dashes (or whatever the delim chars are) after the file/dir name? It would make it easier to see the delimiters for larger har'd files. --- file1.d ------------------- module file1; --- file2.d ------------------- module file2; (Note that markdown allows extra trailing characters with its ATX-style headers, and Pandoc does likewise with ATX headers as well as its div syntax (delimited by at least three colons), for that very reason --- to make it easier to spot them.)
Feb 14 2018
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Wednesday, 14 February 2018 at 20:16:32 UTC, John Gabriele 
wrote:
 Can the har file delimiter be more than three characters?
Yes. So long as the delimiter is the consistent across the whole file, i.e. -------- file1 -------- file2 (See https://github.com/marler8997/har#custom-delimiters)
 What do you think of allowing trailing dashes (or whatever the 
 delim chars are) after the file/dir name? It would make it 
 easier to see the delimiters for larger har'd files.

     --- file1.d -------------------
     module file1;

     --- file2.d -------------------
     module file2;

 (Note that markdown allows extra trailing characters with its 
 ATX-style headers, and Pandoc does likewise with ATX headers as 
 well as its div syntax (delimited by at least three colons), 
 for that very reason --- to make it easier to spot them.)
Given the simplicity of the addition and the the fact that other standards have found it helps readability...I think you've made a fair case. I'll add a note in the README to be a probable addition.
Feb 14 2018
prev sibling next sibling parent reply Martin Nowak <code dawg.eu> writes:
On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?
Oh, I thought it already was a standard, but [.har](https://en.wikipedia.org/wiki/.har) is JSON based and very different, so the name is already taken. Have you done some proper research for existing standards? There is a 50 year old technique to use ASCII file separators (https://stackoverflow.com/a/18782271/2371032). Sure we can find some existing ones. We recently wondered how gcc/llvm test codegen, at least in gcc or gdb I've seen a custom test case format.
Feb 17 2018
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Saturday, 17 February 2018 at 22:11:28 UTC, Martin Nowak wrote:
 On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
 wrote:
  timotheecour and I came up with a solution to a common 
 problem:

 How to represent multiple files in a forum post?
Oh, I thought it already was a standard, but [.har](https://en.wikipedia.org/wiki/.har) is JSON based and very different, so the name is already taken. Have you done some proper research for existing standards? There is a 50 year old technique to use ASCII file separators (https://stackoverflow.com/a/18782271/2371032). Sure we can find some existing ones. We recently wondered how gcc/llvm test codegen, at least in gcc or gdb I've seen a custom test case format.
Those seem to be binary control characters, they also don't allow you to specify the filename...how does that solve the problem of being able to represent multiple files in a forum post? If there is an existing standard that's great, I wasn't able to find one. If you find one let me know.
Feb 17 2018
parent reply Martin Nowak <code dawg.eu> writes:
On Sunday, 18 February 2018 at 04:04:48 UTC, Jonathan Marler 
wrote:
 If there is an existing standard that's great, I wasn't able to 
 find one.  If you find one let me know.
Found ptar (https://github.com/jtvaughan/ptar) and shar (https://linux.die.net/man/1/shar), both aren't too good fits, so indeed a custom format might be in order.
Feb 18 2018
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Sunday, 18 February 2018 at 21:40:34 UTC, Martin Nowak wrote:
 On Sunday, 18 February 2018 at 04:04:48 UTC, Jonathan Marler 
 wrote:
 If there is an existing standard that's great, I wasn't able 
 to find one.  If you find one let me know.
Found ptar (https://github.com/jtvaughan/ptar) and shar (https://linux.die.net/man/1/shar), both aren't too good fits, so indeed a custom format might be in order.
Interesting. Shar definitely doesn't fit the bill (I used it to archive a simple file and got a HUGE shell file). However, PTAR is interesting. Here's the example they provide (https://github.com/jtvaughan/ptar/blob/master/FORMAT.md#example-archive) ====================================================================== PTAR ====================================================================== Metadata Encoding: utf-8 Archive Creation Date: 2013-09-24T22:41:20Z Path: a.txt Type: Regular File File Size: 32 User Name: foo User ID: 1000 Group Name: bar Group ID: 1001 Permissions: 0000664 Modification Time: 1380015036 --- These are the contents of a.txt --- Path: b.txt Type: Regular File File Size: 32 User Name: foo User ID: 1000 Group Name: baz Group ID: 522 Permissions: 0000664 Modification Time: 1380015048 --- These are the contents of b.txt --- In HAR you could represent the same 2 files (omitting all the metadata) with: ====================================================================== HAR ====================================================================== --- a.txt These are the contents of a.txt --- b.txt These are the contents of b.txt The PTAR format may be saved if many of the fields were optional, however, it appears that most of them are always required, i.e.
 NOTE: Unless otherwise specified, all of the aforementioned 
 keys are required. Keys that do not apply to a file entry are 
 silently ignored.
If the standard was changed though (along with making the initial file header optional), the example could be slimmed down to this: Path: a.txt --- These are the contents of a.txt --- Path: b.txt --- These are the contents of b.txt --- This format looks fine, but it's a big modification of PTAR as it exists.
Feb 18 2018
prev sibling next sibling parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig+d outerproduct.org> writes:
Am 14.02.2018 um 19:33 schrieb Jonathan Marler:
  timotheecour and I came up with a solution to a common problem:
 
 How to represent multiple files in a forum post?
 
Why not multipart/mixed? Since this is NNTP based, wouldn't that be the natural choice? That it, assuming that forum.dlang.org is the target for this, of course.
Feb 18 2018
next sibling parent Timothee Cour <thelastmammoth gmail.com> writes:
On Sun, Feb 18, 2018 at 4:46 PM, Sönke Ludwig via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 Am 14.02.2018 um 19:33 schrieb Jonathan Marler:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?
Why not multipart/mixed? Since this is NNTP based, wouldn't that be the natural choice? That it, assuming that forum.dlang.org is the target for this, of course.
no, it should be usable in other plain text contexts (eg email, bugzilla entry, github entry, etc)
Feb 18 2018
prev sibling parent Jonathan Marler <johnnymarler gmail.com> writes:
On Sunday, 18 February 2018 at 23:46:05 UTC, Sönke Ludwig wrote:
 Am 14.02.2018 um 19:33 schrieb Jonathan Marler:
  timotheecour and I came up with a solution to a common 
 problem:
 
 How to represent multiple files in a forum post?
 
Why not multipart/mixed? Since this is NNTP based, wouldn't that be the natural choice? That it, assuming that forum.dlang.org is the target for this, of course.
Actually, using multipart/mixed was my initial thought! But that format is a bit awkward and verbose for humans to type, which makes sense because it was designed to be generated by programs, not as a human-maintained block of text. HAR: ====================================================================== --- a.txt This is a.txt --- b.txt This is b.txt ====================================================================== Multipart: ====================================================================== Content-Type: multipart/alternative; boundary=<some-boundary> --<some-boundary> Content-Type: text/plain; charset=us-ascii Filename: a.txt This is a.txt --<some-boundary> Content-Type: text/plain; charset=us-ascii Filename: b.txt This is b.txt ====================================================================== HAR was actually born out of Multipart, it's really just a simplified version of it :)
Feb 18 2018
prev sibling next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, February 14, 2018 18:33:23 Jonathan Marler via Digitalmars-d 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?

 So we decided to take a stab at creating a standard! (queue links
 to https://xkcd.com/927)

 We're calling it "har" (inspired by the name tar). Here's the
 REPO: https://github.com/marler8997/har and here's what it looks
 like:

 --- file1.d
 module file1;

 --- file2.d
 module file2;

 // some cool stuff

 --- main.d
 import file1, file2;
 void main() { }

 --- Makefile
 main: main.d file1.d file2.d
      dmd main.d file1.d file2.d

 The repo contains the standard in README.md and a reference
 implementation for extracting files from a har file (archiving
 not implemented yet).

 One of the great things is when someone creates a post with this
 format, you can simply copy paste it to "stuff.har" and then
 extract it with `har stuff.har`.  No need to create each
 individual file and copy/paste the contents to each one.

 Is this going to change the world? No...but seems like a nice
 solution to an minor annoyance :)
Okay. Maybe, I'm dumb, but what is the point of all of this? Why would any kind of standard be necessary at all? Most newsgroup posts have snippets of code at most, and those which need to have the contents of one or more files are going to be showing short files where it's a human that's reading them, not a computer. So, as long as the files are delimited in a way that's clear to the reader, why would any kind of standard be needed? I could do ===== file1.d ===== ...contents here... =================== ===== file2.d ===== ...contents here... =================== or ----- file1.d ----- ...contents here... ___________________ ----- file2.d ----- ...contents here... ___________________ or ----- file1.d ...contents here... ----- ----- file2.d ...contents here... ----- or anything else that was delimited fairly clearly, and the programmers reading my post would be able to easily see where one file begins and another ends. If a computer were interpreting the post and doing something with the code, then that wouldn't cut it, because it's not going to just "figure out" what the poster meant, but for a human, that's generally not a problem, and forum posts are meant to be read by humans. So, what is this proposal supposed to be solving? Was it just bugging you guys that people aren't consistent? - Jonathan M Davis
Feb 18 2018
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Monday, 19 February 2018 at 01:26:43 UTC, Jonathan M Davis 
wrote:
 Okay. Maybe, I'm dumb, but what is the point of all of this? 
 Why would any kind of standard be necessary at all?
Good question. Having a standard allows computers to interface with the archive as well as humans. It's not hard to create "ad hoc" formats on the fly to represent multiple files, which is why having a standard doesn't immediately come to mind. But with a standard, you can create tools to process that format and understand it. As an example, we're exploring 2 useful applications, namely, representing multi-file tests in the dmd test suite and multi-file programs in https://run.dlang.io, it's also already been useful to me in copy/pasting complex test cases to forums without having to manage a bunch of individual files. Of course, these are just a few examples of the benefits you get with a standard. So...is it necessary? No. Is it helpful? I think so. Is it worth it? I think there's a good case for it when you weight the simplicity of it against the benefit.
Feb 18 2018
prev sibling parent reply Seb <seb wilzba.ch> writes:
On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
wrote:
  timotheecour and I came up with a solution to a common problem:

 How to represent multiple files in a forum post?
... and now it's available on run.dlang.io: https://run.dlang.io/is/ZHm2Xe This means that it can be used for the DTour, Phobos documentation and runnable specification examples. If we later this year integrate run.dlang.io with DFeed, it will also be available here and who knows maybe someday we can also super-power Bugzilla... It was super-easy to integrate, thanks Jonathan Marler for making such a simple, but very useful tool!
Feb 20 2018
parent Basile B. <b2.temp gmx.com> writes:
On Tuesday, 20 February 2018 at 18:41:08 UTC, Seb wrote:
 On Wednesday, 14 February 2018 at 18:33:23 UTC, Jonathan Marler 
 wrote:
  timotheecour and I came up with a solution to a common 
 problem:

 How to represent multiple files in a forum post?
... and now it's available on run.dlang.io: https://run.dlang.io/is/ZHm2Xe This means that it can be used for the DTour, Phobos documentation and runnable specification examples. If we later this year integrate run.dlang.io with DFeed, it will also be available here and who knows maybe someday we can also super-power Bugzilla... It was super-easy to integrate, thanks Jonathan Marler for making such a simple, but very useful tool!
This is awesome. I wish something similar would actually exist directly in the programming languages, making test for protection and cross dependencies easier. D has a special token sequence that _could_ be used as it (https://dlang.org/spec/lex.html#special-token-sequence), but anyway, this is out of topic. Nice work (both spec + impl on run.dlang). If you add this sample as example maybe use `writeln(__MODULE__);` to make things more obvious.
Feb 20 2018