www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Feature Request - Raw HTML in ddoc comments

reply "Janice Caron" <caron800 googlemail.com> writes:
Hi, I'd love to put raw HTML inside ddoc comments, instead of having
to learn all the ddoc macros, which (...no offense...) aren't quite up
to the job, and are very hard to "debug" when they go wrong.

I'm thinking maybe something like...

/** <?html

    <!-- Actual HTML goes here -->

    <h2>Tutorial</h2>
    <p>Phew! - Now I can write my tutorial, and have tables
    and images and all sorts in it</p>

?> **/ // and now back to our regularly scheduled D...


Then, when the doc is compiled, the contents of an <?html...?> blob
can just be pasted into the generated document, after some small
amount of validation. Of course there should be some caveats, like an
absolute prohibition on Javascript or any other kind of client-side
scripting within such a blob!!!!

Actually, come to think of it, let's make it XHTML, which is much
easier to check for well-formedness.
Feb 21 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 Hi, I'd love to put raw HTML inside ddoc comments,
You can do it now. Just put it in! But I don't recommend it, as then the output is restricted to HTML.
Feb 22 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 22/02/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Janice Caron wrote:
  > Hi, I'd love to put raw HTML inside ddoc comments,

 You can do it now. Just put it in! But I don't recommend it, as then the
  output is restricted to HTML.
Bah! HTML is now the single most portable document format on the planet. It is almost trivial to convert from HTML to any other format. I would think that it would be not too hard to remove the restriction that you mention. ddoc is OK for documenting functions and classes and the like, but tedious for more involved documentation such as a tutorial. (For example, single colon anywhere inside a ddoc comment can completely screw up the layout). Come to think of it ... I think I'll write a utility which converts from HTML into ddoc macros. Then I can write additional documentation in HTML using a WYSIWYG editor of my choice, convert it to ddoc macros, and then wham it into the source code, without having to worry that I might have made a syntax error somewhere that complete screws up the layout.
Feb 22 2008
next sibling parent Derek Parnell <derek psych.ward> writes:
Content-type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

On Fri, 22 Feb 2008 08:51:49 +0000, Janice Caron wrote:


 ddoc is OK for documenting functions and classes and the like, but
 tedious for more involved documentation such as a tutorial. 
I must disagree. I wrote the entire documenation for Bud just using ddoc and without embedded HTML. It's just a matter of writing the appropriate 'generic' macros and using them. I've attached the 'macro' file I use. To output into a differnet format, I'd use a different 'macro' file without touching the document source files. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Feb 22 2008
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 Bah! HTML is now the single most portable document format on the
 planet. It is almost trivial to convert from HTML to any other format.
Yeah, but I've seen what happens to HTML that is auto-translated to PDF. It's awful.
 ddoc is OK for documenting functions and classes and the like, but
 tedious for more involved documentation such as a tutorial. (For
 example, single colon anywhere inside a ddoc comment can completely
 screw up the layout).
The entire Digital Mars website is generated with ddoc. It's saved me tremendous time over the earlier versions which were handcoded HTML. I've also done other sites, all 100% ddoc (like http://www.generalatomic.com )
Feb 22 2008
parent reply "dominik" <aha aha.com> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:fpm60l$apn$1 digitalmars.com...
 Janice Caron wrote:
 Bah! HTML is now the single most portable document format on the
 planet. It is almost trivial to convert from HTML to any other format.
Yeah, but I've seen what happens to HTML that is auto-translated to PDF. It's awful.
Thats why I advocate xml, specificaly docbook. with right xslt you can transform it into anything, including xsl-fo which translates to PDF with ease. just pipe it to xalan with your xslt and voila. another option is javadoc - why no one mentioned javadoc?
Feb 22 2008
next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 22/02/2008, dominik <aha aha.com> wrote:
 Thats why I advocate xml, specificaly docbook. with right xslt you can
  transform it into anything,
ddoc is clearly going to be the "common format" for D though. So, if you write the tool that converts XSLT to ddoc code, then sure - we can use that too.
  another option is javadoc - why no one mentioned javadoc?
I don't think that really applies here.
Feb 22 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
dominik wrote:
 another option is javadoc - why no one mentioned javadoc? 
A couple reasons: 1) javadoc has an ugly user syntax. param? author? Hideous. I wanted something that would look like natural text in user code, not like another programming language. 2) javadoc cannot be used for documents that are not in source code.
Feb 22 2008
parent "dominik" <aha aha.com> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:fpn44n$bd9$1 digitalmars.com...
 dominik wrote:
 another option is javadoc - why no one mentioned javadoc?
A couple reasons: 1) javadoc has an ugly user syntax. param? author? Hideous. I wanted something that would look like natural text in user code, not like another programming language. 2) javadoc cannot be used for documents that are not in source code.
ah ok, I didn't knew where were you going with it. It makes sense. I've experimented a bit with different documentation schemes, and the best one yet, for me, is to write a document with docbook - and parse in external file as a source (if needed to show source) and, if needed, javadoc variables.. there just isn't a good scheme, for me, to write longer documents _within_ source, that would be litterate programming, and I haven't found a good scheme yet. For smaller documents, like you have on digitalmars.com - it, obviously, works.
Feb 22 2008
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Janice Caron, el 22 de febrero a las 08:51 me escribiste:
 On 22/02/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Janice Caron wrote:
  > Hi, I'd love to put raw HTML inside ddoc comments,

 You can do it now. Just put it in! But I don't recommend it, as then the
  output is restricted to HTML.
Bah! HTML is now the single most portable document format on the planet. It is almost trivial to convert from HTML to any other format. I would think that it would be not too hard to remove the restriction that you mention.
But HTML sucks for writing documentation! It's a great format for computers but a PITA for humans. I think ddoc should use something more in the lines of ReStructuredText[1]. [1] http://docutils.sourceforge.net/docs/user/rst/quickref.html -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Ingeniero Juanjo Charlante, Linux es como una mermelada?
Feb 22 2008
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 22/02/2008, Leandro Lucarella <llucax gmail.com> wrote:
 But HTML sucks for writing documentation!
Opinions differ on that point. I can show you a WYSIWYG HTML editor. Can you show me a WYSIWYG ddoc editor? I can show you text editors that auto-complete HTML. Can you show me a text editor that auto-completes ddoc? For that matter, what I actually /use/ is a text editor which syntax-highlights correctly formatted HTML. Can you show me a text editor which syntax-highlights correctly formats ddoc? Can you even show me a tool which validates well-formed ddoc, and takes me to the line containing the error if it doesn't? I don't really want to get into a format war. The point is, I have had many years experience of working with HTML, I like it, and I'm very comfortable with it. That won't be true for everyone, and it may not be true for you, but that doesn't negate my experience of it. I find it elegant and beautiful. And - just as relevant - I wrote a tool to convert HTML to ddoc and it took me half an hour, which is /considerably/ less time than it would take me to learn ddoc. The way I see it, that should make everybody happy. I'll be writing my larger docs in HTML (I'll still use raw ddoc for quickly documenting declarations, of course), but nobody but I will have to see that. What will go into the source code will be the ddoc which gets produced by my tool. Everybody wins.
Feb 22 2008
next sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
Janice Caron escribió:
 On 22/02/2008, Leandro Lucarella <llucax gmail.com> wrote:
 But HTML sucks for writing documentation!
Opinions differ on that point.
 I can show you text editors that auto-complete HTML. Can you show me a
 text editor that auto-completes ddoc?
Descent.
Feb 22 2008
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Janice Caron, el 22 de febrero a las 20:56 me escribiste:
 On 22/02/2008, Leandro Lucarella <llucax gmail.com> wrote:
 But HTML sucks for writing documentation!
Opinions differ on that point. I can show you a WYSIWYG HTML editor. Can you show me a WYSIWYG ddoc editor?
I don't want to use a WYSISWYG HTML editor when writing code.
 I can show you text editors that auto-complete HTML. Can you show me a
 text editor that auto-completes ddoc?
I don't want to auto-complete anything, that's the point :) If you just followed my link and saw what is RST about, you have noticed that...
 For that matter, what I actually /use/ is a text editor which
 syntax-highlights correctly formatted HTML. Can you show me a text
 editor which syntax-highlights correctly formats ddoc?
No, and I'm not interested. But there plainty of tools that highlights doxygen comments for example...
 Can you even show me a tool which validates well-formed ddoc, and
 takes me to the line containing the error if it doesn't?
Again, no. And not interested. That's all problems related to formats that sucks for being written by humans :) I don't want to need any tool for validating anything.
 I don't really want to get into a format war. The point is, I have had
 many years experience of working with HTML, I like it, and I'm very
 comfortable with it.
It's nice to know. I can say the exact opposite though.
 That won't be true for everyone, and it may not
 be true for you, but that doesn't negate my experience of it. I find
 it elegant and beautiful.
I'm glad for you, but your experience don't make HTML suck less for humans. All the tools you mention are needed because it *sucks*. If it doesn't suck, you don't need tools to make it suck less.
 And - just as relevant - I wrote a tool to convert HTML to ddoc and it
 took me half an hour, which is /considerably/ less time than it would
 take me to learn ddoc. The way I see it, that should make everybody
 happy. I'll be writing my larger docs in HTML (I'll still use raw ddoc
 for quickly documenting declarations, of course), but nobody but I
 will have to see that. What will go into the source code will be the
 ddoc which gets produced by my tool. Everybody wins.
Well, for me this is irrelevant, I was talking about HTML sucking for being written by humans, not about your particular documentation process :) PS: What we do agree is that I don't want to start a war on formats either, just because is too naive to think that Walter will change DDoc for something else. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- FINALMENTE EL CABALLITO FABIAN VA A PASAR UNA BUENA NAVIDAD -- Crónica TV
Feb 22 2008
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 23/02/2008, Leandro Lucarella <llucax gmail.com> wrote:
  I don't want to use a WYSISWYG HTML editor when writing code.
Nor do I. Perhaps you misunderstood. I was talking about /documentation/, not code. Raw ddoc is absolutely perfect for inline doc comments, and Walter has done a /fantastic/ job in getting a system like that going. What I'm talking is real documentation, like, actual instruction manuals - tutorials, fifty or a hundred pages long, or more when printed. Think "std.whatever For Dummies". A whole book. There is no way I want to write a whole book in ddoc, and /for that purpose/, a wysiwyg editor wouldn't be a bad thing. Using ddoc as the final output remains a good thing, however, because ddoc format can turn into everything else, including auto-generated Digital Mars web pages.
 I don't want to auto-complete anything, that's the point :)
It's /a/ point, but it's not /the/ point. When I request something, it's usually because /I/ want it, so whether or not /you/ want it is really kind of incidental. Besides which, *I made the tool*. The deed is done, so I'm not requesting anything any more. I already have it. I made it myself. If that tells you anything, it tells you that D is a brilliant language, if only because that was possible.
 But there plainty of tools that highlights
  doxygen comments for example...
ddoc! It's perfect for that! No complaints there. As I said, I was thinking a few orders of magnitude higher than that.
 I'm glad for you, but your experience don't make HTML suck less for
  humans. All the tools you mention are needed because it *sucks*. If it
  doesn't suck, you don't need tools to make it suck less.
That's only true if you look at it from the point of view that you're supposed to interact directly with the source, and that's not necessarily so. You could argue in exactly the same way that Microsoft Word format, or RTF, both suck, because you need special tools (i.e. Microsoft Word or some other word processor) to manipulate them. If you were to try editing an RTF document by editing the raw text file with a plain text editor, I'm sure you would quickly come to the conclusion that RTF sucks. However, that's just not what you do. But HTML is really just another document format, like Word document or RTF, and so, from that point of view, the underlying representation matters less. What matters more is its portability, how easy it is to convert it to other formats, and the availability of tools to edit it.
 I was talking about HTML sucking for
  being written by humans, not about your particular documentation process
  :)
I get that, but you leapt into a thread that I started, basically telling me that /I/ shouldn't be doing something, or requesting something, because /you/ don't need it. No disrespect intended, but ... huh?
  PS: What we do agree is that I don't want to start a war on formats
     either, just because is too naive to think that Walter will change
     DDoc for something else.
And nor do I. It's a solved problem. However, it still remains the case that ddoc documentation /may contain raw HTML, but its use is strongly discouraged/. That, I think, is bad. Either don't allow it all, or fully support it.
Feb 23 2008
next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Janice Caron wrote:
 What I'm talking is real documentation, like, actual instruction
 manuals - tutorials, fifty or a hundred pages long, or more when
 printed. Think "std.whatever For Dummies". A whole book. There is no
 way I want to write a whole book in ddoc, and /for that purpose/, a
 wysiwyg editor wouldn't be a bad thing.
Why does your 50-to-100-page manual need to be in the same file as your code?
Feb 23 2008
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 23/02/2008, Robert Fraser <fraserofthenight gmail.com> wrote:
 Janice Caron wrote:
  > What I'm talking is real documentation, like, actual instruction
  > manuals - tutorials, fifty or a hundred pages long, or more when
  > printed. Think "std.whatever For Dummies". A whole book. There is no
  > way I want to write a whole book in ddoc, and /for that purpose/, a
  > wysiwyg editor wouldn't be a bad thing.

 Why does your 50-to-100-page manual need to be in the same file as your
  code?
It doesn't, but (a) since ddoc can be converted to all those other formats, it makes sense to use ddoc as an intermediate format in any case, and (b) there's an in-between. A document two or three pages long is reasonable to include in the embedded comments, and yet would still be complex enough not to want to hand code it.
Feb 23 2008
parent Christopher Wright <dhasenan gmail.com> writes:
Janice Caron wrote:
 On 23/02/2008, Robert Fraser <fraserofthenight gmail.com> wrote:
 Janice Caron wrote:
  > What I'm talking is real documentation, like, actual instruction
  > manuals - tutorials, fifty or a hundred pages long, or more when
  > printed. Think "std.whatever For Dummies". A whole book. There is no
  > way I want to write a whole book in ddoc, and /for that purpose/, a
  > wysiwyg editor wouldn't be a bad thing.

 Why does your 50-to-100-page manual need to be in the same file as your
  code?
It doesn't, but (a) since ddoc can be converted to all those other formats, it makes sense to use ddoc as an intermediate format in any case, and (b) there's an in-between. A document two or three pages long is reasonable to include in the embedded comments, and yet would still be complex enough not to want to hand code it.
I think LaTeX is a much better solution for writing a manual than ddoc, but I don't know how well tex2html works, if you absolutely must have a fully automated HTML conversion. It should at least come up with well structured HTML, so you just need to apply a stylesheet.
Feb 23 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 Why does your 50-to-100-page manual need to be in the same file as your 
 code?
Ddoc isn't intended to be a full-featured text layout publishing thing. It's meant for documenting declarations in a fairly straightforward manner, and it also turns out to be handy for generating basic web pages (like on the D web site). For writing a book, using a word processor or Latex is probably much more appropriate. Ddoc has easily saved me hundreds and hundreds of hours of work, and it has also been the major factor in boosting the quality and comprehensiveness of the Phobos library documentation. I know it has a lot of shortcomings, but by gawd it's effective. It's one of those fundamentally indispensible tools I can't believe I spent decades stumbling along without. For me it's like the transition from a line editor to a full screen editor.
Feb 29 2008
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Fri, 29 Feb 2008 14:49:21 -0800, Walter Bright wrote:

 Robert Fraser wrote:
 Why does your 50-to-100-page manual need to be in the same file as your 
 code?
Ddoc isn't intended to be a full-featured text layout publishing thing. It's meant for documenting declarations in a fairly straightforward manner, and it also turns out to be handy
I have great affection for DDoc too. It is a truely effective time saver. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Feb 29 2008
parent reply Ty Tower <tytower hotmail.com.au> writes:
Derek Parnell Wrote:

 On Fri, 29 Feb 2008 14:49:21 -0800, Walter Bright wrote:
 
 Robert Fraser wrote:
 Why does your 50-to-100-page manual need to be in the same file as your 
 code?
Ddoc isn't intended to be a full-featured text layout publishing thing. It's meant for documenting declarations in a fairly straightforward manner, and it also turns out to be handy
I have great affection for DDoc too. It is a truely effective time saver. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
OK so how can the pages be easly extracted into html form? Do you have a program or application to do this automatically?
Feb 29 2008
next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Ty Tower wrote:
 OK so how can the pages be easly extracted into html form?
 Do you have a program or application to do this automatically?
That's what ddoc does, you run it on your source file and html comes out.
Feb 29 2008
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Fri, 29 Feb 2008 19:24:55 -0500, Ty Tower wrote:

 Derek Parnell Wrote:
 
 On Fri, 29 Feb 2008 14:49:21 -0800, Walter Bright wrote:
 
 Robert Fraser wrote:
 Why does your 50-to-100-page manual need to be in the same file as your 
 code?
Ddoc isn't intended to be a full-featured text layout publishing thing. It's meant for documenting declarations in a fairly straightforward manner, and it also turns out to be handy
I have great affection for DDoc too. It is a truely effective time saver. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
OK so how can the pages be easly extracted into html form? Do you have a program or application to do this automatically?
Yes I do, its called DMD. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Feb 29 2008
parent reply Ty Tower <tytower hotmail.com.au> writes:
l
 OK so how can the pages be easly extracted into html form?
 Do you have a program or application to do this automatically?
Yes I do, its called DMD. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Would you care to elaborate on this , dmd is the compiler -how does it run ddoc?
Mar 01 2008
next sibling parent BCS <ao pathlink.com> writes:
Reply to ty,

 l
 
 OK so how can the pages be easly extracted into html form? Do you
 have a program or application to do this automatically?
 
Yes I do, its called DMD.
Would you care to elaborate on this , dmd is the compiler -how does it run ddoc?
give it the -D flag and watch stuff happen
Mar 01 2008
prev sibling parent Derek Parnell <derek psych.ward> writes:
On Sat, 01 Mar 2008 17:04:30 -0500, Ty Tower wrote:

 l
 OK so how can the pages be easly extracted into html form?
 Do you have a program or application to do this automatically?
Yes I do, its called DMD. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Would you care to elaborate on this , dmd is the compiler -how does it run ddoc?
Sure. DMD is the compiler, no doubt about that. However, inside the compiler is a DDoc processor as well. Try this out ... Create two text files. The first one "testmac.ddoc" contains macro definitions ... //---- DDOC=**Start of document** $(DDOC_HEAD) $(DDOC_BODY) **End of document** DDOC_HEAD=**Note : Generated by a test macro set** DDOC_BODY=$(BODY) SECTION=**SECTION: "$0" ** I=**ITALIC"$0"** and the second one "testmac.d" contains the document ... Ddoc $(SECTION Introduction) In this section I'd like to introduce the next section. $(SECTION Acknowledgements) I'd like to thank my $(I parents) and $(I Walter Bright). $(SECTION Epilogue) And in conclusion, goodnight. Then generate the document with the command line ... dmd testmac.ddoc testmac.d -Dftestmac.txt The resulting document will be in "testmac.txt" file. And it should look like ... **Start of document** **Note : Generated by a test macro set** <!-- Generated by Ddoc from testmac.d --> **SECTION: "Introduction" ** In this section I'd like to introduce the next section. **SECTION: "Acknowledgements" ** I'd like to thank my **ITALIC"parents"** and **ITALIC"Walter Bright"**. **SECTION: "Epilogue" ** And in conclusion, goodnight. **End of document** Note! NO HTML AT ALL. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 01 2008
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 29/02/2008, Walter Bright <newshound1 digitalmars.com> wrote:
  For writing a book, using a word processor or Latex is probably much
  more appropriate.
Since I solved my own problem almost immediately after I first posed it, I'm surprised this thread has gone on for so long. In case anyone hadn't noticed, I have withdrawn my request.
  I know it has a
  lot of shortcomings, but by gawd it's effective. It's one of those
  fundamentally indispensible tools I can't believe I spent decades
  stumbling along without.
No disputing that, but the one remaining issue from this thread which you haven't really commented on is the fact that ddoc accepts inline HTML, but that authors are discouraged from using it. That just seems nasty to me. Either support it, or ban it, but not half-and-half, please. By the way, is there any publicly available tool which will convert ddoc into any form /other/ than HTML? e.g. pdf?
Feb 29 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 No disputing that, but the one remaining issue from this thread which
 you haven't really commented on is the fact that ddoc accepts inline
 HTML, but that authors are discouraged from using it. That just seems
 nasty to me. Either support it, or ban it, but not half-and-half,
 please.
There's no way to ban it, because ddoc is a text macro processing program. It does not know what html is.
 By the way, is there any publicly available tool which will convert
 ddoc into any form /other/ than HTML? e.g. pdf?
Andrei said at one point he had a sort of set of macros that would cause ddoc to produce latex output, which could then be used to generate pdf.
Mar 01 2008
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 There's no way to ban it, because ddoc is a text macro processing
  program. It does not know what html is.
The docs explicitly state, and I quote: "HTML can be embedded into the documentation comments, and it will be passed through to the HTML output unchanged." This, to my mind, is a bug - both in the documentation, and in the implementation. DMD is at fault here. DMD should /not/ pass HTML through unchanged. It should sanitize the input, in order specifically to prevent it from being interpreted as markup in the output format. The final HTML document which results from the conversion process should not contain any HTML markup which was not generated by ddoc. In general, any tool which converts SRC -> DST needs to escape all characters which would be considered markup in DST. That's true for every kind of DST, not just for HTML - but it's particularly important for HTML because the only way we have to test whether our ddoc source is correct is to compile it and view the results in a web browser. If HTML is passed through unfiltered, then we will fail to notice that our ddoc source will not produce valid PDF, for example. Here's one case where it matters. In the documentation for std.xml, the ddoc source says something like this function converts &amp;amp; to &amp; because that was the only way I could get it to render in a web browser as this function converts &amp; to & The question is, will that render correctly when converted to PDF? And if not, what is the destination-independent way of writing something that will render as "&amp;"? I would call failing to sanitize, a bug. Except I can't, because it's behaving as documented.
Mar 01 2008
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 There's no way to ban it, because ddoc is a text macro processing
  program. It does not know what html is.
The docs explicitly state, and I quote: "HTML can be embedded into the documentation comments, and it will be passed through to the HTML output unchanged." This, to my mind, is a bug - both in the documentation, and in the implementation. DMD is at fault here. DMD should /not/ pass HTML through unchanged. It should sanitize the input, in order specifically to prevent it from being interpreted as markup in the output format.
But DDoc doesn't know anything about the output format. All it knows is some rules for textual transformations based on the macros you give it. It has no way of knowing the full set of sequences have special meaning for the chosen output format. Fixing this by escaping all HTML syntax is not really a fix. What if I type some raw LaTeX in my input? It'll probably be treated like text for HTML output, but be interpreted as code if I output LaTeX. Like Adam said, the only way to really fix it is to teach DDoc what all the special sequences for a given target, and how to escape them. --bb
Mar 01 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 01/03/2008, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 DMD is at fault here. DMD should /not/ pass HTML

 through unchanged. It should sanitize the input, in order specifically
> to prevent it from being interpreted as markup in the output format. But DDoc doesn't know anything about the output format.
I didn't say the ddoc format was at fault, I said DMD was at fault. It is the translation tool which is at fault. A tool which translates DDOC into HTML must, by definition, know about HTML - just as a tool which translates DDOC into PDF must know about PDF. In this case, the translation tool happens to be DMD.
  Fixing this by escaping all HTML syntax is not really a fix.  What if I
  type some raw LaTeX in my input?  It'll probably be treated like text
  for HTML output, but be interpreted as code if I output LaTeX.
The correct behavior would be for any DDOC->HTML translation tool to escape HTML, and for any DDOC->LaTeX translation tool to escape LaTeX. This has nothing whatsoever to do with DDOC format. The problem is in the translation tools. To make an analogy, if I write a tool that translates any source format into XML,then I **MUST** escape anything < and & and ", because otherwise the resulting document will not be well formed, and will be invalid. This will not have been the fault of the source format, it will have been a bug in my application!
  Like Adam said, the only way to really fix it is to teach DDoc
I don't understand what you mean here. So far as I am aware, DDOC is a markup language, not an application. It is therefore not possible to "teach" it anything. I say, leave DDOC alone - just do a better job of translating it.
Mar 01 2008
next sibling parent Derek Parnell <derek psych.ward> writes:
On Sat, 1 Mar 2008 21:45:51 +0000, Janice Caron wrote:

 as I am aware, DDOC is a markup language, not an application.
Here, I think, is the source of confusion. DDOC is not a markup language at all. It is a macro expansion tool. The output depends on the definition of the macros and they are defined OUTSIDE of DDOC. They are not part of DDOC. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 01 2008
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 01/03/2008, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 DMD is at fault here. DMD should /not/ pass HTML

 through unchanged. It should sanitize the input, in order specifically
> to prevent it from being interpreted as markup in the output format. But DDoc doesn't know anything about the output format.
I didn't say the ddoc format was at fault, I said DMD was at fault. It is the translation tool which is at fault. A tool which translates DDOC into HTML must, by definition, know about HTML - just as a tool which translates DDOC into PDF must know about PDF. In this case, the translation tool happens to be DMD.
  Fixing this by escaping all HTML syntax is not really a fix.  What if I
  type some raw LaTeX in my input?  It'll probably be treated like text
  for HTML output, but be interpreted as code if I output LaTeX.
The correct behavior would be for any DDOC->HTML translation tool to escape HTML, and for any DDOC->LaTeX translation tool to escape LaTeX. This has nothing whatsoever to do with DDOC format. The problem is in the translation tools. To make an analogy, if I write a tool that translates any source format into XML,then I **MUST** escape anything < and & and ", because otherwise the resulting document will not be well formed, and will be invalid. This will not have been the fault of the source format, it will have been a bug in my application!
  Like Adam said, the only way to really fix it is to teach DDoc
I don't understand what you mean here. So far as I am aware, DDOC is a markup language, not an application. It is therefore not possible to "teach" it anything. I say, leave DDOC alone - just do a better job of translating it.
By DDoc I mean the system that generates documentation from comments in D code. DDoc is not in and of itself a DDoc comment->HTML translation tool. DDoc is a macro processing system, that, with the right set of macros, can be made to generate HTML. But the macro processor itself has no clue what HTML is. That's Walter's whole idea with DDoc -- put _all_ the output-specific knowledge in the macros. It does come with a built-in set of HTML macros, because not being able to generate anything by default would make it pretty lame. At least that's the theory of DDoc as I understand it. Examination of dmd/src/doc.c reveals a slightly different picture where some knowledge of HTML specifcs *are* actually embedded in the processing logic (see function "highlightText"). I'm not sure what the purpose of that is, but any HTML-specific scanning in the ddoc processor is probably a bug. "Whether the final presentation form is an HTML web page, a man page, a PDF file, etc. is not specified as part of the D Programming Language." -- http://www.digitalmars.com/d/1.0/ddoc.html --bb
Mar 01 2008
prev sibling next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sat, 1 Mar 2008 18:51:22 +0000, Janice Caron wrote:

 On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 There's no way to ban it, because ddoc is a text macro processing
  program. It does not know what html is.
The docs explicitly state, and I quote: "HTML can be embedded into the documentation comments, and it will be passed through to the HTML output unchanged." This, to my mind, is a bug - both in the documentation, and in the implementation. DMD is at fault here. DMD should /not/ pass HTML through unchanged. It should sanitize the input, in order specifically to prevent it from being interpreted as markup in the output format. The final HTML document which results from the conversion process should not contain any HTML markup which was not generated by ddoc. In general, any tool which converts SRC -> DST needs to escape all characters which would be considered markup in DST. That's true for every kind of DST, not just for HTML - but it's particularly important for HTML because the only way we have to test whether our ddoc source is correct is to compile it and view the results in a web browser. If HTML is passed through unfiltered, then we will fail to notice that our ddoc source will not produce valid PDF, for example.
I feel that you don't quite understand what Ddoc is or does. It is not a text to HTML converter. It really does not know anything about HTML, or any other brand of markup language. The only thing Ddoc does is to transform macros embedded in the text into *whatever* the macro has been defined to expand to and to pass all other text *as written*. That's it! The DMD compiler does two indepenant transformations - it first transforms D Comments and some D language constrcuts into Ddoc text, then it transforms the Ddoc text based on the macros that it knows about. By default, Walter has defined those macros to expand into HTML, but the macro definitions can be modified by the developer to expand into anything at all.
 Here's one case where it matters. In the documentation for std.xml,
 the ddoc source says something like
 
     this function converts &amp;amp; to &amp;
Yes, this is a documentation bug. It would have been been written as ... this function converts $(AMP)amp$(SC) to $(AMP)$(SC) AND have a macros 'AMP' and 'SC' defined as ... AMP=& SC=; Because punctuation characters are frequently special to the various markup languages. However, writing like this can be tedious so maybe Ddoc can be improved to help automate this. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 01 2008
parent "Janice Caron" <caron800 googlemail.com> writes:
On 01/03/2008, Derek Parnell <derek psych.ward> wrote:
 I feel that you don't quite understand what Ddoc is or does.
That is certainly possible. My understanding is that it is a simple markup language, whereby (for example) $(I whatever) means "whatever" in italics.
 It is not a
 text to HTML converter.
I certainly do know that.
 It really does not know anything about HTML, or any
  other brand of markup language.
Of course. Nor should it. Just as HTML knows nothing about RTF, or PDF knows nothing about XML. That absolutely makes sense.
 The only thing Ddoc does is to transform
  macros embedded in the text
Now I'm confused. Is ddoc the /format/, or the /tool/? Because I'm hearing inconsistent things here. I thought ddoc was the name of the /format/.
 The DMD
  compiler does two indepenant transformations - it first transforms D
  Comments and some D language constrcuts into Ddoc text,
I thought ddoc comments /were/ ddoc text. Presumably you mean it isolates them from non-ddoc comments, removes the asterisks, and so on?
 then it transforms
 the Ddoc text based on the macros that it knows about.
Ooookaaay. But when DMD produces HTML output from DDOC comments as a result of the -D command line options, where are these macros "that it knows about"? Regardless, wherever it gets them from, the combination of DMD+MACROS together constitute a translation tool, and if that translation tool does not escape what needs to be escaped, then that translation tool is at fault. So maybe I was blaming the wrong component. Certainly (DMD+MACROS) does not translate correctly (because it doesn't escape). I was blaming DMD. Perhaps I should instead have blamed "the macros that it knows about". But whichever element is responsible, that element should do the job properly. If it's going to translate "$(I hello)" into "<i>hello</i>" then it should also translate "<" into "&lt;".
 By default, Walter
 has defined those macros to expand into HTML,
Aha! Then DMD is not at fault at all. But "those macros" are. They /attempt/ to expand ddoc text into HTML, but they don't do the job properly. That is, if I write "&amp;" in the source code, then "&amp;" should be displayed in the destination document, whether that be a web page, a PDF document, or whatever.
 Yes, this is a documentation bug.

  It would have been been written as ...

      this function converts $(AMP)amp$(SC) to $(AMP)$(SC)
OK. That part I could do. I could certainly change the source code to say that - if we are agreed that is the correct thing to do.
  AND have a macros 'AMP' and 'SC' defined as ...

  AMP=&
  SC=;
Defined where? In std.xml.d? Also, defined how?
  Because punctuation characters are frequently special to the various markup
  languages. However, writing like this can be tedious so maybe Ddoc can be
  improved to help automate this.
My feeling is that any charater I include in the source which is /not/ a DDOC macro, should end up in the destination document exactly as it appears in source. If that means escaping should happen along the way, then escaping should happen automatically.
Mar 01 2008
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 In general, any tool which converts SRC -> DST needs to escape all
 characters which would be considered markup in DST.
That is not possible, because ddoc cannot escape every markup in every possible output format past, present, and future. ddoc is output agnostic - the output is controlled by the macro expansion text.
Mar 01 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Janice Caron wrote:
  > In general, any tool which converts SRC -> DST needs to escape all
  > characters which would be considered markup in DST.

 That is not possible, because ddoc cannot escape every markup in every
  possible output format past, present, and future. ddoc is output
  agnostic - the output is controlled by the macro expansion text.
Could you be clearer? I don't know what "macro expansion text" means. Is that the same thing as ddoc source? If so, then you just said "the output is controlled by the input", which is obvious, but incomplete. If not, then what is this "macro expansion text" of which you speak? Coz, if the "macro expansion text" is what decides how to make HTML, then this "macro expansion text" is the thing that should be escaping that which needs to be escaped.
Mar 01 2008
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sat, 1 Mar 2008 23:13:53 +0000, Janice Caron wrote:

 On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Janice Caron wrote:
  > In general, any tool which converts SRC -> DST needs to escape all
  > characters which would be considered markup in DST.

 That is not possible, because ddoc cannot escape every markup in every
  possible output format past, present, and future. ddoc is output
  agnostic - the output is controlled by the macro expansion text.
Could you be clearer? I don't know what "macro expansion text" means. Is that the same thing as ddoc source? If so, then you just said "the output is controlled by the input", which is obvious, but incomplete. If not, then what is this "macro expansion text" of which you speak? Coz, if the "macro expansion text" is what decides how to make HTML, then this "macro expansion text" is the thing that should be escaping that which needs to be escaped.
Everything that we've been saying re Ddoc is already documented in the DMD distribution. Read through http://www.digitalmars.com/d/2.0/ddoc.html. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 01 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 Everything that we've been saying re Ddoc is already documented in the DMD
  distribution. Read through http://www.digitalmars.com/d/2.0/ddoc.html.
Obviously, I have already read that document! (I quoted from it a few posts up).
Mar 01 2008
parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 2 Mar 2008 06:56:02 +0000, Janice Caron wrote:

 On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 Everything that we've been saying re Ddoc is already documented in the DMD
  distribution. Read through http://www.digitalmars.com/d/2.0/ddoc.html.
Obviously, I have already read that document! (I quoted from it a few posts up).
From your posts, it is *not* obvious that you have read it, even though some parts were "quoted". It really isn't that hard to understand but you seem to be having a devil of a time doing so. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 01 2008
parent Tomas Lindquist Olsen <tomas famolsen.dk> writes:
Derek Parnell wrote:
 On Sun, 2 Mar 2008 06:56:02 +0000, Janice Caron wrote:
 
 On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 Everything that we've been saying re Ddoc is already documented in the DMD
  distribution. Read through http://www.digitalmars.com/d/2.0/ddoc.html.
Obviously, I have already read that document! (I quoted from it a few posts up).
From your posts, it is *not* obvious that you have read it, even though some parts were "quoted". It really isn't that hard to understand but you seem to be having a devil of a time doing so.
Cmon people. This discussion is not longer about Ddoc. Clearly people just disagree what Ddoc should really do, regardless of how (well or not) it's documented. IMHO we should just ditch it and put and effort into getting doxygen fixed for D.
Mar 02 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Janice Caron wrote:
  > In general, any tool which converts SRC -> DST needs to escape all
  > characters which would be considered markup in DST.

 That is not possible, because ddoc cannot escape every markup in every
  possible output format past, present, and future. ddoc is output
  agnostic - the output is controlled by the macro expansion text.
I don't know what "macro expansion text" means.
Given the macro: FOO= bar $0 def the source: $(FOO abc) expands to: bar abc def It's simple text replacement. Ddoc can use a default set of macro definitions, or the user can supply one.
Mar 01 2008
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 02/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Given the macro:
         FOO= bar $0 def
  the source:
         $(FOO abc)
  expands to:
         bar abc def

  It's simple text replacement. Ddoc can use a default set of macro
  definitions, or the user can supply one.
OK, so this brings us back to the question I asked earlier, which I shall rephrase and ask again. I wish for the generated text to read: this function converts &amp; to & What should the ddoc source be? I've read the docs over at http://www.digitalmars.com/d/2.0/ddoc.html many times, and if there's an answer to that question there, then I must have missed it.
Mar 01 2008
parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 2 Mar 2008 07:02:09 +0000, Janice Caron wrote:

 On 02/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Given the macro:
         FOO= bar $0 def
  the source:
         $(FOO abc)
  expands to:
         bar abc def

  It's simple text replacement. Ddoc can use a default set of macro
  definitions, or the user can supply one.
OK, so this brings us back to the question I asked earlier, which I shall rephrase and ask again. I wish for the generated text to read: this function converts &amp; to & What should the ddoc source be? I've read the docs over at http://www.digitalmars.com/d/2.0/ddoc.html many times, and if there's an answer to that question there, then I must have missed it.
I already told you! That's why I have doubts about you actually reading stuff. So here it is again... this function converts $(AMP)amp$(SC) to $(AMP) AMP=& SC=; -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 01 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 I already told you!
But I asked a follow-up question, which you didn't answer.
 That's why I have doubts about you actually reading
  stuff.
Oh, don't doubt me on that one. I'm very studious. And actually, your implication there is kinda insulting, because it sort of assumes I'm too stupid to get it (or else actually lying!). Well, maybe I am too stupid to get it. Maybe that's the explanation.
 So here it is again...

   this function converts $(AMP)amp$(SC) to $(AMP)

 AMP=&
  SC=;
And here's my followup question again. WHERE do I write AMP=& SC=; (although in fact I believe it should be "AMP=&amp;", not "AMP=&", because "&amp;", not "&", is what needs to appear in the generated HTML source). WHERE do I write that? Do I write it in the source code of my .d file? Because, my understanding is that the .d source should only contain the macros, not the macro definitions. (In fact, those definitions /must/ be external to the .d file, because otherwise you wouldn't want HTML-specific defintions in a destination-independent file). Thank you for your kind understanding of my clear lack of intelligence!
Mar 01 2008
parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 2 Mar 2008 07:51:21 +0000, Janice Caron wrote:

 On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 I already told you!
But I asked a follow-up question, which you didn't answer.
But you didn't actually ask that question. So I didn't answer it.
 That's why I have doubts about you actually reading
  stuff.
Oh, don't doubt me on that one. I'm very studious. And actually, your implication there is kinda insulting, because it sort of assumes I'm too stupid to get it (or else actually lying!). Well, maybe I am too stupid to get it. Maybe that's the explanation.
 So here it is again...

   this function converts $(AMP)amp$(SC) to $(AMP)

 AMP=&
  SC=;
And here's my followup question again. WHERE do I write AMP=& SC=;
This is the first time I've seen this question. But of course you already know the answer because you've read the documentation, which clearly states where macros a defined. But in any case, here is the quote from the documentation... " Macro definitions come from the following sources, in the specified order: 1. Predefined macros. 2. Definitions from file specified by sc.ini's DDOCFILE setting. 3. Definitions from *.ddoc files specified on the command line. 4. Runtime definitions generated by Ddoc. 5. Definitions from any Macros: sections. "
 (although in fact I believe it should be "AMP=&amp;", not "AMP=&",
 because "&amp;", not "&", is what needs to appear in the generated
 HTML source).
See, you are getting it. You're right and I'm wrong.
 WHERE do I write that? Do I write it in the source code of my .d file?
 Because, my understanding is that the .d source should only contain
 the macros, not the macro definitions.
You're understanding is not quite accurate I'm afraid.
 (In fact, those definitions
 /must/ be external to the .d file, because otherwise you wouldn't want
 HTML-specific defintions in a destination-independent file).
Well, rather than "must" I'd use "should". You are allowed to put macro definitions in your source code, but it can lead to format-dependant layouts. Here is a quick example that you can compile with the -D switch to see what I mean. // ------------ /****************** * main is the entry point. * * Macros: * AMP=&amp; * * Description: * For demo purposes, this function converts $(AMP)amp; to $(AMP) * * Params: * pArgs = A list of strings from the command line. * */ void main(string[] pArgs) { } // ------------- Now compile this with "dmd -D" I would not recommend doing it this way. I'd put the macro definition in a .ddoc file and put that on the command line too.
 Thank you for your kind understanding of my clear lack of intelligence!
You're welcome. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 02 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
  But in any case, here is the quote from the documentation...

  "
  Macro definitions come from the following sources, in the specified order:
   1. Predefined macros.
   2. Definitions from file specified by sc.ini's DDOCFILE setting.
   3. Definitions from *.ddoc files specified on the command line.
   4. Runtime definitions generated by Ddoc.
   5. Definitions from any Macros: sections.

 "
You see, the problem is, I have no control whatsoever over Phobos's build process. When Walter does a release, he just types "make", and all of the source files, std.xml.d included, get built with the exact same command line options. I can't change that. And I /certainly/ don't want to go messing around with the makefile or fiddling with what ddoc does. That's /way/ outside my area. Therefore, I cannot influence 1, 2, 3, or 4. The only one I can influence is 5, but if I put the macro definition actually /inside/ std.xml.d, then it has the same effect as not using a macro at all.
 You're understanding is not quite accurate I'm afraid.
I thought not. But am I getting it now?
 Well, rather than "must" I'd use "should". You are allowed to put macro
  definitions in your source code, but it can lead to format-dependant
  layouts.
which is exactly the same as not using macros at all. I mean, I might as well just put this function converts &amp;amp; to &amp; directly into the ddoc comments. (That's how things are right now, by the way). Using macros which were themselves defined in the same source code wouldn't make the result any less format dependent.
  I would not recommend doing it this way. I'd put the macro definition in a
  .ddoc file and put that on the command line too.
Right! And that's what I've been trying to get at all along. I have no control - /zero/ control - over how Phobos web pages are generated from Phobos source files. That is completely out of my hands. The /only/ thing I am able to do is to put ddoc comments inside the source file, and trust the automated build process to churn out the documentation when the next release happens. So far as I can tell, the only way to make this format-independent would be if those macro definitions were present in wherever the Phobos build process gets its defintions for $(I ...) etc from. To that end, I believe that the only solution available to me would be for Walter to add the following macros to the default macros: For HTML generation: $(AMP) -> &amp; $(LT) -> &lt; $(GT) -> &gt; $(QUOT) -> &quot; $(APOS) -> &apos; For non-HTML generation: $(AMP) -> & $(LT) -> < $(GT) -> > $(QUOT) -> " $(APOS) -> ' And while where at it, some macros to emit colon and right-bracket would be handy too, since these get (sometimes erroneously) interpetted as special to ddoc. I think this affects everyone though - not just people who commit to Phobos. Sure - if you have some degree of control over the build process you can change what macros are used to expand the ddoc, but I believe that the /default/ macros should be sufficient to allow one to write format-independent ddoc comments, and right now, they are not. Correct me if I'm wrong - but I believe I've got it right this time. :-)
Mar 02 2008
next sibling parent reply BCS <ao pathlink.com> writes:
Reply to Janice,

 On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 
 But in any case, here is the quote from the documentation...
 
 "
 Macro definitions come from the following sources, in the specified
 order:
 1. Predefined macros.
 2. Definitions from file specified by sc.ini's DDOCFILE setting.
this can be done by alter the sc.ini file on /your/ computer and then running dmd (you can effect this)
 3. Definitions from *.ddoc files specified on the command line.
this is done by giving dmd a command line flag when you run it.
 4. Runtime definitions generated by Ddoc.
 5. Definitions from any Macros: sections.
 "
(read the bit at the end before you jump on me for this) If you are trying to effect the output of DMD on a project where you are unwilling to change that projects makefile then I guess you might be out of luck.
 
You see, the problem is, I have no control whatsoever over Phobos's build process. When Walter does a release, he just types "make", and all of the source files, std.xml.d included, get built with the exact same command line options.
I haven't followed the details of this thread but... If you are trying to rebuild the phobos docs with a different set of macros and you want to do thins without dinking with the phobos sources or makefile, How would you propose changing the macros?
 I can't change that. And I /certainly/
 don't want to go messing around with the makefile or fiddling with
 what ddoc does. That's /way/ outside my area. Therefore, I cannot
 influence 1, 2, 3, or 4.
 
 The only one I can influence is 5, but if I put the macro definition
 actually /inside/ std.xml.d, then it has the same effect as not using
 a macro at all.
 
 You're understanding is not quite accurate I'm afraid.
 
I thought not. But am I getting it now?
 Well, rather than "must" I'd use "should". You are allowed to put
 macro definitions in your source code, but it can lead to
 format-dependant layouts.
 
which is exactly the same as not using macros at all. I mean, I might as well just put this function converts &amp;amp; to &amp; directly into the ddoc comments. (That's how things are right now, by the way). Using macros which were themselves defined in the same source code wouldn't make the result any less format dependent.
(for lack of a better place in this post to put it): The problem with escaping raw output is that DMD would need to include something akin to a full lex/yacc and more to do this. This is because absolutely anything could be raw output. I could use DDoc to generate output for a system that has any arbitrary syntax. To make a system that can work with defining the rules that would allow DDoc to find the stuff to escape would be a huge project in and of it's self. To take a fun example, say I want to make ddoc generate D (don't ask why, I've made use of crazier things) some string would need to be escaped ( \n for example) but only if they are not in quotes ("\n") but maybe not all quotes (`\n`). There are other reasons to not escape; when do you escape? Say I define a macro like this "AB=A $(B $1))" This macro will insert an A followed by B expanding it's arg. This might be on top of a language package (in which cases the A should be escaped if it has special meaning) or as part of a language package (in which cases it should not be escaped). To start having DDoc escape stuff just opens a huge can of worms that /has not solution/ so it just doesn't bother. I hope some of that is useful.
 I would not recommend doing it this way. I'd put the macro definition
 in a .ddoc file and put that on the command line too.
 
Right! And that's what I've been trying to get at all along. I have no control - /zero/ control - over how Phobos web pages are generated from Phobos source files. That is completely out of my hands. The /only/ thing I am able to do is to put ddoc comments inside the source file, and trust the automated build process to churn out the documentation when the next release happens.
(read the bit at the end before you jump on me for this) Are you trying to modify the docs on the /web site/ and in the /dmd.zip/? That makes a big difference. If Walter is willing to let you do this then he will be willing to give you access to the stuff needed make the changes he will allow you to make. If he is not willing than it doesn't matter. Talk to Walter directly (he's e-mail is around here somewhere).
 So far as I can tell, the only way to make this format-independent
 would be if those macro definitions were present in wherever the
 Phobos build process gets its defintions for $(I ...) etc from.
 
 To that end, I believe that the only solution available to me would be
 for Walter to add the following macros to the default macros:
 
 For HTML generation:
 $(AMP) -> &amp;
 $(LT) -> &lt;
 $(GT) -> &gt;
 $(QUOT) -> &quot;
 $(APOS) -> &apos;
 For non-HTML generation:
 $(AMP) -> &
 $(LT) -> <
 $(GT) -> >
 $(QUOT) -> "
 $(APOS) -> '
 And while where at it, some macros to emit colon and right-bracket
 would be handy too, since these get (sometimes erroneously)
 interpetted as special to ddoc.
 
 I think this affects everyone though - not just people who commit to
 Phobos. Sure - if you have some degree of control over the build
 process you can change what macros are used to expand the ddoc, but I
 believe that the /default/ macros should be sufficient to allow one to
 write format-independent ddoc comments, and right now, they are not.
 
 Correct me if I'm wrong - but I believe I've got it right this time.
 :-)
 
modifying stuff above; if you are trying to to rebuild the docs on your system in a format independent manner... yes you got it right (at least the part I understand and care about). The doc source as it stands now is a "sick" hoge podge of formatting that needs to be cleaned out. This is known. This needs to be addressed at some point. This also is a result of much of the docs predating the advent of DDoc (lets all say "legacy code" :p ).
Mar 02 2008
next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
BCS wrote:
 Reply to Janice,
 
 On 02/03/2008, Derek Parnell <derek psych.ward> wrote:
 
 (for lack of a better place in this post to put it):
 
 The problem with escaping raw output is that DMD would need to include 
 something akin to a full lex/yacc and more to do this. This is because 
 absolutely anything could be raw output. I could use DDoc to generate 
 output for a system that has any arbitrary syntax. To make a system that 
 can work with defining the rules that would allow DDoc to find the stuff 
 to escape would be a huge project in and of it's self. To take a fun 
 example, say I want to make ddoc generate D (don't ask why, I've made 
 use of crazier things) some string would need to be escaped ( \n for
 example) but only if they are not in quotes ("\n") but maybe not all 
 quotes (`\n`).
 There are other reasons to not escape; when do you escape? Say I define 
 a macro like this "AB=A $(B $1))" This macro will insert an A followed 
 by B expanding it's arg. This might be on top of a language package (in 
 which cases the A should be escaped if it has special meaning) or as 
 part of a language package (in which cases it should not be escaped).
 
 To start having DDoc escape stuff just opens a huge can of worms that 
 /has not solution/ so it just doesn't bother.
 
 I hope some of that is useful.
That's a nice example of why simple macro substitution can never be powerful enough to really solve the markup translation problem. There is a solution, but it requires DDoc learning some new tricks and moving beyond the pure macro processor idea, which I'm not sure Walter is willing to give up. --bb
Mar 02 2008
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
BCS wrote:
 This also is a result of much of the docs predating the
 advent of DDoc (lets all say "legacy code" :p ).
That's so true. I've been slowly cleaning it up.
Mar 02 2008
prev sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Janice Caron wrote:
 You see, the problem is, I have no control whatsoever over Phobos's
 build process. When Walter does a release, he just types "make", and
 all of the source files, std.xml.d included, get built with the exact
 same command line options. I can't change that. And I /certainly/
 don't want to go messing around with the makefile or fiddling with
 what ddoc does. That's /way/ outside my area. Therefore, I cannot
 influence 1, 2, 3, or 4.
This, then, appears to be a problem with your relationship with Walter or more likely your laziness/insistence on not using the *FOUR* different mechanisms provided to you. The solution to this problem is not to suggest a completely different documentation style. While I'm sure everyone is appreciative of your HTML-to-Ddoc converter, many people (including alter it seems) are perfectly happy with Ddoc, and you've failed to make a convincing argument showing something that HTML can do that Ddoc can't.
Mar 02 2008
prev sibling parent reply Roberto Mariottini <rmariottini mail.com> writes:
Walter Bright wrote:
 Janice Caron wrote:
 On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
 Janice Caron wrote:
  > In general, any tool which converts SRC -> DST needs to escape all
  > characters which would be considered markup in DST.

 That is not possible, because ddoc cannot escape every markup in every
  possible output format past, present, and future. ddoc is output
  agnostic - the output is controlled by the macro expansion text.
I don't know what "macro expansion text" means.
Given the macro: FOO= bar $0 def the source: $(FOO abc) expands to: bar abc def It's simple text replacement. Ddoc can use a default set of macro definitions, or the user can supply one.
I don't think this will make Ddoc sufficiently portable across output formats. The source code will be filled with macros to escape character sequences in any supported output format, and become impossible to maintain. For example, now Ddoc generates HTML, so the user defines the macro $(LT) to generate a literal '<' on the output, $(AMP) to generate a literal '&' and so on. Then the user wants to generate TeX output (provided that the base set of macros to generate TeX output is available), so she has to add other macros to escape TeX output, such as $(BACKSLASH), $(LBRACE) and so on. Then the user needs to generate RTF, so she adds a full other set of macros, then another set for troff, and so on. At the end the source code is filled of $(AMP), $(BACKSLASH), $(DOT), $(LBRACE) that add nothing to the formatting of the output document, but are used only to escape output. Ddoc must provide a way to define escaping of characters for every particular output format. I don't know how, but some way must be found. Otherwise programmers will stop to write documentation. Ciao -- Roberto Mariottini, http://www.mariottini.net/roberto/ SuperbCalc, a free tape calculator: http://www.mariottini.net/roberto/superbcalc/
Mar 03 2008
parent reply Derek Parnell <derek psych.ward> writes:
On Mon, 03 Mar 2008 09:30:18 +0100, Roberto Mariottini wrote:


 Ddoc must provide a way to define escaping of characters for every 
 particular output format. I don't know how, but some way must be found.
 
 Otherwise programmers will stop to write documentation.
I was thinking something along the lines of whenever dmd, while generating DDoc code, sees *any* punctuation character it automatically converts it into a DDoc macro for that character. Then then macro expansion can take care of the 'escape' situation. So, if dmd sees ... this function turns &amp; into & it generates ... this function turns $(D_AMP)amp$(D_SC) into $(D_AMP) then uses the 'current' macro set to do the expansion. So if the current macro set is for HTML we have ... D_AMP=&amp; D_SC=; so the end result is ... this function turns &amp;amp; into &amp; OK, so this means that there are a lot more macros to define, but it should make it a lot more portable. I'm not sure if it covers all cases but I think it will cover most. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Mar 03 2008
parent BCS <ao pathlink.com> writes:
Reply to Derek,

 On Mon, 03 Mar 2008 09:30:18 +0100, Roberto Mariottini wrote:
 
 Ddoc must provide a way to define escaping of characters for every
 particular output format. I don't know how, but some way must be
 found.
 
 Otherwise programmers will stop to write documentation.
 
I was thinking something along the lines of whenever dmd, while generating DDoc code, sees *any* punctuation character it automatically converts it into a DDoc macro for that character. Then then macro expansion can take care of the 'escape' situation. So, if dmd sees ... this function turns &amp; into & it generates ... this function turns $(D_AMP)amp$(D_SC) into $(D_AMP) then uses the 'current' macro set to do the expansion. So if the current macro set is for HTML we have ... D_AMP=&amp; D_SC=; so the end result is ... this function turns &amp;amp; into &amp; OK, so this means that there are a lot more macros to define, but it should make it a lot more portable. I'm not sure if it covers all cases but I think it will cover most.
Hmmm. Food for though. What about multi char sequences?
Mar 03 2008
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Janice Caron wrote:
 On 01/03/2008, Walter Bright <newshound1 digitalmars.com> wrote:
For what it matters I think Janice is right in the sense that we have a problem with DDoc here. But let me put it in my terms: I think that there are two issues under discussion here: What DDoc can be. What DDoc currently is (this one might not seem an issue, but it is). == What DDoc can be?: Option 1: DDoc is simply a macro expansion system. Nothing is said about what the output format means, so it can be HTML, PDF, whatever. This is actually pretty bad, since because there are is no standard to what the output means, tools cannot make assumptions about it, and thus work with in a standard way. For example, an IDE could macro-process the DDoc but then it wouldn't know how to render it. Option 2: DDoc is a macro expansion system, but specifies how the output text should be interpreted. It can say it should be interpreted as HTML for example (or a subset of it). Now tools can interpret it, but you can't write output for other targets (or at least shouldn't). Option 3: DDoc is a markup language, besides a macro expansion system. This allows conversion for different targets, such as HTML, PDF, but in my opinion seems overkill. Why invent another markup language, if we already have HTML and others (like Wiki formatting, etc.). To me option 2 seems the ideal. == What DDoc currrently is?: Well, the problem here is that this is ambiguous: Walter stated that: "ddoc is a text macro processing program. It does not know what html is.". That would be option 1. However, not only does this suck, but dmd has functionality that generates HTML output from DDoc, which will makes assuming HTML output somewhat a de-facto standard. For example the Eclipse IDEs (Descent and Mmrnhrmm) assume HTML output when they render DDoc in popups. If you try to put something else there, it will look garbled. So I would argue that DDoc is more like Option 2 than 1. Janice Caron wrote:
 Here's one case where it matters. In the documentation for std.xml,
 the ddoc source says something like

     this function converts &amp;amp; to &amp;

 because that was the only way I could get it to render in a web 
browser as
     this function converts &amp; to &

 The question is, will that render correctly when converted to PDF? And
 if not, what is the destination-independent way of writing something
 that will render as "&amp;"?

 I would call failing to sanitize, a bug. Except I can't, because it's
 behaving as documented.
Following on what I said above, you could only have a destination-independent of writing things if DDoc was option 3 - a proper markup language. Which it isn't. Should it be? Well, that's another discussion. -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mar 05 2008
prev sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Sat, Mar 01, 2008 at 06:51:22PM +0000, Janice Caron wrote:
 DMD should /not/ pass HTML
 through unchanged. It should sanitize the input, in order specifically
 to prevent it from being interpreted as markup in the output format.
Maybe it should have some kind of conversion macros for individual characters. These would applied after the normal parsing of the file; to the step right before the output file is actually generated. For HTML, you would define a macro file that says '&' => '&amp;' '<' => '&lt;' And so on. For LaTeX, you might say '&' => '\&' Etc. This would be somewhat tedious to write, but common output formats could have the appropriate files written once and reused, and it is flexible enough so you could easily expand it to other output formats with different rules. -- Adam D. Ruppe http://arsdnet.net
Mar 01 2008
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Janice Caron, el 23 de febrero a las 08:56 me escribiste:
 On 23/02/2008, Leandro Lucarella <llucax gmail.com> wrote:
  I don't want to use a WYSISWYG HTML editor when writing code.
Nor do I. Perhaps you misunderstood. I was talking about /documentation/, not code. Raw ddoc is absolutely perfect for inline doc comments, and Walter has done a /fantastic/ job in getting a system like that going. What I'm talking is real documentation, like, actual instruction manuals - tutorials, fifty or a hundred pages long, or more when printed. Think "std.whatever For Dummies". A whole book. There is no way I want to write a whole book in ddoc, and /for that purpose/, a wysiwyg editor wouldn't be a bad thing.
Ok, all my other rant was about using HTML for documenting the code. For higher level documentation I agree a WYSIWYG editor could be desirable (I still think that RST format makes writing documentation much more easy, but I don't expect to convince you if you didn't took a look at it yet).
 I don't want to auto-complete anything, that's the point :)
It's /a/ point, but it's not /the/ point. When I request something, it's usually because /I/ want it, so whether or not /you/ want it is really kind of incidental.
But this point remains. *You* don't want to autocomplete anything either. If you do, it's because you are using a complex tool, and you have to use another tool lo lower the complexity :)
 I'm glad for you, but your experience don't make HTML suck less for
  humans. All the tools you mention are needed because it *sucks*. If it
  doesn't suck, you don't need tools to make it suck less.
That's only true if you look at it from the point of view that you're supposed to interact directly with the source, and that's not necessarily so. You could argue in exactly the same way that Microsoft Word format, or RTF, both suck, because you need special tools (i.e. Microsoft Word or some other word processor) to manipulate them.
No I said that HTML sucks for being written by humans, and under that circunstances I will tell you over and over again that yes, RTF and MS Word suck much more. Fortunately nobody writes those formats by hand, but many people write HTML by hand, and use tools just to ease that task (as you said, with auto-completion and such).
 If
 you were to try editing an RTF document by editing the raw text file
 with a plain text editor, I'm sure you would quickly come to the
 conclusion that RTF sucks.
Of course.
 However, that's just not what you do. But
 HTML is really just another document format, like Word document or
 RTF, and so, from that point of view, the underlying representation
 matters less. What matters more is its portability, how easy it is to
 convert it to other formats, and the availability of tools to edit it.
But you want to write HTML by hand!!! That's my point :) I don't write .o by hand either, I let the compiler do it processing some weird format called D, and that's perfectly fine for me :)
 I was talking about HTML sucking for
  being written by humans, not about your particular documentation process
  :)
I get that, but you leapt into a thread that I started, basically telling me that /I/ shouldn't be doing something, or requesting something, because /you/ don't need it. No disrespect intended, but ... huh?
No, I'm opposing to that feature because if HTML is allowed (which apparently is), people will start using it and the documentation comments will became unreadable for humans. Is not that "I don't use it, I don't want anyone else to use it". I just think it will hurt D as a language, encouraging bad practices, and making the posibility to export ddoc to other formats harder and trickier.
  PS: What we do agree is that I don't want to start a war on formats
     either, just because is too naive to think that Walter will change
     DDoc for something else.
And nor do I. It's a solved problem. However, it still remains the case that ddoc documentation /may contain raw HTML, but its use is strongly discouraged/. That, I think, is bad. Either don't allow it all, or fully support it.
We agree on this :) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Los sueños de los niños son especialmente importantes en su etapa de formación; si un niño no sueña es que será un pelotudo toda la vida. -- Ricardo Vaporeso
Feb 23 2008
prev sibling parent Alexander Panek <alexander.panek brainsware.org> writes:
Janice Caron wrote:
 What I'm talking is real documentation, like, actual instruction
 manuals - tutorials, fifty or a hundred pages long, or more when
 printed. Think "std.whatever For Dummies". A whole book. There is no
 way I want to write a whole book in ddoc, and /for that purpose/, a
 wysiwyg editor wouldn't be a bad thing.
Don't tell me you write, or intend to write, books in HTML. Please.
Feb 23 2008
prev sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Janice Caron wrote:
 Then, when the doc is compiled, the contents of an <?html...?> blob
 can just be pasted into the generated document, after some small
 amount of validation.
I'm for it... unless you're suggesting getting rid of tags & the default look, in which case I'm against it. Keep in mind that doc comments are used by more than just doc generators (IDEs, static analysis tools...), and should be simple to write in the common case of documenting a function, so tags and other non-formatting metadata should remain intact or otherwise be integrated into any new system.
Feb 22 2008
parent "Janice Caron" <caron800 googlemail.com> writes:
On 22/02/2008, Robert Fraser <fraserofthenight gmail.com> wrote:
 I'm for it... unless you're suggesting getting rid of tags & the default
  look, in which case I'm against it. Keep in mind that doc comments are
  used by more than just doc generators (IDEs, static analysis tools...),
  and should be simple to write in the common case of documenting a
  function, so tags and other non-formatting metadata should remain intact
  or otherwise be integrated into any new system.
Well, there's good news and there's bad news. The good news is that I've written an application called ddocker which converts HTML to ddoc, so all you have to do is write your HTML using your editor of choice (so you get syntax highlighting, auto-completion, wysiwyg, or whatever HTML editing methodology you prefer), run ddocker on it, and paste the resulting output into D source code, which will then compile as error-free ddoc. It took me about half an hour to write the program, which I consider to be a much better use of my time than learning all the ddoc formatting codes. (I don't have time to learn a whole new language). The bad news is that I discovered two bugs in std.xml. They're small, but enough to stop ddocker from working. I have now fixed those bugs in std.xml, but obviously the rest of you won't see the fixes until D2.012, which means you won't be able to build ddocker until then. (It took me longer to fix std.xml than it did to write ddocker). Oh, and Walter says that ddoc code can already contain HTML, but that its use is strongly discouraged for various good reasons. Perhaps, in the future, after std.xml is stable and ddocker has a good track record, it might be possible to have HTML-within-ddoc auto-compile into ddoc (and then, ultimately, back to HTML or pdf or whatever as the final destination). But for now, ddocker is a good compromise. Anyway - time well spent, I'd say. I've got a nice little app, and I've fixed some bugs. Life is good.
Feb 22 2008