digitalmars.D.bugs - Fix for endless loop with HTML files

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (11/11) Nov 09 2004 The problem is when it encounters a '<'

David Friedman (4/20) Nov 13 2004 Thanks! I don't think "< CODE >" is valid HTML, so this patch should be...

Thomas Kuehne (14/19) Nov 13 2004 I've run the attached file (with "< code>" and "")through the va...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (10/33) Nov 14 2004 If you validate as XHTML, instead of the old HTML 4.0,

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

The problem is when it encounters a '<'
character, that is *not* followed by a
valid tag start, according to istagstart()

Such as: < CODE >, and similar constructs.

Then it fails to advance the pointer (!),
and keeps on scanning the '<' character
forever. Inserting a "else p++;" works...

You still need <code> and </code> in order
for D to actually parse the code, but at
least it comes back again with this patch.

--anders

Nov 09 2004

David Friedman <d3rdclsmail_a_ _t_earthlink_d_._t_net> writes:

Anders F Bj�rklund wrote:
 The problem is when it encounters a '<'
 character, that is *not* followed by a
 valid tag start, according to istagstart()
 
 Such as: < CODE >, and similar constructs.
 
 Then it fails to advance the pointer (!),
 and keeps on scanning the '<' character
 forever. Inserting a "else p++;" works...
 
 You still need <code> and </code> in order
 for D to actually parse the code, but at
 least it comes back again with this patch.
 
 --anders
 

Thanks!  I don't think "< CODE >" is valid HTML, so this patch should be 
enough.

David

Nov 13 2004

"Thomas Kuehne" <thomas-dloop kuehne.cn> writes:

David Friedman schrieb:
 You still need <code> and </code> in order
 for D to actually parse the code, but at
 least it comes back again with this patch.

 Thanks!  I don't think "< CODE >" is valid HTML, so this patch should be
 enough.


I've run the attached file (with "< code>" and "</ code>")through the validator
at
http://validator.w3.org/check

An got no warnings or errors.

Thomas


begin 666 html-4.1-tagspace.html


M:'1T<"UE<75I=CTB0V]N=&5N="U4>7!E(B!C;VYT96YT/2)T97AT+VAT;6P[

M=&QE/ T*"3PO:&5A9#X-" D\8F]D>3X-" D)/"!C;V1E(#X-" D)/"\ 8V]D

`
end

Nov 13 2004

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Thomas Kuehne wrote:

 I've run the attached file (with "< code>" and "</ code>")through the
 validator at http://validator.w3.org/check
 
 An got no warnings or errors.

If you validate as XHTML, instead of the old HTML 4.0,
you get the following validation errors (from same w3):

1.
 Line 9, column 2: character "<" is the first character of a delimiter
 but occurred as data
 
 < code >
 
 If you wish to include the "<" character in your output, you should
 escape it as "&lt;". Another possibility is that you forgot to close
 quotes in a previous tag.

2.
 Line 9, column 2: character data is not allowed here
 
 < code >
 
 You have used character data somewhere it is not permitted to appear.
 
 Mistakes that can cause this error include putting text directly in the
 body of the document without wrapping it in a container element (such as
 a <p>aragraph</p>) or forgetting to quote an attribute value (where
 characters such as "%" and "/" are common, but cannot appear without
 surrounding quotes).

In general, XHTML and UTF-8 are now recommended
instead of the old HTML 4.01 and ISO-8859-1...


Anyway, nobody writes < code > in any real stuff.
It's just that it's nice if D doesn't HANG on it. :-)

--anders

Nov 14 2004

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - Fix for endless loop with HTML files