From 28aa78ac9465788c745b1abb9b59a2723282c4fa Mon Sep 17 00:00:00 2001 From: Jorge Arellano Cid Date: Wed, 11 May 2016 10:02:49 -0300 Subject: Fixed handling of BODY and HTML tags. Also improved their html-bug messages. BODY and HTML have optional open and close, making them tricky to handle. Even more when considering Tag soup pages with multiple body or html sections, and corner cases. This patch tackles the problems by leaving the first HTML and BODY stack elements open, until EOF. There's also better html-bug detection and messages, and more accurate comments in the code. Beware: it may look simple, but it's not! --- src/html_common.hh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'src/html_common.hh') diff --git a/src/html_common.hh b/src/html_common.hh index 68ed0d08..6d0d8c62 100644 --- a/src/html_common.hh +++ b/src/html_common.hh @@ -177,9 +177,11 @@ public: //BUG: for now everything is public bool PrevWasCR; /* Flag to help parsing of "\r\n" in PRE tags */ bool PrevWasOpenTag; /* Flag to help deferred parsing of white space */ bool InVisitedLink; /* used to 'contrast_visited_colors' */ - bool ReqTagClose; /* Flag to help handling bad-formed HTML */ + bool ReqTagClose; /* Flag to close the stack's top tag */ bool TagSoup; /* Flag to enable the parser's cleanup functions */ bool loadCssFromStash; /* current stash content should be loaded as CSS */ + bool PrevWasBodyClose; /* set when is found */ + bool PrevWasHtmlClose; /* set when is found */ /* element counters: used for validation purposes. * ATM they're used as three state flags {0,1,>1} */ -- cgit v1.2.3