4 files changed, 96 insertions, 110 deletions
diff --git a/doc/Cache.txt b/doc/Cache.txt
index ac1ecf87..413773ef 100644
--- a/doc/Cache.txt
+++ b/doc/Cache.txt
@@ -1,5 +1,5 @@
  June 2000, --Jcid
- Last update: Oct 2004
+ Last update: Jul 09
 
                               -------
                                CACHE
@@ -12,28 +12,21 @@ rendering and networking.
 calls  the  cache  or  the  dpi routines depending on the type of
 request.
 
-   Every  URL  must be requested using a_Capi_open_url, no matter
-if  it is a http, file, dpi or whatever type of request. The capi
-asks  the  dpi  module  for dpi URLs and the Cache for everything
-else.
+   Every  URL  must be requested using a_Capi_open_url, which 
+sends the request to the cache if the data is cached, to dillo's
+http module for http: URLs, and through dillo's DPI system for
+other URLs.
 
    Here we'll document non dpi requests.
 
-   The  cache,  at its turn, sends the requested-data from memory
-(if  cached),  or opens a new network connection (if not cached).
-
-   This  means  that  no  mattering whether the answer comes from
-memory  or  the  net,  the  client  requests  it through the capi
-wrapper, in a single uniform way.
-
 
                          ----------------
                          CACHE PHILOSOPHY
                          ----------------
 
-   Dillo's  cache  is  very  simple, every single resource that's
-retrieved  (URL)  is  kept  in  memory. NOTHING is saved. This is
-mainly for three reasons:
+   Dillo's  cache  is  very  simple; every single resource that's
+retrieved  (URL)  is  kept  in  memory. NOTHING is saved to disk.
+This is mainly for three reasons:
 
    - Dillo encourages personal privacy and it assures there'll be
 no recorded tracks of the sites you visited.
@@ -42,7 +35,7 @@ no recorded tracks of the sites you visited.
 serve as caches.
 
    -  If  you still want to have cached stuff, you can install an
-external cache server (as WWWOFFLE), and benefit from it.
+external cache server (such as WWWOFFLE), and benefit from it.
 
 
                          ---------------
@@ -51,15 +44,14 @@ external cache server (as WWWOFFLE), and benefit from it.
 
    Currently, dillo's cache code is spread in different sources:
 mainly  in  cache.[ch],  dicache.[ch]  and  it  uses  some  other
-functions from mime.c, Url.c and web.c.
+functions from mime.c and web.cc.
 
-   Cache.c  is  the  principal  source,  and  it also is the main
+   Cache.c  is  the  principal  source,  and  it also is the one
 responsible  for  processing  cache-clients  (held  in  a queue).
-Dicache.c  is  the  "decompressed  image  cache" and it holds the
-original    data   and   its   corresponding   decompressed   RGB
-representation (more on this subject in Images.txt).
+Dicache.c  is  the interface to the decompressed RGB representations
+of currently-displayed images held in DW's imgbuf.
 
-   Url.c,  mime.c  and  web.c  are  used  for secondary tasks; as
+   mime.c  and  web.cc  are  used  for secondary tasks such as
 assigning the right "viewer" or "decoder" for a given URL.
 
 
@@ -67,7 +59,7 @@ assigning the right "viewer" or "decoder" for a given URL.
 A bit of history
 ----------------
 
-   Some  time  ago,  the  cache  functions,  URL  retrieving  and
+   Some  time  ago,  the  cache  functions,  URL  retrieval  and
 external  protocols  were  a whole mess of mixed code, and it was
 getting  REALLY hard to fix, improve or extend the functionality.
 The  main  idea  of  this  "layering" is to make code-portions as
@@ -76,32 +68,34 @@ improved or replaced without affecting the rest of the browser.
 
    An  interesting  part of the process is that, as resources are
 retrieved,  the  client  (dillo  in  this  case) doesn't know the
-Content-Type  of the resource at request-time. It only gets known
-when  the  resource  header  is retrieved (think of http), and it
-happens  when  the  cache  has the control so, the cache sets the
-proper  viewer for it! (unless the Callback function is specified
-with the URL request).
+Content-Type  of the resource at request-time. It only becomes known
+when  the  resource  header  is retrieved (think of http). This
+happens  when  the  cache  has control, so the cache sets the
+proper  viewer for it (unless the Callback function was already
+specified with the URL request).
 
    You'll find a good example in http.c.
 
-   Note:  Files  don't have a header, but the file handler inside
-dillo  tries  to  determine the Content-Type and sends it back in
-HTTP form!
+   Note: All resources received by the cache have HTTP-style headers.
+   The file/data/ftp DPIs generate these headers when sending their
+   non-HTTP resources. Most importantly, a Content-Type header is
+   generated based on file extension or file contents.
 
 
 -------------
 Cache clients
 -------------
 
-   Cache clients MUST use a_Cache_open_url to request an URL. The
+   Cache clients MUST use a_Capi_open_url to request an URL. The
 client structure and the callback-function prototype are defined,
 in cache.h, as follows:
 
 struct _CacheClient {
-   gint Key;                /* Primary Key for this client */
-   const char *Url;         /* Pointer to a cache entry Url */
-   guchar *Buf;             /* Pointer to cache-data */
-   guint BufSize;           /* Valid size of cache-data */
+   int Key;                 /* Primary Key for this client */
+   const DilloUrl *Url;     /* Pointer to a cache entry Url */
+   int Version;             /* Dicache version of this Url (0 if not used) */
+   void *Buf;               /* Pointer to cache-data */
+   uint_t BufSize;          /* Valid size of cache-data */
    CA_Callback_t Callback;  /* Client function */
    void *CbData;            /* Client function data */
    void *Web;               /* Pointer to the Web structure of our client */
@@ -124,28 +118,15 @@ Key-functions descriptions
 --------------------------
 
 贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩
-int a_Cache_open_url(const char *Url, CA_Callback_t Call, void *CbData)
+int a_Cache_open_url(void *Web, CA_Callback_t Call, void *CbData)
 
-   if Url is not cached
+   if Web->url is not cached
       Create a cache-entry for that URL
       Send client to cache queue
-      Initiate a new connection
    else
       Feed our client with cached data
 
 贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩
-ChainFunction_t a_Url_get_ccc_funct(const char *Url)
-
-   Scan the Url handlers for a handler that matches
-   If found
-      Return the CCC function for it
-   else
-      Return NULL
-
-   *  Ex:  If  Url is an http request, a_Http_ccc is the matching
-handler.
-
-贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩贩
 
 ----------------------
 Redirections mechanism
@@ -177,9 +158,9 @@ Notes
 to document it in more detail later (source is commented).
    Currently  I  have  a drawing to understand it; hope the ASCII
 translation serves the same as the original.
-   If  you're  planning to understand the cache process troughly,
-write  me  a  note,  just  to assign a higher priority on further
-improving of this doc.
+   If  you're  planning to understand the cache process thoroughly,
+write  me  a  note and I will assign higher priority to further
+improvement of this doc.
    Hope this helps!
 
 
diff --git a/doc/Dillo.txt b/doc/Dillo.txt
index 47f89780..a2c80afe 100644
--- a/doc/Dillo.txt
+++ b/doc/Dillo.txt
@@ -31,17 +31,17 @@ neccesary data structures and mechanisms for graphical rendering.
 engine  that  handles file descriptor activity, the cache acts as
 the  main  abstraction  layer  between  rendering and networking.
    Every  URL,  whether  cached  or  not, must be retrieved using
-a_Cache_open_url   (Described   briefly   in   Cache.txt,  source
-contained in cache.c).
-   IO is described in IO.txt (recommended), source in IO/.
+a_Capi_open_url   (Described   briefly   in   Cache.txt,  source
+contained in capi.c).
+   IO is described in IO.txt (recommended), source in src/IO/.
 
    3.-  The  HTML  parser: A streamed parser that joins the Dillo
 Widget  and  the  Cache  functionality  to make browsing possible
-(Described in HtmlParser.txt, source mainly inside html.c).
+(Described in HtmlParser.txt, source mainly inside html.cc).
 
    4.-  Image  processing  code:  The  part  that  handles  image
-retrieving,  decoding,  caching  and  displaying.  (Described  in
-Images.txt.   Sources:  image.c,  dw_image.c,  dicache.c,  gif.c,
+retrieval,  decoding,  caching  and  displaying.  (Described  in
+Images.txt.   Sources:  image.c,  dw/image.cc,  dicache.c,  gif.c,
 jpeg.c and png.c)
 
    5.- The dpi framework: a gateway to interface the browser with
@@ -55,34 +55,27 @@ Dpi spec: http://www.dillo.org/dpi1.html
 
 (A short description of the internal function calling process)
 
-   When  the  user requests a new URL, a_Interface_entry_open_url
+   When  the  user requests a new URL, a_UIcmd_open_url
 is  queried to do the job; it calls a_Nav_push (The highest level
 URL  dispatcher); a_Nav_push updates current browsing history and
 calls  Nav_open_url.  Nav_open_url closes all open connections by
-calling  a_Interface_stop  and  a_Interface_stop,  and then calls
-a_Capi_open_url wich calls a_Cache_open_url (or the dpi module if
+calling  a_Bw_stop_clients,  and then calls
+a_Capi_open_url which calls a_Cache_open_url (or the dpi module if
 this gateway is used).
 
-   If  Cache_search  hits  (due to a cached url :), the client is
+   If  Cache_entry_search  hits  (due to a cached url :), the client is
 fed  with cached data, but if the URL isn't cached yet, a new CCC
-(Concomitant  Control Chain) is created and commited to fetch the
-URL.  Note  that  a_Cache_open_url will return the requested URL,
-whether cached or not.
+(Concomitant  Control Chain) is created and committed to fetch the
+URL.
 
    The  next  CCC  link  is dynamically assigned by examining the
-URL's protocol. It can be:
+URL's protocol. It can be a_Http_ccc or a_Dpi_ccc.
 
-   a_Http_ccc
-   a_File_ccc
-   a_About_ccc
-   a_Plugin_ccc (not implemented yet)
-
-
-   If  we  have a HTTP URL, a_Http_ccc will succeed, and the http
+   If  we  have an HTTP URL, a_Http_ccc will succeed, and the http
 module  will  be linked; it will create the proper HTTP query and
 link the IO module to submit and deliver the answer.
 
-   Note  that  as the Content-type of the URL is not always known
+   Note  that  as the Content-Type of the URL is not always known
 in  advance, the answering branch decides where to dispatch it to
 upon HTTP header arrival.
 
diff --git a/doc/HtmlParser.txt b/doc/HtmlParser.txt
index ec64164d..891eafd6 100644
--- a/doc/HtmlParser.txt
+++ b/doc/HtmlParser.txt
@@ -1,5 +1,5 @@
  October 2001, --Jcid
- Last update: Dec 2004
+ Last update: Jul 2009
 
                         ---------------
                         THE HTML PARSER
@@ -11,7 +11,7 @@ and  plain  text  also.  It  has  parsing 'modes' that define its
 behaviour while working:
 
    typedef enum {
-     DILLO_HTML_PARSE_MODE_INIT,
+     DILLO_HTML_PARSE_MODE_INIT = 0,
      DILLO_HTML_PARSE_MODE_STASH,
      DILLO_HTML_PARSE_MODE_STASH_AND_BODY,
      DILLO_HTML_PARSE_MODE_BODY,
@@ -22,12 +22,12 @@ behaviour while working:
 
    The  parser  works  upon a token-grained basis, i.e., the data
 stream is parsed into tokens and the parser is fed with them. The
-process  is  simple:  whenever  the  cache  has new data, it gets
+process  is  simple:  whenever  the  cache  has new data, it is
 passed to Html_write, which groups data into tokens and calls the
-appropriate functions for the token type (TAG, SPACE or WORD).
+appropriate functions for the token type (tag, space, or word).
 
    Note:   when  in  DILLO_HTML_PARSE_MODE_VERBATIM,  the  parser
-doesn't  try  to  split  the  data stream into tokens anymore, it
+doesn't  try  to  split  the  data stream into tokens anymore; it
 simply collects until the closing tag.
 
 ------
@@ -65,14 +65,14 @@ TOKENS
     As  it's  a  common  mistake  for human authors to mistype or
     forget  one  of  the  quote  marks of an attribute value; the
     parser   solves  the  problem  with  a  look-ahead  technique
-    (otherwise  the  parser  could  skip significative amounts of
-    well written HTML).
+    (otherwise  the  parser  could  skip significant amounts of
+    properly-written HTML).
 
 
 
   * WORD                       --> Html_process_word
 
-    A  word is anything that doesn't start with SPACE, and that's
+    A  word is anything that doesn't start with SPACE, that's
     outside  of  a  tag, up to the first SPACE or tag start.
 
     SPACE = ' ' | \n | \r | \t | \f | \v
@@ -84,26 +84,34 @@ THE PARSING STACK
 
   The parsing state of the document is kept in a stack:
 
-  struct _DilloHtml {
+  class DilloHtml {
      [...]
-     DilloHtmlState *stack;
-     gint stack_top; /* Index to the top of the stack [0 based] */
-     gint stack_max;
+     lout::misc::SimpleVector<DilloHtmlState> *stack;
      [...]
   };
 
   struct _DilloHtmlState {
-    char *tag;
-    DwStyle *style, *table_cell_style;
-    DilloHtmlParseMode parse_mode;
-    DilloHtmlTableMode table_mode;
-    gint list_level;
-    gint list_number;
-    DwWidget *page, *table;
-    gint32  current_bg_color;
+     CssPropertyList *table_cell_props;
+     DilloHtmlParseMode parse_mode;
+     DilloHtmlTableMode table_mode;
+     bool cell_text_align_set;
+     DilloHtmlListMode list_type;
+     int list_number;
+
+     /* TagInfo index for the tag that's being processed */
+     int tag_idx;
+
+     dw::core::Widget *textblock, *table;
+
+     /* This is used to align list items (especially in enumerated lists) */
+     dw::core::Widget *ref_list_item;
+  
+     /* This is used for list items etc; if it is set to TRUE, breaks
+        have to be "handed over" (see Html_add_indented and
+        Html_eventually_pop_dw). */
+     bool hand_over_break;
   };
 
-
   Basically,  when a TAG is processed, a new state is pushed into
 the  'stack'  and  its  'style'  is  set  to  reflect the desired
 appearance (details in DwStyle.txt).
diff --git a/doc/README b/doc/README
index b3fbd136..8b0a8d63 100644
--- a/doc/README
+++ b/doc/README
@@ -1,15 +1,21 @@
-README: Last update Oct 2008  --jcid
+README: Last update Jul 2009
 
- This documents need a review. They were current with dillo1 and
-now, with dillo2, most of them are obsolete, specially Dw*txt, but
-Dw2 is fully documented in html using doxygen.
+These documents cover dillo's internals.
+For user help, see http://www.dillo.org/dillo2-help.html
 
- The other documents will be reviewed when I have some time. They
-will give you an overview of what's going on but take them with a
-pinch of salt.
+--------------------------------------------------------------------------
 
- Of course I'd like to have these as doxygen files too!
-If somebody wants to make this convertion, please let me know
+These documents need a review.
+*.txt were current with Dillo1, but many have since become more or
+      less out-of-date.
+*.doc are doxygen source for the Dillo Widget (dw) component, and
+      were written for Dillo2.
+
+They will give you an overview of what's going on, but take them
+with a pinch of salt.
+
+ Of course I'd like to have *.txt as doxygen files too!
+If somebody wants to make this conversion, please let me know
 to assign higher priority to updating these docs.
 
 --
@@ -31,16 +37,14 @@ Jorge.-
    Cookies.txt         Explains how to enable cookies       Current
    Dpid.txt            Dillo plugin daemon                  Current
  --------------------------------------------------------------------------
- [This documents cover dillo's internal working. They're NOT a user manual]
- --------------------------------------------------------------------------
 
 
- * Ah!, there's a small program (srch) within the src/ dir. It searches
+ * BTW, there's a small program (srch) within the src/ dir. It searches
  tokens within the whole code (*.[ch]). It has proven very useful.
  Ex:  ./srch a_Image_write
       ./srch todo:
 
- * Please submit your patches with 'diff -pru'.
+ * Please submit your patches with 'hg diff'.
 
 
  Happy coding!