aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/Cache.txt89
-rw-r--r--doc/Dillo.txt37
-rw-r--r--doc/HtmlParser.txt50
-rw-r--r--doc/README30
4 files changed, 96 insertions, 110 deletions
diff --git a/doc/Cache.txt b/doc/Cache.txt
index ac1ecf87..413773ef 100644
--- a/doc/Cache.txt
+++ b/doc/Cache.txt
@@ -1,5 +1,5 @@
June 2000, --Jcid
- Last update: Oct 2004
+ Last update: Jul 09
-------
CACHE
@@ -12,28 +12,21 @@ rendering and networking.
calls the cache or the dpi routines depending on the type of
request.
- Every URL must be requested using a_Capi_open_url, no matter
-if it is a http, file, dpi or whatever type of request. The capi
-asks the dpi module for dpi URLs and the Cache for everything
-else.
+ Every URL must be requested using a_Capi_open_url, which
+sends the request to the cache if the data is cached, to dillo's
+http module for http: URLs, and through dillo's DPI system for
+other URLs.
Here we'll document non dpi requests.
- The cache, at its turn, sends the requested-data from memory
-(if cached), or opens a new network connection (if not cached).
-
- This means that no mattering whether the answer comes from
-memory or the net, the client requests it through the capi
-wrapper, in a single uniform way.
-
----------------
CACHE PHILOSOPHY
----------------
- Dillo's cache is very simple, every single resource that's
-retrieved (URL) is kept in memory. NOTHING is saved. This is
-mainly for three reasons:
+ Dillo's cache is very simple; every single resource that's
+retrieved (URL) is kept in memory. NOTHING is saved to disk.
+This is mainly for three reasons:
- Dillo encourages personal privacy and it assures there'll be
no recorded tracks of the sites you visited.
@@ -42,7 +35,7 @@ no recorded tracks of the sites you visited.
serve as caches.
- If you still want to have cached stuff, you can install an
-external cache server (as WWWOFFLE), and benefit from it.
+external cache server (such as WWWOFFLE), and benefit from it.
---------------
@@ -51,15 +44,14 @@ external cache server (as WWWOFFLE), and benefit from it.
Currently, dillo's cache code is spread in different sources:
mainly in cache.[ch], dicache.[ch] and it uses some other
-functions from mime.c, Url.c and web.c.
+functions from mime.c and web.cc.
- Cache.c is the principal source, and it also is the main
+ Cache.c is the principal source, and it also is the one
responsible for processing cache-clients (held in a queue).
-Dicache.c is the "decompressed image cache" and it holds the
-original data and its corresponding decompressed RGB
-representation (more on this subject in Images.txt).
+Dicache.c is the interface to the decompressed RGB representations
+of currently-displayed images held in DW's imgbuf.
- Url.c, mime.c and web.c are used for secondary tasks; as
+ mime.c and web.cc are used for secondary tasks such as
assigning the right "viewer" or "decoder" for a given URL.
@@ -67,7 +59,7 @@ assigning the right "viewer" or "decoder" for a given URL.
A bit of history
----------------
- Some time ago, the cache functions, URL retrieving and
+ Some time ago, the cache functions, URL retrieval and
external protocols were a whole mess of mixed code, and it was
getting REALLY hard to fix, improve or extend the functionality.
The main idea of this "layering" is to make code-portions as
@@ -76,32 +68,34 @@ improved or replaced without affecting the rest of the browser.
An interesting part of the process is that, as resources are
retrieved, the client (dillo in this case) doesn't know the
-Content-Type of the resource at request-time. It only gets known
-when the resource header is retrieved (think of http), and it
-happens when the cache has the control so, the cache sets the
-proper viewer for it! (unless the Callback function is specified
-with the URL request).
+Content-Type of the resource at request-time. It only becomes known
+when the resource header is retrieved (think of http). This
+happens when the cache has control, so the cache sets the
+proper viewer for it (unless the Callback function was already
+specified with the URL request).
You'll find a good example in http.c.
- Note: Files don't have a header, but the file handler inside
-dillo tries to determine the Content-Type and sends it back in
-HTTP form!
+ Note: All resources received by the cache have HTTP-style headers.
+ The file/data/ftp DPIs generate these headers when sending their
+ non-HTTP resources. Most importantly, a Content-Type header is
+ generated based on file extension or file contents.
-------------
Cache clients
-------------
- Cache clients MUST use a_Cache_open_url to request an URL. The
+ Cache clients MUST use a_Capi_open_url to request an URL. The
client structure and the callback-function prototype are defined,
in cache.h, as follows:
struct _CacheClient {
- gint Key; /* Primary Key for this client */
- const char *Url; /* Pointer to a cache entry Url */
- guchar *Buf; /* Pointer to cache-data */
- guint BufSize; /* Valid size of cache-data */
+ int Key; /* Primary Key for this client */
+ const DilloUrl *Url; /* Pointer to a cache entry Url */
+ int Version; /* Dicache version of this Url (0 if not used) */
+ void *Buf; /* Pointer to cache-data */
+ uint_t BufSize; /* Valid size of cache-data */
CA_Callback_t Callback; /* Client function */
void *CbData; /* Client function data */
void *Web; /* Pointer to the Web structure of our client */
@@ -124,28 +118,15 @@ Key-functions descriptions
--------------------------
································································
-int a_Cache_open_url(const char *Url, CA_Callback_t Call, void *CbData)
+int a_Cache_open_url(void *Web, CA_Callback_t Call, void *CbData)
- if Url is not cached
+ if Web->url is not cached
Create a cache-entry for that URL
Send client to cache queue
- Initiate a new connection
else
Feed our client with cached data
································································
-ChainFunction_t a_Url_get_ccc_funct(const char *Url)
-
- Scan the Url handlers for a handler that matches
- If found
- Return the CCC function for it
- else
- Return NULL
-
- * Ex: If Url is an http request, a_Http_ccc is the matching
-handler.
-
-································································
----------------------
Redirections mechanism
@@ -177,9 +158,9 @@ Notes
to document it in more detail later (source is commented).
Currently I have a drawing to understand it; hope the ASCII
translation serves the same as the original.
- If you're planning to understand the cache process troughly,
-write me a note, just to assign a higher priority on further
-improving of this doc.
+ If you're planning to understand the cache process thoroughly,
+write me a note and I will assign higher priority to further
+improvement of this doc.
Hope this helps!
diff --git a/doc/Dillo.txt b/doc/Dillo.txt
index 47f89780..a2c80afe 100644
--- a/doc/Dillo.txt
+++ b/doc/Dillo.txt
@@ -31,17 +31,17 @@ neccesary data structures and mechanisms for graphical rendering.
engine that handles file descriptor activity, the cache acts as
the main abstraction layer between rendering and networking.
Every URL, whether cached or not, must be retrieved using
-a_Cache_open_url (Described briefly in Cache.txt, source
-contained in cache.c).
- IO is described in IO.txt (recommended), source in IO/.
+a_Capi_open_url (Described briefly in Cache.txt, source
+contained in capi.c).
+ IO is described in IO.txt (recommended), source in src/IO/.
3.- The HTML parser: A streamed parser that joins the Dillo
Widget and the Cache functionality to make browsing possible
-(Described in HtmlParser.txt, source mainly inside html.c).
+(Described in HtmlParser.txt, source mainly inside html.cc).
4.- Image processing code: The part that handles image
-retrieving, decoding, caching and displaying. (Described in
-Images.txt. Sources: image.c, dw_image.c, dicache.c, gif.c,
+retrieval, decoding, caching and displaying. (Described in
+Images.txt. Sources: image.c, dw/image.cc, dicache.c, gif.c,
jpeg.c and png.c)
5.- The dpi framework: a gateway to interface the browser with
@@ -55,34 +55,27 @@ Dpi spec: http://www.dillo.org/dpi1.html
(A short description of the internal function calling process)
- When the user requests a new URL, a_Interface_entry_open_url
+ When the user requests a new URL, a_UIcmd_open_url
is queried to do the job; it calls a_Nav_push (The highest level
URL dispatcher); a_Nav_push updates current browsing history and
calls Nav_open_url. Nav_open_url closes all open connections by
-calling a_Interface_stop and a_Interface_stop, and then calls
-a_Capi_open_url wich calls a_Cache_open_url (or the dpi module if
+calling a_Bw_stop_clients, and then calls
+a_Capi_open_url which calls a_Cache_open_url (or the dpi module if
this gateway is used).
- If Cache_search hits (due to a cached url :), the client is
+ If Cache_entry_search hits (due to a cached url :), the client is
fed with cached data, but if the URL isn't cached yet, a new CCC
-(Concomitant Control Chain) is created and commited to fetch the
-URL. Note that a_Cache_open_url will return the requested URL,
-whether cached or not.
+(Concomitant Control Chain) is created and committed to fetch the
+URL.
The next CCC link is dynamically assigned by examining the
-URL's protocol. It can be:
+URL's protocol. It can be a_Http_ccc or a_Dpi_ccc.
- a_Http_ccc
- a_File_ccc
- a_About_ccc
- a_Plugin_ccc (not implemented yet)
-
-
- If we have a HTTP URL, a_Http_ccc will succeed, and the http
+ If we have an HTTP URL, a_Http_ccc will succeed, and the http
module will be linked; it will create the proper HTTP query and
link the IO module to submit and deliver the answer.
- Note that as the Content-type of the URL is not always known
+ Note that as the Content-Type of the URL is not always known
in advance, the answering branch decides where to dispatch it to
upon HTTP header arrival.
diff --git a/doc/HtmlParser.txt b/doc/HtmlParser.txt
index ec64164d..891eafd6 100644
--- a/doc/HtmlParser.txt
+++ b/doc/HtmlParser.txt
@@ -1,5 +1,5 @@
October 2001, --Jcid
- Last update: Dec 2004
+ Last update: Jul 2009
---------------
THE HTML PARSER
@@ -11,7 +11,7 @@ and plain text also. It has parsing 'modes' that define its
behaviour while working:
typedef enum {
- DILLO_HTML_PARSE_MODE_INIT,
+ DILLO_HTML_PARSE_MODE_INIT = 0,
DILLO_HTML_PARSE_MODE_STASH,
DILLO_HTML_PARSE_MODE_STASH_AND_BODY,
DILLO_HTML_PARSE_MODE_BODY,
@@ -22,12 +22,12 @@ behaviour while working:
The parser works upon a token-grained basis, i.e., the data
stream is parsed into tokens and the parser is fed with them. The
-process is simple: whenever the cache has new data, it gets
+process is simple: whenever the cache has new data, it is
passed to Html_write, which groups data into tokens and calls the
-appropriate functions for the token type (TAG, SPACE or WORD).
+appropriate functions for the token type (tag, space, or word).
Note: when in DILLO_HTML_PARSE_MODE_VERBATIM, the parser
-doesn't try to split the data stream into tokens anymore, it
+doesn't try to split the data stream into tokens anymore; it
simply collects until the closing tag.
------
@@ -65,14 +65,14 @@ TOKENS
As it's a common mistake for human authors to mistype or
forget one of the quote marks of an attribute value; the
parser solves the problem with a look-ahead technique
- (otherwise the parser could skip significative amounts of
- well written HTML).
+ (otherwise the parser could skip significant amounts of
+ properly-written HTML).
* WORD --> Html_process_word
- A word is anything that doesn't start with SPACE, and that's
+ A word is anything that doesn't start with SPACE, that's
outside of a tag, up to the first SPACE or tag start.
SPACE = ' ' | \n | \r | \t | \f | \v
@@ -84,26 +84,34 @@ THE PARSING STACK
The parsing state of the document is kept in a stack:
- struct _DilloHtml {
+ class DilloHtml {
[...]
- DilloHtmlState *stack;
- gint stack_top; /* Index to the top of the stack [0 based] */
- gint stack_max;
+ lout::misc::SimpleVector<DilloHtmlState> *stack;
[...]
};
struct _DilloHtmlState {
- char *tag;
- DwStyle *style, *table_cell_style;
- DilloHtmlParseMode parse_mode;
- DilloHtmlTableMode table_mode;
- gint list_level;
- gint list_number;
- DwWidget *page, *table;
- gint32 current_bg_color;
+ CssPropertyList *table_cell_props;
+ DilloHtmlParseMode parse_mode;
+ DilloHtmlTableMode table_mode;
+ bool cell_text_align_set;
+ DilloHtmlListMode list_type;
+ int list_number;
+
+ /* TagInfo index for the tag that's being processed */
+ int tag_idx;
+
+ dw::core::Widget *textblock, *table;
+
+ /* This is used to align list items (especially in enumerated lists) */
+ dw::core::Widget *ref_list_item;
+
+ /* This is used for list items etc; if it is set to TRUE, breaks
+ have to be "handed over" (see Html_add_indented and
+ Html_eventually_pop_dw). */
+ bool hand_over_break;
};
-
Basically, when a TAG is processed, a new state is pushed into
the 'stack' and its 'style' is set to reflect the desired
appearance (details in DwStyle.txt).
diff --git a/doc/README b/doc/README
index b3fbd136..8b0a8d63 100644
--- a/doc/README
+++ b/doc/README
@@ -1,15 +1,21 @@
-README: Last update Oct 2008 --jcid
+README: Last update Jul 2009
- This documents need a review. They were current with dillo1 and
-now, with dillo2, most of them are obsolete, specially Dw*txt, but
-Dw2 is fully documented in html using doxygen.
+These documents cover dillo's internals.
+For user help, see http://www.dillo.org/dillo2-help.html
- The other documents will be reviewed when I have some time. They
-will give you an overview of what's going on but take them with a
-pinch of salt.
+--------------------------------------------------------------------------
- Of course I'd like to have these as doxygen files too!
-If somebody wants to make this convertion, please let me know
+These documents need a review.
+*.txt were current with Dillo1, but many have since become more or
+ less out-of-date.
+*.doc are doxygen source for the Dillo Widget (dw) component, and
+ were written for Dillo2.
+
+They will give you an overview of what's going on, but take them
+with a pinch of salt.
+
+ Of course I'd like to have *.txt as doxygen files too!
+If somebody wants to make this conversion, please let me know
to assign higher priority to updating these docs.
--
@@ -31,16 +37,14 @@ Jorge.-
Cookies.txt Explains how to enable cookies Current
Dpid.txt Dillo plugin daemon Current
--------------------------------------------------------------------------
- [This documents cover dillo's internal working. They're NOT a user manual]
- --------------------------------------------------------------------------
- * Ah!, there's a small program (srch) within the src/ dir. It searches
+ * BTW, there's a small program (srch) within the src/ dir. It searches
tokens within the whole code (*.[ch]). It has proven very useful.
Ex: ./srch a_Image_write
./srch todo:
- * Please submit your patches with 'diff -pru'.
+ * Please submit your patches with 'hg diff'.
Happy coding!