diff options
Diffstat (limited to 'doc/IO.txt')
-rw-r--r-- | doc/IO.txt | 468 |
1 files changed, 0 insertions, 468 deletions
diff --git a/doc/IO.txt b/doc/IO.txt deleted file mode 100644 index cd62a4f5..00000000 --- a/doc/IO.txt +++ /dev/null @@ -1,468 +0,0 @@ - -This is the updated base of a paper I wrote with Horst. -It provides a good introduction to Dillo's internals. -(Highly recommended if you plan to patch or develop in Dillo) - -It may not be exactly accurate (it's quite old), but it explains -the theory behind in some detail, so it's more than recommended -reading material. ---Jcid - - ------------------------------------------------------ -Parallel network programming of the Dillo web browser ------------------------------------------------------ - - Jorge Arellano-Cid <jcid@dillo.org> - Horst H. von Brand <vonbrand@inf.utfsm.cl> - - --------- -Abstract --------- - - Network programs face several delay sources when sending or -retrieving data. This is particularly problematic in programs -which interact directly with the user, most notably web browsers. -We present a hybrid approach using threads communicated through -pipes and signal driven I/O, which allows a non-blocking main -thread and overlapping waiting times. - - ------------- -Introduction ------------- - - The Dillo project didn't start from scratch but mainly working -on the code base of gzilla (a light web browser written by Raph -Levien). As the project went by, the code of the whole source was -standardized, and the networking engine was replaced with a new, -faster design. Later, the parser was changed, the streamed -handling of data and its error control was put under the control -of the CCC (Concomitant Control Chain), and the old widget system -was replaced with a new one (Dw). The source code is currently -regarded as "very stable beta", and is available at -<http://www.dillo.org>. Dillo is a project licensed under the GNU -General Public License. - - This paper covers basic design aspects of the hybrid approach -that the Dillo web browser uses to solve several latency -problems. After introducing the main delay-sources, the main -points of the hybrid design will be addressed. - - -------------- -Delay sources -------------- - - Network programs face several delay-sources while sending or -retrieving data. In the particular case of a web browser, they -are found in: - - DNS querying: - The time required to solve a name. - - Initiating the TCP connection: - The three way handshake of the TCP protocol. - - Sending the query: - The time spent uploading queries to the remote server. - - Retrieving data: - The time spent expecting and receiving the query answer. - - Closing the TCP connection: - The four packet-sending closing sequence of the TCP protocol. - - In a WAN context, every single item of this list has an -associated delay that is non deterministic and often measured in -seconds. If we add several connections per browsed page (each one -requiring at least the 4 last steps), the total latency can be -considerable. - - ------------------------------------ -The traditional (blocking) approach ------------------------------------ - - The main problems with the blocking approach are: - - When issuing an operation that can't be completed - immediately, the process is put to sleep waiting for - completion, and the program doesn't do any other - processing in the meantime. - - When waiting for a specific socket operation to complete, - packets that belong to other connections may be arriving, - and have to wait for service. - - Web browsers handle many small transactions, - if waiting times are not overlapped - the latency perceived by the user can be very annoying. - - If the user interface is just put to sleep during network - operations, the program becomes unresponsive, confusing - and perhaps alarming the user. - - Not overlapping waiting times and processing makes - graphical rendering (which is arguably the central function - of a browser) unnecessarily slow. - - ---------------------- -Dillo's hybrid design ---------------------- - - Dillo uses threads and signal driven I/O extensively to -overlap waiting times and computation. Handling the user -interface in a thread that never blocks gives a good interactive -``feel.'' The use of GTK+, a sophisticated widget framework for -graphical user interfaces, helped very much to accomplish this -goal. All the interface, rendering and I/O engine was built upon -its facilities. - - The design is said to be ``hybrid'' because it uses threads -for DNS querying and reading local files, and signal driven I/O -for TCP connections. The threaded DNS scheme is potentially -concurrent (this depends on underlying hardware), while the I/O -handling (both local files and remote connections) is -definitively parallel. - - To simplify the structure of the browser, local files are -encapsulated into HTTP streams and presented to the rest of the -browser as such, in exactly the same way a remote connection is -handled. To create this illusion, a thread is launched. This -thread opens a pipe to the browser, it then synthesizes an -appropriate HTTP header, sends it together with the file to the -browser proper. In this way, all the browser sees is a handle, -the data on it can come from a remote connection or from a local -file. - - To handle a remote connection is more complex. In this case, -the browser asks the cache manager for the URL. The name in the -URL has to be resolved through the DNS engine, a socket TCP -connection must be established, the HTTP request has to be sent, -and finally the result retrieved. Each of the steps mentioned -could give rise to errors, which have to be handled and somehow -communicated to the rest of the program. For performance reasons, -it is critical that responses are cached locally, so the remote -connection doesn't directly hand over the data to the browser; -the response is passed to the cache manager which then relays it -to the rest of the browser. The DNS engine caches DNS responses, -and either answers them from the cache or by querying the DNS. -Querying is done in a separate thread, so that the rest of the -browser isn't blocked by long waits here. - - The activities mentioned do not happen strictly in the order -stated above. It is even possible that several URLs are being -handled at the same time, in order to overlap waiting and -downloading. The functions called directly from the user -interface have to return quickly to maintain interactive -response. Sometimes they return connection handlers that haven't -been completely set up yet. As stated, I/O is signal-driven, when -one of the descriptors is ready for data transfer (reading or -writing), it wakes up the I/O engine. - - Data transfer between threads inside the browser is handled by -pipes, shared memory is little used. This almost obviates the -need for explicit synchronization, which is one of the main areas -of complexity and bugs in concurrent programs. Dillo handles its -threads in a way that its developers can think of it as running -on a single thread of control. This is accomplished by making the -DNS engine call-backs happen within the main thread, and by -isolating file loading with pipes. - - Using threads in this way has three big advantages: - - The browser doesn't block when one of its child threads - blocks. In particular, the user interface is responsive - even while resolving a name or downloading a file. - - Developers don't need to deal with complex concurrent - concerns. Concurrency is hard to handle, and few developers - are adept at this. This gives access a much larger pool of - potential developers, something which can be critical - in an open-source development project. - - By making the code mostly sequential, debugging the code - with traditional tools like gdb is possible. Debugging - parallel programs is very hard, and appropriate tools are - hard to come by. - - Because of simplicity and portability concerns, DNS querying -is done in a separate thread. The standard C library doesn't -provide a function for making DNS queries that don't block. The -alternative is to implement a new, custom DNS querying function -that doesn't block. This is certainly a complex task, integrating -this mechanism into the thread structure of the program is much -simpler. - - Using a thread and a pipe to read a local file adds a -buffering step to the process (and a certain latency), but it has -a couple of significative advantages: - - By handling local files in the same way as remote - connections, a significant amount of code is reused. - - A preprocessing step of the file data can be added easily, - if needed. In fact, the file is encapsulated into an HTTP - data stream. - - ------------ -DNS queries ------------ - - Dillo handles DNS queries with threads, letting a child thread -wait until the DNS server answers the request. When the answer -arrives, a call-back function is called, and the program resumes -what it was doing at DNS-request time. The interesting thing is -that the call-back happens in the main thread, while the child -thread simply exits when done. This is implemented through a -server-channel design. - - -The server channel ------------------- - - There is one thread for each channel, and each channel can -have multiple clients. When the program requests an IP address, -the server first looks for a cached match; if it hits, the client -call-back is invoked immediately, but if not, the client is put -into a queue, a thread is spawned to query the DNS, and a GTK+ -idle client is set to poll the channel 5~times per second for -completion, and when it finally succeeds, every client of that -channel is serviced. - - This scheme allows all the further processing to continue on -the same thread it began: the main thread. - - ------------------------- -Handling TCP connections ------------------------- - - Establishing a TCP connection requires the well known -three-way handshake packet-sending sequence. Depending on network -traffic and several other issues, significant delay can occur at -this phase. - - Dillo handles the connection by a non blocking socket scheme. -Basically, a socket file descriptor of AF_INET type is requested -and set to non-blocking I/O. When the DNS server has resolved the -name, the socket connection process begins by calling connect(2); - {We use the Unix convention of identifying the manual section - where the concept is described, in this case - section 2 (system calls).} -which returns immediately with an EINPROGRESS error. - - After the connection reaches the EINPROGRESS ``state,'' the -socket waits in background until connection succeeds (or fails), -when that happens, a callback function is awaked to perform the -following steps: set the I/O engine to send the query and expect -its answer (both in background). - - The advantage of this scheme is that every required step is -quickly done without blocking the browser. Finally, the socket -will generate a signal whenever I/O is possible. - - ----------------- -Handling queries ----------------- - - In the case of a HTTP URL, queries typically translate into a -short transmission (the HTTP query) and a lengthy retrieval -process. Queries are not always short though, specially when -requesting forms (all the form data is attached within the -query), and also when requesting CGI programs. - - Regardless of query length, query sending is handled in -background. The thread that was initiated at TCP connecting time -has all the transmission framework already set up; at this point, -packet sending is just a matter of waiting for the -write signal (G_IO_OUT) to come and then sending the data. When -the socket gets ready for transmission, the data is sent using -IO_write. - - --------------- -Receiving data --------------- - - Although conceptually similar to sending queries, retrieving -data is very different as the data received can easily exceed the -size of the query by many orders of magnitude (for example when -downloading images or files). This is one of the main sources of -latency, the retrieval can take several seconds or even minutes -when downloading large files. - - The data retrieving process for a single file, that began by -setting up the expecting framework at TCP connecting time, simply -waits for the read signal (G_IO_IN). When it happens, the -low-level I/O engine gets called, the data is read into -pre-allocated buffers and the appropriate call-backs are -performed. Technically, whenever a G_IO_IN event is generated, -data is received from the socket file descriptor, by using the -IO_read function. This iterative process finishes upon EOF (or on -an error condition). - - ----------------------- -Closing the connection ----------------------- - - Closing a TCP connection requires four data segments, not an -impressive amount but twice the round trip time, which can be -substantial. When data retrieval finishes, socket closing is -triggered. There's nothing but a IO_close_fd call on the socket's -file descriptor. This process was originally designed to split -the four segment close into two partial closes, one when query -sending is done and the other when all data is in. This scheme is -not currently used because the write close also stops the reading -part. - - -The low-level I/O engine ------------------------- - - Dillo I/O is carried out in the background. This is achieved -by using low level file descriptors and signals. Anytime a file -descriptor shows activity, a signal is raised and the signal -handler takes care of the I/O. - - The low-level I/O engine ("I/O engine" from here on) was -designed as an internal abstraction layer for background file -descriptor activity. It is intended to be used by the cache -module only; higher level routines should ask the cache for its -URLs. Every operation that is meant to be carried out in -background should be handled by the I/O engine. In the case of -TCP sockets, they are created and submitted to the I/O engine for -any further processing. - - The submitting process (client) must fill a request structure -and let the I/O engine handle the file descriptor activity, until -it receives a call-back for finally processing the data. This is -better understood by examining the request structure: - - typedef struct { - gint Key; /* Primary Key (for klist) */ - gint Op; /* IORead | IOWrite | IOWrites */ - gint FD; /* Current File Descriptor */ - gint Flags; /* Flag array */ - glong Status; /* Number of bytes read, or -errno code */ - - void *Buf; /* Buffer place */ - size_t BufSize; /* Buffer length */ - void *BufStart; /* PRIVATE: only used inside IO.c! */ - - void *ExtData; /* External data reference (not used by IO.c) */ - void *Info; /* CCC Info structure for this IO */ - GIOChannel *GioCh; /* IO channel */ - guint watch_id; /* glib's event source id */ - } IOData_t; - - To request an I/O operation, this structure must be filled and -passed to the I/O engine. - - 'Op' and 'Buf' and 'BufSize' MUST be provided. - - 'ExtData' MAY be provided. - - 'Status', 'FD' and 'GioCh' are set by I/O engine internal -routines. - - When there is new data in the file descriptor, 'IO_callback' -gets called (by glib). Only after the I/O engine finishes -processing the data are the upper layers notified. - - -The I/O engine transfer buffer ------------------------------- - - The 'Buf' and 'BufSize' fields of the request structure -provide the transfer buffer for each operation. This buffer must -be set by the client (to increase performance by avoiding copying -data). - - On reads, the client specifies the amount and where to place -the retrieved data; on writes, it specifies the amount and source -of the data segment that is to be sent. Although this scheme -increases complexity, it has proven very fast and powerful. For -instance, when the size of a document is known in advance, a -buffer for all the data can be allocated at once, eliminating the -need for multiple memory reallocations. Even more, if the size is -known and the data transfer is taking the form of multiple small -chunks of data, the client only needs to update 'Buf' and -BufSize' to point to the next byte in its large preallocated -reception buffer (by adding the chunk size to 'Buf'). On the -other hand, if the size of the transfer isn't known in advance, -the reception buffer can remain untouched until the connection -closes, but the client must then accomplish the usual buffer -copying and reallocation. - - The I/O engine also lets the client specify a full length -transfer buffer when sending data. It doesn't matter (from the -client's point of view) if the data fits in a single packet or -not, it's the I/O engine's job to divide it into smaller chunks -if needed and to perform the operation accordingly. - - ------------------------------------------- -Handling multiple simultaneous connections ------------------------------------------- - - The previous sections describe the internal work for a single -connection, the I/O engine handles several of them in parallel. -This is the normal downloading behavior of a web page. Normally, -after retrieving the main document (HTML code), several -references to other files (typically images) and sometimes even -to other sites (mostly advertising today) are found inside the -page. In order to parse and complete the page rendering, those -other documents must be fetched and displayed, so it is not -uncommon to have multiple downloading connections (every one -requiring the whole fetching process) happening at the same time. - - Even though socket activity can reach a hectic pace, the -browser never blocks. Note also that the I/O engine is the one -that directs the execution flow of the program by triggering a -call-back chain whenever a file descriptor operation succeeds or -fails. - - A key point for this multiple call-back chained I/O engine is -that every single function in the chain must be guaranteed to -return quickly. Otherwise, the whole system blocks until it -returns. - - ------------ -Conclusions ------------ - - Dillo is currently in very stable beta state. It already shows -impressive performance, and its interactive ``feel'' is much -better than that of other web browsers. - - The modular structure of Dillo, and its reliance on GTK1 allow -it to be very small. Not every feature of HTML-4.01 has been -implemented yet, but no significant problems are foreseen in -doing this. - - The fact that Dillo's central I/O engine is written using -advanced features of POSIX and TCP/IP networking makes its -performance possible, but on the other hand this also means that -only a fraction of the interested hackers are able to work on it. - - A simple code base is critical when trying to attract hackers -to work on a project like this one. Using the GTK+ framework -helped both in creating the graphical user interface and in -handling the concurrency inside the browser. By having threads -communicate through pipes the need for explicit synchronization -is almost completely eliminated, and with it most of the -complexity of concurrent programming disappears. - - A clean, strictly applied layering approach based on clear -abstractions is vital in each programming project. A good, -supportive framework is of much help here. - - |