emailrelay/doc/developer.txt
Graeme Walker 216dd32ebf v1.8
2008-03-29 12:00:00 +00:00

396 lines
15 KiB
Plaintext

E-MailRelay Internals
=====================
Module structure
----------------
There are four main C++ libraries in the E-MailRelay code: "glib" provides
low-level classes for file-system abstraction, date and time representation,
string utility functions, logging, command line parsing etc., "gnet" provides
network classes using the Berkley socket and Winsock APIs, "gsmtp" contains SMTP
and message-store classes, and "gpop" contains POP3 classes. All four libraries
are portable between POSIX-like systems (eg. Linux) and Windows.
Under Windows there is an additional library for event handling. Windows has
historically built network event processing on top of the GUI event system which
means that the "gnet" library has to be able to create GUI windows in order to
process network events. The extra GUI and event classes are put into a separate
library in the "src/win32" directory, using the namespace "GGui".
There is also a separate configuration GUI program which uses the "glib" library
together with TrollTech's Qt.
Event model
-----------
The E-MailRelay server uses non-blocking socket i/o, with a select() event loop.
This event model means that the server can handle multiple clients
simultaneously from a single thread and the only significant blocking occurs
when external programs are executed (see "--filter" and "--verifier").
See *C10K Problem* [http://www.kegel.com/c10k.html] for a discussion of
different network event models.
At higher levels the C++ slot/signal design pattern is used to propagate events
between objects (not to be confused with operating system signals). The
slot/signal implementation has been simplified compared to Qt or boost by not
supporting signal multicasting, so each signal connects to no more than one
slot. For historical reasons the slot/signal pattern is not used in the lowest
layers of the network library.
Event handling and exceptions
-----------------------------
The use of non-blocking i/o in the network library means that most processing
operates within the context of an i/o event or timeout callback so the top level
of the call stack is nearly always the event loop code. This can make using C++
exceptions a bit awkward compared to a multi-threaded approach because it is not
possible to put a single catch block around a particular high-level feature.
The event loop delivers all asynchronous events to the abstract "EventHandler"
and "AbstractTimer" interfaces. If these callbacks throw exceptions then the
event loop will catch them and deliver them back to the same interface through
the virtual functions onException() and onTimerException() respectively. If
exceptions are thrown out of _these_ callbacks then the event loop code lets
them propagate back to main(), typically terminating the program.
The two callback interfaces are brought together by having a concrete "Timer"
class that requires an "EventHandler" object to be associated with each timer.
The "Timer" class routes any exceptions thrown out of the timeout callback to
the designated "EventHandler" interface so that both i/o and timeout exceptions
are delivered to the same place.
In common with other event-driven frameworks this leads to a programming
model where objects are instantiated on the heap and the objects delete
themselves when they receive certain events from the framework. In the
"gnet" library the "ServerPeer" and "HeapClient" classes do this lifetime
management; instances of these classes delete themselves when the associated
network connection goes away or when an exception is thrown out their
event-handling code.
Core class structure
--------------------
The message-store functionality uses three abstract interfaces: "MessageStore",
"NewMessage" and "StoredMessage". The "NewMessage" interface is used to create
messages within the store, and the "StoredMessage" interface is used for
reading and extracting messages from the store. The concrete implementation
classes based on these interfaces are respectively "FileStore", "NewFile" and
"StoredFile".
Protocol classes such as "GSmtp::ServerProtocol" receive network and timer
events from their container and use an abstract "Sender" interface to send
network data. This means that the protocols can be largely independent of the
network and event loop framework.
The interaction between the SMTP server protocol class and the message store is
mediated by the "ProtocolMessage" interface. Two main implementations of this
interface are available: one for normal spooling ("ProtocolMessageStore"), and
another for immediate forwarding ("ProtocolMessageForward"). The "Decorator"
pattern is used whereby the forwarding class uses an instance of the storage
class to do the message storing and pre-processing, while adding in an instance
of the "GSmtp::Client" class to do the forwarding.
Message pre-processing (see "--filter") is implemented via an abstract
"Processor" interface. Concrete implementations are provided for doing nothing,
running an external executable program and talking to an external network server.
The protocol, processor and message-store interfaces are brought together by the
high-level "GSmtp::Server" and "GSmtp::Client" classes. Dependency injection is
used to create the concrete instances of the "ProtocolMessage" and "Processor"
interfaces.
Simplified class diagrams for the *GNet* [gnet-classes.png] and
*GSmtp* [gsmtp-classes.png] namespaces are available.
Windows service
---------------
To get E-MailRelay to run as a Windows service there is a service wrapper
program called "emailrelay-service.exe". This program registers itself as
a service when run with the "--install" commandline switch. When the service
runs the wrapper starts the actual E-MailRelay server by looking for a batch
file called "emailrelay-start.bat" in the same directory as service wrapper
executable. It reads the contents of this batch file in order to construct the
E-MailRelay command-line, adding "--no-daemon" and "--hidden" switches if they
are not there already.
The service name and display name can be added to the wrapper's "--install"
command-line, and it is the service name that is used to derive the name of the
"start" batch file. This allows more than one server to be run as services,
using different server command-line switches on each one.
Diagrams
--------
Class diagrams:
* *GNet namespace* [gnet-classes.png]
* *GSmtp namespace* [gsmtp-classes.png]
State transition diagrams:
* *GNet::Client* [gnet-client.png]
* *GSmtp::ServerProtocol* [gsmtp-serverprotocol.png]
Sequence diagrams:
* *Proxy mode forwarding* [sequence-3.png]
Configuration GUI
-----------------
The optional configuration GUI program "emailrelay-gui" uses TrollTech Qt v4
for its user interface components. The GUI can run as a stand-alone
configuration helper or as part of a self-extracting installation program called
"emailrelay-setup".
The packing scheme used to assemble a self-extracting archive is a simple
concatenation of the "stub" executable followed by a table of contents for the
payload files, followed by the payload files themselves (possibly compressed by
"zlib"), and ending with an twelve-byte ascii representation of the offset of
the table of contents.
On Windows there are two levels of packing: the "setup" program has a stub
executable written in "C" that prints an "extracting..." message to the standard
output, with a payload comprising another packed executable and a small number of
"C++" runtime library files. The inner packed executable has the emailrelay GUI
program as its stub and all the other installable files, including the main
emailrelay executable, as its payload.
When the GUI runs it checks whether it has a payload of packed files. If it
has then it runs as an installer; if it does not then it runs as a configuration
helper. Refer to the comments in "src/gui/guimain.cpp" for more details.
The code works exactly the same on Windows, Mac OS X and unix-like operating
systems. However, on unix-like operating systems "make install", possibly run
via some package manager, is the preferred way to install files so the "setup"
program is never normally built or distributed. On Mac OS X the default packing
scheme works well enough, but there is also provision for having a separate
payload file within the Mac bundle rather than appending the payload to the stub
executable.
The user interface is structured as a "wizard" having a dialog box with the
forward and back buttons at the bottom and a single Qt layout object for the
main area. A stack of Qt widgets representing the various pages of the wizard
are installed into the main layout object in turn as the user navigates from
one page to the next.
Once the wizard is completed it asks each page to dump its state as a set of
key-value pairs into a stringstream (see "src/gui/pages.cpp"). These key-value
pairs are processed by an installer class into a list of action objects (in the
"Command" design pattern) and then the action objects are run in turn. In order
to display the progress of the installation each action object is run within a
timer callback so that the Qt framework get a chance to update the GUI between
each one.
During development the user interface pages and the installer can be tested
separately since the interface between them is a simple text stream containing
key-value pairs.
Source control
--------------
The source code is stored in the SourceForge "svn" repository. A working
copy can be checked out as follows:
$ svn co https://emailrelay.svn.sourceforge.net/svnroot/emailrelay
Directory structure
-------------------
# src
Parent directory for source code.
# src/glib
A low-level class library, including classes for file-system abstraction,
date and time, string utility functions, logging, command line parsing etc.
# src/gnet
A network library using Berkley sockets or Winsock.
# src/gsmtp
An SMTP library.
# src/gpop
A POP3 library.
# src/win32
Additional classes for windows event processing.
# src/main
Application-level classes for E-MailRelay.
# src/gui
Installation and configuration GUI program using Qt v4.
# lib
Parent directory for ISO C++ fixups for various compilers.
# test
Test scripts and utilities.
Source file names
-----------------
Generally the source file names are follow the name of the principal class,
(often including the namespace) but all in lowercase. Any underscores in the
name indicate a choice of implementation, so class "G::Foo" might have two
implementations in the files "gfoo_main.cpp" and "gfoo_alternate.cpp".
The choice is normally made by the makefile.
Portability
-----------
The E-MailRelay code is written in ISO C++, although avoiding less-widely
supported language features such as "mutable", templated methods and "export".
The header files "gdef.h" in "src/glib", and "gnet.h" in "src/gnet" are intended
to be used to fix up compiler portability issues such as missing standard types,
non-standard system headers etc. Conditional compilation directives ("#if"
etc.) are confined to these headers as far as possible in order to improve
readability.
Deficiencies in the standard headers files provided by older compilers are fixed
up by files in the "lib" directory tree. For example, the msvc6.0 compiler
sometimes does not put its names into the "std" namespace, even though the
std-namespace headers are used. This can be worked round by additional "using"
declarations in the "lib/msvc6.0" headers. These work-rounds are kept out of
the "src" tree because they are not necessary for more modern compilers.
Windows/unix portability is generally addressed by providing a common class
declaration with two implementations. Where necessary a "pimple" (or "Bridge")
pattern is used to hide the system-specific parts of the declaration.
A good example is the "G::Directory" class used for iterating through files in
a directory. The header file "src/glib/gdirectory.h" is common to both systems,
but two implementations are provided in "gdirectory_unix.cpp" and
"gdirectory_win32.cpp". The unix implementation uses opendir() and glob(),
while the windows implementation uses FindFirstFile().
Sometimes only small parts of the implementation are system-specific. In
these cases there are three source files per header. For example, "gsocket.cpp",
"gsocket_win32.cpp" and "gsocket_unix.cpp" in the "src/gnet" directory.
Compile-time features
---------------------
Compile-time features can be selected with switches passed to the "configure"
script. These include the following:
* Debug-level logging ("--enable-debug")
* IPv6 (Linux only) ("--enable-ipv6")
* Configuration GUI ("--enable-gui")
The "--enable-fhs" switch alters the compiled-in default directories to conform
to the Linux File Hierarchy Standard (FHS). This is recommended for most modern
Linux distributions.
Some functionality can be disabled at compile-time in order to reduce the size
of the executable, typically when building for embedded systems:
* Disable POP3 protocol, "--disable-pop"
* Disable authentication, "--disable-auth" (requires "--disable-pop")
* Disable administration interface, "--disable-admin"
* Disable execution of external programs, "--disable-exec"
The "--enable-small-config" switch can be used to change the command-line
parsing code to use a configuration file instead, resulting in a smaller
executable. This also removes a lot of the configuration checking code, so it is
not recommended unless size is critical. (The format of the configuration file
is similar to the command-line using the long-form switches without the
double-dash and using '=' to separate the switch from the switch value.)
Use "./configure --help" to see a complete list of options and refer to
"acinclude.m4" for more detailed comments.
Patterns
--------
Gang-of-four Design Patterns (ISBN 0-201-63361-2):
+ Factory method
- GNet::EventLoop::create()
- GNet::Server::newPeer()
- GSmtp::MessageStore::newMessage()
+ Iterator
- G::DirectoryIterator
- GNet::EventHandlerList::begin()/end()
- GSmtp::MessageStore::iterator()
+ Singleton
- G::LogOutput
- GGui::ApplicationInstance
- GNet::EventLoop
- GNet::TimerList
+ Facade
- G::File
- GNet::Address
+ Decorator
- GSmtp::ProtocolMessage
+ Command
- Installer
Lakos' Large Scale C++ Software Design patterns (ISBN 0-201-63362-0):
+ Insulation; fully insulating concrete class (Meyer's Effective C++ Item 34, pimple pattern)
- G::DirectoryIterator
- GNet::Address
- GNet::Resolver
- GSmtp::ProtocolMessage
+ Insulation; protocol class
- GNet::EventHandler
- GSmtp::NewMessage
- GSmtp::StoredMessage
- GSmtp::ProtocolMessage
- GSmtp::ServerProtocol::Sender
- GSmtp::ClientProtocol::Sender
Meyer's More Effective C++ patterns (ISBN 0-201-63371-X):
+ Reference counting (Item 29)
- GSmtp::MessageStore::Iterator
- G::Slot0
+ Lazy evaluation (Item 17)
- GNet::EventHandlerList::list()
- G::Date::weekday()
Other patterns:
+ Finite state machine
- GSmtp::ServerProtocol
+ Slot/signal
- G::Slot0
- G::Signal0
+ Exception-safe assignment using swap
- G::Slot0
+ Dependency injection
- GSmtp::Server::newProtocolMessage()
Idioms
------
The "<<=" operator defined in "src/glib/gmemory.h" is used idiomatically
to reassign a std::auto_ptr<> since reset() is not always available.
Copyright (C) 2001-2008 Graeme Walker <graeme_walker@users.sourceforge.net>. All rights reserved.