emailrelay/doc/developer.txt
Graeme Walker 6d9b04341c v1.9
2013-12-08 12:00:00 +00:00

451 lines
18 KiB
Plaintext

E-MailRelay Internals
=====================
Module structure
----------------
The main C++ libraries in the E-MailRelay code are: "glib", providing low-level
classes for file-system abstraction, date and time representation, string
utility functions, logging, command line parsing etc.; "gssl", which is a thin
abstraction over OpenSSL; "gnet", which provides network classes using the
Berkley socket and Winsock APIs; "gauth", which implements various
authentication mechanisms; "gsmtp", containing SMTP and message-store classes;
and "gpop", which contains POP3 classes. All of these libraries are portable
between POSIX-like systems (eg. Linux) and Windows.
Under Windows there is an additional library for event handling. Windows has
historically built network event processing on top of the GUI event system which
means that the "gnet" library has to be able to create GUI windows in order to
process network events. The extra GUI and event classes are put into a separate
library in the "src/win32" directory, using the namespace "GGui".
There is also a separate configuration GUI program which uses the "glib" library
together with TrollTech's Qt.
Event model
-----------
The E-MailRelay server uses non-blocking socket i/o, with a select() event loop.
This event model means that the server can handle multiple clients
simultaneously from a single thread and the only significant blocking occurs
when external programs are executed (see "--filter" and "--verifier").
In some ways this makes the implementation more complicated than the equivalent
multi-threaded approach since (for example) it is not possible to wait for a
complete line of input to be received from a remote SMTP client because there
might be other connection that need servicing.
See *C10K Problem* [http://www.kegel.com/c10k.html] for a discussion of
different network event models.
At higher levels the C++ slot/signal design pattern is used to propagate events
between objects (not to be confused with operating system signals). The
slot/signal implementation has been simplified compared to Qt or boost by not
supporting signal multicasting, so each signal connects to no more than one
slot. For historical reasons the slot/signal pattern is not used in the lowest
layers of the network library.
Event handling and exceptions
-----------------------------
The use of non-blocking i/o in the network library means that most processing
operates within the context of an i/o event or timeout callback so the top level
of the call stack is nearly always the event loop code. This can make using C++
exceptions a bit awkward compared to a multi-threaded approach because it is not
possible to put a single catch block around a particular high-level feature.
The event loop delivers all asynchronous events to the abstract "EventHandler"
and "AbstractTimer" interfaces. If these callbacks throw exceptions then the
event loop will catch them and deliver them back to the same interface through
the virtual functions onException() and onTimerException() respectively. If
exceptions are thrown out of _these_ callbacks then the event loop code lets
them propagate back to main(), typically terminating the program.
The two callback interfaces are brought together by having a concrete "Timer"
class that requires an "EventHandler" object to be associated with each timer.
The "Timer" class routes any exceptions thrown out of the timeout callback to
the designated "EventHandler" interface so that both i/o and timeout exceptions
are delivered to the same place.
In common with other event-driven frameworks this leads to a programming
model where objects are instantiated on the heap and the objects delete
themselves when they receive certain events from the framework. In the
"gnet" library the "ServerPeer" and "HeapClient" classes do this lifetime
management; instances of these classes delete themselves when the associated
network connection goes away or when an exception is thrown out their
event-handling code.
Core class structure
--------------------
The message-store functionality uses three abstract interfaces: "MessageStore",
"NewMessage" and "StoredMessage". The "NewMessage" interface is used to create
messages within the store, and the "StoredMessage" interface is used for
reading and extracting messages from the store. The concrete implementation
classes based on these interfaces are respectively "FileStore", "NewFile" and
"StoredFile".
Protocol classes such as "GSmtp::ServerProtocol" receive network and timer
events from their container and use an abstract "Sender" interface to send
network data. This means that the protocols can be largely independent of the
network and event loop framework.
The interaction between the SMTP server protocol class and the message store is
mediated by the "ProtocolMessage" interface. Two main implementations of this
interface are available: one for normal spooling ("ProtocolMessageStore"), and
another for immediate forwarding ("ProtocolMessageForward"). The "Decorator"
pattern is used whereby the forwarding class uses an instance of the storage
class to do the message storing and pre-processing, while adding in an instance
of the "GSmtp::Client" class to do the forwarding.
Message pre-processing (see "--filter") is implemented via an abstract
"Processor" interface. Concrete implementations are provided for doing nothing,
running an external executable program and talking to an external network server.
The protocol, processor and message-store interfaces are brought together by the
high-level "GSmtp::Server" and "GSmtp::Client" classes. Dependency injection is
used to create the concrete instances of the "ProtocolMessage" and "Processor"
interfaces.
Simplified class diagrams for the *GNet* [gnet-classes.png] and
*GSmtp* [gsmtp-classes.png] namespaces are available.
Windows service
---------------
To get E-MailRelay to run as a Windows service there is a service wrapper
program called "emailrelay-service.exe". This program registers itself as
a service when run with the "--install" commandline option. When the service
runs the wrapper starts the actual E-MailRelay server by looking for a batch
file called "emailrelay-start.bat" in the same directory as service wrapper
executable. It reads the contents of this batch file in order to construct the
E-MailRelay command-line, adding "--no-daemon" and "--hidden" options if they
are not there already.
The service name and display name can be added to the wrapper's "--install"
command-line, and it is the service name that is used to derive the name of the
"start" batch file. This allows more than one server to be run as services,
using different server command-line options on each one.
Diagrams
--------
Class diagrams:
* *GNet namespace* [gnet-classes.png]
* *GSmtp namespace* [gsmtp-classes.png]
State transition diagrams:
* *GNet::Client* [gnet-client.png]
* *GSmtp::ServerProtocol* [gsmtp-serverprotocol.png]
Sequence diagrams:
* *Proxy mode forwarding* [sequence-3.png]
E-MailRelay GUI
---------------
The optional GUI program "emailrelay-gui" uses TrollTech Qt v4 for its user
interface components. The GUI can run as a stand-alone configuration helper
("--as-configure") or as part of a self-extracting installation
("--as-install"). When it runs it checks whether it has a payload of packed
files. If it has then it runs as an installer; if it does not then it runs as a
configuration helper. Refer to the comments in "src/gui/guimain.cpp" for more
details.
The user interface is structured as a "wizard" having a dialog box with the
forward and back buttons at the bottom and a single Qt layout object for the
main area. A stack of Qt widgets representing the various pages of the wizard
are installed into the main layout object in turn as the user navigates from
one page to the next.
Once the wizard is completed it asks each page to dump its state as a set of
key-value pairs into a stringstream (see "src/gui/pages.cpp"). These key-value
pairs are processed by an installer class into a list of action objects (in the
"Command" design pattern) and then the action objects are run in turn. In order
to display the progress of the installation each action object is run within a
timer callback so that the Qt framework gets a chance to update the GUI between
each one.
During development the user interface pages and the installer can be tested
separately since the interface between them is a simple text stream containing
key-value pairs.
When run in "--as-configure" mode the GUI normally ends up simply editing
the "emailrelay.conf" file and/or the "emailrelay.auth" secrets file.
When run in "--as-install" mode the GUI expects to unpack all the E-MailRelay
files from a "payload" archive into target directories.
The packing scheme used for a payload archive is a simple concatenation of the
"stub" executable followed by a table of contents for the payload files,
followed by the payload files themselves (possibly compressed by "zlib"), and
ending with an twelve-byte ascii representation of the offset of the table of
contents.
The payload may be appended to the GUI executable to make a self-extracting
installer, usually then called "emailrelay-setup".
However, on Windows two levels of packing are required: the "emailrelay-setup"
program has a stub executable written in "C" that prints an "extracting..."
message to the standard output, with a payload comprising another packed
executable and a small number of "C++" runtime library files. The inner packed
executable has the emailrelay GUI program as its stub and all the other
installable files, including the main emailrelay executable, as its payload.
On unix-like operating systems it is more natural to use some sort of package
derived from the "make install" process rather than an installer program, so
"emailrelay-setup" ia typically never built.
On Mac OS X the payload is better stored in the installer's application
bundle rather than simply appended to the binary. This is desribed in the next
section.
Windows build
-------------
On Windows E-MailRelay can be built using MinGW or Visual Studio. Makefiles and
Visual Studio 2012 project files are provided in the "src" directory.
For a MinGW build of the E-MailRelay server without TLS/SSL support it should
be sufficient to run make from the "src" directory:
C:\emailrelay\src> PATH=c:\mingw\bin;%PATH%
C:\emailrelay\src> mingw32-make -f mingw.mak
To add TLS/SSL support first install ActiveState perl and make sure that your
MinGW installation contains the MSYS subsystem. Unpack the OpenSSL tarball and
build it as follows:
C:\openssl> PATH=c:\perl\bin;c:\mingw\msys\1.0\bin;c:\mingw\bin;%PATH%
C:\openssl> bash -l
$ ./config
$ mingw32-make
Then edit the E-MailRelay "src/mingw-common.mak" file to enable openssl and run
"mingw32-make" as above.
If building the E-MailRelay GUI then it is best to use MinGW with Qt 5 static
libraries. Start by installing zlib source (eg. to c:/zlib) and then build Qt
using the following "configure" options:
-prefix /c/qt/qt5-static
-I c:/zlib
-L c:/zlib
-static
-release
-force-debug-info
-separate-debug-info
-opensource
-confirm-license
-no-c++11
-fully-process
-no-largefile
-accessibility
-no-sql-sqlite
-no-sql-sqlite2
-no-javascript-jit
-no-qml-debug
-platform win32-g++
-no-sse2
-no-sse3
-no-ssse3
-no-sse4.1
-no-sse4.2
-no-avx
-no-avx2
-no-neon
-no-mips_dsp
-no-mips_dspr2
-no-pkg-config
-system-zlib
-no-gif
-qt-libpng
-no-libjpeg
-no-openssl
-qt-pcre
-qt-xcb
-qt-xkbcommon
-gui
-widgets
-no-rpath
-no-optimized-qmake
-no-nis
-no-cups
-no-iconv
-no-icu
-strip
-no-pch
-no-dbus
-no-xcb
-no-eglfs
-no-directfb
-no-linuxfb
-no-kms
-no-opengl
-no-system-proxies
-no-glib
Start the Qt build by running "mingw32-make" from the "qtbase" directory and
finish off with "mingw32-make install".
Edit the E-MailRelay GUI makefile "src/gui/mingw.mak" so that the E-MailRelay
build uses similar compiler options to the Qt examples and then then build with:
$ mingw32-make -f mingw.mak
Mac OS X packaging
------------------
On Mac OS X the standard "configure ; make ; make install" procedure works best
if the "configure" script is given Mac-like directories on its command-line. The
script "bin/configure-mac.sh" can be used to do this.
The "make" step in "src/main" on Mac OS X additionally builds a simple wrapper
program "emailrelay-start" from "start.cpp" that runs the E-MailRelay server
using a command-line assembled from the "emailrelay.conf" file. This is then
used from the "E-MailRelay-Start" application bundle to provide a Mac-friendly
way of running the server.
Similary the "make" in "src/gui" builds a wrapper program "emailrelay-start-gui"
from "guistart.cpp" that runs "emailrelay-gui.real" and this is used from the
"E-MailRelay-Configure" application bundle.
The self-extracting installer scheme (described above) does work on Mac OS X,
but it is more sensible to use a Mac application bundle (E-MailRelay-Install) to
hold the payload and then wrap the whold thing up in a ".dmg" disk image. The
GUI code supports this by looking for a separate file called "payload" and using
that in preference to any payload archive it finds appended to its own executable.
The format of the "payload" file in the application bundle is the same as is used
in a self-extracting installer.
The makefile in the "src/gui" directory provides the "image" target to create
the E-MailRelay-Install application bundle and the disk image.
Source control
--------------
The source code is stored in the SourceForge "svn" repository. A working
copy can be checked out as follows:
$ svn co https://svn.code.sf.net/p/emailrelay/code/trunk
Directory structure
-------------------
# src
Parent directory for source code.
# src/glib
A low-level class library, including classes for file-system abstraction,
date and time, string utility functions, logging, command line parsing etc.
# src/gnet
A network library using Berkley sockets or Winsock.
# src/gssl
A library for using OpenSSL.
# src/gauth
A library for SASL and PAM authentication.
# src/gsmtp
An SMTP library.
# src/gpop
A POP3 library.
# src/win32
Additional classes for windows event processing.
# src/main
Application-level classes for E-MailRelay.
# src/gui
Installation and configuration GUI program using Qt v4.
# lib
Parent directory for ISO C++ fixups for various older compilers.
# test
Test scripts and utilities.
Source file names
-----------------
Generally the source file names are follow the name of the principal class,
(often including the namespace) but all in lowercase. Any underscores in the
name indicate a choice of implementation, so class "G::Foo" might have two
implementations in the files "gfoo_main.cpp" and "gfoo_alternate.cpp".
The choice is normally made by the makefile.
Portability
-----------
The E-MailRelay code is written in ISO C++, although avoiding less-widely
supported language features such as "mutable", templated methods and "export".
The header files "gdef.h" in "src/glib", and "gnet.h" in "src/gnet" are intended
to be used to fix up compiler portability issues such as missing standard types,
non-standard system headers etc. Conditional compilation directives ("#if"
etc.) are confined to these headers as far as possible in order to improve
readability.
Deficiencies in the standard headers files provided by older compilers are fixed
up by files in the "lib" directory tree. For example, the msvc6.0 compiler
sometimes does not put its names into the "std" namespace, even though the
std-namespace headers are used. This can be worked round by additional "using"
declarations in the "lib/msvc6.0" headers. These work-rounds are kept out of
the "src" tree because they are not necessary for more modern compilers.
Windows/unix portability is generally addressed by providing a common class
declaration with two implementations. Where necessary a "pimple" (or "Bridge")
pattern is used to hide the system-specific parts of the declaration.
A good example is the "G::Directory" class used for iterating through files in
a directory. The header file "src/glib/gdirectory.h" is common to both systems,
but two implementations are provided in "gdirectory_unix.cpp" and
"gdirectory_win32.cpp". The unix implementation uses opendir() and glob(),
while the windows implementation uses FindFirstFile().
Sometimes only small parts of the implementation are system-specific. In
these cases there are three source files per header. For example, "gsocket.cpp",
"gsocket_win32.cpp" and "gsocket_unix.cpp" in the "src/gnet" directory.
Compile-time features
---------------------
Compile-time features can be selected with options passed to the "configure"
script. These include the following:
* Debug-level logging ("--enable-debug")
* IPv6 (Linux only) ("--enable-ipv6")
* Configuration GUI ("--enable-gui")
* PAM support ("--with-pam")
Some functionality can be disabled at compile-time in order to reduce the size
of the executable, typically when building for embedded systems:
* Disable POP3 protocol, "--disable-pop"
* Disable authentication, "--disable-auth" (requires "--disable-pop")
* Disable administration interface, "--disable-admin"
* Disable execution of external programs, "--disable-exec"
The "--enable-small-config" option can be used to change the command-line
parsing code to use a configuration file instead, resulting in a smaller
executable. This also removes a lot of the configuration checking code, so it is
not recommended unless size is critical. (The format of the configuration file
is similar to the command-line using the long-form options without the
double-dash and using '=' to separate the option from the option value.)
Use "./configure --help" to see a complete list of options and refer to
"acinclude.m4" for more detailed comments.
Idioms
------
The "<<=" operator defined in "src/glib/gmemory.h" is used idiomatically
to reassign a std::auto_ptr<> since reset() is not always available.
Copyright (C) 2001-2013 Graeme Walker <graeme_walker@users.sourceforge.net>. All rights reserved.