O_DIRECT causes problems on (at least) a wheezy VM, and there are mixed
reports about its performance impact. This patch makes it a
compile-time choice which should remain until it's been benchmarked.
Previously, the behaviour was to unlink any control socket sat where we
wanted to open ours. This would make us lose control of running servers
if we happened to collide accidentally. With this patch, the new
process will abort() if there is a control socket squatting on the
path we want, and unlink it when it closes.
This means that an unclean shutdown will leave a dangling, unattached
control socket which will block a restart, but that's a better option
than intentionally cutting off running servers.
This simplifies building the log output because it means we don't have
to malloc a buffer to append a newline, and we keep the atomic write
property we're after. It also takes advantage of the C constant string
concatenation which we already require to work to prepend the thread and
pid data.
When we receive a migration, if rebinding to the new listen address and
port fails for a reason which might be fixable, rather than killing the
server we retry once a second. Also in this patch: non-overlapping log
messages and a fix for the client going away halfway through a sendfile
loop.
If the sender disconnects its socket before sending the disconnect
message, the destination should restart the migration process. This
patch makes sure that happens.
This adds a test for destination behaviour, in that if a source crashes
after sending an entrust message but before the destination can reply,
the destination must allow the source to reconnect and retry the mirror.
Now that we have 3 mutexes lying around, it's important that we check
and free these if necessary if error() is called in any thread that can
hold them. To do this, we now have flexthread.c, which defines a
flexthread_mutex struct. This is a wrapper around a pthread_mutex_t and
a pthread_t. The idea is that in the error handler, the thread can
check whether it holds the mutex and can free it if and only if it does.
This is important because pthread fast mutexes can be freed by *any*
thread, not just the thread which holds them.
Note: it is only ever safe for a thread to check if it holds the mutex
itself. It is *never* safe to check if another thread holds a mutex
without first locking that mutex, which makes the whole operation rather
pointless.