Commit Graph

31 Commits

Author SHA1 Message Date
Alex Young
fd935ce4c9 Simplify the migration handover protocol
The three-way hand-off has a problem: there's no way to arrange for the
state of the migration to be unambiguous in case of failure.  If the
final "disconnect" message is lost (as in, the destination never
receives it whether it is sent by the sender or not), the destination
has no option but to quit with an error status and let a human sort it
out.  However, at that point we can either arrange to have a .INCOMPLETE
file still on disc or not - and it doesn't matter which we choose, we
can still end up with dataloss by picking a specific calamity to have
befallen the sender.

Given this, it makes sense to fall back to a simpler protocol: just send
all the data, then send a "disconnect" message.  This has the same
downside that we need a human to sort out specific failure cases, but
combined with --unlink before sending "disconnect" (see next patch) it
will always be possible for a human to disambiguate, whether the
destination quit with an error status or not.
2012-07-23 10:22:25 +01:00
Alex Young
314c0c2a2a Added the flexnbd break command to stop mirroring 2012-07-17 16:30:49 +01:00
Alex Young
cef2dcaad2 Rename struct mirror_status to struct mirror 2012-07-12 14:54:48 +01:00
Alex Young
f3f017a87d Free all possibly held mutexes in error handlers
Now that we have 3 mutexes lying around, it's important that we check
and free these if necessary if error() is called in any thread that can
hold them.  To do this, we now have flexthread.c, which defines a
flexthread_mutex struct.  This is a wrapper around a pthread_mutex_t and
a pthread_t.  The idea is that in the error handler, the thread can
check whether it holds the mutex and can free it if and only if it does.
This is important because pthread fast mutexes can be freed by *any*
thread, not just the thread which holds them.

Note: it is only ever safe for a thread to check if it holds the mutex
itself.  It is *never* safe to check if another thread holds a mutex
without first locking that mutex, which makes the whole operation rather
pointless.
2012-07-11 09:43:16 +01:00
Alex Young
94b4fa887c Add mboxes 2012-06-27 15:45:33 +01:00
Alex Young
2078d17053 connect failure scenarios 2012-06-22 10:05:41 +01:00
Alex Young
f37a217cb9 Add listen mode 2012-06-21 18:01:50 +01:00
Alex Young
79ba1cf728 Make max_nbd_clients configurable per struct server 2012-06-21 17:22:34 +01:00
Alex Young
a3dc670939 Squash valgrind errors by making sure client threads get joined on termination 2012-06-21 17:11:12 +01:00
Alex Young
7d1c15b07a Fix two bugs in mirroring.
First, Leaving off the source address caused a segfault in the
command-sending process because there was no NULL check on the ARGV
entry.

Second, while the migration thread sent a signal to the server to close
on successful completion, it didn't wait until the close actually
happened before releasing the IO lock.  This meant that any client
thread waiting on that IO lock could have a read or a write queued up
which could succeed despite the server shutdown.  This would have meant
dataloss as the guest would see a successful write to the wrong instance
of the file.  This patch adds a noddy serve_wait_for_close() function
which the mirror_runner calls to ensure that any clients will reject
operations they're waiting to complete.

This patch also adds a simple scenario test for migration, and fixes
TempFileWriter#read_original.
2012-06-13 13:44:21 +01:00
Alex Young
b986f6b63e Take _GNU_SOURCE out of source and put it in CFLAGS 2012-06-13 09:59:08 +01:00
Alex Young
c7525f87dc Removed proxying completely and fixed the pthread_join bug revealed in the process 2012-06-12 15:08:07 +01:00
Alex Young
25fc0969cf Make the compiler stricter and tidy up code to make the subsequent errors and warnings go away 2012-06-11 13:57:03 +01:00
Matthew Bloch
e8b5fae7ab Merge, just renaming old error macros. 2012-06-09 02:37:23 +01:00
Matthew Bloch
b546539ab8 Rewrote error & log functions to be more general, use longjmp to get out of
trouble and into predictable cleanup functions (one for each of serve,
client & control contexts).  We use 'fatal' to mean 'kill the thread' and
'error' to mean 'don't kill the thread', assuming some recovery action,
except I don't use error anywhere yet.
2012-06-09 02:25:12 +01:00
Alex Young
b7096ef908 Audit client connections on acl update 2012-06-08 18:03:41 +01:00
Alex Young
35ca93b42c Lock around acl updates 2012-06-08 11:02:40 +01:00
Alex Young
f7e1a098b1 Move updating the acl object into serve.c
* * *
Replacing the server acl sends an acl_updated signal
2012-06-08 10:32:33 +01:00
Alex Young
2d9d00b636 Pull ACLs into their own struct 2012-06-07 17:47:43 +01:00
Alex Young
1cd8f4660f Merge of doom 2012-06-07 14:40:55 +01:00
Alex Young
a90f84972b Add stop signals to client threads 2012-06-07 11:44:19 +01:00
Matthew Bloch
5710431780 Refactored write_not_zeroes to use struct bitset_mapping instead of
repeating all that code (has not fixed earlier bug yet, but lots of
repetition cut).
2012-06-07 11:17:02 +01:00
Alex Young
cfa9f9c71f Fix the sense of client_serve_request 2012-06-06 14:25:35 +01:00
Alex Young
e8b47d5855 Remove the accept lock as being unneeded 2012-06-06 14:07:55 +01:00
Alex Young
1fc76ad77f Merge 2012-06-06 13:44:49 +01:00
Alex Young
1b289a0e87 Change io lock and unlock to server error on failure 2012-06-06 13:29:13 +01:00
Alex Young
339e766339 Use self_pipe for close_signal 2012-06-06 12:41:03 +01:00
nick
14c9468b68 Automated merge with ssh://dev/flexnbd-c 2012-06-06 12:35:18 +01:00
Alex Young
40279bc9ca Split client-specific code into client.{c,h} 2012-06-06 11:27:52 +01:00
Alex Young
d22471d195 Fix a \#define symbol 2012-06-06 10:55:50 +01:00
Alex Young
a80c5ce6b5 Moved sockaddr_address_data to serve.c and renamed params.h to serve.h 2012-06-06 10:45:07 +01:00