flexnbd-c

Author	SHA1	Message	Date
Alex Young	22bea81445	Don't open the control socket until after the server socket is bound This makes it easier for the tests (and supervisor) to guarantee to be able to connect to the server socket. Also this patch moves freeing the mirror supervisor into the server thread.	2012-10-09 17:35:20 +01:00
Alex Young	1fa8ba82a5	Merge	2012-10-04 14:51:54 +01:00
Alex Young	f3e0d61323	Quit with an error status on SIGTERM during migration This prevents the supervisor from thinking that the migration completed successfully. In order to do this, I've introduced a new lock around the start (and finish) of the migration so that we avoid a race between the signal handler in the server_accept loop and the control thread mirror startup. Without that, we'd risk successfully starting a migration after the SIGTERM handler fired, which would be Bad.	2012-10-04 14:41:55 +01:00
nick	ccbfce1075	Whitespace	2012-09-20 13:37:48 +01:00
Alex Young	33f95e1986	Add the --unlink option to mirror This deletes the local file before tearing down the mirror connection, allowing us to avoid an ambiguous recovery situation.	2012-07-23 13:39:27 +01:00
Alex Young	fd935ce4c9	Simplify the migration handover protocol The three-way hand-off has a problem: there's no way to arrange for the state of the migration to be unambiguous in case of failure. If the final "disconnect" message is lost (as in, the destination never receives it whether it is sent by the sender or not), the destination has no option but to quit with an error status and let a human sort it out. However, at that point we can either arrange to have a .INCOMPLETE file still on disc or not - and it doesn't matter which we choose, we can still end up with dataloss by picking a specific calamity to have befallen the sender. Given this, it makes sense to fall back to a simpler protocol: just send all the data, then send a "disconnect" message. This has the same downside that we need a human to sort out specific failure cases, but combined with --unlink before sending "disconnect" (see next patch) it will always be possible for a human to disambiguate, whether the destination quit with an error status or not.	2012-07-23 10:22:25 +01:00
Alex Young	10625e402b	Move the mirror commit state mbox to struct control The mirror_super signals the commit state to the control thread via an mbox, and this mbox is moved to control. It was owned by mirror_super, but the problem with that is that mirror_super can free the mbox before the control client has been scheduled to receive the message. If it's owned by the control object, that can't happen.	2012-07-15 21:57:36 +01:00
Alex Young	b20fbc6a66	Don't retry a mirror which failed on the first attempt If the mirror attempt failed and we were able to report an error to the user, it makes no sense to attempt a retry. We don't have a way to abort a mirror attempt yet, so if the user got a setting wrong and it's failing for that reason, the only recourse they'd have would be to restart the server.	2012-07-15 20:07:17 +01:00
Alex Young	a10adf007c	Switch the mirror commit_signal to an mbox At the moment, a first-pass failed migration will retry. This is wrong, it should abort. However, to make that happen the mirror supervisor needs to know the commit state of the mirror thread. With a self_pipe mirror commit signal that information wasn't there.	2012-07-15 19:46:35 +01:00
Alex Young	5794913fdf	Delete the MS_FINALISE mirror state It's not being used for anything.	2012-07-15 18:40:50 +01:00
Alex Young	e77234c6b1	Close the mirror client socket on rejection If the mirror attempt connects ok, but is rejected (say, for reporting the wrong size), the client socket needs to be closed. The destination end can't close its socket and accept another connection attempt unless it does.	2012-07-15 18:30:20 +01:00
Alex Young	c6e6952def	Open files with O_DIRECT dependent on a compile-time DIRECT_IO #define. O_DIRECT causes problems on (at least) a wheezy VM, and there are mixed reports about its performance impact. This patch makes it a compile-time choice which should remain until it's been benchmarked.	2012-07-14 10:07:58 +01:00
Alex Young	cef2dcaad2	Rename struct mirror_status to struct mirror	2012-07-12 14:54:48 +01:00
Alex Young	f3cebcdcd5	Test a source crashing after an entrust. This adds a test for destination behaviour, in that if a source crashes after sending an entrust message but before the destination can reply, the destination must allow the source to reconnect and retry the mirror.	2012-07-11 15:19:50 +01:00
Alex Young	f3f017a87d	Free all possibly held mutexes in error handlers Now that we have 3 mutexes lying around, it's important that we check and free these if necessary if error() is called in any thread that can hold them. To do this, we now have flexthread.c, which defines a flexthread_mutex struct. This is a wrapper around a pthread_mutex_t and a pthread_t. The idea is that in the error handler, the thread can check whether it holds the mutex and can free it if and only if it does. This is important because pthread fast mutexes can be freed by any thread, not just the thread which holds them. Note: it is only ever safe for a thread to check if it holds the mutex itself. It is never safe to check if another thread holds a mutex without first locking that mutex, which makes the whole operation rather pointless.	2012-07-11 09:43:16 +01:00
Alex Young	9850f5d0a4	Test that timing out a write causes a disconnect and a reconnect	2012-06-28 14:45:53 +01:00
Alex Young	94b4fa887c	Add mboxes	2012-06-27 15:45:33 +01:00

17 Commits