flexnbd-c

Author	SHA1	Message	Date
nick	53cbe14556	mirror: lengthen the request timeout to 60 seconds This is complicated slightly by a need to keep the tests fast, so we introduce an environment variable that can override the constant	2013-10-30 22:45:12 +00:00
Alex Young	161d2fccf1	Rename serve->has_control to serve->success. This makes the use of this variable to signal an unexpected SIGTERM while migrating less confusing.	2012-10-09 17:20:39 +01:00
Alex Young	f3e0d61323	Quit with an error status on SIGTERM during migration This prevents the supervisor from thinking that the migration completed successfully. In order to do this, I've introduced a new lock around the start (and finish) of the migration so that we avoid a race between the signal handler in the server_accept loop and the control thread mirror startup. Without that, we'd risk successfully starting a migration after the SIGTERM handler fired, which would be Bad.	2012-10-04 14:41:55 +01:00
Alex Young	fd935ce4c9	Simplify the migration handover protocol The three-way hand-off has a problem: there's no way to arrange for the state of the migration to be unambiguous in case of failure. If the final "disconnect" message is lost (as in, the destination never receives it whether it is sent by the sender or not), the destination has no option but to quit with an error status and let a human sort it out. However, at that point we can either arrange to have a .INCOMPLETE file still on disc or not - and it doesn't matter which we choose, we can still end up with dataloss by picking a specific calamity to have befallen the sender. Given this, it makes sense to fall back to a simpler protocol: just send all the data, then send a "disconnect" message. This has the same downside that we need a human to sort out specific failure cases, but combined with --unlink before sending "disconnect" (see next patch) it will always be possible for a human to disambiguate, whether the destination quit with an error status or not.	2012-07-23 10:22:25 +01:00
Alex Young	4790912750	Remove listen mode Changing behaviour so that instead of rebinding after a successful migration and continuing as an ordinary server, we simply quit with a 0 exit code and let our caller restart us as a server if they want to. This means that everything in listen.c, listen.h, and anything making reference to a rebind address is unneeded.	2012-07-23 09:48:50 +01:00
Alex Young	314c0c2a2a	Added the `flexnbd break` command to stop mirroring	2012-07-17 16:30:49 +01:00
Alex Young	10b46beeea	Retry failed rebind attempts When we receive a migration, if rebinding to the new listen address and port fails for a reason which might be fixable, rather than killing the server we retry once a second. Also in this patch: non-overlapping log messages and a fix for the client going away halfway through a sendfile loop.	2012-07-12 14:14:46 +01:00
Alex Young	71b7708964	Minor tidy	2012-07-12 10:22:31 +01:00
Alex Young	eb90308b6e	Handle a failed disconnect correctly If the sender disconnects its socket before sending the disconnect message, the destination should restart the migration process. This patch makes sure that happens.	2012-07-12 09:39:39 +01:00
Alex Young	84dd052465	Fix a test broken by stdout/stderr reshuffle	2012-07-11 10:12:10 +01:00
Alex Young	f3f017a87d	Free all possibly held mutexes in error handlers Now that we have 3 mutexes lying around, it's important that we check and free these if necessary if error() is called in any thread that can hold them. To do this, we now have flexthread.c, which defines a flexthread_mutex struct. This is a wrapper around a pthread_mutex_t and a pthread_t. The idea is that in the error handler, the thread can check whether it holds the mutex and can free it if and only if it does. This is important because pthread fast mutexes can be freed by any thread, not just the thread which holds them. Note: it is only ever safe for a thread to check if it holds the mutex itself. It is never safe to check if another thread holds a mutex without first locking that mutex, which makes the whole operation rather pointless.	2012-07-11 09:43:16 +01:00
Alex Young	17fe6d3023	Test that a blocked entrust causes a retry	2012-07-03 18:00:31 +01:00
Alex Young	061512f3dc	Test that a write reply with the wrong magic will force a retry	2012-07-03 17:01:39 +01:00
Alex Young	a767d4bc8c	Test the source handles a dest crash after write correctly	2012-07-03 14:52:27 +01:00
Alex Young	9e67f228f0	Rename a test class	2012-07-03 13:35:47 +01:00
Alex Young	2283b99834	Split acceptance tests into separate files	2012-07-03 13:33:52 +01:00

16 Commits