flexnbd-c

Author	SHA1	Message	Date
nick	65d4f581b9	mirror: Clean up bps calculation slightly	2013-10-24 15:11:55 +01:00
nick	97a923afdf	mirror: Don't start migrating until the allocation map is built There is a fun race that can happen if we begin migrating while the allocation map is still building. We call bitset_enable_stream() when the migration begins, which causes the builder to start putting events into the stream. This is bad all by itself, as it slows the migration down for no reason, but the stream is a limited-size queue and there are situations (migration fails and is restarted) where we can end up with the queue full and nobody able to empty it, freezing the whole thing.	2013-10-23 15:58:47 +01:00
nick	5c1b119f83	serve: Fix calulation of server_mirror_bytes_remaining Previously, we didn't count the number of bytes represented by events in the stream; we just counted each pending event as one byte. Whoops.	2013-09-23 17:09:55 +01:00
nick	0f0697a0aa	serve: Remove an unused (and incorrect, in any case) function	2013-09-23 16:47:32 +01:00
nick	e98c2f2f05	serve: Fix the sense of allow/forbid_new_clients We need a migration test where more clients connect after the gong	2013-09-23 16:46:43 +01:00
nick	0ae249009c	serve/mirror: Move some code tracking migration speed into serve The rationale is that status will need this information	2013-09-23 13:49:01 +01:00
nick	a6c175ed1d	serve: Allow number of clients currently being used to be counted	2013-09-23 13:37:13 +01:00
nick	94654419c5	serve: Add a comment clarifying that a behaviour is safe	2013-09-23 10:53:55 +01:00
nick	e161121c7a	flexnbd: Remove unused ".INCOMPLETE" file code The original idea was that we'd create a .incomplete file at the destination for mirroring, but that code was removed some time ago. This is all dead, now	2013-09-23 10:38:18 +01:00
nick	150e506780	flexnbd: Remove the server I/O lock as it no longer has any consumers	2013-09-23 10:29:06 +01:00
nick	eb80c0d235	mirror: Remove server I/O lock and dirty map Given our bitset_stream events, we no longer need to worry about keeping track of the dirty map. This also lets us rip out the server I/O lock from mirroring. It's possible that we can remove the lock from client.c as well at this point, but I need to have a bit more of a think about possible races	2013-09-19 15:18:30 +01:00
nick	77a66c85a0	serve: Move bitset freeing to after closing the mirror and clients	2013-09-17 17:30:33 +01:00
nick	0172eb1cba	flexnbd: Some comments and a minor fix in client.c to do with the event stream	2013-09-13 15:17:15 +01:00
nick	54a41aacdf	bitset: Add a bitset_free() function	2013-09-11 14:41:59 +01:00
nick	487bef1f40	flexnbd: Disconnect clients at the start of a mirror last pass Currently, we prevent clients from processing requests by taking the server I/O lock. This leads to requests hanging for a long time before being terminated when the migration completes, which is not ideal. With this change, at the start of the final pass, existing clients are closed and any new connections will be closed immediately (so no NBD server handshake will be seen). This is part of the work required to remove remove the server I/O lock completely.	2013-09-10 16:03:26 +01:00
nick	c6764b0de1	mirror: abandon signals are now honoured outside of the remote end being readable / writable	2013-08-12 15:30:21 +01:00
nick	9f34752842	flexnbd: Make the killswitch runtime-selectable We're not actually using it in production right now because it doesn't shut its sockets down cleanly enough. This is a better option than reverting the functionality or keeping production downgraded until we sort out a handler that cleanly closes the sockets.	2013-07-03 09:56:35 +01:00
nick	6842864e74	Automated merge with file:///home/lupine/Development/bigv-repos/flexnbd-c-sockutil	2013-02-15 16:53:18 +00:00
nick	9826dc6c65	Automated merge with ssh://dev/flexnbd-c	2013-02-15 13:36:15 +00:00
nick	dfa7e1a21b	serve: Don't die horribly in the event of EINTR being returned by select()	2013-02-14 16:38:45 +00:00
nick	ac560bd907	serve: Refactor some socket utility code into its own module. We'll be using this in proxy mode later	2013-02-13 13:43:52 +00:00
nick	8c04564645	flexnbd: Avoid a SIGSEGV when the allocation map fails to build. In the event of a fiemap ioctl failing (when the file is on a tmpfs, for instance), we would free() serve->allocation_map, but it would remain not NULL, leading to segfaults in client.c when responding to write requests. Keeping the free() behaviour is more hassle than it's worth, as there are synchronization problems with setting serve->allocation_map to NULL, so we just omit the free() instead to avoid the segfault. This is safe because we never consult the map until allocation_map_built is set to true, and we never do that when the builder thread fails.	2013-02-08 16:17:16 +00:00
Alex Young	dcef6d29e5	Allocate the bitset in the foreground thread. This prevents the possibility of a race in dereferencing it in the client threads.	2012-10-09 17:54:00 +01:00
Alex Young	22bea81445	Don't open the control socket until after the server socket is bound This makes it easier for the tests (and supervisor) to guarantee to be able to connect to the server socket. Also this patch moves freeing the mirror supervisor into the server thread.	2012-10-09 17:35:20 +01:00
Alex Young	161d2fccf1	Rename serve->has_control to serve->success. This makes the use of this variable to signal an unexpected SIGTERM while migrating less confusing.	2012-10-09 17:20:39 +01:00
Alex Young	a039ceffcb	Merge	2012-10-08 16:02:37 +01:00
Alex Young	062ecca1fd	Backed out changeset c25e7d82e56e This causes test failures under valgrind, and we don't need the reordering with a background allocation map builder.	2012-10-08 16:01:25 +01:00
Alex Young	cf62b10adf	Nullcheck before dereferencing. Also bracketing, replacing a lost comment, and some variable naming.	2012-10-08 14:54:10 +01:00
Matthew Bloch	a49cf14927	Block allocation map is now built in a separate thread, and does not delay server startup (sparse write avoidance doesn't happen until it is finished). Added mutex to bitset functions, which were already being called from multiple threads. Rewrote allocation map builder to request file information in multiple chunks, to avoid uninterruptible wait and dynamic memory allocation.	2012-10-07 21:55:01 +01:00
Alex Young	1fa8ba82a5	Merge	2012-10-04 14:51:54 +01:00
Alex Young	f3e0d61323	Quit with an error status on SIGTERM during migration This prevents the supervisor from thinking that the migration completed successfully. In order to do this, I've introduced a new lock around the start (and finish) of the migration so that we avoid a race between the signal handler in the server_accept loop and the control thread mirror startup. Without that, we'd risk successfully starting a migration after the SIGTERM handler fired, which would be Bad.	2012-10-04 14:41:55 +01:00
nick	32cae67a75	flexnbd: Move building the allocation map to before server socket bind() Building the allocation map takes time, which scales with the size of the disc being presented. By building that map in the space between bind() and accept(), we leave the process in a useless state after the only good signal we have for "we are ready" and the state where it is actually ready. This was breaking migrations of large files.	2012-09-25 11:47:44 +01:00
nick	ccbfce1075	Whitespace	2012-09-20 13:37:48 +01:00
Alex Young	8b43321ef2	Fix for deadlocks when writing while migrating	2012-09-13 12:21:43 +01:00
Alex Young	c3c621f750	Don't free a client which hasn't finished yet.	2012-08-23 17:51:19 +01:00
Alex Young	c5dfe16f35	Don't close the same file descriptor more than once.	2012-08-23 16:01:37 +01:00
Alex Young	fd935ce4c9	Simplify the migration handover protocol The three-way hand-off has a problem: there's no way to arrange for the state of the migration to be unambiguous in case of failure. If the final "disconnect" message is lost (as in, the destination never receives it whether it is sent by the sender or not), the destination has no option but to quit with an error status and let a human sort it out. However, at that point we can either arrange to have a .INCOMPLETE file still on disc or not - and it doesn't matter which we choose, we can still end up with dataloss by picking a specific calamity to have befallen the sender. Given this, it makes sense to fall back to a simpler protocol: just send all the data, then send a "disconnect" message. This has the same downside that we need a human to sort out specific failure cases, but combined with --unlink before sending "disconnect" (see next patch) it will always be possible for a human to disambiguate, whether the destination quit with an error status or not.	2012-07-23 10:22:25 +01:00
Alex Young	314c0c2a2a	Added the `flexnbd break` command to stop mirroring	2012-07-17 16:30:49 +01:00
Alex Young	1caa3d4e27	Make an EADDRINUSE on server bind fatal. This is important because if we try to rebind after a migration and someone else is in the way, any clients trying to reconnect to us will instead be connecting to the squatter.	2012-07-16 12:34:39 +01:00
Alex Young	8814894874	Test setting an ACL	2012-07-16 11:38:01 +01:00
Alex Young	10b46beeea	Retry failed rebind attempts When we receive a migration, if rebinding to the new listen address and port fails for a reason which might be fixable, rather than killing the server we retry once a second. Also in this patch: non-overlapping log messages and a fix for the client going away halfway through a sendfile loop.	2012-07-12 14:14:46 +01:00
Alex Young	f3f017a87d	Free all possibly held mutexes in error handlers Now that we have 3 mutexes lying around, it's important that we check and free these if necessary if error() is called in any thread that can hold them. To do this, we now have flexthread.c, which defines a flexthread_mutex struct. This is a wrapper around a pthread_mutex_t and a pthread_t. The idea is that in the error handler, the thread can check whether it holds the mutex and can free it if and only if it does. This is important because pthread fast mutexes can be freed by any thread, not just the thread which holds them. Note: it is only ever safe for a thread to check if it holds the mutex itself. It is never safe to check if another thread holds a mutex without first locking that mutex, which makes the whole operation rather pointless.	2012-07-11 09:43:16 +01:00
Alex Young	ac3e6692a8	make sure that an invalid flexnbd signal fd can't break the serve accept loop	2012-06-27 16:17:51 +01:00
Alex Young	94b4fa887c	Add mboxes	2012-06-27 15:45:33 +01:00
Alex Young	2078d17053	connect failure scenarios	2012-06-22 10:05:41 +01:00
Alex Young	f37a217cb9	Add listen mode	2012-06-21 18:01:50 +01:00
Alex Young	79ba1cf728	Make max_nbd_clients configurable per struct server	2012-06-21 17:22:34 +01:00
Alex Young	a3dc670939	Squash valgrind errors by making sure client threads get joined on termination	2012-06-21 17:11:12 +01:00
Alex Young	bafc3d3687	Make sure filename_incomplete gets freed	2012-06-21 15:58:32 +01:00
Alex Young	cc22f50fe6	Avoid a use-after-free in serve.c	2012-06-21 14:15:58 +01:00

1 2

91 Commits