flexnbd-c

Author	SHA1	Message	Date
nick	8d56316548	client: Start checking for exceptions on the client socket	2014-02-11 14:32:12 +00:00
nick	27f2cc7083	Some debug and whitespace tweaks	2014-02-11 14:31:58 +00:00
nick	8084a41ad2	flexnbd client: Catch a few cases where the killswitch wasn't disarmed	2014-01-28 11:45:27 +00:00
nick	afcc07a181	Fix stop signal logic broken by the killswitch	2014-01-22 12:16:09 +00:00
nick	91d9531a60	flexnbd serve: Make the killswitch per-client-thread This is a bit tricky, but calling shutdown() on a socket in a signal handler is safe, and (at least in linux) appears to cause any read() or write() calls blocked on that socket to return, even with SA_RESTART. I'm not confident enough about the rest of flexnbd's syscall error handling to turn SA_RESTART off for this signal...	2014-01-22 11:49:21 +00:00
nick	f4793c7059	bitset: Rename bitset_mapping to bitset	2013-09-23 16:58:40 +01:00
nick	9a3106f946	flexnbd: Remove the server I/O lock from around NBD requests NBD doesn't actually guarantee what happens if you have two concurrent writes to overlapping areas of the disc, and this mutex was causing us a near-deadlock when the TCP connection died uncleanly, partway through a request. So now we don't bother. This actually removes the last user of the server I/O mutex, so we can remove it completely from the codebase in a future commit.	2013-09-23 10:22:48 +01:00
nick	eb80c0d235	mirror: Remove server I/O lock and dirty map Given our bitset_stream events, we no longer need to worry about keeping track of the dirty map. This also lets us rip out the server I/O lock from mirroring. It's possible that we can remove the lock from client.c as well at this point, but I need to have a bit more of a think about possible races	2013-09-19 15:18:30 +01:00
nick	0172eb1cba	flexnbd: Some comments and a minor fix in client.c to do with the event stream	2013-09-13 15:17:15 +01:00
nick	efdd613968	listen: Turn off CLIENT_MAX_WAIT_SECS The idea behind this feature was to avoid the client thread in a listen server getting stuck forever if the mirroring thread in the source died. However, it breaks any sane implementation of max_Bps in that thread, and there are lingering concerns over how it might operate under normal conditions anyway. Specifically, if iterating over the bitmap takes a long time, or even just reading the requisite 8MB from the disc in order to send it, then the 5-second timeout could be hit, causing mirroring to fail unnecessarily.	2013-08-14 16:09:55 +01:00
nick	64702d992d	Minor fixes here and there	2013-08-09 17:02:33 +01:00
nick	bed8959d47	bitset: Fix large runs	2013-07-24 17:42:08 +01:00
nick	253cee5a10	flexnbd: Acknowledge new return type of bitset_run_count	2013-07-24 15:08:29 +01:00
nick	7de22a385e	flexnbd: clients should be MADV_RANDOM, rather than MADV_SEQUENTIAL	2013-07-24 14:18:23 +01:00
nick	9f34752842	flexnbd: Make the killswitch runtime-selectable We're not actually using it in production right now because it doesn't shut its sockets down cleanly enough. This is a better option than reverting the functionality or keeping production downgraded until we sort out a handler that cleanly closes the sockets.	2013-07-03 09:56:35 +01:00
nick	f7e5353355	serve: Add a killswitch that causes the server to uncleanly exit on hang We define a hang as 120 seconds for now; that should be OK (famous last words). When I say unclean, I mean it; the control socket is left hanging around too. This is a workaround for the fact that the client can hang the whole server by sending a write request header specifying > 0 bytes, then uncleanly going away. On the server side, we acquire the IO mutex, and then try to read > 0 bytes from the socket; the data never arrives, and when the client reconnects, its requests never get a response (since we're waiting on that mutex). Getting rid of that mutex (which isn't actually needed, except for migration) would be better.	2013-06-06 14:16:20 +01:00
nick	a5a7d45355	flexnbd: Add more madvise() hints, both for mirroring out and normal operation. This is hopefully going to reduce flexnbd rss	2013-05-28 14:16:49 +01:00
nick	d9b3aab972	flexnbd: Pass MS_INVALIDATE to our msync calls It's not necessary on Linux, but may be needed elsewhere	2013-04-30 11:04:17 +01:00
nick	6842864e74	Automated merge with file:///home/lupine/Development/bigv-repos/flexnbd-c-sockutil	2013-02-15 16:53:18 +00:00
nick	9b67d30608	serve: Make some error conditions non-fatal, test them. We don't want flexnbd serve to fall over and die if the client sends an invalid request.	2013-02-15 16:51:28 +00:00
nick	8c04564645	flexnbd: Avoid a SIGSEGV when the allocation map fails to build. In the event of a fiemap ioctl failing (when the file is on a tmpfs, for instance), we would free() serve->allocation_map, but it would remain not NULL, leading to segfaults in client.c when responding to write requests. Keeping the free() behaviour is more hassle than it's worth, as there are synchronization problems with setting serve->allocation_map to NULL, so we just omit the free() instead to avoid the segfault. This is safe because we never consult the map until allocation_map_built is set to true, and we never do that when the builder thread fails.	2013-02-08 16:17:16 +00:00
nick	0b3a71bb03	flexnbd: Allocate the right amount of memory for a struct client	2013-02-05 13:27:48 +00:00
Alex Young	ed70dacf2f	Don't skip parts of a file when calling fiemap A mis-incremented offset in the fiemap-processing code meant that non-sparse portions of files were missed.	2012-11-20 17:24:19 +00:00
Alex Young	cf62b10adf	Nullcheck before dereferencing. Also bracketing, replacing a lost comment, and some variable naming.	2012-10-08 14:54:10 +01:00
Matthew Bloch	a49cf14927	Block allocation map is now built in a separate thread, and does not delay server startup (sparse write avoidance doesn't happen until it is finished). Added mutex to bitset functions, which were already being called from multiple threads. Rewrote allocation map builder to request file information in multiple chunks, to avoid uninterruptible wait and dynamic memory allocation.	2012-10-07 21:55:01 +01:00
nick	ccbfce1075	Whitespace	2012-09-20 13:37:48 +01:00
Alex Young	c3c621f750	Don't free a client which hasn't finished yet.	2012-08-23 17:51:19 +01:00
Alex Young	c5dfe16f35	Don't close the same file descriptor more than once.	2012-08-23 16:01:37 +01:00
Alex Young	33f95e1986	Add the --unlink option to mirror This deletes the local file before tearing down the mirror connection, allowing us to avoid an ambiguous recovery situation.	2012-07-23 13:39:27 +01:00
Alex Young	fd935ce4c9	Simplify the migration handover protocol The three-way hand-off has a problem: there's no way to arrange for the state of the migration to be unambiguous in case of failure. If the final "disconnect" message is lost (as in, the destination never receives it whether it is sent by the sender or not), the destination has no option but to quit with an error status and let a human sort it out. However, at that point we can either arrange to have a .INCOMPLETE file still on disc or not - and it doesn't matter which we choose, we can still end up with dataloss by picking a specific calamity to have befallen the sender. Given this, it makes sense to fall back to a simpler protocol: just send all the data, then send a "disconnect" message. This has the same downside that we need a human to sort out specific failure cases, but combined with --unlink before sending "disconnect" (see next patch) it will always be possible for a human to disambiguate, whether the destination quit with an error status or not.	2012-07-23 10:22:25 +01:00
Alex Young	d0b39cce08	Flush bad write data from the client socket. If the client makes a write that's out of range, by the time we get to validate the message at the server end the client has already stuffed the socket with data we can't use, so we have to flush it. This patch also fixes a potential problem in the acceptance tests where the error field was being returned as an array rather than a value.	2012-07-15 23:19:12 +01:00
Alex Young	f9baa95b0f	Raise the log level of a write-request-out-of-range Without this, the error you get is a "Bad magic", when the next read loop tries to read write data as a request. This should be flushed from the socket (although when is an open question), but upping the log level at least gives us a more informative output.	2012-07-14 17:27:13 +01:00
Alex Young	1ce1003d3d	Error when reading sent data fails If the client cuts off part-way through the write, it should cause an error, not a fatal. Previously this happened if the open file had a fiemap, but not if there was no allocation map. This patch fixes that, along with an associated valgrind error.	2012-07-14 12:10:12 +01:00
Alex Young	c6e6952def	Open files with O_DIRECT dependent on a compile-time DIRECT_IO #define. O_DIRECT causes problems on (at least) a wheezy VM, and there are mixed reports about its performance impact. This patch makes it a compile-time choice which should remain until it's been benchmarked.	2012-07-14 10:07:58 +01:00
Alex Young	40101e49f3	Silence a vfprintf valgrind error Turns out that %lld causes valgrind to find an uninitialised variable problem inside vfprintf. Avoid it here by s/%lld/%d/.	2012-07-13 11:57:46 +01:00
Alex Young	2e4e592c08	Enable writing after the 2G boundary This patch fixes a bug in readwrite.c which truncated the 'from' field in nbd requests. It was casting them down from an off64_t to an int.	2012-07-12 18:01:10 +01:00
Alex Young	10b46beeea	Retry failed rebind attempts When we receive a migration, if rebinding to the new listen address and port fails for a reason which might be fixable, rather than killing the server we retry once a second. Also in this patch: non-overlapping log messages and a fix for the client going away halfway through a sendfile loop.	2012-07-12 14:14:46 +01:00
Alex Young	eb90308b6e	Handle a failed disconnect correctly If the sender disconnects its socket before sending the disconnect message, the destination should restart the migration process. This patch makes sure that happens.	2012-07-12 09:39:39 +01:00
Alex Young	f3f017a87d	Free all possibly held mutexes in error handlers Now that we have 3 mutexes lying around, it's important that we check and free these if necessary if error() is called in any thread that can hold them. To do this, we now have flexthread.c, which defines a flexthread_mutex struct. This is a wrapper around a pthread_mutex_t and a pthread_t. The idea is that in the error handler, the thread can check whether it holds the mutex and can free it if and only if it does. This is important because pthread fast mutexes can be freed by any thread, not just the thread which holds them. Note: it is only ever safe for a thread to check if it holds the mutex itself. It is never safe to check if another thread holds a mutex without first locking that mutex, which makes the whole operation rather pointless.	2012-07-11 09:43:16 +01:00
Alex Young	d16aebf36e	Test that a disconnect after the write request but before the data is an error	2012-07-03 15:25:39 +01:00
Alex Young	c9fdd5a60e	Handle ECONNRESET during a read request	2012-06-28 11:46:02 +01:00
Alex Young	94b4fa887c	Add mboxes	2012-06-27 15:45:33 +01:00
Alex Young	2078d17053	connect failure scenarios	2012-06-22 10:05:41 +01:00
Alex Young	f37a217cb9	Add listen mode	2012-06-21 18:01:50 +01:00
Alex Young	e21beb1866	Add the REQUEST_ENTRUST nbd request type	2012-06-21 17:12:06 +01:00
Alex Young	4e8a9670e5	Merge	2012-06-21 11:37:18 +01:00
Alex Young	ed3090d6d5	Tweak struct initialisation to squash a valgrind error	2012-06-21 10:29:06 +01:00
Alex Young	c7525f87dc	Removed proxying completely and fixed the pthread_join bug revealed in the process	2012-06-12 15:08:07 +01:00
Alex Young	2a71b4e7a4	Fix broken error checking around pthread functions	2012-06-11 16:08:19 +01:00
Alex Young	710d8254d4	Make sure all ifs are braced	2012-06-11 14:34:17 +01:00

1 2

65 Commits