Commit Graph

404 Commits

Author SHA1 Message Date
Alex Young
ed70dacf2f Don't skip parts of a file when calling fiemap
A mis-incremented offset in the fiemap-processing code meant that
non-sparse portions of files were missed.
2012-11-20 17:24:19 +00:00
Alex Young
4f650d85c2 Fix the error message for flexnbd write --help 2012-11-20 15:09:48 +00:00
Alex Young
dcef6d29e5 Allocate the bitset in the foreground thread.
This prevents the possibility of a race in dereferencing it in the
client threads.
2012-10-09 17:54:00 +01:00
Alex Young
22bea81445 Don't open the control socket until after the server socket is bound
This makes it easier for the tests (and supervisor) to guarantee to be
able to connect to the server socket.

Also this patch moves freeing the mirror supervisor into the server
thread.
2012-10-09 17:35:20 +01:00
Alex Young
83eb31aba4 Merge 2012-10-09 17:28:41 +01:00
Alex Young
161d2fccf1 Rename serve->has_control to serve->success.
This makes the use of this variable to signal an unexpected SIGTERM
while migrating less confusing.
2012-10-09 17:20:39 +01:00
mbloch
029ebb5ef4 Fixed build_allocation_map in ioutil.c to correctly traverse fiemaps where
there are more than 1000 extents in a 100MB file chunk.
2012-10-08 18:11:21 +01:00
Alex Young
a039ceffcb Merge 2012-10-08 16:02:37 +01:00
Alex Young
062ecca1fd Backed out changeset c25e7d82e56e
This causes test failures under valgrind, and we don't need the
reordering with a background allocation map builder.
2012-10-08 16:01:25 +01:00
Alex Young
cf62b10adf Nullcheck *before* dereferencing.
Also bracketing, replacing a lost comment, and some variable naming.
2012-10-08 14:54:10 +01:00
Matthew Bloch
a49cf14927 Block allocation map is now built in a separate thread, and does not delay
server startup (sparse write avoidance doesn't happen until it is finished).
Added mutex to bitset functions, which were already being called from
multiple threads.  Rewrote allocation map builder to request file
information in multiple chunks, to avoid uninterruptible wait and dynamic
memory allocation.
2012-10-07 21:55:01 +01:00
Matthew Bloch
7b13964c39 Update Rakefile to support locally-installed libcheck, removed efence, pushed
-l arguments to end of link command line.
2012-10-07 02:09:34 +01:00
Alex Young
1fa8ba82a5 Merge 2012-10-04 14:51:54 +01:00
Alex Young
f3e0d61323 Quit with an error status on SIGTERM during migration
This prevents the supervisor from thinking that the migration completed
successfully.

In order to do this, I've introduced a new lock around the start (and
finish) of the migration so that we avoid a race between the signal
handler in the server_accept loop and the control thread mirror startup.
Without that, we'd risk successfully starting a migration after the
SIGTERM handler fired, which would be Bad.
2012-10-04 14:41:55 +01:00
nick
32cae67a75 flexnbd: Move building the allocation map to before server socket bind()
Building the allocation map takes time, which scales with the size of the disc
being presented. By building that map in the space between bind() and accept(),
we leave the process in a useless state after the only good signal we have for
"we are ready" and the state where it is actually ready. This was breaking
migrations of large files.
2012-09-25 11:47:44 +01:00
nick
ccbfce1075 Whitespace 2012-09-20 13:37:48 +01:00
Alex Young
ddc57e76d1 Remove an unneeded sanity check from the tests 2012-09-13 15:13:20 +01:00
Alex Young
1d9c88d4ca Add the write-during-migration test to the acceptance test run 2012-09-13 14:41:50 +01:00
Alex Young
8b43321ef2 Fix for deadlocks when writing while migrating 2012-09-13 12:21:43 +01:00
nick
13328910c8 Add a test case that tickles a deadlock bug when migrating active source discs 2012-09-12 17:13:33 +01:00
Alex Young
50001cd6e7 Merge 2012-09-12 15:43:15 +01:00
Alex Young
ccf5baa956 Add a -dbg package to the debian build 2012-09-12 15:42:58 +01:00
nick
ee652a2965 Fix some races in the acceptance tests 2012-09-11 16:21:35 +01:00
nick
e724d83bec Ensure fiemap ioctl calls are synchronous. 2012-09-11 15:37:13 +01:00
Alex Young
239136064a Add default empty LDFLAGS 2012-08-24 09:32:33 +01:00
Alex Young
c3c621f750 Don't free a client which hasn't finished yet. 2012-08-23 17:51:19 +01:00
Alex Young
c5dfe16f35 Don't close the same file descriptor more than once. 2012-08-23 16:01:37 +01:00
Alex Young
b1a4db2727 Further merge fail fix
The reversal of the control protocol lines for the mirror command wasn't
complete.
2012-07-24 14:19:53 +01:00
nick
2c0f86c018 Fix a merge fail 2012-07-24 09:21:40 +01:00
Alex Young
53eca40fad Fix tests broken by entrust removal
Missed check_readwrite and check_flexnbd
2012-07-23 15:45:39 +01:00
Alex Young
33f95e1986 Add the --unlink option to mirror
This deletes the local file before tearing down the mirror connection,
allowing us to avoid an ambiguous recovery situation.
2012-07-23 13:39:27 +01:00
Alex Young
fd935ce4c9 Simplify the migration handover protocol
The three-way hand-off has a problem: there's no way to arrange for the
state of the migration to be unambiguous in case of failure.  If the
final "disconnect" message is lost (as in, the destination never
receives it whether it is sent by the sender or not), the destination
has no option but to quit with an error status and let a human sort it
out.  However, at that point we can either arrange to have a .INCOMPLETE
file still on disc or not - and it doesn't matter which we choose, we
can still end up with dataloss by picking a specific calamity to have
befallen the sender.

Given this, it makes sense to fall back to a simpler protocol: just send
all the data, then send a "disconnect" message.  This has the same
downside that we need a human to sort out specific failure cases, but
combined with --unlink before sending "disconnect" (see next patch) it
will always be possible for a human to disambiguate, whether the
destination quit with an error status or not.
2012-07-23 10:22:25 +01:00
Alex Young
f6f4266fd6 Update the README for new listen behaviour
Get rid of references to rebind addresses and update the usage examples.
2012-07-23 10:10:47 +01:00
Alex Young
4790912750 Remove listen mode
Changing behaviour so that instead of rebinding after a successful
migration and continuing as an ordinary server, we simply quit with a
0 exit code and let our caller restart us as a server if they want to.
This means that everything in listen.c, listen.h, and anything making
reference to a rebind address is unneeded.
2012-07-23 09:48:50 +01:00
Alex Young
77f4ac29c6 Include strerror(errno) in stat debug output 2012-07-20 09:51:53 +01:00
Alex Young
b0f1a027c6 Add .INCOMPLETE file marker to flexnbd listen
We drop a marker onto the filesystem to say when we know the image we're
serving is not yet ready.
2012-07-19 17:34:20 +01:00
Alex Young
76bbdb4889 Force gzipping the man page 2012-07-19 17:22:25 +01:00
Alex Young
314c0c2a2a Added the flexnbd break command to stop mirroring 2012-07-17 16:30:49 +01:00
Alex Young
1caa3d4e27 Make an EADDRINUSE on server bind fatal.
This is important because if we try to rebind after a migration and
someone else is in the way, any clients trying to reconnect to us will
instead be connecting to the squatter.
2012-07-16 12:34:39 +01:00
Alex Young
2e20e7197a Add the pid to the status output
This will be needed if we daemonise flexnbd.
2012-07-16 11:50:59 +01:00
Alex Young
8814894874 Test setting an ACL 2012-07-16 11:38:01 +01:00
Alex Young
66ff06fe0e Block a second mirror attempt
If a second mirror command is run while the first is still going,
flexnbd needs to prevent the second because we only have one dirty map.
Also, the shutdown becomes Complicated if we allow more than one mirror
at a time.
2012-07-16 11:21:56 +01:00
Alex Young
db30ea0c48 Better error handling for remotes 2012-07-16 11:04:45 +01:00
Alex Young
9a81af5f8f Added tag 0.0.2 for changeset 99b403167181 2012-07-16 10:49:03 +01:00
Alex Young
484a29b3f6 Add README.txt to the deb task code files 0.0.2 2012-07-16 10:29:06 +01:00
Alex Young
d0b39cce08 Flush bad write data from the client socket.
If the client makes a write that's out of range, by the time we get to
validate the message at the server end the client has already stuffed
the socket with data we can't use, so we have to flush it.

This patch also fixes a potential problem in the acceptance tests where
the error field was being returned as an array rather than a value.
2012-07-15 23:19:12 +01:00
Alex Young
f5850e5aaf Switch from expecting a reconnection to *not* doing do
If we're aborting mirror operations early, a couple of specs need to
change sense.
2012-07-15 22:07:00 +01:00
Alex Young
10625e402b Move the mirror commit state mbox to struct control
The mirror_super signals the commit state to the control thread via an
mbox, and this mbox is moved to control.  It was owned by mirror_super,
but the problem with that is that mirror_super can free the mbox before
the control client has been scheduled to receive the message.  If it's
owned by the control object, that can't happen.
2012-07-15 21:57:36 +01:00
Alex Young
b20fbc6a66 Don't retry a mirror which failed on the first attempt
If the mirror attempt failed and we were able to report an error to the
user, it makes no sense to attempt a retry.  We don't have a way to
abort a mirror attempt yet, so if the user got a setting wrong and it's
failing for that reason, the only recourse they'd have would be to
restart the server.
2012-07-15 20:07:17 +01:00
Alex Young
a10adf007c Switch the mirror commit_signal to an mbox
At the moment, a first-pass failed migration will retry. This is wrong,
it should abort.  However, to make that happen the mirror supervisor
needs to know the commit state of the mirror thread.  With a self_pipe
mirror commit signal that information wasn't there.
2012-07-15 19:46:35 +01:00