Commit Graph

54 Commits

Author SHA1 Message Date
nick
9d9ae40953 Increase loglevel of some allocation map messages 2013-10-30 16:40:32 +00:00
nick
65d4f581b9 mirror: Clean up bps calculation slightly 2013-10-24 15:11:55 +01:00
nick
77c71ccf09 mirror: Ensure the bitset is actually disabled on mirror error 2013-10-23 16:18:00 +01:00
nick
97a923afdf mirror: Don't start migrating until the allocation map is built
There is a fun race that can happen if we begin migrating while the
allocation map is still building. We call bitset_enable_stream()
when the migration begins, which causes the builder to start putting
events into the stream. This is bad all by itself, as it slows the
migration down for no reason, but the stream is a limited-size queue
and there are situations (migration fails and is restarted) where we
can end up with the queue full and nobody able to empty it, freezing
the whole thing.
2013-10-23 15:58:47 +01:00
nick
335261869d mirror: Don't count bytes transferred for the purposes of keeping the stream empty as part of our bwlimit
This prevents a fairly nasty situation occurring where the rate of change on the disc is high enough that
just servicing it generates enough traffic to keep us over the bwlimit threshold indefinitely. That would
cause us to sleep during the only windows we'd ordinarily have to advance the offset.
2013-10-23 15:26:28 +01:00
nick
8cf9cae8c0 mirror: Don't sleep if our stream is filling up 2013-10-23 14:38:27 +01:00
nick
b177faacd6 mirror: Reduce the mirror convergence window to 5 seonds, from 60
Also remove some obsolete constants
2013-09-24 14:42:21 +01:00
nick
22f92c5df0 Fix a potential compiler warning on 32-bit 2013-09-23 17:15:47 +01:00
nick
78fc65c515 bitset: Rename bitset_stream_on/off as bitset_enable/disable_stream 2013-09-23 17:10:14 +01:00
nick
ebe6c4a8ab mirror: Remove dead code. We still rely on all_dirty in one place. 2013-09-23 14:20:05 +01:00
nick
0ae249009c serve/mirror: Move some code tracking migration speed into serve
The rationale is that status will need this information
2013-09-23 13:49:01 +01:00
nick
150e506780 flexnbd: Remove the server I/O lock as it no longer has any consumers 2013-09-23 10:29:06 +01:00
nick
6553907972 mirror: Fix mirroring, break status
This removes the concept of 'passes' completely from mirror.c,
although it leaves the relevant bits in mirror.h to keep status from
failing - although its current code is now Wrong. FIXME.

We also now get the previous test passing, meaning mirroring works
again.
2013-09-20 17:08:14 +01:00
nick
eb80c0d235 mirror: Remove server I/O lock and dirty map
Given our bitset_stream events, we no longer need to worry about
keeping track of the dirty map. This also lets us rip out the
server I/O lock from mirroring.

It's possible that we can remove the lock from client.c as well at
this point, but I need to have a bit more of a think about possible
races
2013-09-19 15:18:30 +01:00
nick
54a41aacdf bitset: Add a bitset_free() function 2013-09-11 14:41:59 +01:00
nick
487bef1f40 flexnbd: Disconnect clients at the start of a mirror last pass
Currently, we prevent clients from processing requests by taking
the server I/O lock. This leads to requests hanging for a long
time before being terminated when the migration completes, which
is not ideal. With this change, at the start of the final pass,
existing clients are closed and any new connections will be closed
immediately (so no NBD server handshake will be seen).

This is part of the work required to remove remove the server I/O
lock completely.
2013-09-10 16:03:26 +01:00
nick
0494295705 mirror: Ensure the mirror client socket is closed after a fail, and before a retry 2013-08-27 15:54:59 +01:00
nick
14fde0f2a1 mirror: Remove overly-verbose log line 2013-08-21 14:41:19 +01:00
nick
e13d1d8fb4 mirror: honour max_bytes_per_second - naive scheme
If we're above max_bytes_per_second once we've finished a transfer
(8MB chunks, worst-case) then we delay the next transfer until
all_dirty_bytes / duration < max_bytes_per_second - checking once
per second.

If this isn't good enough, we can improve it - leaky bucket is one
option. To begin with, though, we'll mostly be using this to set
max_bps to either 0 or 100MB/sec or so. So it should be fine.
2013-08-14 16:24:50 +01:00
nick
d0022402ae mirror: Start our timeout watcher from the first, not second, transfer 2013-08-14 15:29:24 +01:00
nick
cc468b0b17 control/mirror: Use uint64_t and strtoull to get max_Bps into the mirror 2013-08-13 12:30:18 +01:00
nick
45355666f7 mirror: And another abandon fix 2013-08-12 16:14:53 +01:00
nick
8a294e5ee0 mirror: fix abandon 2013-08-12 15:54:49 +01:00
nick
c6764b0de1 mirror: abandon signals are now honoured outside of the remote end being readable / writable 2013-08-12 15:30:21 +01:00
nick
c2df38c9d3 mirror: Use libev to provide an event loop inside the mirror thread
We're doing this so we can implement bandwidth controls sanely.
2013-08-09 17:02:10 +01:00
nick
f590f8ed3c status: Add migration_speed ( bytes per second ) and migration_duration( seconds ) to the migration output 2013-07-26 11:50:01 +01:00
nick
a5870b8e9b Remove a stray debugging statement 2013-07-25 10:14:14 +01:00
nick
bed8959d47 bitset: Fix large runs 2013-07-24 17:42:08 +01:00
nick
253cee5a10 flexnbd: Acknowledge new return type of bitset_run_count 2013-07-24 15:08:29 +01:00
nick
5c5636b053 flexnbd mirror: If the final run would be longer than the file size, truncate to file size
This fixes migrations of images that are not exactly divisible by 4096
2013-07-23 11:00:51 +01:00
nick
f4bfc70a4b flexnbd status: Add current pass clean/dirty byte statistics 2013-07-08 13:51:15 +01:00
nick
f556f298b1 flexnbd status: Add current migration pass to the status output if we're migrating 2013-07-08 09:58:31 +01:00
nick
26c7f1b1c4 mirror: munmap() our range on cleanup 2013-05-30 11:09:24 +01:00
nick
055836c8cb mirror: Don't undo the MADV_SEQUENTIAL hinting over the course of a migration 2013-05-30 11:06:15 +01:00
nick
76cf2dc7b9 mirror: Only say we're unlinking the file if we actually are 2013-05-30 11:05:26 +01:00
nick
a5a7d45355 flexnbd: Add more madvise() hints, both for mirroring out and normal operation.
This is hopefully going to reduce flexnbd rss
2013-05-28 14:16:49 +01:00
Alex Young
f002b8ca1f madvise after mirroring to control the RSS 2012-12-28 11:38:54 +00:00
Alex Young
22bea81445 Don't open the control socket until after the server socket is bound
This makes it easier for the tests (and supervisor) to guarantee to be
able to connect to the server socket.

Also this patch moves freeing the mirror supervisor into the server
thread.
2012-10-09 17:35:20 +01:00
Alex Young
1fa8ba82a5 Merge 2012-10-04 14:51:54 +01:00
Alex Young
f3e0d61323 Quit with an error status on SIGTERM during migration
This prevents the supervisor from thinking that the migration completed
successfully.

In order to do this, I've introduced a new lock around the start (and
finish) of the migration so that we avoid a race between the signal
handler in the server_accept loop and the control thread mirror startup.
Without that, we'd risk successfully starting a migration after the
SIGTERM handler fired, which would be Bad.
2012-10-04 14:41:55 +01:00
nick
ccbfce1075 Whitespace 2012-09-20 13:37:48 +01:00
Alex Young
33f95e1986 Add the --unlink option to mirror
This deletes the local file before tearing down the mirror connection,
allowing us to avoid an ambiguous recovery situation.
2012-07-23 13:39:27 +01:00
Alex Young
fd935ce4c9 Simplify the migration handover protocol
The three-way hand-off has a problem: there's no way to arrange for the
state of the migration to be unambiguous in case of failure.  If the
final "disconnect" message is lost (as in, the destination never
receives it whether it is sent by the sender or not), the destination
has no option but to quit with an error status and let a human sort it
out.  However, at that point we can either arrange to have a .INCOMPLETE
file still on disc or not - and it doesn't matter which we choose, we
can still end up with dataloss by picking a specific calamity to have
befallen the sender.

Given this, it makes sense to fall back to a simpler protocol: just send
all the data, then send a "disconnect" message.  This has the same
downside that we need a human to sort out specific failure cases, but
combined with --unlink before sending "disconnect" (see next patch) it
will always be possible for a human to disambiguate, whether the
destination quit with an error status or not.
2012-07-23 10:22:25 +01:00
Alex Young
10625e402b Move the mirror commit state mbox to struct control
The mirror_super signals the commit state to the control thread via an
mbox, and this mbox is moved to control.  It was owned by mirror_super,
but the problem with that is that mirror_super can free the mbox before
the control client has been scheduled to receive the message.  If it's
owned by the control object, that can't happen.
2012-07-15 21:57:36 +01:00
Alex Young
b20fbc6a66 Don't retry a mirror which failed on the first attempt
If the mirror attempt failed and we were able to report an error to the
user, it makes no sense to attempt a retry.  We don't have a way to
abort a mirror attempt yet, so if the user got a setting wrong and it's
failing for that reason, the only recourse they'd have would be to
restart the server.
2012-07-15 20:07:17 +01:00
Alex Young
a10adf007c Switch the mirror commit_signal to an mbox
At the moment, a first-pass failed migration will retry. This is wrong,
it should abort.  However, to make that happen the mirror supervisor
needs to know the commit state of the mirror thread.  With a self_pipe
mirror commit signal that information wasn't there.
2012-07-15 19:46:35 +01:00
Alex Young
5794913fdf Delete the MS_FINALISE mirror state
It's not being used for anything.
2012-07-15 18:40:50 +01:00
Alex Young
e77234c6b1 Close the mirror client socket on rejection
If the mirror attempt connects ok, but is rejected (say, for reporting
the wrong size), the client socket needs to be closed.  The destination
end can't close its socket and accept another connection attempt unless
it does.
2012-07-15 18:30:20 +01:00
Alex Young
c6e6952def Open files with O_DIRECT dependent on a compile-time DIRECT_IO #define.
O_DIRECT causes problems on (at least) a wheezy VM, and there are mixed
reports about its performance impact.  This patch makes it a
compile-time choice which should remain until it's been benchmarked.
2012-07-14 10:07:58 +01:00
Alex Young
cef2dcaad2 Rename struct mirror_status to struct mirror 2012-07-12 14:54:48 +01:00