The original idea was that we'd create a .incomplete file at the destination
for mirroring, but that code was removed some time ago. This is all dead, now
NBD doesn't actually guarantee what happens if you have two
concurrent writes to overlapping areas of the disc, and this
mutex was causing us a near-deadlock when the TCP connection
died uncleanly, partway through a request. So now we don't
bother. This actually removes the last user of the server I/O
mutex, so we can remove it completely from the codebase in a
future commit.
This removes the concept of 'passes' completely from mirror.c,
although it leaves the relevant bits in mirror.h to keep status from
failing - although its current code is now Wrong. FIXME.
We also now get the previous test passing, meaning mirroring works
again.
Given our bitset_stream events, we no longer need to worry about
keeping track of the dirty map. This also lets us rip out the
server I/O lock from mirroring.
It's possible that we can remove the lock from client.c as well at
this point, but I need to have a bit more of a think about possible
races
Currently, we prevent clients from processing requests by taking
the server I/O lock. This leads to requests hanging for a long
time before being terminated when the migration completes, which
is not ideal. With this change, at the start of the final pass,
existing clients are closed and any new connections will be closed
immediately (so no NBD server handshake will be seen).
This is part of the work required to remove remove the server I/O
lock completely.
If we're above max_bytes_per_second once we've finished a transfer
(8MB chunks, worst-case) then we delay the next transfer until
all_dirty_bytes / duration < max_bytes_per_second - checking once
per second.
If this isn't good enough, we can improve it - leaky bucket is one
option. To begin with, though, we'll mostly be using this to set
max_bps to either 0 or 100MB/sec or so. So it should be fine.
The idea behind this feature was to avoid the client thread in a listen
server getting stuck forever if the mirroring thread in the source died.
However, it breaks any sane implementation of max_Bps in that thread,
and there are lingering concerns over how it might operate under normal
conditions anyway.
Specifically, if iterating over the bitmap takes a long time, or even just
reading the requisite 8MB from the disc in order to send it, then the
5-second timeout could be hit, causing mirroring to fail unnecessarily.
It's not actually honoured yet, and ideally, you'd also be able to set it as
part of the initial setup: "flexnbd mirror ... -m 4G". remote_argv for the
mirror case would need to become x=y z=w format first, though.