209 Commits
0.0.1 ... 0.0.2

Author SHA1 Message Date
Alex Young
484a29b3f6 Add README.txt to the deb task code files 2012-07-16 10:29:06 +01:00
Alex Young
d0b39cce08 Flush bad write data from the client socket.
If the client makes a write that's out of range, by the time we get to
validate the message at the server end the client has already stuffed
the socket with data we can't use, so we have to flush it.

This patch also fixes a potential problem in the acceptance tests where
the error field was being returned as an array rather than a value.
2012-07-15 23:19:12 +01:00
Alex Young
f5850e5aaf Switch from expecting a reconnection to *not* doing do
If we're aborting mirror operations early, a couple of specs need to
change sense.
2012-07-15 22:07:00 +01:00
Alex Young
10625e402b Move the mirror commit state mbox to struct control
The mirror_super signals the commit state to the control thread via an
mbox, and this mbox is moved to control.  It was owned by mirror_super,
but the problem with that is that mirror_super can free the mbox before
the control client has been scheduled to receive the message.  If it's
owned by the control object, that can't happen.
2012-07-15 21:57:36 +01:00
Alex Young
b20fbc6a66 Don't retry a mirror which failed on the first attempt
If the mirror attempt failed and we were able to report an error to the
user, it makes no sense to attempt a retry.  We don't have a way to
abort a mirror attempt yet, so if the user got a setting wrong and it's
failing for that reason, the only recourse they'd have would be to
restart the server.
2012-07-15 20:07:17 +01:00
Alex Young
a10adf007c Switch the mirror commit_signal to an mbox
At the moment, a first-pass failed migration will retry. This is wrong,
it should abort.  However, to make that happen the mirror supervisor
needs to know the commit state of the mirror thread.  With a self_pipe
mirror commit signal that information wasn't there.
2012-07-15 19:46:35 +01:00
Alex Young
5794913fdf Delete the MS_FINALISE mirror state
It's not being used for anything.
2012-07-15 18:40:50 +01:00
Alex Young
e77234c6b1 Close the mirror client socket on rejection
If the mirror attempt connects ok, but is rejected (say, for reporting
the wrong size), the client socket needs to be closed.  The destination
end can't close its socket and accept another connection attempt unless
it does.
2012-07-15 18:30:20 +01:00
Alex Young
e0a61e91e6 Simplify acceptance test launching
Get rid of checking for --verbose, since it's always there now
2012-07-15 17:14:22 +01:00
Alex Young
f7379e3278 Tweak help output for the --bind option
Each option's parameter should be unique - they're instances, not
classes
2012-07-14 21:43:27 +01:00
Alex Young
a1ea2ba4c5 Add a rake task to build the man page
Also tweak the debian .install to put it in the right place.
2012-07-14 18:47:25 +01:00
Alex Young
54a1409dce Added a README.txt and a man page
Spoiler: they're the same thing. Added a `rake man` task to build the
man page.  Depends on asciidoc.
2012-07-14 18:36:02 +01:00
Alex Young
f9baa95b0f Raise the log level of a write-request-out-of-range
Without this, the error you get is a "Bad magic", when the next read
loop tries to read write data as a request.  This should be flushed from
the socket (although *when* is an open question), but upping the log
level at least gives us a more informative output.
2012-07-14 17:27:13 +01:00
Alex Young
69ad6d6b7a Only copy constants from C to Ruby once
This avoids unnecessary duplicate constant warnings for C constants that
are defined in two legs of an #ifdef.
2012-07-14 17:25:26 +01:00
Alex Young
b734a468c1 Make the --verbose flag universal
Previously, the --verbose flag was only present in debug builds. Now
it's present whether you define DEBUG or not.  What changes is the
amount of information printed to stderr: DEBUG sets the --verbose log
level to 0 (debug), while DEBUG unset sets it to 1 (info).  This makes
driving the binary slightly simpler as you don't have to detect whether
it's a debug build by scanning for "--verbose" in the help output.
2012-07-14 12:27:16 +01:00
Alex Young
768b30c4eb Clobber a dangling fprintf 2012-07-14 12:11:25 +01:00
Alex Young
1ce1003d3d Error when reading sent data fails
If the client cuts off part-way through the write, it should cause an
error, not a fatal.  Previously this happened if the open file had a
fiemap, but not if there was no allocation map.  This patch fixes that,
along with an associated valgrind error.
2012-07-14 12:10:12 +01:00
Alex Young
c6e6952def Open files with O_DIRECT dependent on a compile-time DIRECT_IO #define.
O_DIRECT causes problems on (at least) a wheezy VM, and there are mixed
reports about its performance impact.  This patch makes it a
compile-time choice which should remain until it's been benchmarked.
2012-07-14 10:07:58 +01:00
Alex Young
03c06a689d Append the CFLAGS environment variable to the build flags.
This is going to be used for the DIRECT_IO flag.
2012-07-14 10:05:35 +01:00
Alex Young
e4d2b9a667 Make test sockets less dependent on enviroment
It seems that ruby in a default wheezy VM can't handle a source address
of nil.
2012-07-14 10:04:55 +01:00
Alex Young
2ea5a2e38a Unlink the control socket on clean shutdown
Previously, the behaviour was to unlink any control socket sat where we
wanted to open ours.  This would make us lose control of running servers
if we happened to collide accidentally.  With this patch, the new
process will abort() if there is a control socket squatting on the
path we want, and unlink it when it closes.

This means that an unclean shutdown will leave a dangling, unattached
control socket which will block a restart, but that's a better option
than intentionally cutting off running servers.
2012-07-13 14:09:52 +01:00
Alex Young
a838714571 Tweak the fuzz script to work with the new test layout 2012-07-13 13:13:04 +01:00
Alex Young
fd8ee5b8c3 Tweak the parse_acl declaration
Array lengths don't make sense in function declarations.
2012-07-13 12:37:21 +01:00
Alex Young
15109c72d1 Add a newline to log messages at macro expansion
This simplifies building the log output because it means we don't have
to malloc a buffer to append a newline, and we keep the atomic write
property we're after.  It also takes advantage of the C constant string
concatenation which we already require to work to prepend the thread and
pid data.
2012-07-13 12:18:19 +01:00
Alex Young
9f4da5def0 Switch to use nbd_r2h_reply in read_reply()
Use a wrapper function to simplify the reply field reading.
2012-07-13 12:13:55 +01:00
Alex Young
40101e49f3 Silence a vfprintf valgrind error
Turns out that %lld causes valgrind to find an uninitialised variable
problem inside vfprintf.  Avoid it here by s/%lld/%d/.
2012-07-13 11:57:46 +01:00
Alex Young
2a50b64a43 Free the flexnbd switch mutex 2012-07-13 11:31:22 +01:00
Alex Young
00e912d0a6 Add a 'just in case' error case to acl checking 2012-07-13 10:16:44 +01:00
Alex Young
2f24d02a8f Remove unused variables
use_connect_from in control_mirror() and success in mode_serve() are no
longer used.
2012-07-13 09:34:18 +01:00
Alex Young
2e4e592c08 Enable writing after the 2G boundary
This patch fixes a bug in readwrite.c which truncated the 'from' field
in nbd requests.  It was casting them down from an off64_t to an int.
2012-07-12 18:01:10 +01:00
Alex Young
cef2dcaad2 Rename struct mirror_status to struct mirror 2012-07-12 14:54:48 +01:00
Alex Young
c6a084ce82 Add a --quiet command-line option
--quiet will suppress all log lines except FATAL.  Conceptually it's
exclusive with --verbose, but this isn't checked - last one wins.
2012-07-12 14:45:55 +01:00
Alex Young
10b46beeea Retry failed rebind attempts
When we receive a migration, if rebinding to the new listen address and
port fails for a reason which might be fixable, rather than killing the
server we retry once a second.  Also in this patch: non-overlapping log
messages and a fix for the client going away halfway through a sendfile
loop.
2012-07-12 14:14:46 +01:00
Alex Young
9002341e77 Fix the broken --rebind-port command-line option. 2012-07-12 10:45:19 +01:00
Alex Young
71b7708964 Minor tidy 2012-07-12 10:22:31 +01:00
Alex Young
eb90308b6e Handle a failed disconnect correctly
If the sender disconnects its socket before sending the disconnect
message, the destination should restart the migration process.  This
patch makes sure that happens.
2012-07-12 09:39:39 +01:00
Alex Young
f3cebcdcd5 Test a source crashing after an entrust.
This adds a test for destination behaviour, in that if a source crashes
after sending an entrust message but before the destination can reply,
the destination must allow the source to reconnect and retry the mirror.
2012-07-11 15:19:50 +01:00
Alex Young
84dd052465 Fix a test broken by stdout/stderr reshuffle 2012-07-11 10:12:10 +01:00
Alex Young
f3f017a87d Free all possibly held mutexes in error handlers
Now that we have 3 mutexes lying around, it's important that we check
and free these if necessary if error() is called in any thread that can
hold them.  To do this, we now have flexthread.c, which defines a
flexthread_mutex struct.  This is a wrapper around a pthread_mutex_t and
a pthread_t.  The idea is that in the error handler, the thread can
check whether it holds the mutex and can free it if and only if it does.
This is important because pthread fast mutexes can be freed by *any*
thread, not just the thread which holds them.

Note: it is only ever safe for a thread to check if it holds the mutex
itself.  It is *never* safe to check if another thread holds a mutex
without first locking that mutex, which makes the whole operation rather
pointless.
2012-07-11 09:43:16 +01:00
Alex Young
17fe6d3023 Test that a blocked entrust causes a retry 2012-07-03 18:00:31 +01:00
Alex Young
061512f3dc Test that a write reply with the wrong magic will force a retry 2012-07-03 17:01:39 +01:00
Alex Young
5c66d35677 Test that closing the socket immediately after sending write data causes an error 2012-07-03 15:33:00 +01:00
Alex Young
d16aebf36e Test that a disconnect after the write request but before the data is an error 2012-07-03 15:25:39 +01:00
Alex Young
a767d4bc8c Test the source handles a dest crash after write correctly 2012-07-03 14:52:27 +01:00
Alex Young
64ebbe7688 Refactor FakeSource from a module to a class 2012-07-03 14:39:05 +01:00
Alex Young
ded4914c84 Simplified FlexNBD::FakeDest 2012-07-03 14:23:20 +01:00
Alex Young
9e67f228f0 Rename a test class 2012-07-03 13:35:47 +01:00
Alex Young
2283b99834 Split acceptance tests into separate files 2012-07-03 13:33:52 +01:00
Alex Young
988b2ec014 Moved acceptance tests into tests/acceptance 2012-07-03 10:59:31 +01:00
Alex Young
c0c9c6f076 Moved unit tests into tests/unit 2012-07-03 10:53:08 +01:00
Alex Young
e817129c47 Changes to error severity in readwrite.c made a test fail, this patch fixes it 2012-07-02 18:10:02 +01:00
Alex Young
cc2e67d4bb Test that an invalid write gets an error response 2012-07-02 15:37:52 +01:00
Alex Young
ea4642a878 Check that a mirror write returning an error will cause a reconnect and retry 2012-07-02 15:04:45 +01:00
Alex Young
99f8c24b01 Tweak a timeout to prevent an intermittent test failure 2012-07-02 13:00:30 +01:00
Alex Young
9850f5d0a4 Test that timing out a write causes a disconnect and a reconnect 2012-06-28 14:45:53 +01:00
Alex Young
4de4cee3d0 Test for acl rejection 2012-06-28 13:29:22 +01:00
Alex Young
c9fdd5a60e Handle ECONNRESET during a read request 2012-06-28 11:46:02 +01:00
Alex Young
9b717d6391 Factor common code out of fake destinations 2012-06-28 11:34:36 +01:00
Alex Young
192471ee82 Factor common code out of the test fake sources
* * *
More fake source refacoring
2012-06-27 17:28:24 +01:00
Alex Young
137a764cc7 Add a test for a second client connecting during a mirror 2012-06-27 16:32:01 +01:00
Alex Young
cea9d97086 Missing file 2012-06-27 16:19:13 +01:00
Alex Young
04a10179a0 check_acl correctly sets log_level 2012-06-27 16:18:38 +01:00
Alex Young
ac3e6692a8 make sure that an invalid flexnbd signal fd can't break the serve accept loop 2012-06-27 16:17:51 +01:00
Alex Young
94b4fa887c Add mboxes 2012-06-27 15:45:33 +01:00
Alex Young
2078d17053 connect failure scenarios 2012-06-22 10:05:41 +01:00
Alex Young
80f298f6cd Make non-fatal errors return properly 2012-06-21 18:01:56 +01:00
Alex Young
f37a217cb9 Add listen mode 2012-06-21 18:01:50 +01:00
Alex Young
79ba1cf728 Make max_nbd_clients configurable per struct server 2012-06-21 17:22:34 +01:00
Alex Young
e21beb1866 Add the REQUEST_ENTRUST nbd request type 2012-06-21 17:12:06 +01:00
Alex Young
a3dc670939 Squash valgrind errors by making sure client threads get joined on termination 2012-06-21 17:11:12 +01:00
Alex Young
bafc3d3687 Make sure filename_incomplete gets freed 2012-06-21 15:58:32 +01:00
Alex Young
322eae137b Add a missed free() 2012-06-21 15:55:48 +01:00
Alex Young
43e95dc4db Make sure all the lines we read get freed (including the trailing blank) 2012-06-21 15:31:28 +01:00
Alex Young
cc22f50fe6 Avoid a use-after-free in serve.c 2012-06-21 14:15:58 +01:00
Alex Young
c054403208 Trim the length bitset_run_count looks at not to exceed the bits array 2012-06-21 12:05:01 +01:00
Alex Young
80fff4e0e6 Squash a valgrind error caused by debug output 2012-06-21 11:55:21 +01:00
Alex Young
4e8a9670e5 Merge 2012-06-21 11:37:18 +01:00
Alex Young
e3c04ade29 Added early-exit on any valgrind error 2012-06-21 11:37:00 +01:00
Alex Young
ed3090d6d5 Tweak struct initialisation to squash a valgrind error 2012-06-21 10:29:06 +01:00
Alex Young
50b0db7bf6 Reject mirroring if the remote size doesn't match the local size 2012-06-13 15:51:37 +01:00
Alex Young
c9ece5a63f Tidy mirror_runner somewhat 2012-06-13 15:45:59 +01:00
Alex Young
c2b6fac92d Fix an argv array reference (root cause of a bug from the last commit) 2012-06-13 13:52:15 +01:00
Alex Young
7d1c15b07a Fix two bugs in mirroring.
First, Leaving off the source address caused a segfault in the
command-sending process because there was no NULL check on the ARGV
entry.

Second, while the migration thread sent a signal to the server to close
on successful completion, it didn't wait until the close actually
happened before releasing the IO lock.  This meant that any client
thread waiting on that IO lock could have a read or a write queued up
which could succeed despite the server shutdown.  This would have meant
dataloss as the guest would see a successful write to the wrong instance
of the file.  This patch adds a noddy serve_wait_for_close() function
which the mirror_runner calls to ensure that any clients will reject
operations they're waiting to complete.

This patch also adds a simple scenario test for migration, and fixes
TempFileWriter#read_original.
2012-06-13 13:44:21 +01:00
Alex Young
b986f6b63e Take _GNU_SOURCE out of source and put it in CFLAGS 2012-06-13 09:59:08 +01:00
Alex Young
c7525f87dc Removed proxying completely and fixed the pthread_join bug revealed in the process 2012-06-12 15:08:07 +01:00
Alex Young
2a71b4e7a4 Fix broken error checking around pthread functions 2012-06-11 16:08:19 +01:00
Alex Young
5996c8f7ba Simplify a FATAL_IF_NEGATIVE 2012-06-11 15:31:59 +01:00
Alex Young
4c52bcd870 Make the error and fatal functions swallow semicolons properly 2012-06-11 15:26:42 +01:00
Alex Young
13a6a403a4 Make the error and fatal macros swallow semicolons properly 2012-06-11 15:23:06 +01:00
Alex Young
83b8b9eaac Add general-purpose ERROR/FATAL_IF and ERROR/FATAL_UNLESS macros 2012-06-11 15:20:05 +01:00
Alex Young
c6182b9edf Merge 2012-06-11 14:59:52 +01:00
Alex Young
e2d3161a4a Set default log level to warn to shut the tests up 2012-06-11 14:59:26 +01:00
nick
8513144354 Automated merge with ssh://dev/flexnbd-c 2012-06-11 14:40:53 +01:00
nick
5ab9e10019 test: make check_serve bind() its outgoing socket to a known IP for these tests 2012-06-11 14:40:41 +01:00
Alex Young
710d8254d4 Make sure all ifs are braced 2012-06-11 14:34:17 +01:00
Alex Young
25fc0969cf Make the compiler stricter and tidy up code to make the subsequent errors and warnings go away 2012-06-11 13:57:03 +01:00
Alex Young
8825f86726 Merge 2012-06-11 13:49:56 +01:00
Alex Young
b5427d13db Explicitly check for which fd is acceptable in server_accept 2012-06-11 13:49:35 +01:00
nick
893db71d7c Whitespace 2012-06-11 13:05:22 +01:00
nick
224bdcbf87 Fix handling ACLs where > 1 entry exists 2012-06-11 12:56:45 +01:00
nick
0b90517035 tests: Get rid of a warning 2012-06-11 10:08:24 +01:00
nick
0441ef9d74 tests: Get check_serve working after the merge of doom 2012-06-11 10:04:31 +01:00
Matthew Bloch
e8b5fae7ab Merge, just renaming old error macros. 2012-06-09 02:37:23 +01:00
Matthew Bloch
b546539ab8 Rewrote error & log functions to be more general, use longjmp to get out of
trouble and into predictable cleanup functions (one for each of serve,
client & control contexts).  We use 'fatal' to mean 'kill the thread' and
'error' to mean 'don't kill the thread', assuming some recovery action,
except I don't use error anywhere yet.
2012-06-09 02:25:12 +01:00
Matthew Bloch
8691533d88 Added hopeful default path to find rake_utils, turned undefined function
warnings into errors, and added expensive header scanning to .c->.o rule to
ensure changes to .h files cause recompiles as you'd expect.
2012-06-09 02:17:34 +01:00
Alex Young
b7096ef908 Audit client connections on acl update 2012-06-08 18:03:41 +01:00
Alex Young
35ca93b42c Lock around acl updates 2012-06-08 11:02:40 +01:00
Alex Young
f7e1a098b1 Move updating the acl object into serve.c
* * *
Replacing the server acl sends an acl_updated signal
2012-06-08 10:32:33 +01:00
Alex Young
5fb0cd4cca Fix O_NONBLOCK setting on self_pipes 2012-06-08 10:11:06 +01:00
Alex Young
2d9d00b636 Pull ACLs into their own struct 2012-06-07 17:47:43 +01:00
Alex Young
601e5b475a Tidy the NULLCHECK macro to swallow semicolons properly 2012-06-07 16:00:38 +01:00
Alex Young
c628435f77 Fix an invalid define symbol 2012-06-07 15:59:13 +01:00
Alex Young
1cd8f4660f Merge of doom 2012-06-07 14:40:55 +01:00
Alex Young
5930f25034 Use client stop signals for thread stopping 2012-06-07 14:25:30 +01:00
Matthew Bloch
40f0f9fab6 Big bit of debug output in write_not_zeroes (disabled). 2012-06-07 12:28:21 +01:00
Matthew Bloch
d763ab4e74 Fixed bug in bitset_run_count which was causing data corruptionn writing
around sparse boundaries.
2012-06-07 12:27:46 +01:00
Matthew Bloch
3810a8210f Added some record-keeping / printing to fuzzer to assist with backtracking. 2012-06-07 12:25:56 +01:00
Alex Young
a90f84972b Add stop signals to client threads 2012-06-07 11:44:19 +01:00
Matthew Bloch
5710431780 Refactored write_not_zeroes to use struct bitset_mapping instead of
repeating all that code (has not fixed earlier bug yet, but lots of
repetition cut).
2012-06-07 11:17:02 +01:00
Matthew Bloch
08f3d42b34 Improved fuzz test to find an actual code bug (previous bug was in the test
<g>).
2012-06-07 02:06:08 +01:00
Matthew Bloch
9fc3c061f8 Fixed arguments to debug function. 2012-06-07 01:15:29 +01:00
Matthew Bloch
8cf1a515dd Missing break; in switch statement (verbose was setting default deny!) 2012-06-07 00:01:11 +01:00
Alex Young
cfa9f9c71f Fix the sense of client_serve_request 2012-06-06 14:25:35 +01:00
Alex Young
e8b47d5855 Remove the accept lock as being unneeded 2012-06-06 14:07:55 +01:00
Alex Young
1fc76ad77f Merge 2012-06-06 13:44:49 +01:00
Alex Young
16001eb9eb Move checking for a closed client out of server_lock_io and into client_serve_request 2012-06-06 13:44:38 +01:00
nick
648f768ff6 tests: fix the Ruby flexnbd wrapper for mirror 2012-06-06 13:33:24 +01:00
Alex Young
1b289a0e87 Change io lock and unlock to server error on failure 2012-06-06 13:29:13 +01:00
Alex Young
9dbc0a31a8 Better error message 2012-06-06 13:19:24 +01:00
Alex Young
339e766339 Use self_pipe for close_signal 2012-06-06 12:41:03 +01:00
nick
14c9468b68 Automated merge with ssh://dev/flexnbd-c 2012-06-06 12:35:18 +01:00
nick
7544a59da1 mirror: Add --bind to our mirror mode.
Mirroring doesn't actually work yet, of course.
2012-06-06 12:35:01 +01:00
nick
f4a403842d flexnbd: Fix specifying -d as --default-deny on the command line 2012-06-06 12:07:40 +01:00
Alex Young
457987664a Renamed struct client_params to struct client 2012-06-06 11:33:17 +01:00
Alex Young
40279bc9ca Split client-specific code into client.{c,h} 2012-06-06 11:27:52 +01:00
Alex Young
d22471d195 Fix a \#define symbol 2012-06-06 10:55:50 +01:00
Alex Young
a80c5ce6b5 Moved sockaddr_address_data to serve.c and renamed params.h to serve.h 2012-06-06 10:45:07 +01:00
Alex Young
cc97dd4842 Rename control to control_fd and struct mode_serve_params to struct server 2012-06-06 10:35:50 +01:00
Alex Young
a0990b824c Merge 2012-06-06 10:24:33 +01:00
Alex Young
d7fa05d42c Backed out changeset 0cbb8e9cf515 because it breaks deb packaging. 2012-06-06 10:24:04 +01:00
Alex Young
78b1879cab Merge 2012-06-06 10:19:59 +01:00
Alex Young
059be22c27 Rename int server to int server_fd in mode_serve_params 2012-06-06 10:19:45 +01:00
nick
15513c03df Remove a duplicated line due to the last merge 2012-06-06 10:05:12 +01:00
nick
682f3c70ef Automated merge with ssh://dev/flexnbd-c 2012-06-06 10:03:46 +01:00
nick
3e0628e2fc flexnbd: Re-add --sock to flexnbd mirror 2012-06-06 09:55:47 +01:00
nick
8a2fd06c31 flexnbd: Add --bind to flexnbd read and flexnbd write 2012-06-06 09:55:08 +01:00
Matthew Bloch
60cb089e45 Added fuzzer which currently exposes ugly bug with unaligned writes. 2012-06-06 01:28:54 +01:00
Matthew Bloch
d981dde8d1 Fixed FlexNBD#serve parameters, added detection of non-starting server. 2012-06-06 01:28:30 +01:00
Matthew Bloch
2245385117 Added msync() call after every write - not sure whether it's necessary yet. 2012-06-06 01:27:37 +01:00
Matthew Bloch
29151b8a78 Isolated missing library code to pkg:deb task - couldn't locate library code
(must be available from Debian, or bundled).
2012-06-05 23:46:28 +01:00
Alex Young
d87d7a826f Rename the 'debug' cli option 'verbose' and switch default-deny from 'D' to 'd' 2012-06-01 16:58:32 +01:00
Alex Young
8511cacb03 Make sure the -d short option is honoured 2012-06-01 16:47:34 +01:00
Alex Young
29937cdcf9 Merge 2012-06-01 16:25:41 +01:00
Alex Young
1ddb3bb609 Add a self_pipe set of convenience functions 2012-06-01 16:25:27 +01:00
Alex Young
91ab715659 Indentation fix 2012-06-01 16:24:50 +01:00
nick
b985e97098 Automated merge with ssh://dev/flexnbd-c 2012-06-01 14:51:43 +01:00
nick
04d67b3bab acls: Add a default-deny option, which allows you to specify what an empty ACL means.
When this option is specified, an empty ACL means "reject all clients". Without it,
an empty ACL means "accept all clients"
2012-06-01 14:48:34 +01:00
Alex Young
9dbb107bf8 Use nbdtypes to write the nbd hello message 2012-05-31 20:33:42 +01:00
Alex Young
17ed766c74 Null-terminated strings strike again 2012-05-31 18:04:57 +01:00
Alex Young
185a840e03 Factor out the bulk of client_serve_request, and add convenience converters in src/nbdtypes.c 2012-05-31 17:44:11 +01:00
Alex Young
949d7d6a72 Don't check for the INCOMPLETE file on read 2012-05-31 14:11:57 +01:00
Alex Young
1aec12613c Ditch a couple of unneeded variables to silence gcc warnings 2012-05-31 14:09:35 +01:00
Alex Young
b90b73fba6 build and default rake tasks, because I keep trying to type them 2012-05-31 14:01:49 +01:00
Alex Young
49c4ef7c56 Add .orig merge files to .hgignore 2012-05-31 13:55:35 +01:00
Alex Young
81fe41f016 Merge 2012-05-31 13:53:21 +01:00
Alex Young
074efd9fa4 Add a no-op debug() define for non-debug builds and make valgrind optional in nbd_scenarios 2012-05-31 13:53:04 +01:00
Alex Young
c2d1414bff Merge 2012-05-31 13:32:56 +01:00
Alex Young
623a398767 Add a --debug flag for DEBUG builds
If you compile with:

  DEBUG=true rake build

then all the commands get a --debug flag as an option which will make
the server dump crazy amounts of data to stderr.
2012-05-31 13:31:22 +01:00
Alex Young
268bebd408 Run the nbd_scenario tests under valgrind 2012-05-31 13:23:12 +01:00
nick
71e755906b Make the Rakefile take note of DEBUG= 2012-05-31 12:12:32 +01:00
nick
e863bffe3d Set TCP_NODELAY on our socket. This decreases average NBD read request RTT from 0.3ms to 0.001ms 2012-05-31 11:33:31 +01:00
mbloch
c6dd4fbd89 Merge 2012-05-30 20:14:14 +01:00
mbloch
cd976d1c2c Fixed short copies of struct sockaddr (it's shorter than sockaddr_in6!)
which was giving duff results when comparing IPv6 ACL entries.
2012-05-30 20:13:56 +01:00
Alex Young
42599fe01e Make sure we build arch-specific packages 2012-05-30 18:11:32 +01:00
Alex Young
f21dd9e888 Basic debian packaging
Add a build dependency on rake_utils, but we get simple debian packages
out of it.
2012-05-30 17:35:07 +01:00
Alex Young
15c3133458 Simplify option definition with som handy macros 2012-05-30 17:33:38 +01:00
Alex Young
fe08084144 Added tag 0.0.1 for changeset 27409c2c1313 2012-05-30 17:11:10 +01:00
Alex Young
0102217019 Merge 2012-05-30 15:39:55 +01:00
Alex Young
0c62e66a70 Added getopt_long command-line handling.
All parameters now have switches.  The one gotcha is the parameter which
was overloaded - s_length_or_filename to params_readwrite - is only
pretending to be a length at the moment. If you pass a filename it'll
still work, but the help messages don't mention that.  I'll split the
parameter into two in a later commit.
2012-05-30 15:19:40 +01:00
Alex Young
a01621dc1e Added .h files to the Rakefile 2012-05-30 15:06:06 +01:00
mbloch
6d8afd1035 Fixed bug where ACL was accidentally deleted when being set from control
socket.
2012-05-30 13:03:02 +01:00
nick
46ceb85aec Fix the usage message 2012-05-30 11:28:32 +01:00
Alex Young
7832958522 Rearranged the project to have src/ and build/ directories
This simplifies keeping everything clean.
2012-05-30 09:51:20 +01:00
mbloch
cf2400fedd Fixed race in tests. 2012-05-29 17:01:54 +01:00
Matthew Bloch
21ccd17ea5 Added .INCOMPLETE hack to aid with marking finished transfers. 2012-05-29 11:24:24 +01:00
Matthew Bloch
ab0dfb5eca Added mirror write barrier / final pass stuff & clean exit afterwards.
Plenty of code documentation.
2012-05-29 04:03:28 +01:00
mbloch
dcb1633b8b Lots of errors spotted by Alex fixed, added mutexes to accept & I/O, added
"remote" commands to set ACL, start mirror etc.
2012-05-29 00:59:12 +01:00
Matthew Bloch
c54d4a68ba Added another write/read test, fixed bugs in splice() usage and IPv6
socket handling.
2012-05-27 14:40:16 +01:00
Matthew Bloch
5a5041a751 First few external tests with test/unit, some minor tidying of internal data
structures.
2012-05-24 01:39:35 +01:00
mbloch
d5d6e0f55d Pulled some duplicated code out of control.c into
read_lines_until_blankline.
2012-05-23 14:03:30 +01:00
Matthew Bloch
9c26f7f36f Split control-socket functions into separate file. 2012-05-23 00:42:14 +01:00
Matthew Bloch
811e4ab2cd Fixed mirroring to work (error reporting suspect though). 2012-05-22 00:22:06 +01:00
Matthew Bloch
7eaf5c3fd3 Initial, untested mirror implementation and resolved some type confusion
around struct ip_and_mask pointers (no idea how it worked before).  Added a
header for readwrite.h used in mirror implementation.
2012-05-21 04:03:17 +01:00
Matthew Bloch
cd6e878673 More valgrind-found bugs, extracted open_and_mmap from main code. 2012-05-21 04:00:45 +01:00
Matthew Bloch
43239feb38 Fixed some uninitialised variables courtesy of valgrind. 2012-05-21 03:59:43 +01:00
Matthew Bloch
f7ce2c0ea5 Mostly finished bitset tests, fixed test build to include utilities, remove
efence as valgrind far preferable.
2012-05-21 03:17:32 +01:00
Matthew Bloch
c94b6f365c Tweaks to bitset.h, established a C test framework. 2012-05-20 14:38:46 +01:00
Matthew Bloch
8a38cf48eb Fixed segfaulting access control, allowed change to acl via control socket. 2012-05-19 12:48:03 +01:00
Matthew Bloch
580b821f61 Added dummy control socket answering / changed serve_accept_loop to use
select() to avoid a separate listening thread.
2012-05-18 23:39:16 +01:00
mbloch
b533e4e31c Added control socket, doesn't do anything yet. 2012-05-18 18:44:34 +01:00
Matthew Bloch
f5d8e740f8 Added .hgignore file 2012-05-18 13:25:54 +01:00
Matthew Bloch
ca53d6f270 Stopped NBD writes from committing all-zero blocks to disc (tentative, needs
further testing).
2012-05-18 13:24:35 +01:00
Matthew Bloch
0432fef8f5 Split code out into separate compilation units (first pass, anyway). 2012-05-17 20:14:22 +01:00
Matthew Bloch
aec90e5244 Non-functioning commit, half-way through adding sparse bitmap feature. 2012-05-17 11:54:25 +01:00
Matthew Bloch
f688d416a5 Added write mode. 2012-05-16 11:58:41 +01:00
Matthew Bloch
b1aa942b3d Added working read via splice syscall. 2012-05-16 03:20:09 +01:00
mbloch
c796a526d0 Added Rakefile 2012-05-16 01:27:14 +01:00
mbloch
c6099f78ea Silly bug fixes, added ACL support, added parser for read/write requests. 2012-05-15 18:40:58 +01:00
Matthew Bloch
94c2d44d7d Some debugging, got it to serve. 2012-05-15 03:16:19 +01:00
100 changed files with 11523 additions and 423 deletions

9
.hgignore Normal file
View File

@@ -0,0 +1,9 @@
.o$
~$
^flexnbd$
^build/
^pkg/
\.orig$
.*\.swp$
cscope.out$
valgrind.out$

370
README.txt Normal file
View File

@@ -0,0 +1,370 @@
FLEXNBD(1)
==========
:doctype: manpage
NAME
----
flexnbd - A fast NBD server
SYNOPSIS
--------
*flexnbd* 'COMMAND' ['OPTIONS']
DESCRIPTION
-----------
Flexnbd is a fast NBD server which supports live migration. Live
migration is performed by writing the data to a new server. A failed
migration will be invisible to any connected clients.
Flexnbd tries quite hard to preserve sparsity of files it is serving,
even across migrations.
COMMANDS
--------
serve
~~~~~
$ flexnbd serve --addr <ADDR> --port <PORT> --file <FILE>
[--sock <SOCK>] [--default-deny] [global option]* [acl entry]*
Serve a file. If any ACL entries are given (which should be IP
addresses), only those clients listed will be permitted to connect.
flexnbd will continue to serve until a SIGINT, SIGQUIT, or a successful
migration.
Options
^^^^^^^
*--addr, -l ADDR*:
The address to listen on. Required.
*--port, -p PORT*:
The port to listen on. Required.
*--file, -f FILE*:
The file to serve. Must already exist. Required.
*--sock, -s SOCK*:
Path to a control socket to open. You will need this if you want to
migrate, get the current status, or manipulate the access control
list.
*--default-deny, -d*:
How to interpret an empty ACL. If --default-deny is given, an
empty ACL will let no clients connect. If it is not given, an
empty ACL will let any client connect.
listen
~~~~~~
$ flexnbd listen --addr <ADDR> --port <PORT> --file <FILE>
[--rebind-addr <REBIND-ADDR>] [--rebind-port <REBIND-PORT>]
[--sock <SOCK>] [--default-deny] [global option]* [acl entry]*
Listen for an inbound migration, then serve it as normal once it has
completed.
flexnbd will wait for a successful migration, and then switch into
'serve' mode. The file to write the inbound migration data to must
already exist before you run 'flexnbd listen'.
Only one sender may connect to send data, and the server is not
available to clients while the migration is taking place.
If the sender disconnects part-way through the migration, the
destination will expect it to reconnect and retry the whole migration.
It isn't safe to assume that a partial migration can be resumed because
the destination has no knowledge of whether a client has made a write to
the source in the interim.
To support transparently replacing an existing server, flexnbd can
switch addresses once it has received a successful migration.
Options
^^^^^^^
As for 'serve', with these additions:
*--rebind-addr, -L REBIND_ADDR*:
The address to rebind to once migration has completed.
*--rebind-port, -P REBIND_PORT*:
The port to rebind to once migration has completed.
Either, both, or neither of --rebind-port and rebind-addr may be given.
If rebinding fails, flexnbd will retry every second until it succeeds.
mirror
~~~~~~
$ flexnbd mirror --addr <ADDR> --port <PORT> --sock SOCK
[--bind <BIND-ADDR>] [global option]*
Start a migration from the server with control socket SOCK to the server
listening at ADDR:PORT.
Migration can be a slow process. Rather than block the 'flexnbd mirror'
process until it completes, it will exit with a message of "Migration
started" once it has confirmation that the local server was able to
connect to ADDR:PORT and got an NBD header back. To check on the
progress of a running migration, use 'flexnbd status'.
If the destination unexpectedly disconnects part-way through the
migration, the source will attempt to reconnect and start the migration
again. It is not safe to resume the migration from where it left off
because the source can't see that the backing store behind the
destination is intact, or even on the same machine.
Note: files smaller than 4096 bytes cannot be migrated.
Options
^^^^^^^
*--addr, -l ADDR*:
The address of the remote server to migrate to. Required.
*--port, -p PORT*:
The port of the remote server to migrate to. Required.
*--sock, -s SOCK*:
The control socket of the local server to migrate from. Required.
*--bind, -b BIND-ADDR*:
The local address to bind to. You may need this if the remote server
is using an access control list.
acl
~~~
$ flexnbd acl --sock <SOCK> [acl entry]+ [global option]*
Set the access control list of the server with the control socket SOCK
to the given access control list entries.
ACL entries are given as IP addresses.
Options
^^^^^^^
*--sock, -s SOCK*:
The control socket of the server whose ACL to replace.
status
~~~~~~
$ flexnbd status --sock <SOCK> [global option]*
Get the current status of the server with control socket SOCK.
The status will be printed to STDOUT. It is a space-separated list of
key=value pairs. The space character will never appear in a key or
value. Currently reported values are:
*is_mirroring*:
'true' if this server is sending migration data, 'false' otherwise.
*has_control*:
'false' if this server was started in 'listen' mode and has not yet
received a successful migration. 'true' otherwise.
read
~~~~
$ flexnbd read --addr <ADDR> --port <PORT> --from <OFFSET>
--size <SIZE> [--bind BIND-ADDR] [global option]*
Connect to the server at ADDR:PORT, and read SIZE bytes starting at
OFFSET in a single NBD query. The returned data will be echoed to
STDOUT. In case of a remote ACL, set the local source address to
BIND-ADDR.
Options
^^^^^^^
*--addr, -l ADDR*:
The address of the remote server. Required.
*--port, -p PORT*:
The port of the remote server. Required.
*--from, -F OFFSET*:
The byte offset to start reading from. Required. Maximum 2^62.
*--size, -S SIZE*:
The number of bytes to read. Required. Maximum 2^30.
*--bind, -b BIND-ADDR*:
The local address to bind to. You may need this if the remote server
is using an access control list.
write
~~~~~
$ cat ... | flexnbd write --addr <ADDR> --port <PORT> --from <OFFSET>
--size <SIZE> [--bind BIND-ADDR] [global option]*
Connect to the server at ADDR:PORT, and write SIZE bytes from STDIN
starting at OFFSET in a single NBD query. In case of a remote ACL, set
the local source address to BIND-ADDR.
Options
^^^^^^^
*--addr, -l ADDR*:
The address of the remote server. Required.
*--port, -p PORT*:
The port of the remote server. Required.
*--from, -F OFFSET*:
The byte offset to start writing from. Required. Maximum 2^62.
*--size, -S SIZE*:
The number of bytes to write. Required. Maximum 2^30.
*--bind, -b BIND-ADDR*:
The local address to bind to. You may need this if the remote server
is using an access control list.
help
~~~~
$ flexnbd help [command] [global option]*
Without 'command', show the list of available commands. With 'command',
show help for that command.
GLOBAL OPTIONS
--------------
*--help, -h* :
Show command or global help.
*--verbose, -v* :
Output all available log information to STDERR.
*--quiet, -q* :
Output as little log information as possible to STDERR.
LOGGING
-------
Log output is sent to STDERR. If --quiet is set, no output will be seen
unless the program termintes abnormally. If neither --quiet nor
--verbose are set, no output will be seen unless something goes wrong
with a specific request. If --verbose is given, every available log
message will be seen (which, for a debug build, is many). It is not an
error to set both --verbose and --quiet. The last one wins.
The log line format is:
<LEVEL>:<PID> <THREAD> <SOURCEFILE>:<SOURCELINE>: <MSG>
*LEVEL*:
This will be one of 'D', 'I', 'W', 'E', 'F' in increasing order of
severity. If flexnbd is started with the --quiet flag, only 'F' will be
seen. If it is started with the --verbose flag, any from 'I' upwards
will be seen. Only if you have a debug build and start it with
--verbose will you see 'D' entries.
*PID*:
This is the process ID.
*THREAD*:
There are several pthreads per flexnbd process: a main thread, a serve
thread, a thread per client, and possibly a pair of mirror threads and a
control thread. This field identifies which thread was responsible for
the log line.
*SOURCEFILE:SOURCELINE*:
Identifies where in the source code this log line can be found.
*MSG*:
A short message describing what's happening, how it's being done, or
if you're very lucky *why* it's going on.
EXAMPLES
--------
Serving a file
~~~~~~~~~~~~~~
The simplest case is serving a file on the default nbd port:
$ cp /etc/passwd /tmp
$ flexnbd serve --file /tmp/passwd --addr 0.0.0.0 --port 4777 &
$ flexnbd read --addr 127.0.0.1 --port 4777 --from 0 --size 7
root:x:
$
Reading server status
~~~~~~~~~~~~~~~~~~~~~
In order to read a server's status, we need it to open a control socket.
$ flexnbd serve --file /tmp/passwd --addr 0.0.0.0 --port 4777 \
--sock /tmp/flexnbd.sock
$ flexnbd status --sock /tmp/flexnbd.sock
is_mirroring=false has_control=true
$
Note that the status output is newline-terminated.
Migrating
~~~~~~~~~
To migrate, we need to provide a destination file of the right size.
$ dd if=/dev/random of=/tmp/data bs=1M count=1
$ truncate -s 1M /tmp/data.copy
$ flexnbd serve --file /tmp/data --addr 0.0.0.0 --port 4778 \
--sock /tmp/flex-source.sock &
$ flexnbd listen --file /tmp/data.copy --addr 0.0.0.0 --port 4779 \
--sock /tmp/flex-dest.sock &
$
Now we check the status of each server, to check that they are both in
the right state:
$ flexnbd status --sock /tmp/flex-source.sock
is_mirroring=false has_control=true
$ flexnbd status --sock /tmp/flex-dest.sock
is_mirroring=false has_control=false
$
With this knowledge in hand, we can start the migration:
$ flexnbd mirror --addr 127.0.0.1 --port 4779 \
--sock /tmp/flex-source.sock
Migration started
[1] + 9648 done build/flexnbd serve --addr 0.0.0.0 --port 4778
$
Note that because the file is so small in this case, we see the source
server quit soon after we start the migration.
We can check the status of the destination server, to ensure that it
took control:
$ flexnbd status --sock /tmp/flex-dest.sock
is_mirroring=false has_control=true
BUGS
----
Should be reported to alex@bytemark.co.uk.
AUTHOR
------
Written by Alex Young <alex@bytemark.co.uk>.
Original concept and core code by Matthew Bloch
<matthew@bytemark.co.uk>.
COPYING
-------
Copyright (c) 2012 Bytemark Hosting Ltd. Free use of this software is
granted under the terms of the GNU General Public License version 3 or
later.

274
Rakefile Normal file
View File

@@ -0,0 +1,274 @@
$: << '../rake_utils/lib'
require 'rake_utils/debian'
include RakeUtils::DSL
CC=ENV['CC'] || "gcc"
DEBUG = ENV.has_key?('DEBUG') &&
%w|yes y ok 1 true t|.include?(ENV['DEBUG'])
ALL_SOURCES =FileList['src/*']
SOURCES = ALL_SOURCES.select { |c| c =~ /\.c$/ }
OBJECTS = SOURCES.pathmap( "%{^src,build}X.o" )
TEST_SOURCES = FileList['tests/unit/*.c']
TEST_OBJECTS = TEST_SOURCES.pathmap( "%{^tests/unit,build/tests}X.o" )
LIBS = %w( pthread )
CCFLAGS = %w(
-D_GNU_SOURCE=1
-Wall
-Wextra
-Werror-implicit-function-declaration
-Wstrict-prototypes
-Wno-missing-field-initializers
) + # Added -Wno-missing-field-initializers to shut GCC up over {0} struct initialisers
[ENV['CFLAGS']]
LDFLAGS = []
LIBCHECK = "/usr/lib/libcheck.a"
TEST_MODULES = Dir["tests/unit/check_*.c"].map { |n|
File.basename( n )[%r{check_(.+)\.c},1] }
if DEBUG
LDFLAGS << ["-g"]
CCFLAGS << ["-g -DDEBUG"]
end
desc "Build the binary and man page"
task :build => ['build/flexnbd', 'build/flexnbd.1.gz']
task :default => :build
desc "Build just the binary"
task :flexnbd => "build/flexnbd"
def check(m)
"build/tests/check_#{m}"
end
file "README.txt"
file "build/flexnbd.1.gz" => "README.txt" do
FileUtils.mkdir_p( "build" )
sh "a2x --destination-dir build --format manpage README.txt"
sh "gzip build/flexnbd.1"
end
desc "Build just the man page"
task :man => "build/flexnbd.1.gz"
namespace "test" do
desc "Run all tests"
task 'run' => ["unit", "scenarios"]
desc "Build C tests"
task 'build' => TEST_MODULES.map { |n| check n}
TEST_MODULES.each do |m|
desc "Run tests for #{m}"
task "check_#{m}" => check(m) do
sh check m
end
end
desc "Run C tests"
task 'unit' => 'build' do
TEST_MODULES.each do |n|
ENV['EF_DISABLE_BANNER'] = '1'
sh check n
end
end
desc "Run NBD test scenarios"
task 'scenarios' => 'flexnbd' do
sh "cd tests/acceptance; ruby nbd_scenarios"
end
end
def gcc_compile( target, source )
FileUtils.mkdir_p File.dirname( target )
sh "#{CC} -Isrc -c #{CCFLAGS.join(' ')} -o #{target} #{source} "
end
def gcc_link(target, objects)
FileUtils.mkdir_p File.dirname( target )
sh "#{CC} #{LDFLAGS.join(' ')} "+
LIBS.map { |l| "-l#{l}" }.join(" ")+
" -Isrc " +
" -o #{target} "+
objects.join(" ")
end
def headers(c)
`#{CC} -Isrc -MM #{c}`.gsub("\\\n", " ").split(" ")[2..-1]
end
rule 'build/flexnbd' => OBJECTS do |t|
gcc_link(t.name, t.sources)
end
file check("client") =>
%w{build/tests/check_client.o
build/self_pipe.o
build/nbdtypes.o
build/listen.o
build/flexnbd.o
build/flexthread.o
build/control.o
build/readwrite.o
build/parse.o
build/client.o
build/serve.o
build/acl.o
build/ioutil.o
build/mbox.o
build/mirror.o
build/status.o
build/util.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check("acl") =>
%w{build/tests/check_acl.o
build/parse.o
build/acl.o
build/util.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check( "util" ) =>
%w{build/tests/check_util.o
build/util.o
build/self_pipe.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check("serve") =>
%w{build/tests/check_serve.o
build/self_pipe.o
build/nbdtypes.o
build/control.o
build/readwrite.o
build/parse.o
build/client.o
build/flexthread.o
build/serve.o
build/flexnbd.o
build/mirror.o
build/status.o
build/listen.o
build/acl.o
build/mbox.o
build/ioutil.o
build/util.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check("readwrite") =>
%w{build/tests/check_readwrite.o
build/readwrite.o
build/client.o
build/self_pipe.o
build/serve.o
build/parse.o
build/acl.o
build/flexthread.o
build/control.o
build/flexnbd.o
build/mirror.o
build/status.o
build/listen.o
build/nbdtypes.o
build/mbox.o
build/ioutil.o
build/util.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check("listen") =>
%w{build/tests/check_listen.o
build/listen.o
build/flexnbd.o
build/status.o
build/flexthread.o
build/mbox.o
build/mirror.o
build/self_pipe.o
build/nbdtypes.o
build/control.o
build/readwrite.o
build/parse.o
build/client.o
build/serve.o
build/acl.o
build/ioutil.o
build/util.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check("flexnbd") =>
%w{build/tests/check_flexnbd.o
build/flexnbd.o
build/ioutil.o
build/util.o
build/control.o
build/listen.o
build/mbox.o
build/flexthread.o
build/status.o
build/self_pipe.o
build/client.o
build/acl.o
build/parse.o
build/nbdtypes.o
build/readwrite.o
build/mirror.o
build/serve.o} do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
file check("control") =>
%w{build/tests/check_control.o} + OBJECTS - ["build/main.o"] do |t|
gcc_link t.name, t.prerequisites + [LIBCHECK]
end
(TEST_MODULES- %w{control flexnbd acl client serve readwrite listen util}).each do |m|
tgt = "build/tests/check_#{m}.o"
maybe_obj_name = "build/#{m}.o"
# Take it out in case we're testing util.o or ioutil.o
deps = ["build/ioutil.o", "build/util.o"] - [maybe_obj_name]
# Add it back in if it's something we need to compile
deps << maybe_obj_name if OBJECTS.include?( maybe_obj_name )
file check( m ) => deps + [tgt] do |t|
gcc_link(t.name, deps + [tgt, LIBCHECK])
end
end
OBJECTS.zip( SOURCES ).each do |o,c|
file o => [c]+headers(c) do |t| gcc_compile( o, c ) end
end
TEST_OBJECTS.zip( TEST_SOURCES ).each do |o,c|
file o => [c] + headers(c) do |t| gcc_compile( o, c ) end
end
desc "Remove all build targets, binaries and temporary files"
task :clean do
sh "rm -rf *~ build"
end
namespace :pkg do
deb do |t|
t.code_files = ALL_SOURCES + ["Rakefile", "README.txt"]
t.pkg_name = "flexnbd"
t.generate_changelog!
end
end

198
debian/changelog vendored Normal file
View File

@@ -0,0 +1,198 @@
flexnbd (0.0.1-33) unstable; urgency=low
* Added tag 0.0.1 for changeset 27409c2c1313 [r33]
-- Alex Young <alex@bytemark.co.uk> Wed, 30 May 2012 17:11:10 +0100
flexnbd (0.0.1-31) unstable; urgency=low
* Fixed bug where ACL was accidentally deleted when being set from control [r31]
-- mbloch <mbloch> Wed, 30 May 2012 13:03:02 +0100
flexnbd (0.0.1-30) unstable; urgency=low
* Fix the usage message [r30]
-- nick <nick@bytemark.co.uk> Wed, 30 May 2012 11:28:32 +0100
flexnbd (0.0.1-29) unstable; urgency=low
* Fixed race in tests. [r29]
-- mbloch <mbloch> Tue, 29 May 2012 17:01:54 +0100
flexnbd (0.0.1-28) unstable; urgency=low
* Added getopt_long command-line handling. [r28]
-- Alex Young <alex@bytemark.co.uk> Wed, 30 May 2012 15:19:40 +0100
flexnbd (0.0.1-27) unstable; urgency=low
* Added .h files to the Rakefile [r27]
-- Alex Young <alex@bytemark.co.uk> Wed, 30 May 2012 15:06:06 +0100
flexnbd (0.0.1-26) unstable; urgency=low
* Rearranged the project to have src/ and build/ directories [r26]
-- Alex Young <alex@bytemark.co.uk> Wed, 30 May 2012 09:51:20 +0100
flexnbd (0.0.1-25) unstable; urgency=low
* Added .INCOMPLETE hack to aid with marking finished transfers. [r25]
-- Matthew Bloch <matthew@bytemark.co.uk> Tue, 29 May 2012 11:24:24 +0100
flexnbd (0.0.1-24) unstable; urgency=low
* Added mirror write barrier / final pass stuff & clean exit afterwards. [r24]
-- Matthew Bloch <matthew@bytemark.co.uk> Tue, 29 May 2012 04:03:28 +0100
flexnbd (0.0.1-23) unstable; urgency=low
* Lots of errors spotted by Alex fixed, added mutexes to accept & I/O, added [r23]
-- mbloch <mbloch> Tue, 29 May 2012 00:59:12 +0100
flexnbd (0.0.1-22) unstable; urgency=low
* Added another write/read test, fixed bugs in splice() usage and IPv6 [r22]
-- Matthew Bloch <matthew@bytemark.co.uk> Sun, 27 May 2012 14:40:16 +0100
flexnbd (0.0.1-21) unstable; urgency=low
* First few external tests with test/unit, some minor tidying of internal data [r21]
-- Matthew Bloch <matthew@bytemark.co.uk> Thu, 24 May 2012 01:39:35 +0100
flexnbd (0.0.1-20) unstable; urgency=low
* Pulled some duplicated code out of control.c into [r20]
-- mbloch <mbloch> Wed, 23 May 2012 14:03:30 +0100
flexnbd (0.0.1-19) unstable; urgency=low
* Split control-socket functions into separate file. [r19]
-- Matthew Bloch <matthew@bytemark.co.uk> Wed, 23 May 2012 00:42:14 +0100
flexnbd (0.0.1-18) unstable; urgency=low
* Fixed mirroring to work (error reporting suspect though). [r18]
-- Matthew Bloch <matthew@bytemark.co.uk> Tue, 22 May 2012 00:22:06 +0100
flexnbd (0.0.1-17) unstable; urgency=low
* Initial, untested mirror implementation and resolved some type confusion [r17]
-- Matthew Bloch <matthew@bytemark.co.uk> Mon, 21 May 2012 04:03:17 +0100
flexnbd (0.0.1-16) unstable; urgency=low
* More valgrind-found bugs, extracted open_and_mmap from main code. [r16]
-- Matthew Bloch <matthew@bytemark.co.uk> Mon, 21 May 2012 04:00:45 +0100
flexnbd (0.0.1-15) unstable; urgency=low
* Fixed some uninitialised variables courtesy of valgrind. [r15]
-- Matthew Bloch <matthew@bytemark.co.uk> Mon, 21 May 2012 03:59:43 +0100
flexnbd (0.0.1-14) unstable; urgency=low
* Mostly finished bitset tests, fixed test build to include utilities, remove [r14]
-- Matthew Bloch <matthew@bytemark.co.uk> Mon, 21 May 2012 03:17:32 +0100
flexnbd (0.0.1-13) unstable; urgency=low
* Tweaks to bitset.h, established a C test framework. [r13]
-- Matthew Bloch <matthew@bytemark.co.uk> Sun, 20 May 2012 14:38:46 +0100
flexnbd (0.0.1-12) unstable; urgency=low
* Fixed segfaulting access control, allowed change to acl via control socket. [r12]
-- Matthew Bloch <matthew@bytemark.co.uk> Sat, 19 May 2012 12:48:03 +0100
flexnbd (0.0.1-11) unstable; urgency=low
* Added dummy control socket answering / changed serve_accept_loop to use [r11]
-- Matthew Bloch <matthew@bytemark.co.uk> Fri, 18 May 2012 23:39:16 +0100
flexnbd (0.0.1-10) unstable; urgency=low
* Added control socket, doesn't do anything yet. [r10]
-- mbloch <mbloch> Fri, 18 May 2012 18:44:34 +0100
flexnbd (0.0.1-9) unstable; urgency=low
* Added .hgignore file [r9]
-- Matthew Bloch <matthew@bytemark.co.uk> Fri, 18 May 2012 13:25:54 +0100
flexnbd (0.0.1-8) unstable; urgency=low
* Stopped NBD writes from committing all-zero blocks to disc (tentative, needs [r8]
-- Matthew Bloch <matthew@bytemark.co.uk> Fri, 18 May 2012 13:24:35 +0100
flexnbd (0.0.1-7) unstable; urgency=low
* Split code out into separate compilation units (first pass, anyway). [r7]
-- Matthew Bloch <matthew@bytemark.co.uk> Thu, 17 May 2012 20:14:22 +0100
flexnbd (0.0.1-6) unstable; urgency=low
* Non-functioning commit, half-way through adding sparse bitmap feature. [r6]
-- Matthew Bloch <matthew@bytemark.co.uk> Thu, 17 May 2012 11:54:25 +0100
flexnbd (0.0.1-5) unstable; urgency=low
* Added write mode. [r5]
-- Matthew Bloch <matthew@bytemark.co.uk> Wed, 16 May 2012 11:58:41 +0100
flexnbd (0.0.1-4) unstable; urgency=low
* Added working read via splice syscall. [r4]
-- Matthew Bloch <matthew@bytemark.co.uk> Wed, 16 May 2012 03:20:09 +0100
flexnbd (0.0.1-3) unstable; urgency=low
* Added Rakefile [r3]
-- mbloch <mbloch> Wed, 16 May 2012 01:27:14 +0100
flexnbd (0.0.1-2) unstable; urgency=low
* Silly bug fixes, added ACL support, added parser for read/write requests. [r2]
-- mbloch <mbloch> Tue, 15 May 2012 18:40:58 +0100
flexnbd (0.0.1-1) unstable; urgency=low
* Some debugging, got it to serve. [r1]
-- Matthew Bloch <matthew@bytemark.co.uk> Tue, 15 May 2012 03:16:19 +0100
flexnbd (0.0.1-0) unstable; urgency=low
* It compiles :) [r0]
-- Matthew Bloch <matthew@bytemark.co.uk> Tue, 15 May 2012 02:42:03 +0100

1
debian/compat vendored Normal file
View File

@@ -0,0 +1 @@
7

14
debian/control vendored Normal file
View File

@@ -0,0 +1,14 @@
Source: flexnbd
Section: unknown
Priority: extra
Maintainer: Alex Young <alex@bytemark.co.uk>
Build-Depends: cdbs, debhelper (>= 7), ruby, rake, gcc
Standards-Version: 3.8.1
Homepage: http://bigv.io/
Package: flexnbd
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}
Description: FlexNBD server
An NBD server offering push-mirroring and intelligent sparse file handling

53
debian/copyright vendored Normal file
View File

@@ -0,0 +1,53 @@
This work was packaged for Debian by:
Alex Young <alex@bytemark.co.uk> on Wed, 30 May 2012 16:46:58 +0100
It was downloaded from:
<url://example.com>
Upstream Author(s):
<put author's name and email here>
<likewise for another author>
Copyright:
<Copyright (C) YYYY Firstname Lastname>
<likewise for another author>
License:
### SELECT: ###
This package is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
### OR ###
This package is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.
##########
This package is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>
On Debian systems, the complete text of the GNU General
Public License version 2 can be found in "/usr/share/common-licenses/GPL-2".
The Debian packaging is:
Copyright (C) 2012 Alex Young <alex@bytemark.co.uk>
you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
# Please also look if there are files or directories which have a
# different copyright/license attached and list them here.

2
debian/flexnbd.install vendored Normal file
View File

@@ -0,0 +1,2 @@
build/flexnbd usr/bin
build/flexnbd.1.gz usr/share/man/man1

14
debian/rules vendored Executable file
View File

@@ -0,0 +1,14 @@
#!/usr/bin/make -f
# -*- makefile -*-
# Uncomment this to turn on verbose mode.
#export DH_VERBOSE=1
%:
dh $@
override_dh_auto_build:
rake build
override_dh_auto_clean:
rake clean

1
debian/source/format vendored Normal file
View File

@@ -0,0 +1 @@
3.0 (native)

423
flexnbd.c
View File

@@ -1,423 +0,0 @@
#define _LARGEFILE64_SOURCE
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <malloc.h>
#include <errno.h>
#include <endian.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <sys/mman.h>
#include <sys/sendfile.h>
/* http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-09/2332.html */
#define INIT_PASSWD "NBDMAGIC"
#define INIT_MAGIC 0x0000420281861253
#define REQUEST_MAGIC 0x25609513
#define REPLY_MAGIC 0x67446698
#define REQUEST_READ 0
#define REQUEST_WRITE 1
#define REQUEST_DISCONNECT 2
#include <linux/types.h>
struct nbd_init {
char passwd[8];
__be64 magic;
__be64 size;
char reserved[128];
};
struct nbd_request {
__be32 magic;
__be32 type; /* == READ || == WRITE */
char handle[8];
__be64 from;
__be32 len;
} __attribute__((packed));
struct nbd_reply {
__be32 magic;
__be32 error; /* 0 = ok, else error */
char handle[8]; /* handle you got from request */
};
void syntax()
{
fprintf(stderr,
"Syntax: flexnbd serve <IP address> <port> <file> [ip addresses ...]\n"
" flexnbd read <IP address> <port> <offset> <length> > data\n"
" flexnbd write <IP address> <port> <offset> [length] < data\n"
" flexnbd mirror <IP address> <port> <target IP> <target port>\n"
);
exit(1);
}
static pthread_t server_thread_id;
void error(int consult_errno, int close_socket, const char* format, ...)
{
va_list argptr;
fprintf(stderr, "*** ");
va_start(argptr, format);
vfprintf(stderr, format, argptr);
va_end(argptr);
if (consult_errno) {
fprintf(stderr, " (errno=%d, %s)", errno, strerror(errno));
}
if (close_socket)
close(close_socket);
fprintf(stderr, "\n");
if (pthread_equal(pthread_self(), server_thread_id))
pthread_exit((void*) 1);
else
exit(1);
}
#define CLIENT_ERROR(msg, ...) \
error(0, client->socket, msg, ##__VA_ARGS__)
#define CLIENT_ERROR_ON_FAILURE(test, msg, ...) \
if (test < 0) { error(1, client->socket, msg, ##__VA_ARGS__); }
#define SERVER_ERROR(msg, ...) \
error(0, 0, msg, ##__VA_ARGS__)
#define SERVER_ERROR_ON_FAILURE(test, msg, ...) \
if (test < 0) { error(1, 0, msg, ##__VA_ARGS__); }
void* xmalloc(size_t size)
{
void* p = malloc(size);
if (p == NULL)
SERVER_ERROR("couldn't malloc %d bytes", size);
return p;
}
struct ip_and_mask {
/* FIXME */
};
struct mode_serve_params {
union { struct sockaddr generic;
struct sockaddr_in v4;
struct sockaddr_in6 v6; } bind_to;
struct ip_and_mask** acl;
char* filename;
int tcp_backlog;
int server;
int threads;
};
struct client_params {
int socket;
char* filename;
int fileno;
off64_t size;
char* mapped;
};
union mode_params {
struct mode_serve_params serve;
};
int writeloop(int filedes, const void *buffer, size_t size)
{
size_t written=0;
while (written < size) {
size_t result = write(filedes, buffer+written, size-written);
if (result == -1)
return -1;
written += result;
}
return 0;
}
int readloop(int filedes, void *buffer, size_t size)
{
size_t readden=0;
while (readden < size) {
size_t result = read(filedes, buffer+readden, size-readden);
printf("read size=%d readden=%d result=%d\n", size, readden, result);
if (result == -1)
return -1;
readden += result;
}
return 0;
}
int sendfileloop(int out_fd, int in_fd, off64_t *offset, size_t count)
{
size_t sent=0;
while (sent < count) {
size_t result = sendfile64(out_fd, in_fd, offset+sent, count-sent);
if (result == -1)
return -1;
sent += result;
}
return 0;
}
int client_serve_request(struct client_params* client)
{
off64_t offset;
struct nbd_request request;
struct nbd_reply reply;
CLIENT_ERROR_ON_FAILURE(
readloop(client->socket, &request, sizeof(request)),
"Failed to read request"
);
reply.magic = htobe32(REPLY_MAGIC);
reply.error = htobe32(0);
memcpy(reply.handle, request.handle, 8);
if (be32toh(request.magic) != REQUEST_MAGIC)
CLIENT_ERROR("Bad magic %08x", be32toh(request.magic));
switch (be32toh(request.type))
{
case REQUEST_READ:
case REQUEST_WRITE:
/* check it's not out of range */
if (be64toh(request.from) < 0 ||
be64toh(request.from)+be64toh(request.len) > client->size) {
reply.error = htobe32(1);
write(client->socket, &reply, sizeof(reply));
return 0;
}
case REQUEST_DISCONNECT:
return 1;
default:
CLIENT_ERROR("Unknown request %08x", be32toh(request.type));
}
switch (be32toh(request.type))
{
case REQUEST_READ:
write(client->socket, &reply, sizeof(reply));
offset = be64toh(request.from);
CLIENT_ERROR_ON_FAILURE(
sendfileloop(
client->socket,
client->fileno,
&offset,
be64toh(request.len)
),
"sendfile failed from=%ld, len=%ld",
offset,
be64toh(request.len)
);
break;
case REQUEST_WRITE:
CLIENT_ERROR_ON_FAILURE(
readloop(
client->socket,
client->mapped + be64toh(request.from),
be64toh(request.len)
),
"read failed from=%ld, len=%d",
be64toh(request.from),
be64toh(request.len)
);
write(client->socket, &reply, sizeof(reply));
break;
}
return 0;
}
void client_open_file(struct client_params* client)
{
client->fileno = open(client->filename, O_RDWR|O_DIRECT|O_SYNC);
CLIENT_ERROR_ON_FAILURE(client->fileno, "Couldn't open %s",
client->filename);
client->size = lseek64(client->fileno, 0, SEEK_END);
CLIENT_ERROR_ON_FAILURE(client->fileno, "Couldn't seek to end of %s",
client->filename);
client->mapped = mmap64(NULL, client->size, PROT_READ|PROT_WRITE,
MAP_SHARED, client->fileno, 0);
CLIENT_ERROR_ON_FAILURE((long) client->mapped, "Couldn't map file %s",
client->filename);
}
void client_send_hello(struct client_params* client)
{
struct nbd_init init;
memcpy(init.passwd, INIT_PASSWD, sizeof(INIT_PASSWD));
init.magic = htobe64(INIT_MAGIC);
init.size = htobe64(client->size);
memset(init.reserved, 0, 128);
CLIENT_ERROR_ON_FAILURE(
writeloop(client->socket, &init, sizeof(init)),
"Couldn't send hello"
);
}
void* client_serve(void* client_uncast)
{
struct client_params* client = (struct client_params*) client_uncast;
client_open_file(client);
client_send_hello(client);
while (client_serve_request(client) == 0)
CLIENT_ERROR_ON_FAILURE(
close(client->socket),
"Couldn't close socket %d",
client->socket
);
free(client);
return NULL;
}
/* FIXME */
int is_included_in_acl(struct ip_and_mask** list, struct sockaddr* test)
{
return 1;
}
void serve_open_socket(struct mode_serve_params* params)
{
params->server = socket(PF_INET, SOCK_STREAM, 0);
SERVER_ERROR_ON_FAILURE(params->server,
"Couldn't create server socket");
SERVER_ERROR_ON_FAILURE(
bind(params->server, &params->bind_to.generic,
sizeof(params->bind_to.generic)),
"Couldn't bind server to IP address"
);
SERVER_ERROR_ON_FAILURE(
listen(params->server, params->tcp_backlog),
"Couldn't listen on server socket"
);
}
void serve_accept_loop(struct mode_serve_params* params)
{
while (1) {
pthread_t client_thread;
struct sockaddr client_address;
struct client_params* client_params;
socklen_t socket_length;
int client_socket = accept(params->server, &client_address,
&socket_length);
SERVER_ERROR_ON_FAILURE(client_socket, "accept() failed");
if (params->acl &&
!is_included_in_acl(params->acl, &client_address)) {
write(client_socket, "Access control error", 20);
close(client_socket);
continue;
}
client_params = xmalloc(sizeof(struct client_params));
client_params->socket = client_socket;
client_params->filename = params->filename;
client_thread = pthread_create(&client_thread, NULL,
client_serve, client_params);
SERVER_ERROR_ON_FAILURE(client_thread,
"Failed to create client thread");
/* FIXME: keep track of them? */
/* FIXME: maybe shouldn't be fatal? */
}
}
void serve(struct mode_serve_params* params)
{
serve_open_socket(params);
serve_accept_loop(params);
}
void params_serve(
struct mode_serve_params* out,
char* s_ip_address,
char* s_port,
char* s_file
)
{
out->tcp_backlog = 10; /* does this need to be settable? */
out->acl = NULL; /* ignore for now */
if (s_ip_address == NULL)
SERVER_ERROR("No IP address supplied");
if (s_port == NULL)
SERVER_ERROR("No port number supplied");
if (s_file == NULL)
SERVER_ERROR("No filename supplied");
if (s_ip_address[0] == '0' && s_ip_address[1] == '\0') {
out->bind_to.v4.sin_family = AF_INET;
out->bind_to.v4.sin_addr.s_addr = INADDR_ANY;
}
else if (inet_pton(AF_INET, s_ip_address, &out->bind_to.v4) == 0) {
}
else if (inet_pton(AF_INET6, s_ip_address, &out->bind_to.v6) == 0) {
}
else {
SERVER_ERROR("Couldn't understand address '%%' "
"(use 0 if you don't care)", s_ip_address);
}
out->bind_to.v4.sin_port = atoi(s_port);
if (out->bind_to.v4.sin_port < 0 || out->bind_to.v4.sin_port > 65535)
SERVER_ERROR("Port number must be >= 0 and <= 65535");
out->bind_to.v4.sin_port = htobe16(out->bind_to.v4.sin_port);
out->filename = s_file;
}
void mode(char* mode, int argc, char **argv)
{
union mode_params params;
if (strcmp(mode, "serve") == 0) {
if (argc >= 3) {
params_serve(&params.serve, argv[0], argv[1], argv[2]);
serve(&params.serve);
}
else {
syntax();
}
}
else {
syntax();
}
exit(0);
}
int main(int argc, char** argv)
{
server_thread_id = pthread_self();
if (argc < 2)
syntax();
mode(argv[1], argc-2, argv+2);
return 0;
}

108
src/acl.c Normal file
View File

@@ -0,0 +1,108 @@
#include <stdlib.h>
#include "util.h"
#include "parse.h"
#include "acl.h"
struct acl * acl_create( int len, char ** lines, int default_deny )
{
struct acl * acl;
acl = (struct acl *)xmalloc( sizeof( struct acl ) );
acl->len = parse_acl( &acl->entries, len, lines );
acl->default_deny = default_deny;
return acl;
}
static int testmasks[9] = { 0,128,192,224,240,248,252,254,255 };
/** Test whether AF_INET or AF_INET6 sockaddr is included in the given access
* control list, returning 1 if it is, and 0 if not.
*/
static int is_included_in_acl(int list_length, struct ip_and_mask (*list)[], union mysockaddr* test)
{
NULLCHECK( test );
int i;
for (i=0; i < list_length; i++) {
struct ip_and_mask *entry = &(*list)[i];
int testbits;
unsigned char *raw_address1, *raw_address2;
debug("checking acl entry %d (%d/%d)", i, test->generic.sa_family, entry->ip.family);
if (test->generic.sa_family != entry->ip.family) {
continue;
}
if (test->generic.sa_family == AF_INET) {
debug("it's an AF_INET");
raw_address1 = (unsigned char*) &test->v4.sin_addr;
raw_address2 = (unsigned char*) &entry->ip.v4.sin_addr;
}
else if (test->generic.sa_family == AF_INET6) {
debug("it's an AF_INET6");
raw_address1 = (unsigned char*) &test->v6.sin6_addr;
raw_address2 = (unsigned char*) &entry->ip.v6.sin6_addr;
}
else {
fatal( "Can't check an ACL for this address type." );
}
debug("testbits=%d", entry->mask);
for (testbits = entry->mask; testbits > 0; testbits -= 8) {
debug("testbits=%d, c1=%02x, c2=%02x", testbits, raw_address1[0], raw_address2[0]);
if (testbits >= 8) {
if (raw_address1[0] != raw_address2[0]) { goto no_match; }
}
else {
if ((raw_address1[0] & testmasks[testbits%8]) !=
(raw_address2[0] & testmasks[testbits%8]) ) {
goto no_match;
}
}
raw_address1++;
raw_address2++;
}
return 1;
no_match: ;
debug("no match");
}
return 0;
}
int acl_includes( struct acl * acl, union mysockaddr * addr )
{
NULLCHECK( acl );
if ( 0 == acl->len ) {
return !( acl->default_deny );
}
else {
return is_included_in_acl( acl->len, acl->entries, addr );
}
}
int acl_default_deny( struct acl * acl )
{
NULLCHECK( acl );
return acl->default_deny;
}
void acl_destroy( struct acl * acl )
{
free( acl->entries );
acl->len = 0;
acl->entries = NULL;
free( acl );
}

37
src/acl.h Normal file
View File

@@ -0,0 +1,37 @@
#ifndef ACL_H
#define ACL_H
#include "parse.h"
struct acl {
int len;
int default_deny;
struct ip_and_mask (*entries)[];
};
/** Allocate a new acl structure, parsing the given lines to sockaddr
* structures in the process. After allocation, acl->len might not
* equal len. In that case, there was an error in parsing and acl->len
* will be the index of the failed entry in lines.
*
* default_deny controls the behaviour of an empty list: if true, all
* requests will be denied. If true, all requests will be accepted.
*/
struct acl * acl_create( int len, char **lines, int default_deny );
/** Check to see whether an address is allowed by an acl.
* See acl_create for how the default_deny setting affects this.
*/
int acl_includes( struct acl *, union mysockaddr *);
/** Get the default_deny status */
int acl_default_deny( struct acl * );
/** Free the acl structure and the internal acl entries table.
*/
void acl_destroy( struct acl * );
#endif

194
src/bitset.h Normal file
View File

@@ -0,0 +1,194 @@
#ifndef BITSET_H
#define BITSET_H
#include "util.h"
#include <inttypes.h>
#include <string.h>
#include <pthread.h>
static inline char char_with_bit_set(int num) { return 1<<(num%8); }
/** Return 1 if the bit at ''idx'' in array ''b'' is set */
static inline int bit_is_set(char* b, int idx) {
return (b[idx/8] & char_with_bit_set(idx)) != 0;
}
/** Return 1 if the bit at ''idx'' in array ''b'' is clear */
static inline int bit_is_clear(char* b, int idx) {
return !bit_is_set(b, idx);
}
/** Tests whether the bit at ''idx'' in array ''b'' has value ''value'' */
static inline int bit_has_value(char* b, int idx, int value) {
if (value) { return bit_is_set(b, idx); }
else { return bit_is_clear(b, idx); }
}
/** Sets the bit ''idx'' in array ''b'' */
static inline void bit_set(char* b, int idx) {
b[idx/8] |= char_with_bit_set(idx);
//__sync_fetch_and_or(b+(idx/8), char_with_bit_set(idx));
}
/** Clears the bit ''idx'' in array ''b'' */
static inline void bit_clear(char* b, int idx) {
b[idx/8] &= ~char_with_bit_set(idx);
//__sync_fetch_and_nand(b+(idx/8), char_with_bit_set(idx));
}
/** Sets ''len'' bits in array ''b'' starting at offset ''from'' */
static inline void bit_set_range(char* b, int from, int len) {
for (; from%8 != 0 && len > 0; len--) { bit_set(b, from++); }
if (len >= 8) { memset(b+(from/8), 255, len/8); }
for (; len > 0; len--) { bit_set(b, from++); }
}
/** Clears ''len'' bits in array ''b'' starting at offset ''from'' */
static inline void bit_clear_range(char* b, int from, int len) {
for (; from%8 != 0 && len > 0; len--) { bit_clear(b, from++); }
if (len >= 8) { memset(b+(from/8), 0, len/8); }
for (; len > 0; len--) { bit_clear(b, from++); }
}
/** Counts the number of contiguous bits in array ''b'', starting at ''from''
* up to a maximum number of bits ''len''. Returns the number of contiguous
* bits that are the same as the first one specified.
*/
static inline int bit_run_count(char* b, int from, int len) {
int count;
int first_value = bit_is_set(b, from);
for (count=0; len > 0 && bit_has_value(b, from+count, first_value); count++, len--)
;
/* FIXME: debug this later */
/*for (; (from+count) % 64 != 0 && len > 0; len--)
if (bit_has_value(b, from+count, first_value))
count++;
else
return count;
for (; len >= 64; len-=64) {
if (*((uint64_t*)(b + ((from+count)/8))) == UINT64_MAX)
count += 64;
else
break;
}
for (; len > 0; len--)
if (bit_is_set(b, from+count))
count++;*/
return count;
}
/** An application of a bitset - a bitset mapping represents a file of ''size''
* broken down into ''resolution''-sized chunks. The bit set is assumed to
* represent one bit per chunk.
*/
struct bitset_mapping {
uint64_t size;
int resolution;
char bits[];
};
/** Allocate a bitset_mapping for a file of the given size, and chunks of the
* given resolution.
*/
static inline struct bitset_mapping* bitset_alloc(
uint64_t size,
int resolution
)
{
struct bitset_mapping *bitset = xmalloc(
sizeof(struct bitset_mapping)+
(size+resolution-1)/resolution
);
bitset->size = size;
bitset->resolution = resolution;
return bitset;
}
#define INT_FIRST_AND_LAST \
int first = from/set->resolution, \
last = (from+len-1)/set->resolution, \
bitlen = last-first+1
/** Set the bits in a bitset which correspond to the given bytes in the larger
* file.
*/
static inline void bitset_set_range(
struct bitset_mapping* set,
uint64_t from,
uint64_t len)
{
INT_FIRST_AND_LAST;
bit_set_range(set->bits, first, bitlen);
}
/** Set every bit in the bitset. */
static inline void bitset_set(
struct bitset_mapping* set
)
{
bitset_set_range(set, 0, set->size);
}
/** Clear the bits in a bitset which correspond to the given bytes in the
* larger file.
*/
static inline void bitset_clear_range(
struct bitset_mapping* set,
uint64_t from,
uint64_t len)
{
INT_FIRST_AND_LAST;
bit_clear_range(set->bits, first, bitlen);
}
/** Clear every bit in the bitset. */
static inline void bitset_clear(
struct bitset_mapping *set
)
{
bitset_clear_range(set, 0, set->size);
}
/** Counts the number of contiguous bytes that are represented as a run in
* the bit field.
*/
static inline int bitset_run_count(
struct bitset_mapping* set,
uint64_t from,
uint64_t len)
{
/* now fix in case len goes past the end of the memory we have
* control of */
len = len+from>set->size ? set->size-from : len;
INT_FIRST_AND_LAST;
return (bit_run_count(set->bits, first, bitlen) * set->resolution) -
(from % set->resolution);
}
/** Tests whether the bit field is clear for the given file offset.
*/
static inline int bitset_is_clear_at(
struct bitset_mapping* set,
uint64_t at
)
{
return bit_is_clear(set->bits, at/set->resolution);
}
/** Tests whether the bit field is set for the given file offset.
*/
static inline int bitset_is_set_at(
struct bitset_mapping* set,
uint64_t at
)
{
return bit_is_set(set->bits, at/set->resolution);
}
#endif

581
src/client.c Normal file
View File

@@ -0,0 +1,581 @@
#include "client.h"
#include "serve.h"
#include "util.h"
#include "ioutil.h"
#include "bitset.h"
#include "nbdtypes.h"
#include "self_pipe.h"
#include <sys/mman.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
struct client *client_create( struct server *serve, int socket )
{
NULLCHECK( serve );
struct client *c;
c = xmalloc( sizeof( struct server ) );
c->stopped = 0;
c->socket = socket;
c->serve = serve;
c->stop_signal = self_pipe_create();
c->entrusted = 0;
debug( "Alloced client %p (%d, %d)", c, c->stop_signal->read_fd, c->stop_signal->write_fd );
return c;
}
void client_signal_stop( struct client *c)
{
NULLCHECK( c);
debug("client %p: signal stop (%d, %d)", c,c->stop_signal->read_fd, c->stop_signal->write_fd );
self_pipe_signal( c->stop_signal );
}
void client_destroy( struct client *client )
{
NULLCHECK( client );
debug( "Destroying stop signal for client %p", client );
self_pipe_destroy( client->stop_signal );
free( client );
}
/**
* So waiting on client->socket is len bytes of data, and we must write it all
* to client->mapped. However while doing do we must consult the bitmap
* client->block_allocation_map, which is a bitmap where one bit represents
* block_allocation_resolution bytes. Where a bit isn't set, there are no
* disc blocks allocated for that portion of the file, and we'd like to keep
* it that way.
*
* If the bitmap shows that every block in our prospective write is already
* allocated, we can proceed as normal and make one call to writeloop.
*
*/
void write_not_zeroes(struct client* client, uint64_t from, int len)
{
NULLCHECK( client );
struct bitset_mapping *map = client->serve->allocation_map;
while (len > 0) {
/* so we have to calculate how much of our input to consider
* next based on the bitmap of allocated blocks. This will be
* at a coarser resolution than the actual write, which may
* not fall on a block boundary at either end. So we look up
* how many blocks our write covers, then cut off the start
* and end to get the exact number of bytes.
*/
int run = bitset_run_count(map, from, len);
debug("write_not_zeroes: from=%ld, len=%d, run=%d", from, len, run);
if (run > len) {
run = len;
debug("(run adjusted to %d)", run);
}
if (0) /* useful but expensive */
{
uint64_t i;
fprintf(stderr, "full map resolution=%d: ", map->resolution);
for (i=0; i<client->serve->size; i+=map->resolution) {
int here = (from >= i && from < i+map->resolution);
if (here) { fprintf(stderr, ">"); }
fprintf(stderr, bitset_is_set_at(map, i) ? "1" : "0");
if (here) { fprintf(stderr, "<"); }
}
fprintf(stderr, "\n");
}
#define DO_READ(dst, len) ERROR_IF_NEGATIVE( \
readloop( \
client->socket, \
(dst), \
(len) \
), \
"read failed %ld+%d", from, (len) \
)
if (bitset_is_set_at(map, from)) {
debug("writing the lot: from=%ld, run=%d", from, run);
/* already allocated, just write it all */
DO_READ(client->mapped + from, run);
server_dirty(client->serve, from, run);
len -= run;
from += run;
}
else {
char zerobuffer[block_allocation_resolution];
/* not allocated, read in block_allocation_resoution */
while (run > 0) {
int blockrun = block_allocation_resolution -
(from % block_allocation_resolution);
if (blockrun > run)
blockrun = run;
DO_READ(zerobuffer, blockrun);
/* This reads the buffer twice in the worst case
* but we're leaning on memcmp failing early
* and memcpy being fast, rather than try to
* hand-optimized something specific.
*/
if (zerobuffer[0] != 0 ||
memcmp(zerobuffer, zerobuffer + 1, blockrun - 1)) {
memcpy(client->mapped+from, zerobuffer, blockrun);
bitset_set_range(map, from, blockrun);
server_dirty(client->serve, from, blockrun);
/* at this point we could choose to
* short-cut the rest of the write for
* faster I/O but by continuing to do it
* the slow way we preserve as much
* sparseness as possible.
*/
}
len -= blockrun;
run -= blockrun;
from += blockrun;
}
}
}
}
int fd_read_request( int fd, struct nbd_request_raw *out_request)
{
return readloop(fd, out_request, sizeof(struct nbd_request_raw));
}
/* Returns 1 if *request was filled with a valid request which we should
* try to honour. 0 otherwise. */
int client_read_request( struct client * client , struct nbd_request *out_request, int * disconnected )
{
NULLCHECK( client );
NULLCHECK( out_request );
struct nbd_request_raw request_raw;
fd_set fds;
struct timeval tv = {CLIENT_MAX_WAIT_SECS, 0};
struct timeval * ptv;
int fd_count;
/* We want a timeout if this is an inbound migration, but not
* otherwise
*/
ptv = server_is_in_control( client->serve ) ? NULL : &tv;
FD_ZERO(&fds);
FD_SET(client->socket, &fds);
self_pipe_fd_set( client->stop_signal, &fds );
fd_count = select(FD_SETSIZE, &fds, NULL, NULL, ptv);
if ( fd_count == 0 ) {
/* This "can't ever happen" */
if ( NULL == ptv ) { fatal( "No FDs selected, and no timeout!" ); }
else { error("Timed out waiting for I/O"); }
}
else if ( fd_count < 0 ) { fatal( "Select failed" ); }
if ( self_pipe_fd_isset( client->stop_signal, &fds ) ){
debug("Client received stop signal.");
return 0;
}
if (fd_read_request(client->socket, &request_raw) == -1) {
*disconnected = 1;
switch( errno ){
case 0:
debug( "EOF while reading request" );
return 0;
case ECONNRESET:
debug( "Connection reset while"
" reading request" );
return 0;
default:
/* FIXME: I've seen this happen, but I
* couldn't reproduce it so I'm leaving
* it here with a better debug output in
* the hope it'll spontaneously happen
* again. It should *probably* be an
* error() call, but I want to be sure.
* */
fatal("Error reading request: %d, %s",
errno,
strerror( errno ));
}
}
nbd_r2h_request( &request_raw, out_request );
return 1;
}
int fd_write_reply( int fd, char *handle, int error )
{
struct nbd_reply reply;
struct nbd_reply_raw reply_raw;
reply.magic = REPLY_MAGIC;
reply.error = error;
memcpy( reply.handle, handle, 8 );
nbd_h2r_reply( &reply, &reply_raw );
if( -1 == write( fd, &reply_raw, sizeof( reply_raw ) ) ) {
switch( errno ) {
case ECONNRESET:
error( "Connection reset while writing reply" );
break;
case EBADF:
fatal( "Tried to write to an invalid file descriptor" );
break;
case EPIPE:
error( "Remote end closed" );
break;
default:
fatal( "Unhandled error while writing: %d", errno );
}
}
return 1;
}
/* Writes a reply to request *request, with error, to the client's
* socket.
* Returns 1; we don't check for errors on the write.
* TODO: Check for errors on the write.
*/
int client_write_reply( struct client * client, struct nbd_request *request, int error )
{
return fd_write_reply( client->socket, request->handle, error);
}
void client_write_init( struct client * client, uint64_t size )
{
struct nbd_init init = {{0}};
struct nbd_init_raw init_raw = {{0}};
memcpy( init.passwd, INIT_PASSWD, sizeof( INIT_PASSWD ) );
init.magic = INIT_MAGIC;
init.size = size;
memset( init.reserved, 0, 128 );
nbd_h2r_init( &init, &init_raw );
ERROR_IF_NEGATIVE(
writeloop(client->socket, &init_raw, sizeof(init_raw)),
"Couldn't send hello"
);
}
/* Remove len bytes from the client socket. This is needed when the
* client sends a write we can't honour - we need to get rid of the
* bytes they've already written before we can look for another request.
*/
void client_flush( struct client * client, size_t len )
{
int devnull = open("/dev/null", O_WRONLY);
FATAL_IF_NEGATIVE( devnull,
"Couldn't open /dev/null: %s", strerror(errno));
int pipes[2];
pipe( pipes );
const unsigned int flags = SPLICE_F_MORE | SPLICE_F_MOVE;
size_t spliced = 0;
while ( spliced < len ) {
ssize_t received = splice(
client->socket, NULL,
pipes[1], NULL,
len-spliced, flags );
FATAL_IF_NEGATIVE( received,
"splice error: %s",
strerror(errno));
ssize_t junked = 0;
while( junked < received ) {
ssize_t junk;
junk = splice(
pipes[0], NULL,
devnull, NULL,
received, flags );
FATAL_IF_NEGATIVE( junk,
"splice error: %s",
strerror(errno));
junked += junk;
}
spliced += received;
}
debug("Flushed %d bytes", len);
close( devnull );
}
/* Check to see if the client's request needs a reply constructing.
* Returns 1 if we do, 0 otherwise.
* request_err is set to 0 if the client sent a bad request, in which
* case we drop the connection.
* FIXME: after an ENTRUST, there's no way to distinguish between a
* DISCONNECT and any bad request.
*/
int client_request_needs_reply( struct client * client,
struct nbd_request request )
{
debug("request type %d", request.type);
if (request.magic != REQUEST_MAGIC) {
fatal("Bad magic %08x", request.magic);
}
switch (request.type)
{
case REQUEST_READ:
ERROR_IF( client->entrusted,
"Received a read request "
"after an entrust message.");
break;
case REQUEST_WRITE:
ERROR_IF( client->entrusted,
"Received a write request "
"after an entrust message.");
/* check it's not out of range */
if ( request.from+request.len > client->serve->size) {
warn("write request %d+%d out of range",
request.from,
request.len
);
client_write_reply( client, &request, 1 );
client_flush( client, request.len );
client->disconnect = 0;
return 0;
}
break;
case REQUEST_ENTRUST:
/* Yes, we need to reply to an entrust, but we take no
* further action */
debug("request entrust");
break;
case REQUEST_DISCONNECT:
debug("request disconnect");
client->disconnect = 1;
return 0;
default:
fatal("Unknown request %08x", request.type);
}
return 1;
}
void client_reply_to_entrust( struct client * client, struct nbd_request request )
{
/* An entrust needs a response, but has no data. */
debug( "request entrust" );
client_write_reply( client, &request, 0 );
/* We set this after trying to send the reply, so we know the
* reply got away safely.
*/
client->entrusted = 1;
}
void client_reply_to_read( struct client* client, struct nbd_request request )
{
off64_t offset;
debug("request read %ld+%d", request.from, request.len);
client_write_reply( client, &request, 0);
offset = request.from;
/* If we get cut off partway through this sendfile, we don't
* want to kill the server. This should be an error.
*/
ERROR_IF_NEGATIVE(
sendfileloop(
client->socket,
client->fileno,
&offset,
request.len),
"sendfile failed from=%ld, len=%d",
offset,
request.len);
}
void client_reply_to_write( struct client* client, struct nbd_request request )
{
debug("request write %ld+%d", request.from, request.len);
if (client->serve->allocation_map) {
write_not_zeroes( client, request.from, request.len );
}
else {
debug("No allocation map, writing directly.");
/* If we get cut off partway through reading this data
* */
ERROR_IF_NEGATIVE(
readloop( client->socket,
client->mapped + request.from,
request.len),
"reading write data failed from=%ld, len=%d",
request.from,
request.len
);
server_dirty(client->serve, request.from, request.len);
}
if (1) /* not sure whether this is necessary... */
{
/* multiple of 4K page size */
uint64_t from_rounded = request.from & (!0xfff);
uint64_t len_rounded = request.len + (request.from - from_rounded);
FATAL_IF_NEGATIVE(
msync( client->mapped + from_rounded,
len_rounded,
MS_SYNC),
"msync failed %ld %ld", request.from, request.len
);
}
client_write_reply( client, &request, 0);
}
void client_reply( struct client* client, struct nbd_request request )
{
switch (request.type) {
case REQUEST_READ:
client_reply_to_read( client, request );
break;
case REQUEST_WRITE:
client_reply_to_write( client, request );
break;
case REQUEST_ENTRUST:
client_reply_to_entrust( client, request );
break;
}
}
/* Returns 0 if we should continue trying to serve requests */
int client_serve_request(struct client* client)
{
struct nbd_request request = {0};
int failure = 1;
int disconnected = 0;
if ( !client_read_request( client, &request, &disconnected ) ) { return failure; }
if ( disconnected ) { return failure; }
if ( !client_request_needs_reply( client, request ) ) {
return client->disconnect;
}
server_lock_io( client->serve );
{
if ( !server_is_closed( client->serve ) ) {
client_reply( client, request );
failure = 0;
}
}
server_unlock_io( client->serve );
return failure;
}
void client_send_hello(struct client* client)
{
client_write_init( client, client->serve->size );
}
void client_cleanup(struct client* client,
int fatal __attribute__ ((unused)) )
{
info("client cleanup for client %p", client);
if (client->socket) { close(client->socket); }
if (client->mapped) {
munmap(client->mapped, client->serve->size);
}
if (client->fileno) { close(client->fileno); }
if ( server_io_locked( client->serve ) ) { server_unlock_io( client->serve ); }
if ( server_acl_locked( client->serve ) ) { server_unlock_acl( client->serve ); }
}
void* client_serve(void* client_uncast)
{
struct client* client = (struct client*) client_uncast;
error_set_handler((cleanup_handler*) client_cleanup, client);
info("client: mmaping file");
FATAL_IF_NEGATIVE(
open_and_mmap(
client->serve->filename,
&client->fileno,
NULL,
(void**) &client->mapped
),
"Couldn't open/mmap file %s: %s", client->serve->filename, strerror( errno )
);
debug("client: sending hello");
client_send_hello(client);
debug("client: serving requests");
while (client_serve_request(client) == 0)
;
debug("client: stopped serving requests");
client->stopped = 1;
if ( client->entrusted ) {
if ( client->disconnect ){
debug("client: control arrived" );
server_control_arrived( client->serve );
}
else {
warn( "client: control transfer failed." );
}
}
FATAL_IF_NEGATIVE(
close(client->socket),
"Couldn't close socket %d",
client->socket
);
debug("Cleaning client %p up normally in thread %p", client, pthread_self());
client_cleanup(client, 0);
debug("Client thread done" );
return NULL;
}

44
src/client.h Normal file
View File

@@ -0,0 +1,44 @@
#ifndef CLIENT_H
#define CLIENT_H
/** CLIENT_MAX_WAIT_SECS
* This is the length of time an inbound migration will wait for a fresh
* write before assuming the source has Gone Away. Note: it is *not*
* the time from one write to the next, it is the gap between the end of
* one write and the start of the next.
*/
#define CLIENT_MAX_WAIT_SECS 5
struct client {
/* When we call pthread_join, if the thread is already dead
* we can get an ESRCH. Since we have no other way to tell
* if that ESRCH is from a dead thread or a thread that never
* existed, we use a `stopped` flag to indicate a thread which
* did exist, but went away. Only check this after a
* pthread_join call.
*/
int stopped;
int socket;
int fileno;
char* mapped;
struct self_pipe * stop_signal;
struct server* serve; /* FIXME: remove above duplication */
/* Have we seen a REQUEST_ENTRUST message? */
int entrusted;
/* Have we seen a REQUEST_DISCONNECT message? */
int disconnect;
};
void* client_serve(void* client_uncast);
struct client * client_create( struct server * serve, int socket );
void client_destroy( struct client * client );
void client_signal_stop( struct client * client );
#endif

499
src/control.c Normal file
View File

@@ -0,0 +1,499 @@
/* FlexNBD server (C) Bytemark Hosting 2012
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/** The control server responds on a UNIX socket and services our "remote"
* commands which are used for changing the access control list, initiating
* a mirror process, or asking for status. The protocol is pretty simple -
* after connecting the client sends a series of LF-terminated lines, followed
* by a blank line (i.e. double LF). The first line is taken to be the command
* name to invoke, and the lines before the double LF are its arguments.
*
* These commands can be invoked remotely from the command line, with the
* client code to be found in remote.c
*/
#include "control.h"
#include "mirror.h"
#include "serve.h"
#include "util.h"
#include "ioutil.h"
#include "parse.h"
#include "readwrite.h"
#include "bitset.h"
#include "self_pipe.h"
#include "acl.h"
#include "status.h"
#include "mbox.h"
#include <stdlib.h>
#include <string.h>
#include <sys/un.h>
#include <unistd.h>
struct control * control_create(
struct flexnbd * flexnbd,
const char * csn)
{
struct control * control = xmalloc( sizeof( struct control ) );
NULLCHECK( csn );
control->flexnbd = flexnbd;
control->socket_name = csn;
control->close_signal = self_pipe_create();
control->mirror_state_mbox = mbox_create();
return control;
}
void control_signal_close( struct control * control)
{
NULLCHECK( control );
self_pipe_signal( control->close_signal );
}
void control_destroy( struct control * control )
{
NULLCHECK( control );
mbox_destroy( control->mirror_state_mbox );
self_pipe_destroy( control->close_signal );
free( control );
}
struct control_client * control_client_create(
struct flexnbd * flexnbd,
int client_fd ,
struct mbox * state_mbox )
{
NULLCHECK( flexnbd );
struct control_client * control_client =
xmalloc( sizeof( struct control_client ) );
control_client->socket = client_fd;
control_client->flexnbd = flexnbd;
control_client->mirror_state_mbox = state_mbox;
return control_client;
}
void control_client_destroy( struct control_client * client )
{
NULLCHECK( client );
free( client );
}
void control_respond(struct control_client * client);
void control_handle_client( struct control * control, int client_fd )
{
NULLCHECK( control );
NULLCHECK( control->flexnbd );
struct control_client * control_client =
control_client_create(
control->flexnbd,
client_fd ,
control->mirror_state_mbox);
/* We intentionally don't spawn a thread for the client here.
* This is to avoid having more than one thread potentially
* waiting on the migration commit status.
*/
control_respond( control_client );
}
void control_accept_client( struct control * control )
{
int client_fd;
union mysockaddr client_address;
socklen_t addrlen = sizeof( union mysockaddr );
client_fd = accept( control->control_fd, &client_address.generic, &addrlen );
FATAL_IF( -1 == client_fd, "control accept failed" );
control_handle_client( control, client_fd );
}
int control_accept( struct control * control )
{
NULLCHECK( control );
fd_set fds;
FD_ZERO( &fds );
FD_SET( control->control_fd, &fds );
self_pipe_fd_set( control->close_signal, &fds );
debug("Control thread selecting");
FATAL_UNLESS( 0 < select( FD_SETSIZE, &fds, NULL, NULL, NULL ),
"Control select failed." );
if ( self_pipe_fd_isset( control->close_signal, &fds ) ){
return 0;
}
if ( FD_ISSET( control->control_fd, &fds ) ) {
control_accept_client( control );
}
return 1;
}
void control_accept_loop( struct control * control )
{
while( control_accept( control ) );
}
int open_control_socket( const char * socket_name )
{
struct sockaddr_un bind_address;
int control_fd;
if (!socket_name) {
fatal( "Tried to open a control socket without a socket name" );
}
control_fd = socket(AF_UNIX, SOCK_STREAM, 0);
FATAL_IF_NEGATIVE(control_fd ,
"Couldn't create control socket");
memset(&bind_address, 0, sizeof(struct sockaddr_un));
bind_address.sun_family = AF_UNIX;
strncpy(bind_address.sun_path, socket_name, sizeof(bind_address.sun_path)-1);
//unlink(socket_name); /* ignore failure */
FATAL_IF_NEGATIVE(
bind(control_fd , &bind_address, sizeof(bind_address)),
"Couldn't bind control socket to %s: %s",
socket_name, strerror( errno )
);
FATAL_IF_NEGATIVE(
listen(control_fd , 5),
"Couldn't listen on control socket"
);
return control_fd;
}
void control_listen(struct control* control)
{
NULLCHECK( control );
control->control_fd = open_control_socket( control->socket_name );
}
void control_serve( struct control * control )
{
NULLCHECK( control );
control_listen( control );
while( control_accept( control ) );
}
void control_cleanup(
struct control * control,
int fatal __attribute__((unused)) )
{
NULLCHECK( control );
unlink( control->socket_name );
close( control->control_fd );
}
void * control_runner( void * control_uncast )
{
debug("Control thread");
NULLCHECK( control_uncast );
struct control * control = (struct control *)control_uncast;
error_set_handler( (cleanup_handler*)control_cleanup, control );
control_serve( control );
control_cleanup( control, 0 );
return NULL;
}
#define write_socket(msg) write(client_fd, (msg "\n"), strlen((msg))+1)
void control_write_mirror_response( enum mirror_state mirror_state, int client_fd )
{
switch (mirror_state) {
case MS_INIT:
case MS_UNKNOWN:
write_socket( "1: Mirror failed to initialise" );
fatal( "Impossible mirror state: %d", mirror_state );
case MS_FAIL_CONNECT:
write_socket( "1: Mirror failed to connect");
break;
case MS_FAIL_REJECTED:
write_socket( "1: Mirror was rejected" );
break;
case MS_FAIL_NO_HELLO:
write_socket( "1: Remote server failed to respond");
break;
case MS_FAIL_SIZE_MISMATCH:
write_socket( "1: Remote size does not match local size" );
break;
case MS_GO:
case MS_DONE: /* Yes, I know we know better, but it's simpler this way */
write_socket( "0: Mirror started" );
break;
default:
fatal( "Unhandled mirror state: %d", mirror_state );
}
}
#undef write_socket
/* Call this in the thread where you want to receive the mirror state */
enum mirror_state control_client_mirror_wait(
struct control_client* client)
{
NULLCHECK( client );
NULLCHECK( client->mirror_state_mbox );
struct mbox * mbox = client->mirror_state_mbox;
enum mirror_state mirror_state;
enum mirror_state * contents;
contents = (enum mirror_state*)mbox_receive( mbox );
NULLCHECK( contents );
mirror_state = *contents;
free( contents );
return mirror_state;
}
#define write_socket(msg) write(client->socket, (msg "\n"), strlen((msg))+1)
/** Command parser to start mirror process from socket input */
int control_mirror(struct control_client* client, int linesc, char** lines)
{
NULLCHECK( client );
struct flexnbd * flexnbd = client->flexnbd;
union mysockaddr *connect_to = xmalloc( sizeof( union mysockaddr ) );
union mysockaddr *connect_from = NULL;
uint64_t max_Bps = 0;
int action_at_finish;
int raw_port;
if (linesc < 2) {
write_socket("1: mirror takes at least two parameters");
return -1;
}
if (parse_ip_to_sockaddr(&connect_to->generic, lines[0]) == 0) {
write_socket("1: bad IP address");
return -1;
}
raw_port = atoi(lines[1]);
if (raw_port < 0 || raw_port > 65535) {
write_socket("1: bad IP port number");
return -1;
}
connect_to->v4.sin_port = htobe16(raw_port);
if (linesc > 2) {
connect_from = xmalloc( sizeof( union mysockaddr ) );
if (parse_ip_to_sockaddr(&connect_from->generic, lines[2]) == 0) {
write_socket("1: bad bind address");
return -1;
}
}
if (linesc > 3) { max_Bps = atoi(lines[2]); }
action_at_finish = ACTION_EXIT;
if (linesc > 4) {
if (strcmp("exit", lines[3]) == 0) {
action_at_finish = ACTION_EXIT;
}
else if (strcmp("nothing", lines[3]) == 0) {
action_at_finish = ACTION_NOTHING;
}
else {
write_socket("1: action must be 'exit' or 'nothing'");
return -1;
}
}
if (linesc > 5) {
write_socket("1: unrecognised parameters to mirror");
return -1;
}
/* In theory, we should never have to worry about the switch
* lock here, since we should never be able to start more than
* one mirror at a time. This is enforced by only accepting a
* single client at a time on the control socket.
*/
flexnbd_lock_switch( flexnbd );
{
struct server * serve = flexnbd_server(flexnbd);
serve->mirror_super = mirror_super_create(
serve->filename,
connect_to,
connect_from,
max_Bps ,
action_at_finish,
client->mirror_state_mbox );
serve->mirror = serve->mirror_super->mirror;
FATAL_IF( 0 != pthread_create(
&serve->mirror_super->thread,
NULL,
mirror_super_runner,
serve
),
"Failed to create mirror thread"
);
debug("Control thread mirror super waiting");
enum mirror_state state =
control_client_mirror_wait( client );
debug("Control thread writing response");
control_write_mirror_response( state, client->socket );
}
debug( "Control thread unlocking switch" );
flexnbd_unlock_switch( flexnbd );
debug( "Control thread going away." );
return 0;
}
#undef write_socket
/** Command parser to alter access control list from socket input */
int control_acl(struct control_client* client, int linesc, char** lines)
{
NULLCHECK( client );
NULLCHECK( client->flexnbd );
struct flexnbd * flexnbd = client->flexnbd;
int default_deny = flexnbd_default_deny( flexnbd );
struct acl * new_acl = acl_create( linesc, lines, default_deny );
if (new_acl->len != linesc) {
write(client->socket, "1: bad spec: ", 13);
write(client->socket, lines[new_acl->len],
strlen(lines[new_acl->len]));
write(client->socket, "\n", 1);
acl_destroy( new_acl );
}
else {
flexnbd_replace_acl( flexnbd, new_acl );
write( client->socket, "0: updated", 10);
}
return 0;
}
/** FIXME: add some useful statistics */
int control_status(
struct control_client* client,
int linesc __attribute__ ((unused)),
char** lines __attribute__((unused))
)
{
NULLCHECK( client );
NULLCHECK( client->flexnbd );
struct status * status = flexnbd_status_create( client->flexnbd );
write( client->socket, "0: ", 3 );
status_write( status, client->socket );
status_destroy( status );
return 0;
}
void control_client_cleanup(struct control_client* client,
int fatal __attribute__ ((unused)) )
{
if (client->socket) { close(client->socket); }
/* This is wrongness */
if ( server_io_locked( client->flexnbd->serve ) ) { server_unlock_io( client->flexnbd->serve ); }
if ( server_acl_locked( client->flexnbd->serve ) ) { server_unlock_acl( client->flexnbd->serve ); }
if ( flexnbd_switch_locked( client->flexnbd ) ) { flexnbd_unlock_switch( client->flexnbd ); }
control_client_destroy( client );
}
/** Master command parser for control socket connections, delegates quickly */
void control_respond(struct control_client * client)
{
char **lines = NULL;
error_set_handler((cleanup_handler*) control_client_cleanup, client);
int i, linesc;
linesc = read_lines_until_blankline(client->socket, 256, &lines);
if (linesc < 1)
{
write(client->socket, "9: missing command\n", 19);
/* ignore failure */
}
else if (strcmp(lines[0], "acl") == 0) {
info("acl command received" );
if (control_acl(client, linesc-1, lines+1) < 0) {
debug("acl command failed");
}
}
else if (strcmp(lines[0], "mirror") == 0) {
info("mirror command received" );
if (control_mirror(client, linesc-1, lines+1) < 0) {
debug("mirror command failed");
}
}
else if (strcmp(lines[0], "status") == 0) {
info("status command received" );
if (control_status(client, linesc-1, lines+1) < 0) {
debug("status command failed");
}
}
else {
write(client->socket, "10: unknown command\n", 23);
}
for (i=0; i<linesc; i++) {
free(lines[i]);
}
free(lines);
control_client_cleanup(client, 0);
debug("control command handled" );
}

54
src/control.h Normal file
View File

@@ -0,0 +1,54 @@
#ifndef CONTROL_H
#define CONTROL_H
#include "parse.h"
#include "mirror.h"
#include "control.h"
#include "flexnbd.h"
#include "mbox.h"
struct control {
struct flexnbd * flexnbd;
int control_fd;
const char * socket_name;
pthread_t thread;
struct self_pipe * close_signal;
/* This is owned by the control object, and used by a
* mirror_super to communicate the state of a mirror attempt as
* early as feasible. It can't be owned by the mirror_super
* object because the mirror_super object can be freed at any
* time (including while the control_client is waiting on it),
* whereas the control object lasts for the lifetime of the
* process (and we can only have a mirror thread if the control
* thread has started it).
*/
struct mbox * mirror_state_mbox;
};
struct control_client{
int socket;
struct flexnbd * flexnbd;
/* Passed in on creation. We know it's all right to do this
* because we know there's only ever one control_client.
*/
struct mbox * mirror_state_mbox;
};
struct control * control_create(
struct flexnbd *,
const char * control_socket_name );
void control_signal_close( struct control * );
void control_destroy( struct control * );
void * control_runner( void * );
void accept_control_connection(struct server* params, int client_fd, union mysockaddr* client_address);
void serve_open_control_socket(struct server* params);
#endif

315
src/flexnbd.c Normal file
View File

@@ -0,0 +1,315 @@
/* FlexNBD server (C) Bytemark Hosting 2012
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/** main() function for parsing and dispatching commands. Each mode has
* a corresponding structure which is filled in and passed to a do_ function
* elsewhere in the program.
*/
#include "flexnbd.h"
#include "serve.h"
#include "listen.h"
#include "util.h"
#include "control.h"
#include "status.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/signalfd.h>
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <getopt.h>
#include "acl.h"
int flexnbd_build_signal_fd(void)
{
sigset_t mask;
int sfd;
sigemptyset( &mask );
sigaddset( &mask, SIGTERM );
sigaddset( &mask, SIGQUIT );
sigaddset( &mask, SIGINT );
FATAL_UNLESS( 0 == pthread_sigmask( SIG_BLOCK, &mask, NULL ),
"Signal blocking failed" );
sfd = signalfd( -1, &mask, 0 );
FATAL_IF( -1 == sfd, "Failed to get a signal fd" );
return sfd;
}
void flexnbd_create_shared(
struct flexnbd * flexnbd,
const char * s_ctrl_sock)
{
NULLCHECK( flexnbd );
if ( s_ctrl_sock ){
flexnbd->control =
control_create( flexnbd, s_ctrl_sock );
}
else {
flexnbd->control = NULL;
}
flexnbd->signal_fd = flexnbd_build_signal_fd();
flexnbd->switch_mutex = flexthread_mutex_create();
}
struct flexnbd * flexnbd_create_serving(
char* s_ip_address,
char* s_port,
char* s_file,
char *s_ctrl_sock,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients)
{
struct flexnbd * flexnbd = xmalloc( sizeof( struct flexnbd ) );
flexnbd->serve = server_create(
flexnbd,
s_ip_address,
s_port,
s_file,
default_deny,
acl_entries,
s_acl_entries,
max_nbd_clients,
1);
flexnbd_create_shared( flexnbd, s_ctrl_sock );
return flexnbd;
}
struct flexnbd * flexnbd_create_listening(
char* s_ip_address,
char* s_rebind_ip_address,
char* s_port,
char* s_rebind_port,
char* s_file,
char *s_ctrl_sock,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients )
{
struct flexnbd * flexnbd = xmalloc( sizeof( struct flexnbd ) );
flexnbd->listen = listen_create(
flexnbd,
s_ip_address,
s_rebind_ip_address,
s_port,
s_rebind_port,
s_file,
default_deny,
acl_entries,
s_acl_entries,
max_nbd_clients);
flexnbd->serve = flexnbd->listen->init_serve;
flexnbd_create_shared( flexnbd, s_ctrl_sock );
return flexnbd;
}
void flexnbd_spawn_control(struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
NULLCHECK( flexnbd->control );
pthread_t * control_thread = &flexnbd->control->thread;
FATAL_UNLESS( 0 == pthread_create(
control_thread,
NULL,
control_runner,
flexnbd->control ),
"Couldn't create the control thread" );
}
void flexnbd_stop_control( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
NULLCHECK( flexnbd->control );
control_signal_close( flexnbd->control );
FATAL_UNLESS( 0 == pthread_join( flexnbd->control->thread, NULL ),
"Failed joining the control thread" );
}
int flexnbd_signal_fd( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
return flexnbd->signal_fd;
}
void flexnbd_destroy( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
if ( flexnbd->control ) {
control_destroy( flexnbd->control );
}
if ( flexnbd->listen ) {
listen_destroy( flexnbd->listen );
}
flexthread_mutex_destroy( flexnbd->switch_mutex );
close( flexnbd->signal_fd );
free( flexnbd );
}
/* THOU SHALT NOT DEREFERENCE flexnbd->serve OUTSIDE A SWITCH LOCK
*/
void flexnbd_lock_switch( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
flexthread_mutex_lock( flexnbd->switch_mutex );
}
void flexnbd_unlock_switch( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
flexthread_mutex_unlock( flexnbd->switch_mutex );
}
int flexnbd_switch_locked( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
return flexthread_mutex_held( flexnbd->switch_mutex );
}
struct server * flexnbd_server( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
return flexnbd->serve;
}
void flexnbd_replace_acl( struct flexnbd * flexnbd, struct acl * acl )
{
NULLCHECK( flexnbd );
flexnbd_lock_switch( flexnbd );
{
server_replace_acl( flexnbd_server(flexnbd), acl );
}
flexnbd_unlock_switch( flexnbd );
}
struct status * flexnbd_status_create( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
struct status * status;
flexnbd_lock_switch( flexnbd );
{
status = status_create( flexnbd_server( flexnbd ) );
}
flexnbd_unlock_switch( flexnbd );
return status;
}
/** THOU SHALT *ONLY* CALL THIS FROM INSIDE A SWITCH LOCK
*/
void flexnbd_set_server( struct flexnbd * flexnbd, struct server * serve )
{
NULLCHECK( flexnbd );
flexnbd->serve = serve;
}
/* Calls the given callback to exchange server objects, then sets
* flexnbd->server so everything else can see it. */
void flexnbd_switch( struct flexnbd * flexnbd, struct server *(listen_cb)(struct listen *) )
{
NULLCHECK( flexnbd );
NULLCHECK( flexnbd->listen );
flexnbd_lock_switch( flexnbd );
{
struct server * new_server = listen_cb( flexnbd->listen );
NULLCHECK( new_server );
flexnbd_set_server( flexnbd, new_server );
}
flexnbd_unlock_switch( flexnbd );
}
/* Get the default_deny of the current server object. This takes the
* switch_lock to avoid nastiness if the server switches and gets freed
* in the dereference chain.
* This means that this function must not be called if the switch lock
* is already held.
*/
int flexnbd_default_deny( struct flexnbd * flexnbd )
{
int result;
NULLCHECK( flexnbd );
flexnbd_lock_switch( flexnbd );
{
result = server_default_deny( flexnbd->serve );
}
flexnbd_unlock_switch( flexnbd );
return result;
}
int flexnbd_serve( struct flexnbd * flexnbd )
{
NULLCHECK( flexnbd );
int success;
if ( flexnbd->control ){
debug( "Spawning control thread" );
flexnbd_spawn_control( flexnbd );
}
if ( flexnbd->listen ){
success = do_listen( flexnbd->listen );
}
else {
do_serve( flexnbd->serve );
/* We can't tell here what the intent was. We can
* legitimately exit either in control or not.
*/
success = 1;
}
if ( flexnbd->control ) {
debug( "Stopping control thread" );
flexnbd_stop_control( flexnbd );
debug("Control thread stopped");
}
return success;
}

77
src/flexnbd.h Normal file
View File

@@ -0,0 +1,77 @@
#ifndef FLEXNBD_H
#define FLEXNBD_H
#include "acl.h"
#include "mirror.h"
#include "serve.h"
#include "listen.h"
#include "self_pipe.h"
#include "mbox.h"
#include "control.h"
#include "flexthread.h"
/* Carries the "globals". */
struct flexnbd {
/* We always have a serve pointer, but it should never be
* dereferenced outside a flexnbd_switch_lock/unlock pair.
*/
struct server * serve;
/* We only have a listen object if the process was started in
* listen mode.
*/
struct listen * listen;
/* We only have a control object if a control socket name was
* passed on the command line.
*/
struct control * control;
/* switch_mutex is the lock around dereferencing the serve
* pointer.
*/
struct flexthread_mutex * switch_mutex;
/* File descriptor for a signalfd(2) signal stream. */
int signal_fd;
};
struct flexnbd * flexnbd_create(void);
struct flexnbd * flexnbd_create_serving(
char* s_ip_address,
char* s_port,
char* s_file,
char *s_ctrl_sock,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients);
struct flexnbd * flexnbd_create_listening(
char* s_ip_address,
char* s_rebind_ip_address,
char* s_port,
char* s_rebind_port,
char* s_file,
char *s_ctrl_sock,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients );
void flexnbd_destroy( struct flexnbd * );
enum mirror_state;
enum mirror_state flexnbd_get_mirror_state( struct flexnbd * );
void flexnbd_lock_switch( struct flexnbd * );
void flexnbd_unlock_switch( struct flexnbd * );
int flexnbd_switch_locked( struct flexnbd * );
int flexnbd_default_deny( struct flexnbd * );
void flexnbd_set_server( struct flexnbd * flexnbd, struct server * serve );
void flexnbd_switch( struct flexnbd * flexnbd, struct server *(listen_cb)(struct listen *) );
int flexnbd_signal_fd( struct flexnbd * flexnbd );
int flexnbd_serve( struct flexnbd * flexnbd );
struct server * flexnbd_server( struct flexnbd * flexnbd );
void flexnbd_replace_acl( struct flexnbd * flexnbd, struct acl * acl );
struct status * flexnbd_status_create( struct flexnbd * flexnbd );
#endif

75
src/flexthread.c Normal file
View File

@@ -0,0 +1,75 @@
#include "flexthread.h"
#include "util.h"
#include <pthread.h>
struct flexthread_mutex * flexthread_mutex_create(void)
{
struct flexthread_mutex * ftm =
xmalloc( sizeof( struct flexthread_mutex ) );
FATAL_UNLESS( 0 == pthread_mutex_init( &ftm->mutex, NULL ),
"Mutex initialisation failed" );
return ftm;
}
void flexthread_mutex_destroy( struct flexthread_mutex * ftm )
{
NULLCHECK( ftm );
if( flexthread_mutex_held( ftm ) ) {
flexthread_mutex_unlock( ftm );
}
else if ( (pthread_t)NULL != ftm->holder ) {
/* This "should never happen": if we can try to destroy
* a mutex currently held by another thread, there's a
* logic bug somewhere. I know the test here is racy,
* but there's not a lot we can do about it at this
* point.
*/
fatal( "Attempted to destroy a flexthread_mutex"\
" held by another thread!" );
}
FATAL_UNLESS( 0 == pthread_mutex_destroy( &ftm->mutex ),
"Mutex destroy failed" );
free( ftm );
}
int flexthread_mutex_lock( struct flexthread_mutex * ftm )
{
NULLCHECK( ftm );
int failure = pthread_mutex_lock( &ftm->mutex );
if ( 0 == failure ) {
ftm->holder = pthread_self();
}
return failure;
}
int flexthread_mutex_unlock( struct flexthread_mutex * ftm )
{
NULLCHECK( ftm );
pthread_t orig = ftm->holder;
ftm->holder = (pthread_t)NULL;
int failure = pthread_mutex_unlock( &ftm->mutex );
if ( 0 != failure ) {
ftm->holder = orig;
}
return failure;
}
int flexthread_mutex_held( struct flexthread_mutex * ftm )
{
NULLCHECK( ftm );
return pthread_self() == ftm->holder;
}

29
src/flexthread.h Normal file
View File

@@ -0,0 +1,29 @@
#ifndef FLEXTHREAD_H
#define FLEXTHREAD_H
#include <pthread.h>
/* Define a mutex wrapper object. This wrapper allows us to easily
* track whether or not we currently hold the wrapped mutex. If we hold
* the mutex when we destroy it, then we first release it.
*
* These are specifically for the case where an ERROR_* handler gets
* called when we might (or might not) have a mutex held. The
* flexthread_mutex_held() function will tell you if your thread
* currently holds the given mutex. It's not safe to make any other
* comparisons.
*/
struct flexthread_mutex {
pthread_mutex_t mutex;
pthread_t holder;
};
struct flexthread_mutex * flexthread_mutex_create(void);
void flexthread_mutex_destroy( struct flexthread_mutex * );
int flexthread_mutex_lock( struct flexthread_mutex * );
int flexthread_mutex_unlock( struct flexthread_mutex * );
int flexthread_mutex_held( struct flexthread_mutex * );
#endif

282
src/ioutil.c Normal file
View File

@@ -0,0 +1,282 @@
#include <sys/mman.h>
#include <sys/sendfile.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <linux/fs.h>
#include <linux/fiemap.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include "util.h"
#include "bitset.h"
struct bitset_mapping* build_allocation_map(int fd, uint64_t size, int resolution)
{
unsigned int i;
struct bitset_mapping* allocation_map = bitset_alloc(size, resolution);
struct fiemap *fiemap_count = NULL, *fiemap = NULL;
fiemap_count = (struct fiemap*) xmalloc(sizeof(struct fiemap));
fiemap_count->fm_start = 0;
fiemap_count->fm_length = size;
fiemap_count->fm_flags = 0;
fiemap_count->fm_extent_count = 0;
fiemap_count->fm_mapped_extents = 0;
/* Find out how many extents there are */
if (ioctl(fd, FS_IOC_FIEMAP, fiemap_count) < 0) {
debug( "Couldn't get fiemap_count, returning no allocation_map" );
goto no_map;
}
/* Resize fiemap to allow us to read in the extents */
fiemap = (struct fiemap*)xmalloc(
sizeof(struct fiemap) + (
sizeof(struct fiemap_extent) *
fiemap_count->fm_mapped_extents
)
);
/* realloc makes valgrind complain a lot */
memcpy(fiemap, fiemap_count, sizeof(struct fiemap));
free( fiemap_count );
fiemap->fm_extent_count = fiemap->fm_mapped_extents;
fiemap->fm_mapped_extents = 0;
if (ioctl(fd, FS_IOC_FIEMAP, fiemap) < 0) {
debug( "Couldn't get fiemap, returning no allocation_map" );
goto no_map;
}
for (i=0;i<fiemap->fm_mapped_extents;i++) {
bitset_set_range(
allocation_map,
fiemap->fm_extents[i].fe_logical,
fiemap->fm_extents[i].fe_length
);
}
/* This is pointlessly verbose for real discs, it's here as a
* reference for pulling data out of the allocation map */
if ( 0 ) {
for (i=0; i<(size/resolution); i++) {
debug("map[%d] = %d%d%d%d%d%d%d%d",
i,
(allocation_map->bits[i] & 1) == 1,
(allocation_map->bits[i] & 2) == 2,
(allocation_map->bits[i] & 4) == 4,
(allocation_map->bits[i] & 8) == 8,
(allocation_map->bits[i] & 16) == 16,
(allocation_map->bits[i] & 32) == 32,
(allocation_map->bits[i] & 64) == 64,
(allocation_map->bits[i] & 128) == 128
);
}
}
free(fiemap);
debug("Successfully built allocation map");
return allocation_map;
no_map:
free( allocation_map );
if ( NULL != fiemap ) { free( fiemap ); }
if ( NULL != fiemap_count ) { free( fiemap_count ); }
return NULL;
}
int open_and_mmap(const char* filename, int* out_fd, off64_t *out_size, void **out_map)
{
off64_t size;
/* O_DIRECT seems to be intermittently supported. Leaving it as
* a compile-time option for now. */
#ifdef DIRECT_IO
*out_fd = open(filename, O_RDWR | O_DIRECT | O_SYNC );
#else
*out_fd = open(filename, O_RDWR | O_SYNC );
#endif
if (*out_fd < 1) {
warn("open(%s) failed: does it exist?", filename);
return *out_fd;
}
size = lseek64(*out_fd, 0, SEEK_END);
if (size < 0) {
warn("lseek64() failed");
return size;
}
if (out_size) {
*out_size = size;
}
if (out_map) {
*out_map = mmap64(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED,
*out_fd, 0);
if (((long) *out_map) == -1) {
warn("mmap64() failed");
return -1;
}
}
debug("opened %s size %ld on fd %d @ %p", filename, size, *out_fd, *out_map);
return 0;
}
int writeloop(int filedes, const void *buffer, size_t size)
{
size_t written=0;
while (written < size) {
ssize_t result = write(filedes, buffer+written, size-written);
if (result == -1) { return -1; }
written += result;
}
return 0;
}
int readloop(int filedes, void *buffer, size_t size)
{
size_t readden=0;
while (readden < size) {
ssize_t result = read(filedes, buffer+readden, size-readden);
if (result == 0 /* EOF */ || result == -1 /* error */) {
return -1;
}
readden += result;
}
return 0;
}
int sendfileloop(int out_fd, int in_fd, off64_t *offset, size_t count)
{
size_t sent=0;
while (sent < count) {
ssize_t result = sendfile64(out_fd, in_fd, offset, count-sent);
debug("sendfile64(out_fd=%d, in_fd=%d, offset=%p, count-sent=%ld) = %ld", out_fd, in_fd, offset, count-sent, result);
if (result == -1) { return -1; }
sent += result;
debug("sent=%ld, count=%ld", sent, count);
}
debug("exiting sendfileloop");
return 0;
}
#include <errno.h>
ssize_t spliceloop(int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags2)
{
const unsigned int flags = SPLICE_F_MORE|SPLICE_F_MOVE|flags2;
size_t spliced=0;
//debug("spliceloop(%d, %ld, %d, %ld, %ld)", fd_in, off_in ? *off_in : 0, fd_out, off_out ? *off_out : 0, len);
while (spliced < len) {
ssize_t result = splice(fd_in, off_in, fd_out, off_out, len, flags);
if (result < 0) {
//debug("result=%ld (%s), spliced=%ld, len=%ld", result, strerror(errno), spliced, len);
if (errno == EAGAIN && (flags & SPLICE_F_NONBLOCK) ) {
return spliced;
}
else {
return -1;
}
} else {
spliced += result;
//debug("result=%ld (%s), spliced=%ld, len=%ld", result, strerror(errno), spliced, len);
}
}
return spliced;
}
int splice_via_pipe_loop(int fd_in, int fd_out, size_t len)
{
int pipefd[2]; /* read end, write end */
size_t spliced=0;
if (pipe(pipefd) == -1) {
return -1;
}
while (spliced < len) {
ssize_t run = len-spliced;
ssize_t s2, s1 = spliceloop(fd_in, NULL, pipefd[1], NULL, run, SPLICE_F_NONBLOCK);
/*if (run > 65535)
run = 65535;*/
if (s1 < 0) { break; }
s2 = spliceloop(pipefd[0], NULL, fd_out, NULL, s1, 0);
if (s2 < 0) { break; }
spliced += s2;
}
close(pipefd[0]);
close(pipefd[1]);
return spliced < len ? -1 : 0;
}
/* Reads single bytes from fd until either an EOF or a newline appears.
* If an EOF occurs before a newline, returns -1. The line is lost.
* Inserts the read bytes (without the newline) into buf, followed by a
* trailing NULL.
* Returns the number of read bytes: the length of the line without the
* newline, plus the trailing null.
*/
int read_until_newline(int fd, char* buf, int bufsize)
{
int cur;
for (cur=0; cur < bufsize; cur++) {
int result = read(fd, buf+cur, 1);
if (result <= 0) { return -1; }
if (buf[cur] == 10) {
buf[cur] = '\0';
break;
}
}
return cur+1;
}
int read_lines_until_blankline(int fd, int max_line_length, char ***lines)
{
int lines_count = 0;
char line[max_line_length+1];
*lines = NULL;
memset(line, 0, max_line_length+1);
while (1) {
int readden = read_until_newline(fd, line, max_line_length);
/* readden will be:
* 1 for an empty line
* -1 for an eof
* -1 for a read error
*/
if (readden <= 1) { return lines_count; }
*lines = xrealloc(*lines, (lines_count+1) * sizeof(char*));
(*lines)[lines_count] = strdup(line);
if ((*lines)[lines_count][0] == 0) {
return lines_count;
}
lines_count++;
}
}
int fd_is_closed( int fd_in )
{
int errno_old = errno;
int result = fcntl( fd_in, F_GETFL ) < 0;
errno = errno_old;
return result;
}

66
src/ioutil.h Normal file
View File

@@ -0,0 +1,66 @@
#ifndef __IOUTIL_H
#define __IOUTIL_H
#include "serve.h"
struct bitset_mapping; /* don't need whole of bitset.h here */
/** Returns a bit field representing which blocks are allocated in file
* descriptor ''fd''. You must supply the size, and the resolution at which
* you want the bits to represent allocated blocks. If the OS represents
* allocated blocks at a finer resolution than you've asked for, any block
* or part block will count as "allocated" with the corresponding bit set.
*/
struct bitset_mapping* build_allocation_map(int fd, off64_t size, int resolution);
/** Repeat a write() operation that succeeds partially until ''size'' bytes
* are written, or an error is returned, when it returns -1 as usual.
*/
int writeloop(int filedes, const void *buffer, size_t size);
/** Repeat a read() operation that succeeds partially until ''size'' bytes
* are written, or an error is returned, when it returns -1 as usual.
*/
int readloop(int filedes, void *buffer, size_t size);
/** Repeat a sendfile() operation that succeeds partially until ''size'' bytes
* are written, or an error is returned, when it returns -1 as usual.
*/
int sendfileloop(int out_fd, int in_fd, off64_t *offset, size_t count);
/** Repeat a splice() operation until we have 'len' bytes. */
ssize_t spliceloop(int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags2);
/** Copy ''len'' bytes from ''fd_in'' to ''fd_out'' by creating a temporary
* pipe and using the Linux splice call repeatedly until it has transferred
* all the data. Returns -1 on error.
*/
int splice_via_pipe_loop(int fd_in, int fd_out, size_t len);
/** Fill up to ''bufsize'' characters starting at ''buf'' with data from ''fd''
* until an LF character is received, which is written to the buffer at a zero
* byte. Returns -1 on error, or the number of bytes written to the buffer.
*/
int read_until_newline(int fd, char* buf, int bufsize);
/** Read a number of lines using read_until_newline, until an empty line is
* received (i.e. the sequence LF LF). The data is read from ''fd'' and
* lines must be a maximum of ''max_line_length''. The set of lines is
* returned as an array of zero-terminated strings; you must pass an address
* ''lines'' in which you want the address of this array returned.
*/
int read_lines_until_blankline(int fd, int max_line_length, char ***lines);
/** Open the given ''filename'', determine its size, and mmap it in its
* entirety. The file descriptor is stored in ''out_fd'', the size in
* ''out_size'' and the address of the mmap in ''out_map''. If anything goes
* wrong, returns -1 setting errno, otherwise 0.
*/
int open_and_mmap( const char* filename, int* out_fd, off64_t *out_size, void **out_map);
/** Check to see whether the given file descriptor is closed.
*/
int fd_is_closed( int fd_in );
#endif

120
src/listen.c Normal file
View File

@@ -0,0 +1,120 @@
#include "listen.h"
#include "serve.h"
#include "util.h"
#include "flexnbd.h"
#include <stdlib.h>
struct listen * listen_create(
struct flexnbd * flexnbd,
char* s_ip_address,
char* s_rebind_ip_address,
char* s_port,
char* s_rebind_port,
char* s_file,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients )
{
NULLCHECK( flexnbd );
struct listen * listen;
listen = (struct listen *)xmalloc( sizeof( struct listen ) );
listen->flexnbd = flexnbd;
listen->init_serve = server_create(
flexnbd,
s_ip_address,
s_port,
s_file,
default_deny,
acl_entries,
s_acl_entries,
1, 0);
listen->main_serve = server_create(
flexnbd,
s_rebind_ip_address ? s_rebind_ip_address : s_ip_address,
s_rebind_port ? s_rebind_port : s_port,
s_file,
default_deny,
acl_entries,
s_acl_entries,
max_nbd_clients, 1);
return listen;
}
void listen_destroy( struct listen * listen )
{
NULLCHECK( listen );
free( listen );
}
struct server *listen_switch( struct listen * listen )
{
NULLCHECK( listen );
/* TODO: Copy acl from init_serve to main_serve */
/* TODO: rename underlying file from foo.INCOMPLETE to foo */
server_destroy( listen->init_serve );
listen->init_serve = NULL;
info( "Switched to the main server, serving." );
return listen->main_serve;
}
void listen_cleanup( struct listen * listen )
{
NULLCHECK( listen );
if ( flexnbd_switch_locked( listen->flexnbd ) ) {
flexnbd_unlock_switch( listen->flexnbd );
}
}
int do_listen( struct listen * listen )
{
NULLCHECK( listen );
int have_control = 0;
flexnbd_lock_switch( listen->flexnbd );
{
flexnbd_set_server( listen->flexnbd, listen->init_serve );
}
flexnbd_unlock_switch( listen->flexnbd );
/* WATCH FOR RACES HERE: flexnbd->serve is set, but the server
* isn't running yet and the switch lock is released.
*/
have_control = do_serve( listen->init_serve );
if( have_control ) {
info( "Taking control.");
flexnbd_switch( listen->flexnbd, listen_switch );
/* WATCH FOR RACES HERE: the server hasn't been
* restarted before we release the flexnbd switch lock.
* do_serve doesn't return, so there's not a lot of
* choice about that.
*/
do_serve( listen->main_serve );
}
else {
warn("Failed to take control, giving up.");
server_destroy( listen->init_serve );
listen->init_serve = NULL;
}
/* TODO: here we must signal the control thread to stop before
* it tries to */
server_destroy( listen->main_serve );
listen->main_serve = NULL;
debug("Listen done, cleaning up");
listen_cleanup( listen );
return have_control;
}

28
src/listen.h Normal file
View File

@@ -0,0 +1,28 @@
#ifndef LISTEN_H
#define LISTEN_H
#include "flexnbd.h"
#include "serve.h"
struct listen {
struct flexnbd * flexnbd;
struct server * init_serve;
struct server * main_serve;
};
struct listen * listen_create(
struct flexnbd * flexnbd,
char* s_ip_address,
char* s_rebind_ip_address,
char* s_port,
char* s_rebind_port,
char* s_file,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients );
void listen_destroy( struct listen* );
int do_listen( struct listen * );
#endif

19
src/main.c Normal file
View File

@@ -0,0 +1,19 @@
#include "util.h"
#include "mode.h"
#include <signal.h>
int main(int argc, char** argv)
{
signal(SIGPIPE, SIG_IGN); /* calls to splice() unhelpfully throw this */
error_init();
if (argc < 2) {
exit_err( help_help_text );
}
mode(argv[1], argc-1, argv+1); /* never returns */
return 0;
}

77
src/mbox.c Normal file
View File

@@ -0,0 +1,77 @@
#include "mbox.h"
#include "util.h"
#include <pthread.h>
struct mbox * mbox_create( void )
{
struct mbox * mbox = xmalloc( sizeof( struct mbox ) );
FATAL_UNLESS( 0 == pthread_cond_init( &mbox->filled_cond, NULL ),
"Failed to initialise a condition variable" );
FATAL_UNLESS( 0 == pthread_cond_init( &mbox->emptied_cond, NULL ),
"Failed to initialise a condition variable" );
FATAL_UNLESS( 0 == pthread_mutex_init( &mbox->mutex, NULL ),
"Failed to initialise a mutex" );
return mbox;
}
void mbox_post( struct mbox * mbox, void * contents )
{
pthread_mutex_lock( &mbox->mutex );
{
if (mbox->full){
pthread_cond_wait( &mbox->emptied_cond, &mbox->mutex );
}
mbox->contents = contents;
mbox->full = 1;
while( 0 != pthread_cond_signal( &mbox->filled_cond ) );
}
pthread_mutex_unlock( &mbox->mutex );
}
void * mbox_contents( struct mbox * mbox )
{
return mbox->contents;
}
int mbox_is_full( struct mbox * mbox )
{
return mbox->full;
}
void * mbox_receive( struct mbox * mbox )
{
NULLCHECK( mbox );
void * result;
pthread_mutex_lock( &mbox->mutex );
{
if ( !mbox->full ) {
pthread_cond_wait( &mbox->filled_cond, &mbox->mutex );
}
mbox->full = 0;
result = mbox->contents;
mbox->contents = NULL;
while( 0 != pthread_cond_signal( &mbox->emptied_cond));
}
pthread_mutex_unlock( &mbox->mutex );
return result;
}
void mbox_destroy( struct mbox * mbox )
{
NULLCHECK( mbox );
while( 0 != pthread_cond_destroy( &mbox->emptied_cond ) );
while( 0 != pthread_cond_destroy( &mbox->filled_cond ) );
while( 0 != pthread_mutex_destroy( &mbox->mutex ) );
free( mbox );
}

55
src/mbox.h Normal file
View File

@@ -0,0 +1,55 @@
#ifndef MBOX_H
#define MBOX_H
/** mbox
* A thread sync object. Put a void * into the mbox in one thread, and
* get it out in another. The receiving thread will block if there's
* nothing in the mbox, and the sending thread will block if there is.
* The mbox doesn't assume any responsibility for the pointer it's
* passed - you must free it yourself if it's malloced.
*/
#include <pthread.h>
struct mbox {
void * contents;
/** Marker to tell us if there's content in the box.
* Keeping this separate allows us to use NULL for the contents.
*/
int full;
/** This gets signaled by mbox_post, and waited on by
* mbox_receive */
pthread_cond_t filled_cond;
/** This is signaled by mbox_receive, and waited on by mbox_post */
pthread_cond_t emptied_cond;
pthread_mutex_t mutex;
};
/* Create an mbox. */
struct mbox * mbox_create(void);
/* Put something in the mbox, blocking if it's already full.
* That something can be NULL if you want.
*/
void mbox_post( struct mbox *, void *);
/* See what's in the mbox. This isn't thread-safe. */
void * mbox_contents( struct mbox *);
/* See if anything has been put into the mbox. This isn't thread-safe.
* */
int mbox_is_full( struct mbox *);
/* Get the contents from the mbox, blocking if there's nothing there. */
void * mbox_receive( struct mbox *);
/* Free the mbox and destroy the associated pthread bits. */
void mbox_destroy( struct mbox *);
#endif

614
src/mirror.c Normal file
View File

@@ -0,0 +1,614 @@
/* FlexNBD server (C) Bytemark Hosting 2012
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "mirror.h"
#include "serve.h"
#include "util.h"
#include "ioutil.h"
#include "parse.h"
#include "readwrite.h"
#include "bitset.h"
#include "self_pipe.h"
#include "status.h"
#include <stdlib.h>
#include <string.h>
#include <sys/un.h>
#include <unistd.h>
struct mirror * mirror_alloc(
union mysockaddr * connect_to,
union mysockaddr * connect_from,
int max_Bps,
int action_at_finish,
struct mbox * commit_signal)
{
struct mirror * mirror;
mirror = xmalloc(sizeof(struct mirror));
mirror->connect_to = connect_to;
mirror->connect_from = connect_from;
mirror->max_bytes_per_second = max_Bps;
mirror->action_at_finish = action_at_finish;
mirror->commit_signal = commit_signal;
mirror->commit_state = MS_UNKNOWN;
return mirror;
}
void mirror_set_state_f( struct mirror * mirror, enum mirror_state state )
{
NULLCHECK( mirror );
mirror->commit_state = state;
}
#define mirror_set_state( mirror, state ) do{\
debug( "Mirror state => " #state );\
mirror_set_state_f( mirror, state );\
} while(0)
enum mirror_state mirror_get_state( struct mirror * mirror )
{
NULLCHECK( mirror );
return mirror->commit_state;
}
void mirror_init( struct mirror * mirror, const char * filename )
{
int map_fd;
off64_t size;
NULLCHECK( mirror );
NULLCHECK( filename );
FATAL_IF_NEGATIVE(
open_and_mmap(
filename,
&map_fd,
&size,
(void**) &mirror->mapped
),
"Failed to open and mmap %s",
filename
);
mirror->dirty_map = bitset_alloc(size, 4096);
}
/* Call this before a mirror attempt. */
void mirror_reset( struct mirror * mirror )
{
NULLCHECK( mirror );
NULLCHECK( mirror->dirty_map );
mirror_set_state( mirror, MS_INIT );
bitset_set(mirror->dirty_map);
}
struct mirror * mirror_create(
const char * filename,
union mysockaddr * connect_to,
union mysockaddr * connect_from,
int max_Bps,
int action_at_finish,
struct mbox * commit_signal)
{
/* FIXME: shouldn't map_fd get closed? */
struct mirror * mirror;
mirror = mirror_alloc( connect_to,
connect_from,
max_Bps,
action_at_finish,
commit_signal);
mirror_init( mirror, filename );
mirror_reset( mirror );
return mirror;
}
void mirror_destroy( struct mirror *mirror )
{
NULLCHECK( mirror );
free(mirror->connect_to);
free(mirror->connect_from);
free(mirror->dirty_map);
free(mirror);
}
/** The mirror code will split NBD writes, making them this long as a maximum */
static const int mirror_longest_write = 8<<20;
/** If, during a mirror pass, we have sent this number of bytes or fewer, we
* go to freeze the I/O and finish it off. This is just a guess.
*/
static const unsigned int mirror_last_pass_after_bytes_written = 100<<20;
/** The largest number of full passes we'll do - the last one will always
* cause the I/O to freeze, however many bytes are left to copy.
*/
static const int mirror_maximum_passes = 7;
/* A single mirror pass over the disc, optionally locking IO around the
* transfer.
*/
int mirror_pass(struct server * serve, int should_lock, uint64_t *written)
{
uint64_t current = 0;
int success = 1;
struct bitset_mapping *map = serve->mirror->dirty_map;
*written = 0;
while (current < serve->size) {
int run = bitset_run_count(map, current, mirror_longest_write);
debug("mirror current=%ld, run=%d", current, run);
/* FIXME: we could avoid sending sparse areas of the
* disc here, and probably save a lot of bandwidth and
* time (if we know the destination starts off zeroed).
*/
if (bitset_is_set_at(map, current)) {
/* We've found a dirty area, send it */
debug("^^^ writing");
/* We need to stop the main thread from working
* because it might corrupt the dirty map. This
* is likely to slow things down but will be
* safe.
*/
if (should_lock) { server_lock_io( serve ); }
{
debug("in lock block");
/** FIXME: do something useful with bytes/second */
/** FIXME: error handling code here won't unlock */
socket_nbd_write( serve->mirror->client,
current,
run,
0,
serve->mirror->mapped + current,
MS_REQUEST_LIMIT_SECS);
/* now mark it clean */
bitset_clear_range(map, current, run);
debug("leaving lock block");
}
if (should_lock) { server_unlock_io( serve ); }
*written += run;
}
current += run;
if (serve->mirror->signal_abandon) {
debug("Abandon message received" );
success = 0;
break;
}
}
return success;
}
void mirror_give_control( struct mirror * mirror )
{
debug( "mirror: entrusting and disconnecting" );
/* TODO: set up an error handler to clean up properly on ERROR.
*/
/* A transfer of control is expressed as a 3-way handshake.
* First, We send a REQUEST_ENTRUST. If this fails to be
* received, this thread will simply block until the server is
* restarted. If the remote end doesn't understand it, it'll
* disconnect us, and an ERROR *should* bomb this thread.
* FIXME: make the ERROR work.
* If we get an explicit error back from the remote end, then
* again, this thread will bomb out.
* On receiving a valid response, we send a REQUEST_DISCONNECT,
* and we quit without checking for a response. This is the
* remote server's signal to assume control of the file. The
* reason we don't check for a response is the state we end up
* in if the final message goes astray: if we lose the
* REQUEST_DISCONNECT, the sender has quit and the receiver
* hasn't had a signal to take over yet, so the data is safe.
* If we were to wait for a response to the REQUEST_DISCONNECT,
* the sender and receiver would *both* be servicing write
* requests while the response was in flight, and if the
* response went astray we'd have two servers claiming
* responsibility for the same data.
*
* The meaning of these is as follows:
* The entrust signifies that all the data has been sent, and
* the client is currently paused but not disconnected.
* The disconnect signifies that the client has been
* safely prevented from making any more writes.
*
* Since we lock io and close the server it in mirror_on_exit before
* releasing, we don't actually need to take any action between the
* two here.
*/
socket_nbd_entrust( mirror->client );
socket_nbd_disconnect( mirror->client );
}
/* THIS FUNCTION MUST ONLY BE CALLED WITH THE SERVER'S IO LOCKED. */
void mirror_on_exit( struct server * serve )
{
/* Send an explicit entrust and disconnect. After this
* point we cannot allow any reads or writes to the local file.
* We do this *before* trying to shut down the server so that if
* the transfer of control fails, we haven't stopped the server
* and already-connected clients don't get needlessly
* disconnected.
*/
debug( "mirror_give_control");
mirror_give_control( serve->mirror );
/* If we're still here, the transfer of control went ok, and the
* remote is listening (or will be shortly). We can shut the
* server down.
*
* It doesn't matter if we get new client connections before
* now, the IO lock will stop them from doing anything.
*/
debug("serve_signal_close");
serve_signal_close( serve );
/* We have to wait until the server is closed before unlocking
* IO. This is because the client threads check to see if the
* server is still open before reading or writing inside their
* own locks. If we don't wait for the close, there's no way to
* guarantee the server thread will win the race and we risk the
* clients seeing a "successful" write to a dead disc image.
*/
debug("serve_wait_for_close");
serve_wait_for_close( serve );
info("Mirror sent.");
}
void mirror_cleanup( struct server * serve,
int fatal __attribute__((unused)))
{
NULLCHECK( serve );
struct mirror * mirror = serve->mirror;
NULLCHECK( mirror );
info( "Cleaning up mirror thread");
if( mirror->client && mirror->client > 0 ){
close( mirror->client );
}
mirror->client = -1;
if( server_io_locked( serve ) ){ server_unlock_io( serve ); }
}
int mirror_connect( struct mirror * mirror, off64_t local_size )
{
struct sockaddr * connect_from = NULL;
int connected = 0;
if ( mirror->connect_from ) {
connect_from = &mirror->connect_from->generic;
}
NULLCHECK( mirror->connect_to );
mirror->client = socket_connect(&mirror->connect_to->generic, connect_from);
if ( 0 < mirror->client ) {
fd_set fds;
struct timeval tv = { MS_HELLO_TIME_SECS, 0};
FD_ZERO( &fds );
FD_SET( mirror->client, &fds );
FATAL_UNLESS( 0 <= select( FD_SETSIZE, &fds, NULL, NULL, &tv ),
"Select failed." );
if( FD_ISSET( mirror->client, &fds ) ){
off64_t remote_size;
if ( socket_nbd_read_hello( mirror->client, &remote_size ) ) {
if( remote_size == local_size ){
connected = 1;
mirror_set_state( mirror, MS_GO );
}
else {
warn("Remote size (%d) doesn't match local (%d)",
remote_size, local_size );
mirror_set_state( mirror, MS_FAIL_SIZE_MISMATCH );
}
}
else {
warn( "Mirror attempt rejected." );
mirror_set_state( mirror, MS_FAIL_REJECTED );
}
}
else {
warn( "No NBD Hello received." );
mirror_set_state( mirror, MS_FAIL_NO_HELLO );
}
if ( !connected ) { close( mirror->client ); }
}
else {
warn( "Mirror failed to connect.");
mirror_set_state( mirror, MS_FAIL_CONNECT );
}
return connected;
}
void mirror_run( struct server *serve )
{
NULLCHECK( serve );
NULLCHECK( serve->mirror );
int pass;
uint64_t written;
info("Starting mirror" );
for (pass=0; pass < mirror_maximum_passes-1; pass++) {
debug("mirror start pass=%d", pass);
if ( !mirror_pass( serve, 1, &written ) ){
debug("Failed mirror pass state is %d", mirror_get_state( serve->mirror ) );
debug("pass failed, giving up");
return; }
/* if we've not written anything */
if (written < mirror_last_pass_after_bytes_written) { break; }
}
server_lock_io( serve );
{
if ( mirror_pass( serve, 0, &written ) &&
ACTION_EXIT == serve->mirror->action_at_finish) {
debug("exit!");
mirror_on_exit( serve );
info("Server closed, quitting "
"after successful migration");
}
}
server_unlock_io( serve );
}
void mbox_post_mirror_state( struct mbox * mbox, enum mirror_state st )
{
NULLCHECK( mbox );
enum mirror_state * contents = xmalloc( sizeof( enum mirror_state ) );
*contents = st;
mbox_post( mbox, contents );
}
void mirror_signal_commit( struct mirror * mirror )
{
NULLCHECK( mirror );
mbox_post_mirror_state( mirror->commit_signal,
mirror_get_state( mirror ) );
}
/** Thread launched to drive mirror process
* This is needed for two reasons: firstly, it decouples the mirroring
* from the control thread (although that's less valid with mboxes
* passing state back and forth) and to provide an error context so that
* retries can be cleanly handled without a bespoke error handling
* mechanism.
* */
void* mirror_runner(void* serve_params_uncast)
{
/* The supervisor thread relies on there not being any ERROR
* calls until after the mirror_signal_commit() call in this
* function.
* However, *after* that, we should call ERROR_* instead of
* FATAL_* wherever possible.
*/
struct server *serve = (struct server*) serve_params_uncast;
NULLCHECK( serve );
NULLCHECK( serve->mirror );
struct mirror * mirror = serve->mirror;
NULLCHECK( mirror->dirty_map );
error_set_handler( (cleanup_handler *) mirror_cleanup, serve );
info( "Connecting to mirror" );
time_t start_time = time(NULL);
int connected = mirror_connect( mirror, serve->size );
mirror_signal_commit( mirror );
if ( !connected ) { goto abandon_mirror; }
/* After this point, if we see a failure we need to disconnect
* and retry everything from mirror_set_state(_, MS_INIT), but
* *without* signaling the commit or abandoning the mirror.
* */
if ( (time(NULL) - start_time) > MS_CONNECT_TIME_SECS ){
/* If we get here, then we managed to connect but the
* control thread feeding status back to the user will
* have gone away, leaving the user without meaningful
* feedback. In this instance, they have to assume a
* failure, so we can't afford to let the mirror happen.
* We have to set the state to avoid a race.
*/
mirror_set_state( mirror, MS_FAIL_CONNECT );
warn( "Mirror connected, but too slowly" );
goto abandon_mirror;
}
mirror_run( serve );
mirror_set_state( mirror, MS_DONE );
abandon_mirror:
return NULL;
}
struct mirror_super * mirror_super_create(
const char * filename,
union mysockaddr * connect_to,
union mysockaddr * connect_from,
int max_Bps,
int action_at_finish,
struct mbox * state_mbox)
{
struct mirror_super * super = xmalloc( sizeof( struct mirror_super) );
super->mirror = mirror_create(
filename,
connect_to,
connect_from,
max_Bps,
action_at_finish,
mbox_create() ) ;
super->state_mbox = state_mbox;
return super;
}
/* Post the current state of the mirror into super->state_mbox.*/
void mirror_super_signal_committed(
struct mirror_super * super ,
enum mirror_state commit_state )
{
NULLCHECK( super );
NULLCHECK( super->state_mbox );
mbox_post_mirror_state(
super->state_mbox,
commit_state );
}
void mirror_super_destroy( struct mirror_super * super )
{
NULLCHECK( super );
mbox_destroy( super->mirror->commit_signal );
mirror_destroy( super->mirror );
free( super );
}
/* The mirror supervisor thread. Responsible for kicking off retries if
* the mirror thread fails.
* The mirror and mirror_super objects are never freed, and the
* mirror_super_runner thread is never joined.
*/
void * mirror_super_runner( void * serve_uncast )
{
struct server * serve = (struct server *) serve_uncast;
NULLCHECK( serve );
NULLCHECK( serve->mirror );
NULLCHECK( serve->mirror_super );
int first_pass = 1;
int should_retry = 0;
int success = 0;
struct mirror * mirror = serve->mirror;
struct mirror_super * super = serve->mirror_super;
do {
FATAL_IF( 0 != pthread_create(
&mirror->thread,
NULL,
mirror_runner,
serve),
"Failed to create mirror thread");
debug("Supervisor waiting for commit signal");
enum mirror_state * commit_state =
mbox_receive( mirror->commit_signal );
debug( "Supervisor got commit signal" );
if ( first_pass ) {
/* Only retry if the connection attempt was
* successful. Otherwise the user will see an
* error reported while we're still trying to
* retry behind the scenes.
*/
should_retry = *commit_state == MS_GO;
/* Only send this signal the first time */
mirror_super_signal_committed(
super,
*commit_state);
debug("Mirror supervisor committed");
}
/* We only care about the value of the commit signal on
* the first pass, so this is ok
*/
free( commit_state );
debug("Supervisor waiting for mirror thread" );
pthread_join( mirror->thread, NULL );
success = MS_DONE == mirror_get_state( mirror );
if( success ){
info( "Mirror supervisor success, exiting" ); }
else if ( mirror->signal_abandon ) {
info( "Mirror abandoned" );
should_retry = 0;
}
else if (should_retry){
info( "Mirror failed, retrying" );
}
else { info( "Mirror failed before commit, giving up" ); }
first_pass = 0;
if ( should_retry ) {
/* We don't want to hammer the destination too
* hard, so if this is a retry, insert a delay. */
sleep( MS_RETRY_DELAY_SECS );
/* We also have to reset the bitmap to be sure
* we transfer everything */
mirror_reset( mirror );
}
}
while ( should_retry && !success );
serve->mirror = NULL;
serve->mirror_super = NULL;
mirror_super_destroy( super );
debug( "Mirror supervisor done." );
return NULL;
}

107
src/mirror.h Normal file
View File

@@ -0,0 +1,107 @@
#ifndef MIRROR_H
#define MIRROR_H
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
#include "bitset.h"
#include "self_pipe.h"
enum mirror_state;
#include "serve.h"
#include "mbox.h"
/* MS_CONNECT_TIME_SECS
* The length of time after which the sender will assume a connect() to
* the destination has failed.
*/
#define MS_CONNECT_TIME_SECS 60
/* MS_HELLO_TIME_SECS
* The length of time the sender will wait for the NBD hello message
* after connect() before aborting the connection attempt.
*/
#define MS_HELLO_TIME_SECS 5
/* MS_RETRY_DELAY_SECS
* The delay after a failed migration attempt before launching another
* thread to try again.
*/
#define MS_RETRY_DELAY_SECS 1
/* MS_REQUEST_LIMIT_SECS
* We must receive a reply to a request within this time. For a read
* request, this is the time between the end of the NBD request and the
* start of the NBD reply. For a write request, this is the time
* between the end of the written data and the start of the NBD reply.
*/
#define MS_REQUEST_LIMIT_SECS 4
enum mirror_finish_action {
ACTION_EXIT,
ACTION_NOTHING
};
enum mirror_state {
MS_UNKNOWN,
MS_INIT,
MS_GO,
MS_DONE,
MS_FAIL_CONNECT,
MS_FAIL_REJECTED,
MS_FAIL_NO_HELLO,
MS_FAIL_SIZE_MISMATCH
};
struct mirror {
pthread_t thread;
/* set to 1, then join thread to make mirror terminate early */
int signal_abandon;
union mysockaddr * connect_to;
union mysockaddr * connect_from;
int client;
const char * filename;
off64_t max_bytes_per_second;
enum mirror_finish_action action_at_finish;
char *mapped;
struct bitset_mapping *dirty_map;
enum mirror_state commit_state;
/* commit_signal is sent immediately after attempting to connect
* and checking the remote size, whether successful or not.
*/
struct mbox * commit_signal;
};
struct mirror_super {
struct mirror * mirror;
pthread_t thread;
struct mbox * state_mbox;
};
/* We need these declaration to get around circular dependencies in the
* .h's
*/
struct server;
struct flexnbd;
struct mirror_super * mirror_super_create(
const char * filename,
union mysockaddr * connect_to,
union mysockaddr * connect_from,
int max_Bps,
int action_at_finish,
struct mbox * state_mbox
);
void * mirror_super_runner( void * serve_uncast );
#endif

738
src/mode.c Normal file
View File

@@ -0,0 +1,738 @@
#include "mode.h"
#include "flexnbd.h"
#include <getopt.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
static struct option serve_options[] = {
GETOPT_HELP,
GETOPT_ADDR,
GETOPT_PORT,
GETOPT_FILE,
GETOPT_SOCK,
GETOPT_DENY,
GETOPT_QUIET,
GETOPT_VERBOSE,
{0}
};
static char serve_short_options[] = "hl:p:f:s:d" SOPT_QUIET SOPT_VERBOSE;
static char serve_help_text[] =
"Usage: flexnbd " CMD_SERVE " <options> [<acl address>*]\n\n"
"Serve FILE from ADDR:PORT, with an optional control socket at SOCK.\n\n"
HELP_LINE
"\t--" OPT_ADDR ",-l <ADDR>\tThe address to serve on.\n"
"\t--" OPT_PORT ",-p <PORT>\tThe port to serve on.\n"
"\t--" OPT_FILE ",-f <FILE>\tThe file to serve.\n"
"\t--" OPT_DENY ",-d\tDeny connections by default unless in ACL.\n"
SOCK_LINE
VERBOSE_LINE
QUIET_LINE;
static struct option listen_options[] = {
GETOPT_HELP,
GETOPT_ADDR,
GETOPT_REBIND_ADDR,
GETOPT_PORT,
GETOPT_REBIND_PORT,
GETOPT_FILE,
GETOPT_SOCK,
GETOPT_DENY,
GETOPT_QUIET,
GETOPT_VERBOSE,
{0}
};
static char listen_short_options[] = "hl:L:p:P:f:s:d" SOPT_QUIET SOPT_VERBOSE;
static char listen_help_text[] =
"Usage: flexnbd " CMD_LISTEN " <options> [<acl_address>*]\n\n"
"Listen for an incoming migration on ADDR:PORT, "
"then switch to REBIND_ADDR:REBIND_PORT on completion "
"to serve FILE.\n\n"
HELP_LINE
"\t--" OPT_ADDR ",-l <ADDR>\tThe address to listen on.\n"
"\t--" OPT_REBIND_ADDR ",-L <REBIND_ADDR>\tThe address to switch to, if given.\n"
"\t--" OPT_PORT ",-p <PORT>\tThe port to listen on.\n"
"\t--" OPT_REBIND_PORT ",-P <REBIND_PORT>\tThe port to switch to, if given..\n"
"\t--" OPT_FILE ",-f <FILE>\tThe file to serve.\n"
"\t--" OPT_DENY ",-d\tDeny connections by default unless in ACL.\n"
SOCK_LINE
VERBOSE_LINE
QUIET_LINE;
static struct option read_options[] = {
GETOPT_HELP,
GETOPT_ADDR,
GETOPT_PORT,
GETOPT_FROM,
GETOPT_SIZE,
GETOPT_BIND,
GETOPT_QUIET,
GETOPT_VERBOSE,
{0}
};
static char read_short_options[] = "hl:p:F:S:b:" SOPT_QUIET SOPT_VERBOSE;
static char read_help_text[] =
"Usage: flexnbd " CMD_READ " <options>\n\n"
"Read SIZE bytes from a server at ADDR:PORT to stdout, starting at OFFSET.\n\n"
HELP_LINE
"\t--" OPT_ADDR ",-l <ADDR>\tThe address to read from.\n"
"\t--" OPT_PORT ",-p <PORT>\tThe port to read from.\n"
"\t--" OPT_FROM ",-F <OFFSET>\tByte offset to read from.\n"
"\t--" OPT_SIZE ",-S <SIZE>\tBytes to read.\n"
BIND_LINE
VERBOSE_LINE
QUIET_LINE;
static struct option *write_options = read_options;
static char *write_short_options = read_short_options;
static char write_help_text[] =
"Usage: flexnbd " CMD_WRITE" <options>\n\n"
"Write SIZE bytes from stdin to a server at ADDR:PORT, starting at OFFSET.\n\n"
HELP_LINE
"\t--" OPT_ADDR ",-l <ADDR>\tThe address to write to.\n"
"\t--" OPT_PORT ",-p <PORT>\tThe port to write to.\n"
"\t--" OPT_FROM ",-F <OFFSET>\tByte offset to write from.\n"
"\t--" OPT_SIZE ",-S <SIZE>\tBytes to write.\n"
BIND_LINE
VERBOSE_LINE
QUIET_LINE;
static struct option acl_options[] = {
GETOPT_HELP,
GETOPT_SOCK,
GETOPT_QUIET,
GETOPT_VERBOSE,
{0}
};
static char acl_short_options[] = "hs:" SOPT_QUIET SOPT_VERBOSE;
static char acl_help_text[] =
"Usage: flexnbd " CMD_ACL " <options> [<acl address>+]\n\n"
"Set the access control list for a server with control socket SOCK.\n\n"
HELP_LINE
SOCK_LINE
VERBOSE_LINE
QUIET_LINE;
static struct option mirror_options[] = {
GETOPT_HELP,
GETOPT_SOCK,
GETOPT_ADDR,
GETOPT_PORT,
GETOPT_BIND,
GETOPT_QUIET,
GETOPT_VERBOSE,
{0}
};
static char mirror_short_options[] = "hs:l:p:b:" SOPT_QUIET SOPT_VERBOSE;
static char mirror_help_text[] =
"Usage: flexnbd " CMD_MIRROR " <options>\n\n"
"Start mirroring from the server with control socket SOCK to one at ADDR:PORT.\n\n"
HELP_LINE
"\t--" OPT_ADDR ",-l <ADDR>\tThe address to mirror to.\n"
"\t--" OPT_PORT ",-p <PORT>\tThe port to mirror to.\n"
SOCK_LINE
BIND_LINE
VERBOSE_LINE
QUIET_LINE;
static struct option status_options[] = {
GETOPT_HELP,
GETOPT_SOCK,
GETOPT_QUIET,
GETOPT_VERBOSE,
{0}
};
static char status_short_options[] = "hs:" SOPT_QUIET SOPT_VERBOSE;
static char status_help_text[] =
"Usage: flexnbd " CMD_STATUS " <options>\n\n"
"Get the status for a server with control socket SOCK.\n\n"
HELP_LINE
SOCK_LINE
VERBOSE_LINE
QUIET_LINE;
char help_help_text_arr[] =
"Usage: flexnbd <cmd> [cmd options]\n\n"
"Commands:\n"
"\tflexnbd serve\n"
"\tflexnbd read\n"
"\tflexnbd write\n"
"\tflexnbd acl\n"
"\tflexnbd mirror\n"
"\tflexnbd status\n"
"\tflexnbd help\n\n"
"See flexnbd help <cmd> for further info\n";
/* Slightly odd array/pointer pair to stop the compiler from complaining
* about symbol sizes
*/
char * help_help_text = help_help_text_arr;
int do_serve(struct server* params);
void do_read(struct mode_readwrite_params* params);
void do_write(struct mode_readwrite_params* params);
void do_remote_command(char* command, char* mode, int argc, char** argv);
void read_serve_param( int c, char **ip_addr, char **ip_port, char **file, char **sock, int *default_deny )
{
switch(c){
case 'h':
fprintf(stdout, "%s\n", serve_help_text );
exit( 0 );
break;
case 'l':
*ip_addr = optarg;
break;
case 'p':
*ip_port = optarg;
break;
case 'f':
*file = optarg;
break;
case 's':
*sock = optarg;
break;
case 'd':
*default_deny = 1;
break;
case 'q':
log_level = 4;
break;
case 'v':
log_level = VERBOSE_LOG_LEVEL;
break;
default:
exit_err( serve_help_text );
break;
}
}
void read_listen_param( int c,
char **ip_addr,
char **rebind_ip_addr,
char **ip_port,
char **rebind_ip_port,
char **file,
char **sock,
int *default_deny )
{
switch(c){
case 'h':
fprintf(stdout, "%s\n", listen_help_text );
exit(0);
break;
case 'l':
*ip_addr = optarg;
break;
case 'L':
*rebind_ip_addr = optarg;
break;
case 'p':
*ip_port = optarg;
break;
case 'P':
*rebind_ip_port = optarg;
break;
case 'f':
*file = optarg;
break;
case 's':
*sock = optarg;
break;
case 'd':
*default_deny = 1;
break;
case 'q':
log_level = 4;
break;
case 'v':
log_level = VERBOSE_LOG_LEVEL;
break;
default:
exit_err( listen_help_text );
break;
}
}
void read_readwrite_param( int c, char **ip_addr, char **ip_port, char **bind_addr, char **from, char **size)
{
switch(c){
case 'h':
fprintf(stdout, "%s\n", read_help_text );
exit( 0 );
break;
case 'l':
*ip_addr = optarg;
break;
case 'p':
*ip_port = optarg;
break;
case 'F':
*from = optarg;
break;
case 'S':
*size = optarg;
break;
case 'b':
*bind_addr = optarg;
break;
case 'q':
log_level = 4;
break;
case 'v':
log_level = VERBOSE_LOG_LEVEL;
break;
default:
exit_err( read_help_text );
break;
}
}
void read_sock_param( int c, char **sock, char *help_text )
{
switch(c){
case 'h':
fprintf( stdout, "%s\n", help_text );
exit( 0 );
break;
case 's':
*sock = optarg;
break;
case 'q':
log_level = 4;
break;
case 'v':
log_level = VERBOSE_LOG_LEVEL;
break;
default:
exit_err( help_text );
break;
}
}
void read_acl_param( int c, char **sock )
{
read_sock_param( c, sock, acl_help_text );
}
void read_mirror_param( int c, char **sock, char **ip_addr, char **ip_port, char **bind_addr )
{
switch( c ){
case 'h':
fprintf( stdout, "%s\n", mirror_help_text );
exit( 0 );
break;
case 's':
*sock = optarg;
break;
case 'l':
*ip_addr = optarg;
break;
case 'p':
*ip_port = optarg;
break;
case 'b':
*bind_addr = optarg;
break;
case 'q':
log_level = 4;
break;
case 'v':
log_level = VERBOSE_LOG_LEVEL;
break;
default:
exit_err( mirror_help_text );
break;
}
}
void read_status_param( int c, char **sock )
{
read_sock_param( c, sock, status_help_text );
}
int mode_serve( int argc, char *argv[] )
{
int c;
char *ip_addr = NULL;
char *ip_port = NULL;
char *file = NULL;
char *sock = NULL;
int default_deny = 0; // not on by default
int err = 0;
struct flexnbd * flexnbd;
while (1) {
c = getopt_long(argc, argv, serve_short_options, serve_options, NULL);
if ( c == -1 ) { break; }
read_serve_param( c, &ip_addr, &ip_port, &file, &sock, &default_deny );
}
if ( NULL == ip_addr || NULL == ip_port ) {
err = 1;
fprintf( stderr, "both --addr and --port are required.\n" );
}
if ( NULL == file ) {
err = 1;
fprintf( stderr, "--file is required\n" );
}
if ( err ) { exit_err( serve_help_text ); }
flexnbd = flexnbd_create_serving( ip_addr, ip_port, file, sock, default_deny, argc - optind, argv + optind, MAX_NBD_CLIENTS );
flexnbd_serve( flexnbd );
flexnbd_destroy( flexnbd );
return 0;
}
int mode_listen( int argc, char *argv[] )
{
int c;
char *ip_addr = NULL;
char *rebind_ip_addr = NULL;
char *ip_port = NULL;
char *rebind_ip_port = NULL;
char *file = NULL;
char *sock = NULL;
int default_deny = 0; // not on by default
int err = 0;
int success;
struct flexnbd * flexnbd;
while (1) {
c = getopt_long(argc, argv, listen_short_options, listen_options, NULL);
if ( c == -1 ) { break; }
read_listen_param( c, &ip_addr, &rebind_ip_addr, &ip_port, &rebind_ip_port,
&file, &sock, &default_deny );
}
if ( NULL == ip_addr || NULL == ip_port ) {
err = 1;
fprintf( stderr, "both --addr and --port are required.\n" );
}
if ( NULL == file ) {
err = 1;
fprintf( stderr, "--file is required\n" );
}
if ( err ) { exit_err( listen_help_text ); }
flexnbd = flexnbd_create_listening(
ip_addr,
rebind_ip_addr,
ip_port,
rebind_ip_port,
file,
sock,
default_deny,
argc - optind,
argv + optind,
MAX_NBD_CLIENTS );
success = flexnbd_serve( flexnbd );
flexnbd_destroy( flexnbd );
return success ? 0 : 1;
}
/* TODO: Separate this function.
* It should be:
* params_read( struct mode_readwrite_params* out,
* char *s_ip_address,
* char *s_port,
* char *s_from,
* char *s_length )
* params_write( struct mode_readwrite_params* out,
* char *s_ip_address,
* char *s_port,
* char *s_from,
* char *s_length,
* char *s_filename )
*/
void params_readwrite(
int write_not_read,
struct mode_readwrite_params* out,
char* s_ip_address,
char* s_port,
char* s_bind_address,
char* s_from,
char* s_length_or_filename
)
{
FATAL_IF_NULL(s_ip_address, "No IP address supplied");
FATAL_IF_NULL(s_port, "No port number supplied");
FATAL_IF_NULL(s_from, "No from supplied");
FATAL_IF_NULL(s_length_or_filename, "No length supplied");
FATAL_IF_ZERO(
parse_ip_to_sockaddr(&out->connect_to.generic, s_ip_address),
"Couldn't parse connection address '%s'",
s_ip_address
);
if (s_bind_address != NULL &&
parse_ip_to_sockaddr(&out->connect_from.generic, s_bind_address) == 0) {
fatal("Couldn't parse bind address '%s'", s_bind_address);
}
parse_port( s_port, &out->connect_to.v4 );
out->from = atol(s_from);
if (write_not_read) {
if (s_length_or_filename[0]-48 < 10) {
out->len = atol(s_length_or_filename);
out->data_fd = 0;
}
else {
out->data_fd = open(
s_length_or_filename, O_RDONLY);
FATAL_IF_NEGATIVE(out->data_fd,
"Couldn't open %s", s_length_or_filename);
out->len = lseek64(out->data_fd, 0, SEEK_END);
FATAL_IF_NEGATIVE(out->len,
"Couldn't find length of %s", s_length_or_filename);
FATAL_IF_NEGATIVE(
lseek64(out->data_fd, 0, SEEK_SET),
"Couldn't rewind %s", s_length_or_filename
);
}
}
else {
out->len = atol(s_length_or_filename);
out->data_fd = 1;
}
}
int mode_read( int argc, char *argv[] )
{
int c;
char *ip_addr = NULL;
char *ip_port = NULL;
char *bind_addr = NULL;
char *from = NULL;
char *size = NULL;
int err = 0;
struct mode_readwrite_params readwrite;
while (1){
c = getopt_long(argc, argv, read_short_options, read_options, NULL);
if ( c == -1 ) { break; }
read_readwrite_param( c, &ip_addr, &ip_port, &bind_addr, &from, &size );
}
if ( NULL == ip_addr || NULL == ip_port ) {
err = 1;
fprintf( stderr, "both --addr and --port are required.\n" );
}
if ( NULL == from || NULL == size ) {
err = 1;
fprintf( stderr, "both --from and --size are required.\n" );
}
if ( err ) { exit_err( read_help_text ); }
memset( &readwrite, 0, sizeof( readwrite ) );
params_readwrite( 0, &readwrite, ip_addr, ip_port, bind_addr, from, size );
do_read( &readwrite );
return 0;
}
int mode_write( int argc, char *argv[] )
{
int c;
char *ip_addr = NULL;
char *ip_port = NULL;
char *bind_addr = NULL;
char *from = NULL;
char *size = NULL;
int err = 0;
struct mode_readwrite_params readwrite;
while (1){
c = getopt_long(argc, argv, write_short_options, write_options, NULL);
if ( c == -1 ) { break; }
read_readwrite_param( c, &ip_addr, &ip_port, &bind_addr, &from, &size );
}
if ( NULL == ip_addr || NULL == ip_port ) {
err = 1;
fprintf( stderr, "both --addr and --port are required.\n" );
}
if ( NULL == from || NULL == size ) {
err = 1;
fprintf( stderr, "both --from and --size are required.\n" );
}
if ( err ) { exit_err( write_help_text ); }
memset( &readwrite, 0, sizeof( readwrite ) );
params_readwrite( 1, &readwrite, ip_addr, ip_port, bind_addr, from, size );
do_write( &readwrite );
return 0;
}
int mode_acl( int argc, char *argv[] )
{
int c;
char *sock = NULL;
while (1) {
c = getopt_long( argc, argv, acl_short_options, acl_options, NULL );
if ( c == -1 ) { break; }
read_acl_param( c, &sock );
}
if ( NULL == sock ){
fprintf( stderr, "--sock is required.\n" );
exit_err( acl_help_text );
}
/* Don't use the CMD_ACL macro here, "acl" is the remote command
* name, not the cli option
*/
do_remote_command( "acl", sock, argc - optind, argv + optind );
return 0;
}
int mode_mirror( int argc, char *argv[] )
{
int c;
char *sock = NULL;
char *remote_argv[4] = {0};
int err = 0;
while (1) {
c = getopt_long( argc, argv, mirror_short_options, mirror_options, NULL);
if ( -1 == c ) { break; }
read_mirror_param( c, &sock, &remote_argv[0], &remote_argv[1], &remote_argv[2] );
}
if ( NULL == sock ){
fprintf( stderr, "--sock is required.\n" );
err = 1;
}
if ( NULL == remote_argv[0] || NULL == remote_argv[1] ) {
fprintf( stderr, "both --addr and --port are required.\n");
err = 1;
}
if ( err ) { exit_err( mirror_help_text ); }
if (remote_argv[2] == NULL) {
do_remote_command( "mirror", sock, 2, remote_argv );
}
else {
do_remote_command( "mirror", sock, 3, remote_argv );
}
return 0;
}
int mode_status( int argc, char *argv[] )
{
int c;
char *sock = NULL;
while (1) {
c = getopt_long( argc, argv, status_short_options, status_options, NULL );
if ( -1 == c ) { break; }
read_status_param( c, &sock );
}
if ( NULL == sock ){
fprintf( stderr, "--sock is required.\n" );
exit_err( acl_help_text );
}
do_remote_command( "status", sock, argc - optind, argv + optind );
return 0;
}
int mode_help( int argc, char *argv[] )
{
char *cmd;
char *help_text = NULL;
if ( argc < 1 ){
help_text = help_help_text;
} else {
cmd = argv[0];
if (IS_CMD( CMD_SERVE, cmd ) ) {
help_text = serve_help_text;
} else if ( IS_CMD( CMD_LISTEN, cmd ) ) {
help_text = listen_help_text;
} else if ( IS_CMD( CMD_READ, cmd ) ) {
help_text = read_help_text;
} else if ( IS_CMD( CMD_WRITE, cmd ) ) {
help_text = write_help_text;
} else if ( IS_CMD( CMD_ACL, cmd ) ) {
help_text = acl_help_text;
} else if ( IS_CMD( CMD_MIRROR, cmd ) ) {
help_text = mirror_help_text;
} else if ( IS_CMD( CMD_STATUS, cmd ) ) {
help_text = status_help_text;
} else { exit_err( help_help_text ); }
}
fprintf( stdout, "%s\n", help_text );
return 0;
}
void mode(char* mode, int argc, char **argv)
{
if ( IS_CMD( CMD_SERVE, mode ) ) {
exit( mode_serve( argc, argv ) );
}
else if ( IS_CMD( CMD_LISTEN, mode ) ) {
exit( mode_listen( argc, argv ) );
}
else if ( IS_CMD( CMD_READ, mode ) ) {
mode_read( argc, argv );
}
else if ( IS_CMD( CMD_WRITE, mode ) ) {
mode_write( argc, argv );
}
else if ( IS_CMD( CMD_ACL, mode ) ) {
mode_acl( argc, argv );
}
else if ( IS_CMD( CMD_MIRROR, mode ) ) {
mode_mirror( argc, argv );
}
else if ( IS_CMD( CMD_STATUS, mode ) ) {
mode_status( argc, argv );
}
else if ( IS_CMD( CMD_HELP, mode ) ) {
mode_help( argc-1, argv+1 );
}
else {
mode_help( argc-1, argv+1 );
exit( 1 );
}
exit(0);
}

83
src/mode.h Normal file
View File

@@ -0,0 +1,83 @@
#ifndef MODE_H
#define MODE_H
void mode(char* mode, int argc, char **argv);
#include <getopt.h>
#define GETOPT_ARG(x,s) {(x), 1, 0, (s)}
#define GETOPT_FLAG(x,v) {(x), 0, 0, (v)}
#define OPT_HELP "help"
#define OPT_ADDR "addr"
#define OPT_REBIND_ADDR "rebind-addr"
#define OPT_BIND "bind"
#define OPT_PORT "port"
#define OPT_REBIND_PORT "rebind-port"
#define OPT_FILE "file"
#define OPT_SOCK "sock"
#define OPT_FROM "from"
#define OPT_SIZE "size"
#define OPT_DENY "default-deny"
#define CMD_SERVE "serve"
#define CMD_LISTEN "listen"
#define CMD_READ "read"
#define CMD_WRITE "write"
#define CMD_ACL "acl"
#define CMD_MIRROR "mirror"
#define CMD_STATUS "status"
#define CMD_HELP "help"
#define LEN_CMD_MAX 7
#define PATH_LEN_MAX 1024
#define ADDR_LEN_MAX 64
#define IS_CMD(x,c) (strncmp((x),(c),(LEN_CMD_MAX)) == 0)
#define GETOPT_HELP GETOPT_FLAG( OPT_HELP, 'h' )
#define GETOPT_DENY GETOPT_FLAG( OPT_DENY, 'd' )
#define GETOPT_ADDR GETOPT_ARG( OPT_ADDR, 'l' )
#define GETOPT_REBIND_ADDR GETOPT_ARG( OPT_REBIND_ADDR, 'L')
#define GETOPT_PORT GETOPT_ARG( OPT_PORT, 'p' )
#define GETOPT_REBIND_PORT GETOPT_ARG( OPT_REBIND_PORT, 'P')
#define GETOPT_FILE GETOPT_ARG( OPT_FILE, 'f' )
#define GETOPT_SOCK GETOPT_ARG( OPT_SOCK, 's' )
#define GETOPT_FROM GETOPT_ARG( OPT_FROM, 'F' )
#define GETOPT_SIZE GETOPT_ARG( OPT_SIZE, 'S' )
#define GETOPT_BIND GETOPT_ARG( OPT_BIND, 'b' )
#define OPT_VERBOSE "verbose"
#define SOPT_VERBOSE "v"
#define GETOPT_VERBOSE GETOPT_FLAG( OPT_VERBOSE, 'v' )
#define VERBOSE_LINE \
"\t--" OPT_VERBOSE ",-" SOPT_VERBOSE "\t\tOutput debug information.\n"
#ifdef DEBUG
# define VERBOSE_LOG_LEVEL 0
#else
# define VERBOSE_LOG_LEVEL 1
#endif
#define OPT_QUIET "quiet"
#define SOPT_QUIET "q"
#define GETOPT_QUIET GETOPT_FLAG( OPT_QUIET, 'q' )
#define QUIET_LINE \
"\t--" OPT_QUIET ",-" SOPT_QUIET "\t\tOutput only fatal information.\n"
#define HELP_LINE \
"\t--" OPT_HELP ",-h \tThis text.\n"
#define SOCK_LINE \
"\t--" OPT_SOCK ",-s <SOCK>\tPath to the control socket.\n"
#define BIND_LINE \
"\t--" OPT_BIND ",-b <BIND-ADDR>\tBind the local socket to a particular IP address.\n"
char * help_help_text;
#endif

57
src/nbdtypes.c Normal file
View File

@@ -0,0 +1,57 @@
#include "nbdtypes.h"
#include <string.h>
#include <endian.h>
/**
* We intentionally ignore the reserved 128 bytes at the end of the
* request, since there's nothing we can do with them.
*/
void nbd_r2h_init( struct nbd_init_raw * from, struct nbd_init * to )
{
memcpy( to->passwd, from->passwd, 8 );
to->magic = be64toh( from->magic );
to->size = be64toh( from->size );
}
void nbd_h2r_init( struct nbd_init * from, struct nbd_init_raw * to)
{
memcpy( to->passwd, from->passwd, 8 );
to->magic = htobe64( from->magic );
to->size = htobe64( from->size );
}
void nbd_r2h_request( struct nbd_request_raw *from, struct nbd_request * to )
{
to->magic = htobe32( from->magic );
to->type = htobe32( from->type );
memcpy( to->handle, from->handle, 8 );
to->from = htobe64( from->from );
to->len = htobe32( from->len );
}
void nbd_h2r_request( struct nbd_request * from, struct nbd_request_raw * to )
{
to->magic = be32toh( from->magic );
to->type = be32toh( from->type );
memcpy( to->handle, from->handle, 8 );
to->from = be64toh( from->from );
to->len = be32toh( from->len );
}
void nbd_r2h_reply( struct nbd_reply_raw * from, struct nbd_reply * to )
{
to->magic = htobe32( from->magic );
to->error = htobe32( from->error );
memcpy( to->handle, from->handle, 8 );
}
void nbd_h2r_reply( struct nbd_reply * from, struct nbd_reply_raw * to )
{
to->magic = be32toh( from->magic );
to->error = be32toh( from->error );
memcpy( to->handle, from->handle, 8 );
}

77
src/nbdtypes.h Normal file
View File

@@ -0,0 +1,77 @@
#ifndef __NBDTYPES_H
#define __NBDTYPES_H
/* http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-09/2332.html */
#define INIT_PASSWD "NBDMAGIC"
#define INIT_MAGIC 0x0000420281861253
#define REQUEST_MAGIC 0x25609513
#define REPLY_MAGIC 0x67446698
#define REQUEST_READ 0
#define REQUEST_WRITE 1
#define REQUEST_DISCONNECT 2
#define REQUEST_ENTRUST (1<<16)
#include <linux/types.h>
#include <inttypes.h>
/* The _raw types are the types as they appear on the wire. Non-_raw
* types are in host-format.
* Conversion functions are _r2h_ for converting raw to host, and _h2r_
* for converting host to raw.
*/
struct nbd_init_raw {
char passwd[8];
__be64 magic;
__be64 size;
char reserved[128];
};
struct nbd_request_raw {
__be32 magic;
__be32 type; /* == READ || == WRITE */
char handle[8];
__be64 from;
__be32 len;
} __attribute__((packed));
struct nbd_reply_raw {
__be32 magic;
__be32 error; /* 0 = ok, else error */
char handle[8]; /* handle you got from request */
};
struct nbd_init {
char passwd[8];
uint64_t magic;
uint64_t size;
char reserved[128];
};
struct nbd_request {
uint32_t magic;
uint32_t type; /* == READ || == WRITE */
char handle[8];
uint64_t from;
uint32_t len;
} __attribute__((packed));
struct nbd_reply {
uint32_t magic;
uint32_t error; /* 0 = ok, else error */
char handle[8]; /* handle you got from request */
};
void nbd_r2h_init( struct nbd_init_raw * from, struct nbd_init * to );
void nbd_r2h_request( struct nbd_request_raw *from, struct nbd_request * to );
void nbd_r2h_reply( struct nbd_reply_raw * from, struct nbd_reply * to );
void nbd_h2r_init( struct nbd_init * from, struct nbd_init_raw * to);
void nbd_h2r_request( struct nbd_request * from, struct nbd_request_raw * to );
void nbd_h2r_reply( struct nbd_reply * from, struct nbd_reply_raw * to );
#endif

110
src/parse.c Normal file
View File

@@ -0,0 +1,110 @@
#include "parse.h"
#include "util.h"
int atoi(const char *nptr);
#define IS_IP_VALID_CHAR(x) ( ((x) >= '0' && (x) <= '9' ) || \
((x) >= 'a' && (x) <= 'f') || \
((x) >= 'A' && (x) <= 'F' ) || \
(x) == ':' || (x) == '.' \
)
/* FIXME: should change this to return negative on error like everything else */
int parse_ip_to_sockaddr(struct sockaddr* out, char* src)
{
NULLCHECK( out );
NULLCHECK( src );
char temp[64];
struct sockaddr_in *v4 = (struct sockaddr_in *) out;
struct sockaddr_in6 *v6 = (struct sockaddr_in6 *) out;
/* allow user to start with [ and end with any other invalid char */
{
int i=0, j=0;
if (src[i] == '[') { i++; }
for (; i<64 && IS_IP_VALID_CHAR(src[i]); i++) {
temp[j++] = src[i];
}
temp[j] = 0;
}
if (temp[0] == '0' && temp[1] == '\0') {
v4->sin_family = AF_INET;
v4->sin_addr.s_addr = INADDR_ANY;
return 1;
}
if (inet_pton(AF_INET, temp, &v4->sin_addr) == 1) {
out->sa_family = AF_INET;
return 1;
}
if (inet_pton(AF_INET6, temp, &v6->sin6_addr) == 1) {
out->sa_family = AF_INET6;
return 1;
}
return 0;
}
int parse_acl(struct ip_and_mask (**out)[], int max, char **entries)
{
struct ip_and_mask* list;
int i;
if (max == 0) {
*out = NULL;
return 0;
}
else {
list = xmalloc(max * sizeof(struct ip_and_mask));
*out = (struct ip_and_mask (*)[])list;
debug("acl alloc: %p", *out);
}
for (i = 0; i < max; i++) {
int j;
struct ip_and_mask* outentry = &list[i];
# define MAX_MASK_BITS (outentry->ip.family == AF_INET ? 32 : 128)
if (parse_ip_to_sockaddr(&outentry->ip.generic, entries[i]) == 0) {
return i;
}
for (j=0; entries[i][j] && entries[i][j] != '/'; j++)
; // increment j!
if (entries[i][j] == '/') {
outentry->mask = atoi(entries[i]+j+1);
if (outentry->mask < 1 || outentry->mask > MAX_MASK_BITS) {
return i;
}
}
else {
outentry->mask = MAX_MASK_BITS;
}
# undef MAX_MASK_BITS
debug("acl ptr[%d]: %p %d",i, outentry, outentry->mask);
}
for (i=0; i < max; i++) {
debug("acl entry %d @ %p has mask %d", i, list[i], list[i].mask);
}
return max;
}
void parse_port( char *s_port, struct sockaddr_in *out )
{
NULLCHECK( s_port );
int raw_port;
raw_port = atoi( s_port );
if ( raw_port < 0 || raw_port > 65535 ) {
fatal( "Port number must be >= 0 and <= 65535" );
}
out->sin_port = htobe16( raw_port );
}

25
src/parse.h Normal file
View File

@@ -0,0 +1,25 @@
#ifndef PARSE_H
#define PARSE_H
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
union mysockaddr {
unsigned short family;
struct sockaddr generic;
struct sockaddr_in v4;
struct sockaddr_in6 v6;
};
struct ip_and_mask {
union mysockaddr ip;
int mask;
};
int parse_ip_to_sockaddr(struct sockaddr* out, char* src);
int parse_acl(struct ip_and_mask (**out)[], int max, char **entries);
void parse_port( char *s_port, struct sockaddr_in *out );
#endif

216
src/readwrite.c Normal file
View File

@@ -0,0 +1,216 @@
#include "nbdtypes.h"
#include "ioutil.h"
#include "util.h"
#include "serve.h"
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
int socket_connect(struct sockaddr* to, struct sockaddr* from)
{
int fd = socket(to->sa_family == AF_INET ? PF_INET : PF_INET6, SOCK_STREAM, 0);
if( fd < 0 ){
warn( "Couldn't create client socket");
return -1;
}
if (NULL != from) {
if ( 0 > bind(fd, from, sizeof(struct sockaddr_in6)) ){
warn( "bind() failed");
close( fd );
return -1;
}
}
if ( 0 > connect(fd, to, sizeof(struct sockaddr_in6)) ) {
warn( "connect failed" );
close( fd );
return -1;
}
return fd;
}
int socket_nbd_read_hello(int fd, off64_t * out_size)
{
struct nbd_init init;
if ( 0 > readloop(fd, &init, sizeof(init)) ) {
warn( "Couldn't read init" );
goto fail;
}
if (strncmp(init.passwd, INIT_PASSWD, 8) != 0) {
warn("wrong passwd");
goto fail;
}
if (be64toh(init.magic) != INIT_MAGIC) {
warn("wrong magic (%x)", be64toh(init.magic));
goto fail;
}
if ( NULL != out_size ) {
*out_size = be64toh(init.size);
}
return 1;
fail:
return 0;
}
void fill_request(struct nbd_request *request, int type, off64_t from, int len)
{
request->magic = htobe32(REQUEST_MAGIC);
request->type = htobe32(type);
((int*) request->handle)[0] = rand();
((int*) request->handle)[1] = rand();
request->from = htobe64(from);
request->len = htobe32(len);
}
void read_reply(int fd, struct nbd_request *request, struct nbd_reply *reply)
{
struct nbd_reply_raw reply_raw;
ERROR_IF_NEGATIVE(readloop(fd, &reply_raw, sizeof(struct nbd_reply_raw)),
"Couldn't read reply");
nbd_r2h_reply( &reply_raw, reply );
if (reply->magic != REPLY_MAGIC) {
error("Reply magic incorrect (%x)", reply->magic);
}
if (reply->error != 0) {
error("Server replied with error %d", reply->error);
}
if (strncmp(request->handle, reply->handle, 8) != 0) {
error("Did not reply with correct handle");
}
}
void wait_for_data( int fd, int timeout_secs )
{
fd_set fds;
struct timeval tv = {timeout_secs, 0};
int selected;
FD_ZERO( &fds );
FD_SET( fd, &fds );
selected = select( FD_SETSIZE,
&fds, NULL, NULL,
timeout_secs >=0 ? &tv : NULL );
FATAL_IF( -1 == selected, "Select failed" );
ERROR_IF( 0 == selected, "Timed out waiting for reply" );
}
void socket_nbd_read(int fd, off64_t from, int len, int out_fd, void* out_buf, int timeout_secs)
{
struct nbd_request request;
struct nbd_reply reply;
fill_request(&request, REQUEST_READ, from, len);
FATAL_IF_NEGATIVE(writeloop(fd, &request, sizeof(request)),
"Couldn't write request");
wait_for_data( fd, timeout_secs );
read_reply(fd, &request, &reply);
if (out_buf) {
FATAL_IF_NEGATIVE(readloop(fd, out_buf, len),
"Read failed");
}
else {
FATAL_IF_NEGATIVE(
splice_via_pipe_loop(fd, out_fd, len),
"Splice failed"
);
}
}
void socket_nbd_write(int fd, off64_t from, int len, int in_fd, void* in_buf, int timeout_secs)
{
struct nbd_request request;
struct nbd_reply reply;
fill_request(&request, REQUEST_WRITE, from, len);
ERROR_IF_NEGATIVE(writeloop(fd, &request, sizeof(request)),
"Couldn't write request");
if (in_buf) {
ERROR_IF_NEGATIVE(writeloop(fd, in_buf, len),
"Write failed");
}
else {
ERROR_IF_NEGATIVE(
splice_via_pipe_loop(in_fd, fd, len),
"Splice failed"
);
}
wait_for_data( fd, timeout_secs );
read_reply(fd, &request, &reply);
}
void socket_nbd_entrust( int fd )
{
struct nbd_request request;
struct nbd_reply reply;
fill_request( &request, REQUEST_ENTRUST, 0, 0 );
FATAL_IF_NEGATIVE( writeloop( fd, &request, sizeof( request ) ),
"Couldn't write request");
read_reply( fd, &request, &reply );
}
int socket_nbd_disconnect( int fd )
{
int success = 1;
struct nbd_request request;
fill_request( &request, REQUEST_DISCONNECT, 0, 0 );
/* FIXME: This shouldn't be a FATAL error. We should just drop
* the mirror without affecting the main server.
*/
FATAL_IF_NEGATIVE( writeloop( fd, &request, sizeof( request ) ),
"Failed to write the disconnect request." );
return success;
}
#define CHECK_RANGE(error_type) { \
off64_t size;\
int success = socket_nbd_read_hello(params->client, &size); \
if ( success ) {\
if (params->from < 0 || (params->from + params->len) > size) {\
fatal(error_type \
" request %d+%d is out of range given size %d", \
params->from, params->len, size\
);\
}\
}\
else {\
fatal( error_type " connection failed." );\
}\
}
void do_read(struct mode_readwrite_params* params)
{
params->client = socket_connect(&params->connect_to.generic, &params->connect_from.generic);
FATAL_IF_NEGATIVE( params->client, "Couldn't connect." );
CHECK_RANGE("read");
socket_nbd_read(params->client, params->from, params->len,
params->data_fd, NULL, 10);
close(params->client);
}
void do_write(struct mode_readwrite_params* params)
{
params->client = socket_connect(&params->connect_to.generic, &params->connect_from.generic);
FATAL_IF_NEGATIVE( params->client, "Couldn't connect." );
CHECK_RANGE("write");
socket_nbd_write(params->client, params->from, params->len,
params->data_fd, NULL, 10);
close(params->client);
}

16
src/readwrite.h Normal file
View File

@@ -0,0 +1,16 @@
#ifndef READWRITE_H
#define READWRITE_H
#include <sys/types.h>
#include <sys/socket.h>
int socket_connect(struct sockaddr* to, struct sockaddr* from);
int socket_nbd_read_hello(int fd, off64_t * size);
void socket_nbd_read(int fd, off64_t from, int len, int out_fd, void* out_buf, int timeout_secs);
void socket_nbd_write(int fd, off64_t from, int len, int out_fd, void* out_buf, int timeout_secs);
void socket_nbd_entrust(int fd);
int socket_nbd_disconnect( int fd );
#endif

68
src/remote.c Normal file
View File

@@ -0,0 +1,68 @@
#include "ioutil.h"
#include "util.h"
#include <stdlib.h>
#include <sys/un.h>
static const int max_response=1024;
void print_response( const char * response )
{
char * response_text;
FILE * out;
int exit_status;
NULLCHECK( response );
exit_status = atoi(response);
response_text = strchr( response, ':' ) + 2;
NULLCHECK( response_text );
out = exit_status > 0 ? stderr : stdout;
fprintf(out, "%s\n", response_text );
}
void do_remote_command(char* command, char* socket_name, int argc, char** argv)
{
char newline=10;
int i;
debug( "connecting to run remote command %s", command );
int remote = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un address;
char response[max_response];
memset(&address, 0, sizeof(address));
FATAL_IF_NEGATIVE(remote, "Couldn't create client socket");
address.sun_family = AF_UNIX;
strncpy(address.sun_path, socket_name, sizeof(address.sun_path));
FATAL_IF_NEGATIVE(
connect(remote, (struct sockaddr*) &address, sizeof(address)),
"Couldn't connect to %s", socket_name
);
write(remote, command, strlen(command));
write(remote, &newline, 1);
for (i=0; i<argc; i++) {
if ( NULL != argv[i] ) {
write(remote, argv[i], strlen(argv[i]));
}
write(remote, &newline, 1);
}
write(remote, &newline, 1);
FATAL_IF_NEGATIVE(
read_until_newline(remote, response, max_response),
"Couldn't read response from %s", socket_name
);
print_response( response );
exit(atoi(response));
close(remote);
}

148
src/self_pipe.c Normal file
View File

@@ -0,0 +1,148 @@
/**
* self_pipe.c
*
* author: Alex Young <alex@bytemark.co.uk>
*
* Wrapper for the self-pipe trick for select()-based thread
* synchronisation. Get yourself a self_pipe with self_pipe_create(),
* select() on the read end of the pipe with the help of
* self_pipe_fd_set( sig, fds ) and self_pipe_fd_isset( sig, fds ).
* When you've received a signal, clear it with
* self_pipe_signal_clear(sig) so that the buffer doesn't get filled.
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include "util.h"
#include "self_pipe.h"
#define ERR_MSG_PIPE "Couldn't open a pipe for signaling."
#define ERR_MSG_FCNTL "Couldn't set a signalling pipe non-blocking."
#define ERR_MSG_WRITE "Couldn't write to a signaling pipe."
#define ERR_MSG_READ "Couldn't read from a signaling pipe."
void self_pipe_server_error( int err, char *msg )
{
char errbuf[1024] = {0};
strerror_r( err, errbuf, 1024 );
fatal( "%s\t%d (%s)", msg, err, errbuf );
}
/**
* Allocate a struct self_pipe, opening the pipe.
*
* Returns NULL if the pipe couldn't be opened or if we couldn't set it
* non-blocking.
*
* Remember to call self_pipe_destroy when you're done with the return
* value.
*/
struct self_pipe * self_pipe_create(void)
{
struct self_pipe *sig = xmalloc( sizeof( struct self_pipe ) );
int fds[2];
int fcntl_err;
if ( NULL == sig ) { return NULL; }
if ( pipe( fds ) ) {
free( sig );
self_pipe_server_error( errno, ERR_MSG_PIPE );
return NULL;
}
if ( fcntl( fds[0], F_SETFL, O_NONBLOCK ) || fcntl( fds[1], F_SETFL, O_NONBLOCK ) ) {
fcntl_err = errno;
while( close( fds[0] ) == -1 && errno == EINTR );
while( close( fds[1] ) == -1 && errno == EINTR );
free( sig );
self_pipe_server_error( fcntl_err, ERR_MSG_FCNTL );
return NULL;
}
sig->read_fd = fds[0];
sig->write_fd = fds[1];
return sig;
}
/**
* Send a signal to anyone select()ing on this signal.
*
* Returns 1 on success. Can fail if weirdness happened to the write fd
* of the pipe in the self_pipe struct.
*/
int self_pipe_signal( struct self_pipe * sig )
{
NULLCHECK( sig );
FATAL_IF( 1 == sig->write_fd, "Shouldn't be writing to stdout" );
FATAL_IF( 2 == sig->write_fd, "Shouldn't be writing to stderr" );
int written = write( sig->write_fd, "X", 1 );
if ( written != 1 ) {
self_pipe_server_error( errno, ERR_MSG_WRITE );
return 0;
}
return 1;
}
/**
* Clear a received signal from the pipe. Every signal sent must be
* cleared by one (and only one) recipient when they return from select()
* if the signal is to be used more than once.
* Returns the number of bytes read, which will be 1 on success and 0 if
* there was no signal.
*/
int self_pipe_signal_clear( struct self_pipe *sig )
{
char buf[1];
return 1 == read( sig->read_fd, buf, 1 );
}
/**
* Close the pipe and free the self_pipe. Do not try to use the
* self_pipe struct after calling this, the innards are mush.
*/
int self_pipe_destroy( struct self_pipe * sig )
{
NULLCHECK(sig);
while( close( sig->read_fd ) == -1 && errno == EINTR );
while( close( sig->write_fd ) == -1 && errno == EINTR );
/* Just in case anyone *does* try to use this after free,
* we should set the memory locations to an error value
*/
sig->read_fd = -1;
sig->write_fd = -1;
free( sig );
return 1;
}
int self_pipe_fd_set( struct self_pipe * sig, fd_set * fds)
{
FD_SET( sig->read_fd, fds );
return 1;
}
int self_pipe_fd_isset( struct self_pipe * sig, fd_set * fds)
{
return FD_ISSET( sig->read_fd, fds );
}

19
src/self_pipe.h Normal file
View File

@@ -0,0 +1,19 @@
#ifndef SELF_PIPE_H
#define SELF_PIPE_H
#include <sys/select.h>
struct self_pipe {
int read_fd;
int write_fd;
};
struct self_pipe * self_pipe_create(void);
int self_pipe_signal( struct self_pipe * sig );
int self_pipe_signal_clear( struct self_pipe *sig );
int self_pipe_destroy( struct self_pipe * sig );
int self_pipe_fd_set( struct self_pipe * sig, fd_set * fds );
int self_pipe_fd_isset( struct self_pipe *sig, fd_set *fds );
#endif

770
src/serve.c Normal file
View File

@@ -0,0 +1,770 @@
#include "serve.h"
#include "client.h"
#include "nbdtypes.h"
#include "ioutil.h"
#include "util.h"
#include "bitset.h"
#include "control.h"
#include "self_pipe.h"
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/un.h>
#include <fcntl.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/socket.h>
#include <netinet/tcp.h>
static inline void* sockaddr_address_data(struct sockaddr* sockaddr)
{
NULLCHECK( sockaddr );
struct sockaddr_in* in = (struct sockaddr_in*) sockaddr;
struct sockaddr_in6* in6 = (struct sockaddr_in6*) sockaddr;
if (sockaddr->sa_family == AF_INET) {
return &in->sin_addr;
}
if (sockaddr->sa_family == AF_INET6) {
return &in6->sin6_addr;
}
return NULL;
}
struct server * server_create (
struct flexnbd * flexnbd,
char* s_ip_address,
char* s_port,
char* s_file,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients,
int has_control)
{
NULLCHECK( flexnbd );
struct server * out;
out = xmalloc( sizeof( struct server ) );
out->flexnbd = flexnbd;
out->has_control = has_control;
out->max_nbd_clients = max_nbd_clients;
out->nbd_client = xmalloc( max_nbd_clients * sizeof( struct client_tbl_entry ) );
out->tcp_backlog = 10; /* does this need to be settable? */
FATAL_IF_NULL(s_ip_address, "No IP address supplied");
FATAL_IF_NULL(s_port, "No port number supplied");
FATAL_IF_NULL(s_file, "No filename supplied");
NULLCHECK( s_ip_address );
FATAL_IF_ZERO(
parse_ip_to_sockaddr(&out->bind_to.generic, s_ip_address),
"Couldn't parse server address '%s' (use 0 if "
"you want to bind to all IPs)",
s_ip_address
);
out->acl = acl_create( acl_entries, s_acl_entries, default_deny );
if (out->acl && out->acl->len != acl_entries) {
fatal("Bad ACL entry '%s'", s_acl_entries[out->acl->len]);
}
parse_port( s_port, &out->bind_to.v4 );
out->filename = s_file;
out->filename_incomplete = xmalloc(strlen(s_file)+11+1);
strcpy(out->filename_incomplete, s_file);
strcpy(out->filename_incomplete + strlen(s_file), ".INCOMPLETE");
out->l_io = flexthread_mutex_create();
out->l_acl= flexthread_mutex_create();
out->close_signal = self_pipe_create();
out->acl_updated_signal = self_pipe_create();
NULLCHECK( out->close_signal );
NULLCHECK( out->acl_updated_signal );
return out;
}
void server_destroy( struct server * serve )
{
self_pipe_destroy( serve->acl_updated_signal );
serve->acl_updated_signal = NULL;
self_pipe_destroy( serve->close_signal );
serve->close_signal = NULL;
flexthread_mutex_destroy( serve->l_acl );
flexthread_mutex_destroy( serve->l_io );
if ( serve->acl ) {
acl_destroy( serve->acl );
serve->acl = NULL;
}
free( serve->filename_incomplete );
free( serve->nbd_client );
free( serve );
}
void server_dirty(struct server *serve, off64_t from, int len)
{
NULLCHECK( serve );
if (serve->mirror) {
bitset_set_range(serve->mirror->dirty_map, from, len);
}
}
#define SERVER_LOCK( s, f, msg ) \
do { NULLCHECK( s ); \
FATAL_IF( 0 != flexthread_mutex_lock( s->f ), msg ); } while (0)
#define SERVER_UNLOCK( s, f, msg ) \
do { NULLCHECK( s ); \
FATAL_IF( 0 != flexthread_mutex_unlock( s->f ), msg ); } while (0)
void server_lock_io( struct server * serve)
{
debug("IO locking");
SERVER_LOCK( serve, l_io, "Problem with I/O lock" );
}
void server_unlock_io( struct server* serve )
{
debug("IO unlocking");
SERVER_UNLOCK( serve, l_io, "Problem with I/O unlock" );
}
/* This is only to be called from error handlers. */
int server_io_locked( struct server * serve )
{
NULLCHECK( serve );
return flexthread_mutex_held( serve->l_io );
}
void server_lock_acl( struct server *serve )
{
debug("ACL locking");
SERVER_LOCK( serve, l_acl, "Problem with ACL lock" );
}
void server_unlock_acl( struct server *serve )
{
SERVER_UNLOCK( serve, l_acl, "Problem with ACL unlock" );
}
int server_acl_locked( struct server * serve )
{
NULLCHECK( serve );
return flexthread_mutex_held( serve->l_acl );
}
/** Return the actual port the server bound to. This is used because we
* are allowed to pass "0" on the command-line.
*/
int server_port( struct server * server )
{
NULLCHECK( server );
union mysockaddr addr;
socklen_t len = sizeof( addr.v4 );
if ( getsockname( server->server_fd, &addr.v4, &len ) < 0 ) {
fatal( "Failed to get the port number." );
}
return be16toh( addr.v4.sin_port );
}
/* Try to bind to our serving socket, retrying until it works or gives a
* fatal error. */
void serve_bind( struct server * serve )
{
int bind_result;
char s_address[64];
memset( s_address, 0, 64 );
strcpy( s_address, "???" );
inet_ntop( serve->bind_to.generic.sa_family,
sockaddr_address_data( &serve->bind_to.generic),
s_address, 64 );
do {
bind_result = bind(
serve->server_fd,
&serve->bind_to.generic,
sizeof(serve->bind_to));
if ( 0 == bind_result ) {
info( "Bound to %s port %d",
s_address,
ntohs(serve->bind_to.v4.sin_port));
break;
}
else {
warn( "Couldn't bind to %s port %d: %s",
s_address,
ntohs(serve->bind_to.v4.sin_port),
strerror( errno ) );
switch (errno){
/* bind() can give us EACCES,
* EADDRINUSE, EADDRNOTAVAIL, EBADF,
* EINVAL or ENOTSOCK.
*
* Any of these other than EACCES,
* EADDRINUSE or EADDRNOTAVAIL signify
* that there's a logic error somewhere.
*/
case EACCES:
case EADDRINUSE:
case EADDRNOTAVAIL:
debug("retrying");
sleep(1);
continue;
default:
fatal( "Giving up" );
}
}
} while ( 1 );
}
/** Prepares a listening socket for the NBD server, binding etc. */
void serve_open_server_socket(struct server* params)
{
NULLCHECK( params );
int optval=1;
params->server_fd= socket(params->bind_to.generic.sa_family == AF_INET ?
PF_INET : PF_INET6, SOCK_STREAM, 0);
FATAL_IF_NEGATIVE(params->server_fd,
"Couldn't create server socket");
/* We need SO_REUSEADDR so that when we switch from listening to
* serving we don't have to change address if we don't want to.
*
* If this fails, it's not necessarily bad in principle, but at
* this point in the code we can't tell if it's going to be a
* problem. It's also indicative of something odd going on, so
* we barf.
*/
FATAL_IF_NEGATIVE(
setsockopt(params->server_fd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)),
"Couldn't set SO_REUSEADDR"
);
/* TCP_NODELAY makes everything not be slow. If we can't set
* this, again, there's something odd going on which we don't
* understand.
*/
FATAL_IF_NEGATIVE(
setsockopt(params->server_fd, IPPROTO_TCP, TCP_NODELAY, &optval, sizeof(optval)),
"Couldn't set TCP_NODELAY"
);
/* If we can't bind, presumably that's because someone else is
* squatting on our ip/port combo, or the ip isn't yet
* configured. Ideally we want to retry this. */
serve_bind(params);
FATAL_IF_NEGATIVE(
listen(params->server_fd, params->tcp_backlog),
"Couldn't listen on server socket"
);
}
int tryjoin_client_thread( struct client_tbl_entry *entry, int (*joinfunc)(pthread_t, void **) )
{
NULLCHECK( entry );
NULLCHECK( joinfunc );
int was_closed = 0;
void * status=NULL;
int join_errno;
if (entry->thread != 0) {
char s_client_address[64];
memset(s_client_address, 0, 64);
strcpy(s_client_address, "???");
inet_ntop( entry->address.generic.sa_family,
sockaddr_address_data(&entry->address.generic),
s_client_address,
64 );
debug( "%s(%p,...)", joinfunc == pthread_join ? "joining" : "tryjoining", entry->thread );
join_errno = joinfunc(entry->thread, &status);
/* join_errno can legitimately be ESRCH if the thread is
* already dead, but the client still needs tidying up. */
if (join_errno != 0 && !entry->client->stopped ) {
FATAL_UNLESS( join_errno == EBUSY,
"Problem with joining thread %p: %s",
entry->thread,
strerror(join_errno) );
}
else {
debug("nbd thread %016x exited (%s) with status %ld",
entry->thread,
s_client_address,
(uint64_t)status);
client_destroy( entry->client );
entry->client = NULL;
entry->thread = 0;
was_closed = 1;
}
}
return was_closed;
}
/**
* Check to see if a client thread has finished, and if so, tidy up
* after it.
* Returns 1 if the thread was cleaned up and the slot freed, 0
* otherwise.
*
* It's important that client_destroy gets called in the same thread
* which signals the client threads to stop. This avoids the
* possibility of sending a stop signal via a signal which has already
* been destroyed. However, it means that stopped client threads,
* including their signal pipes, won't be cleaned up until the next new
* client connection attempt.
*/
int cleanup_client_thread( struct client_tbl_entry * entry )
{
return tryjoin_client_thread( entry, pthread_tryjoin_np );
}
void cleanup_client_threads( struct client_tbl_entry * entries, size_t entries_len )
{
size_t i;
for( i = 0; i < entries_len; i++ ) {
cleanup_client_thread( &entries[i] );
}
}
/**
* Join a client thread after having sent a stop signal to it.
* This function will not return until pthread_join has returned, so
* ensures that the client thread is dead.
*/
int join_client_thread( struct client_tbl_entry *entry )
{
return tryjoin_client_thread( entry, pthread_join );
}
/** We can only accommodate MAX_NBD_CLIENTS connections at once. This function
* goes through the current list, waits for any threads that have finished
* and returns the next slot free (or -1 if there are none).
*/
int cleanup_and_find_client_slot(struct server* params)
{
NULLCHECK( params );
int slot=-1, i;
cleanup_client_threads( params->nbd_client, params->max_nbd_clients );
for ( i = 0; i < params->max_nbd_clients; i++ ) {
if( params->nbd_client[i].thread == 0 && slot == -1 ){
slot = i;
break;
}
}
return slot;
}
/** Check whether the address client_address is allowed or not according
* to the current acl. If params->acl is NULL, the result will be 1,
* otherwise it will be the result of acl_includes().
*/
int server_acl_accepts( struct server *params, union mysockaddr * client_address )
{
NULLCHECK( params );
NULLCHECK( client_address );
struct acl * acl;
int accepted;
server_lock_acl( params );
{
acl = params->acl;
accepted = acl ? acl_includes( acl, client_address ) : 1;
}
server_unlock_acl( params );
return accepted;
}
int server_should_accept_client(
struct server * params,
union mysockaddr * client_address,
char *s_client_address,
size_t s_client_address_len )
{
NULLCHECK( params );
NULLCHECK( client_address );
NULLCHECK( s_client_address );
if (inet_ntop(client_address->generic.sa_family,
sockaddr_address_data(&client_address->generic),
s_client_address, s_client_address_len ) == NULL) {
warn( "Rejecting client %s: Bad client_address", s_client_address );
return 0;
}
if ( !server_acl_accepts( params, client_address ) ) {
warn( "Rejecting client %s: Access control error", s_client_address );
debug( "We %s have an acl, and default_deny is %s",
(params->acl ? "do" : "do not"),
(params->acl->default_deny ? "true" : "false") );
return 0;
}
return 1;
}
int spawn_client_thread(
struct client * client_params,
pthread_t *out_thread)
{
int result = pthread_create(out_thread, NULL, client_serve, client_params);
return result;
}
/** Dispatch function for accepting an NBD connection and starting a thread
* to handle it. Rejects the connection if there is an ACL, and the far end's
* address doesn't match, or if there are too many clients already connected.
*/
void accept_nbd_client(
struct server* params,
int client_fd,
union mysockaddr* client_address)
{
NULLCHECK(params);
NULLCHECK(client_address);
struct client* client_params;
int slot;
char s_client_address[64] = {0};
if ( !server_should_accept_client( params, client_address, s_client_address, 64 ) ) {
close( client_fd );
return;
}
slot = cleanup_and_find_client_slot(params);
if (slot < 0) {
warn("too many clients to accept connection");
close(client_fd);
return;
}
debug( "Client %s accepted.", s_client_address );
client_params = client_create( params, client_fd );
params->nbd_client[slot].client = client_params;
memcpy(&params->nbd_client[slot].address, client_address,
sizeof(union mysockaddr));
pthread_t * thread = &params->nbd_client[slot].thread;
if ( 0 != spawn_client_thread( client_params, thread ) ) {
debug( "Thread creation problem." );
client_destroy( client_params );
close(client_fd);
return;
}
debug("nbd thread %p started (%s)", params->nbd_client[slot].thread, s_client_address);
}
void server_audit_clients( struct server * serve)
{
NULLCHECK( serve );
int i;
struct client_tbl_entry * entry;
/* There's an apparent race here. If the acl updates while
* we're traversing the nbd_clients array, the earlier entries
* won't have been audited against the later acl. This isn't a
* problem though, because in order to update the acl
* server_replace_acl must have been called, so the
* server_accept ioop will see a second acl_updated signal as
* soon as it hits select, and a second audit will be run.
*/
for( i = 0; i < serve->max_nbd_clients; i++ ) {
entry = &serve->nbd_client[i];
if ( 0 == entry->thread ) { continue; }
if ( server_acl_accepts( serve, &entry->address ) ) { continue; }
client_signal_stop( entry->client );
}
}
int server_is_closed(struct server* serve)
{
NULLCHECK( serve );
return fd_is_closed( serve->server_fd );
}
void server_close_clients( struct server *params )
{
NULLCHECK(params);
info("closing all clients");
int i, j;
struct client_tbl_entry *entry;
for( i = 0; i < params->max_nbd_clients; i++ ) {
entry = &params->nbd_client[i];
if ( entry->thread != 0 ) {
debug( "Stop signaling client %p", entry->client );
client_signal_stop( entry->client );
}
}
for( j = 0; j < params->max_nbd_clients; j++ ) {
join_client_thread( &params->nbd_client[j] );
}
}
/** Replace the current acl with a new one. The old one will be thrown
* away.
*/
void server_replace_acl( struct server *serve, struct acl * new_acl )
{
NULLCHECK(serve);
NULLCHECK(new_acl);
/* We need to lock around updates to the acl in case we try to
* destroy the old acl while checking against it.
*/
server_lock_acl( serve );
{
struct acl * old_acl = serve->acl;
serve->acl = new_acl;
/* We should always have an old_acl, but just in case... */
if ( old_acl ) { acl_destroy( old_acl ); }
}
server_unlock_acl( serve );
self_pipe_signal( serve->acl_updated_signal );
}
/** Accept either an NBD or control socket connection, dispatch appropriately */
int server_accept( struct server * params )
{
NULLCHECK( params );
debug("accept loop starting");
int client_fd;
union mysockaddr client_address;
fd_set fds;
socklen_t socklen=sizeof(client_address);
/* We select on this fd to receive OS signals (only a few of
* which we're interested in, see flexnbd.c */
int signal_fd = flexnbd_signal_fd( params->flexnbd );
FD_ZERO(&fds);
FD_SET(params->server_fd, &fds);
if( 0 < signal_fd ) { FD_SET(signal_fd, &fds); }
self_pipe_fd_set( params->close_signal, &fds );
self_pipe_fd_set( params->acl_updated_signal, &fds );
FATAL_IF_NEGATIVE(select(FD_SETSIZE, &fds,
NULL, NULL, NULL), "select() failed");
if ( self_pipe_fd_isset( params->close_signal, &fds ) ){
server_close_clients( params );
return 0;
}
if ( 0 < signal_fd && FD_ISSET( signal_fd, &fds ) ){
debug( "Stop signal received." );
server_close_clients( params );
return 0;
}
if ( self_pipe_fd_isset( params->acl_updated_signal, &fds ) ) {
self_pipe_signal_clear( params->acl_updated_signal );
server_audit_clients( params );
}
if ( FD_ISSET( params->server_fd, &fds ) ){
client_fd = accept( params->server_fd, &client_address.generic, &socklen );
debug("Accepted nbd client socket");
accept_nbd_client(params, client_fd, &client_address);
}
return 1;
}
void serve_accept_loop(struct server* params)
{
NULLCHECK( params );
while( server_accept( params ) );
}
/** Initialisation function that sets up the initial allocation map, i.e. so
* we know which blocks of the file are allocated.
*/
void serve_init_allocation_map(struct server* params)
{
NULLCHECK( params );
int fd = open(params->filename, O_RDONLY);
off64_t size;
FATAL_IF_NEGATIVE(fd, "Couldn't open %s", params->filename);
size = lseek64(fd, 0, SEEK_END);
params->size = size;
FATAL_IF_NEGATIVE(size, "Couldn't find size of %s",
params->filename);
params->allocation_map =
build_allocation_map(fd, size, block_allocation_resolution);
close(fd);
}
/* Tell the server to close all the things. */
void serve_signal_close( struct server * serve )
{
NULLCHECK( serve );
info("signalling close");
self_pipe_signal( serve->close_signal );
}
/* Block until the server closes the server_fd.
*/
void serve_wait_for_close( struct server * serve )
{
while( !fd_is_closed( serve->server_fd ) ){
usleep(10000);
}
}
/* We've just had an ENTRUST/DISCONNECT pair, so we need to shut down
* and signal our listener that we can safely take over.
*/
void server_control_arrived( struct server *serve )
{
NULLCHECK( serve );
serve->has_control = 1;
serve_signal_close( serve );
}
/** Closes sockets, frees memory and waits for all client threads to finish */
void serve_cleanup(struct server* params,
int fatal __attribute__ ((unused)) )
{
NULLCHECK( params );
info("cleaning up");
int i;
if (params->server_fd){ close(params->server_fd); }
if (params->allocation_map) {
free(params->allocation_map);
}
if (params->mirror_super) {
/* AWOOGA! RACE! */
pthread_t mirror_t = params->mirror_super->thread;
params->mirror->signal_abandon = 1;
pthread_join( mirror_t, NULL );
}
for (i=0; i < params->max_nbd_clients; i++) {
void* status;
pthread_t thread_id = params->nbd_client[i].thread;
if (thread_id != 0) {
debug("joining thread %p", thread_id);
pthread_join(thread_id, &status);
}
}
if ( server_acl_locked( params ) ) {
server_unlock_acl( params );
}
debug( "Cleanup done");
}
int server_is_in_control( struct server *serve )
{
NULLCHECK( serve );
return serve->has_control;
}
int server_default_deny( struct server * serve )
{
NULLCHECK( serve );
return acl_default_deny( serve->acl );
}
/** Full lifecycle of the server */
int do_serve(struct server* params)
{
NULLCHECK( params );
int has_control;
error_set_handler((cleanup_handler*) serve_cleanup, params);
serve_open_server_socket(params);
serve_init_allocation_map(params);
serve_accept_loop(params);
has_control = params->has_control;
serve_cleanup(params, 0);
return has_control;
}

114
src/serve.h Normal file
View File

@@ -0,0 +1,114 @@
#ifndef SERVE_H
#define SERVE_H
#include <sys/types.h>
#include <unistd.h>
#include "flexnbd.h"
#include "parse.h"
#include "acl.h"
static const int block_allocation_resolution = 4096;//128<<10;
struct client_tbl_entry {
pthread_t thread;
union mysockaddr address;
struct client * client;
};
#define MAX_NBD_CLIENTS 16
struct server {
/* The flexnbd wrapper this server is attached to */
struct flexnbd * flexnbd;
/** address/port to bind to */
union mysockaddr bind_to;
/** (static) file name to serve */
char* filename;
/** file name of INCOMPLETE flag */
char* filename_incomplete;
/** TCP backlog for listen() */
int tcp_backlog;
/** (static) file name of UNIX control socket (or NULL if none) */
char* control_socket_name;
/** size of file */
uint64_t size;
/** Claims around any I/O to this file */
struct flexthread_mutex * l_io;
/** to interrupt accept loop and clients, write() to close_signal[1] */
struct self_pipe * close_signal;
/** access control list */
struct acl * acl;
/** acl_updated_signal will be signalled after the acl struct
* has been replaced
*/
struct self_pipe * acl_updated_signal;
/* Claimed around any updates to the ACL. */
struct flexthread_mutex * l_acl;
struct mirror* mirror;
struct mirror_super * mirror_super;
int server_fd;
int control_fd;
struct bitset_mapping* allocation_map;
int max_nbd_clients;
struct client_tbl_entry *nbd_client;
/* Marker for whether this server has control over the data in
* the file, or if we're waiting to receive it from an inbound
* migration which hasn't yet finished.
*/
int has_control;
};
struct server * server_create(
struct flexnbd * flexnbd,
char* s_ip_address,
char* s_port,
char* s_file,
int default_deny,
int acl_entries,
char** s_acl_entries,
int max_nbd_clients,
int has_control );
void server_destroy( struct server * );
int server_is_closed(struct server* serve);
void server_dirty(struct server *serve, off64_t from, int len);
void server_lock_io( struct server * serve);
void server_unlock_io( struct server* serve );
void serve_signal_close( struct server *serve );
void serve_wait_for_close( struct server * serve );
void server_replace_acl( struct server *serve, struct acl * acl);
void server_control_arrived( struct server *serve );
int server_is_in_control( struct server *serve );
int server_default_deny( struct server * serve );
int server_io_locked( struct server * serve );
int server_acl_locked( struct server * serve );
void server_lock_acl( struct server *serve );
void server_unlock_acl( struct server *serve );
int do_serve( struct server * );
struct mode_readwrite_params {
union mysockaddr connect_to;
union mysockaddr connect_from;
off64_t from;
off64_t len;
int data_fd;
int client;
};
#endif

34
src/status.c Normal file
View File

@@ -0,0 +1,34 @@
#include "status.h"
#include "serve.h"
#include "util.h"
struct status * status_create( struct server * serve )
{
NULLCHECK( serve );
struct status * status;
status = xmalloc( sizeof( struct status ) );
status->has_control = serve->has_control;
status->is_mirroring = NULL != serve->mirror;
return status;
}
#define BOOL_S(var) (var ? "true" : "false" )
#define PRINT_FIELD( var ) \
do{dprintf( fd, #var "=%s ", BOOL_S( status->var ) );}while(0)
int status_write( struct status * status, int fd )
{
PRINT_FIELD( is_mirroring );
PRINT_FIELD( has_control );
dprintf(fd, "\n");
return 1;
}
void status_destroy( struct status * status )
{
NULLCHECK( status );
free( status );
}

55
src/status.h Normal file
View File

@@ -0,0 +1,55 @@
#ifndef STATUS_H
#define STATUS_H
/* Status reports
*
* The status will be reported by writing to a file descriptor. The
* status report will be on a single line. The status format will be:
*
* A=B C=D
*
* That is, a space-separated list of label,value pairs, each pair
* separated by an '=' character. Neither ' ' nor '=' will appear in
* either labels or values.
*
* Boolean values will appear as the strings "true" and "false".
*
* The following status fields are defined:
*
* has_control:
* This will be false when the server is listening for an incoming
* migration. It will switch to true when the end-of-migration
* handshake is successfully completed.
* If the server is started in "serve" mode, this will never be
* false.
*
* is_migrating:
* This will be false when the server is started in either "listen"
* or "serve" mode. It will become true when a server in "serve"
* mode starts a migration, and will become false again when the
* migration terminates, successfully or not.
* If the server is currently in "listen" mode, this will never b
* true.
*/
#include "serve.h"
struct status {
int has_control;
int is_mirroring;
};
/** Create a status object for the given server. */
struct status * status_create( struct server * );
/** Output the given status object to the given file descriptot */
int status_write( struct status *, int fd );
/** Free the status object */
void status_destroy( struct status * );
#endif

67
src/util.c Normal file
View File

@@ -0,0 +1,67 @@
#include <stdarg.h>
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <malloc.h>
#include <unistd.h>
#include "util.h"
pthread_key_t cleanup_handler_key;
int log_level = 2;
void error_init(void)
{
pthread_key_create(&cleanup_handler_key, free);
}
void error_handler(int fatal)
{
DECLARE_ERROR_CONTEXT(context);
if (context) {
longjmp(context->jmp, fatal ? 1 : 2 );
}
else {
if ( fatal ) { abort(); }
else { pthread_exit((void*) 1); }
}
}
void exit_err( const char *msg )
{
fprintf( stderr, "%s\n", msg );
exit( 1 );
}
void mylog(int line_level, const char* format, ...)
{
if (line_level < log_level) { return; }
va_list argptr;
va_start(argptr, format);
vfprintf(stderr, format, argptr);
va_end(argptr);
}
void* xrealloc(void* ptr, size_t size)
{
void* p = realloc(ptr, size);
FATAL_IF_NULL(p, "couldn't xrealloc %d bytes", ptr ? "realloc" : "malloc", size);
return p;
}
void* xmalloc(size_t size)
{
void* p = xrealloc(NULL, size);
memset(p, 0, size);
return p;
}

153
src/util.h Normal file
View File

@@ -0,0 +1,153 @@
#ifndef __UTIL_H
#define __UTIL_H
#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
void* xrealloc(void* ptr, size_t size);
void* xmalloc(size_t size);
typedef void (cleanup_handler)(void* /* context */, int /* is fatal? */);
/* set from 0 - 5 depending on what level of verbosity you want */
extern int log_level;
/* set up the error globals */
void error_init(void);
void exit_err( const char * );
/* error_set_handler must be a macro not a function due to setjmp stack rules */
#include <setjmp.h>
struct error_handler_context {
jmp_buf jmp;
cleanup_handler* handler;
void* data;
};
#define DECLARE_ERROR_CONTEXT(name) \
struct error_handler_context *name = (struct error_handler_context*) \
pthread_getspecific(cleanup_handler_key);
/* clean up with the given function & data when error_handler() is invoked,
* non-fatal errors will also return here (if that's dangerous, use fatal()
* instead of error()).
*
* error handlers are thread-local, so you need to call this when starting a
* new thread.
*/
extern pthread_key_t cleanup_handler_key;
#define error_set_handler(cleanfn, cleandata) \
{ \
DECLARE_ERROR_CONTEXT(old); \
struct error_handler_context *context = \
xmalloc(sizeof(struct error_handler_context)); \
context->handler = (cleanfn); \
context->data = (cleandata); \
\
switch (setjmp(context->jmp)) \
{ \
case 0: /* setup time */ \
if (old) { free(old); }\
if( EINVAL == pthread_setspecific(cleanup_handler_key, context) ) { \
fprintf(stderr, "Tried to set an error handler at %s:%d" \
" without calling error_init().\n", \
__FILE__, __LINE__ );\
abort();\
}\
break; \
case 1: /* fatal error, terminate process */ \
debug( "Fatal error in thread %p", pthread_self() ); \
abort(); \
/* abort() can't return, so we can't fall through */ \
case 2: /* non-fatal error, call cleanup and terminate thread */ \
debug( "Error in thread %p", pthread_self() ); \
context->handler(context->data, 0); \
pthread_exit((void*) 1);\
/* pthread_exit() can't return, so we can't fall through
* */\
default: \
abort(); \
} \
}
/* invoke the error handler - longjmps away, don't use directly */
void error_handler(int fatal);
/* mylog a line at the given level (0 being most verbose) */
void mylog(int line_level, const char* format, ...);
#define levstr(i) (i==0?'D':(i==1?'I':(i==2?'W':(i==3?'E':'F'))))
#define myloglev(level, msg, ...) mylog( level, "%c:%d %p %s:%d: "msg"\n", levstr(level), getpid(),pthread_self(), __FILE__, __LINE__, ##__VA_ARGS__ )
#ifdef DEBUG
# define debug(msg, ...) myloglev(0, msg, ##__VA_ARGS__)
#else
# define debug(msg, ...) /* no-op */
#endif
/* informational message, not expected to be compiled out */
#define info(msg, ...) myloglev(1, msg, ##__VA_ARGS__)
/* messages that might indicate a problem */
#define warn(msg, ...) myloglev(2, msg, ##__VA_ARGS__)
/* mylog a message and invoke the error handler to recover */
#define error(msg, ...) do { \
myloglev(3, msg, ##__VA_ARGS__); \
error_handler(0); \
} while(0)
/* mylog a message and invoke the error handler to kill the current thread */
#define fatal(msg, ...) do { \
myloglev(4, msg, ##__VA_ARGS__); \
error_handler(1); \
} while(0)
#define ERROR_IF( test, msg, ... ) do { if ((test)) { error(msg, ##__VA_ARGS__); } } while(0)
#define FATAL_IF( test, msg, ... ) do { if ((test)) { fatal(msg, ##__VA_ARGS__); } } while(0)
#define ERROR_UNLESS( test, msg, ... ) ERROR_IF( !(test), msg, ##__VA_ARGS__ )
#define FATAL_UNLESS( test, msg, ... ) FATAL_IF( !(test), msg, ##__VA_ARGS__ )
#define ERROR_IF_NULL(value, msg, ...) \
ERROR_IF( NULL == value, msg " (errno=%d, %s)", ##__VA_ARGS__, errno, strerror(errno) )
#define FATAL_IF_NULL(value, msg, ...) \
FATAL_IF( NULL == value, msg " (errno=%d, %s)", ##__VA_ARGS__, errno, strerror(errno) )
#define ERROR_IF_NEGATIVE( value, msg, ... ) ERROR_IF( value < 0, msg, ##__VA_ARGS__ )
#define FATAL_IF_NEGATIVE( value, msg, ... ) FATAL_IF( value < 0, msg, ##__VA_ARGS__ )
#define ERROR_IF_ZERO( value, msg, ... ) ERROR_IF( 0 == value, msg, ##__VA_ARGS__ )
#define FATAL_IF_ZERO( value, msg, ... ) FATAL_IF( 0 == value, msg, ##__VA_ARGS__ )
#define ERROR_UNLESS_NULL(value, msg, ...) \
ERROR_UNLESS( NULL == value, msg " (errno=%d, %s)", ##__VA_ARGS__, errno, strerror(errno) )
#define FATAL_UNLESS_NULL(value, msg, ...) \
FATAL_UNLESS( NULL == value, msg " (errno=%d, %s)", ##__VA_ARGS__, errno, strerror(errno) )
#define ERROR_UNLESS_NEGATIVE( value, msg, ... ) ERROR_UNLESS( value < 0, msg, ##__VA_ARGS__ )
#define FATAL_UNLESS_NEGATIVE( value, msg, ... ) FATAL_UNLESS( value < 0, msg, ##__VA_ARGS__ )
#define ERROR_UNLESS_ZERO( value, msg, ... ) ERROR_UNLESS( 0 == value, msg, ##__VA_ARGS__ )
#define FATAL_UNLESS_ZERO( value, msg, ... ) FATAL_UNLESS( 0 == value, msg, ##__VA_ARGS__ )
#define NULLCHECK(value) FATAL_IF_NULL(value, "BUG: " #value " is null")
#endif

View File

@@ -0,0 +1,141 @@
# encoding: utf-8
require 'flexnbd'
require 'file_writer'
class Environment
attr_reader( :blocksize, :filename1, :filename2, :ip,
:port1, :port2, :nbd1, :nbd2, :file1, :file2, :rebind_port1 )
def initialize
@blocksize = 1024
@filename1 = "/tmp/.flexnbd.test.#{$$}.#{Time.now.to_i}.1"
@filename2 = "/tmp/.flexnbd.test.#{$$}.#{Time.now.to_i}.2"
@ip = "127.0.0.1"
@available_ports = [*40000..41000] - listening_ports
@port1 = @available_ports.shift
@rebind_port1 = @available_ports.shift
@port2 = @available_ports.shift
@rebind_port2 = @available_ports.shift
@nbd1 = FlexNBD.new("../../build/flexnbd", @ip, @port1, @ip, @rebind_port1)
@nbd2 = FlexNBD.new("../../build/flexnbd", @ip, @port2, @ip, @rebind_port2)
@fake_pid = nil
end
def serve1(*acl)
@nbd1.serve(@filename1, *acl)
end
def serve2(*acl)
@nbd2.serve(@filename2, *acl)
end
def listen1( *acl )
@nbd1.listen( @filename1, *(acl.empty? ? @acl1: acl) )
end
def listen2( *acl )
@nbd2.listen( @filename2, *acl )
end
def acl1( *acl )
@nbd1.acl( *acl )
end
def acl2( *acl )
@nbd2.acl( *acl )
end
def status1
@nbd1.status.first
end
def status2
@nbd2.status.first
end
def mirror12
@nbd1.mirror( @nbd2.ip, @nbd2.port )
end
def mirror12_unchecked
@nbd1.mirror_unchecked( @nbd2.ip, @nbd2.port, nil, nil, 10 )
end
def writefile1(data)
@file1 = FileWriter.new(@filename1, @blocksize).write(data)
end
def writefile2(data)
@file2 = FileWriter.new(@filename2, @blocksize).write(data)
end
def truncate1( size )
system "truncate -s #{size} #{@filename1}"
end
def listening_ports
`netstat -ltn`.
split("\n").
map { |x| x.split(/\s+/) }[2..-1].
map { |l| l[3].split(":")[-1].to_i }
end
def cleanup
if @fake_pid
begin
Process.waitpid2( @fake_pid )
rescue Errno::ESRCH
end
end
@nbd1.can_die(0)
@nbd1.kill
@nbd2.kill
[@filename1, @filename2].each do |f|
File.unlink(f) if File.exists?(f)
end
end
def run_fake( name, addr, port, rebind_addr = addr, rebind_port = port )
fakedir = File.join( File.dirname( __FILE__ ), "fakes" )
fake = Dir[File.join( fakedir, name ) + "*"].sort.find { |fn|
File.executable?( fn )
}
raise "no fake executable" unless fake
raise "no addr" unless addr
raise "no port" unless port
raise "no rebind_addr" unless rebind_addr
raise "no rebind_port" unless rebind_port
@fake_pid = fork do
exec [fake, addr, port, @nbd1.pid, rebind_addr, rebind_port].map{|x| x.to_s}.join(" ")
end
sleep(0.5)
end
def fake_reports_success
_,status = Process.waitpid2( @fake_pid )
@fake_pid = nil
status.success?
end
end # class Environment

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Open a server, accept a client, then we expect a single write
# followed by an entrust. Disconnect after the entrust. We expect a
# reconnection followed by a full mirror.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port, src_pid = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
client.write_hello
write_req = client.read_request
data = client.read_data( write_req[:len] )
client.write_reply( write_req[:handle], 0 )
entrust_req = client.read_request
fail "Not an entrust" unless entrust_req[:type] == 65536
client.close
client2 = server.accept
client2.receive_mirror
exit(0)

View File

@@ -0,0 +1,36 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Receive a mirror, and disconnect after sending the entrust reply but
# before it can send the disconnect signal.
#
# This test is currently unused: the sender can't detect that the
# write failed.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port, src_pid = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
client.write_hello
while (req = client.read_request; req[:type] == 1)
client.read_data( req[:len] )
client.write_reply( req[:handle] )
end
system "kill -STOP #{src_pid}"
client.write_reply( req[:handle] )
client.close
system "kill -CONT #{src_pid}"
sleep( 0.25 )
client2 = server.accept( "Timed out waiting for a reconnection" )
client2.close
server.close
$stderr.puts "done"
exit(0)

View File

@@ -0,0 +1,23 @@
#!/usr/bin/env ruby
# Wait for a sender connection, send a correct hello, then disconnect.
# Simulate a server which crashes after sending the hello. We then
# reopen the server socket to check that the sender retries: since the
# command-line has gone away, and can't feed an error back to the
# user, we have to keep trying.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept( "Timed out waiting for a connection" )
client.write_hello
client.close
new_client = server.accept( "Timed out waiting for a reconnection" )
new_client.close
server.close
exit 0

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env ruby
# Wait for a sender connection, send a correct hello, wait for a write
# request, then disconnect. Simulate a server which crashes after
# receiving the write, and before it can send a reply. We then reopen
# the server socket to check that the sender retries: since the
# command-line has gone away, and can't feed an error back to the
# user, we have to keep trying.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept( "Timed out waiting for a connection" )
client.write_hello
client.read_request
client.close
new_client = server.accept( "Timed out waiting for a reconnection" )
new_client.close
server.close
exit 0

View File

@@ -0,0 +1,27 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Open a server, accept a client, then we expect a single write
# followed by an entrust. However, we disconnect after the write so
# the entrust will fail. We expect a reconnection.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port, src_pid = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
client.write_hello
req = client.read_request
data = client.read_data( req[:len] )
Process.kill("STOP", src_pid.to_i)
client.write_reply( req[:handle], 0 )
client.close
Process.kill("CONT", src_pid.to_i)
client2 = server.accept
client2.close
exit(0)

View File

@@ -0,0 +1,34 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Receive a mirror, but respond to the entrust with an error. There's
# currently no code path in flexnbd which can do this, but we could
# add one.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
client.write_hello
loop do
req = client.read_request
if req[:type] == 1
client.read_data( req[:len] )
client.write_reply( req[:handle] )
else
client.write_reply( req[:handle], 1 )
break
end
end
client.close
client2 = server.accept( "Timed out waiting for a reconnection" )
client2.close
server.close
exit(0)

View File

@@ -0,0 +1,22 @@
#!/usr/bin/env ruby
# encoding: utf-8
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
client.write_hello
handle = client.read_request[:handle]
client.write_error( handle )
client2 = server.accept( "Timed out waiting for a reconnection" )
client.close
client2.close
server.close
exit(0)

View File

@@ -0,0 +1,35 @@
#!/usr/bin/env ruby
# Will open a server, accept a single connection, then sleep for 5
# seconds. After that time, the client should have disconnected,
# which we can can't effectively check.
#
# We also expect the client *not* to reconnect, since it could feed back
# an error.
#
# This allows the test runner to check that the command-line sees the
# right error message after the timeout time.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept( "Client didn't make a connection" )
# Sleep for one second past the timeout (a bit of slop in case ruby
# doesn't launch things quickly)
sleep(FlexNBD::MS_HELLO_TIME_SECS + 1)
client.close
# Invert the sense of the timeout exception, since we *don't* want a
# connection attempt
begin
server.accept( "Expected timeout" )
fail "Unexpected reconnection"
rescue Timeout::Error
# expected
end
server.close

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Open a socket, say hello, receive a write, then sleep for >
# MS_REQUEST_LIMIT_SECS seconds. This should tell the source that the
# write has gone MIA, and we expect a reconnect.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client1 = server.accept( server )
client1.write_hello
client1.read_request
t = Thread.start do
client2 = server.accept( "Timed out waiting for a reconnection",
FlexNBD::MS_REQUEST_LIMIT_SECS + 2 )
client2.close
end
sleep( FlexNBD::MS_REQUEST_LIMIT_SECS + 2 )
client1.close
t.join
server.close
exit(0)

View File

@@ -0,0 +1,28 @@
#!/usr/bin/env ruby
# Simulate a destination which sends the wrong magic.
require 'flexnbd/fake_dest'
include FlexNBD
Thread.abort_on_exception
addr, port = *ARGV
server = FakeDest.new( addr, port )
client1 = server.accept
# We don't expect a reconnection attempt.
t = Thread.new do
begin
client2 = server.accept( "Timed out waiting for a reconnection",
FlexNBD::MS_RETRY_DELAY_SECS + 1 )
fail "Unexpected reconnection"
rescue Timeout::Error
#expected
end
end
client1.write_hello( :magic => :wrong )
t.join
exit 0

View File

@@ -0,0 +1,38 @@
#!/usr/bin/env ruby
# Simulate a server which has a disc of the wrong size attached: send
# a valid NBD hello with a random size, then check that we have see an
# EOF on read.
require 'flexnbd/fake_dest'
include FlexNBD
Thread.abort_on_exception = true
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
t = Thread.new do
# The sender *should not reconnect.* Since this is a first-pass
# mirror attempt, the user will have been told that the mirror failed,
# so it makes no sense to continue. This means we have to invert the
# sense of the exception.
begin
client2 = server.accept( "Timed out waiting for a reconnection",
FlexNBD::MS_RETRY_DELAY_SECS + 1 )
client2.close
fail "Unexpected reconnection."
rescue Timeout::Error
end
end
client.write_hello( :size => :wrong )
t.join
# Now check that the source closed the first socket (yes, this was an
# actual bug)
fail "Didn't close socket" unless client.disconnected?
exit 0

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env ruby
# Accept a connection, then immediately close it. This simulates an ACL rejection.
# We do not expect a reconnection.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
server.accept.close
begin
server.accept
fail "Unexpected reconnection"
rescue Timeout::Error
# expected
end
server.close
exit(0)

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Accept a connection, write hello, wait for a write request, read the
# data, then write back a reply with a bad magic field. We then
# expect a reconnect.
require 'flexnbd/fake_dest'
include FlexNBD
addr, port = *ARGV
server = FakeDest.new( addr, port )
client = server.accept
client.write_hello
req = client.read_request
client.read_data( req[:len] )
client.write_reply( req[:handle], 0, :magic => :wrong )
client2 = server.accept
client.close
client2.close
exit(0)

View File

@@ -0,0 +1,23 @@
#!/usr/bin/env ruby
# Connects to the destination server, then immediately disconnects,
# simulating a source crash.
#
# It then connects again, to check that the destination is still
# listening.
require 'flexnbd/fake_source'
include FlexNBD
addr, port = *ARGV
FakeSource.new( addr, port, "Failed to connect" ).close
# Sleep to be sure we don't try to connect too soon. That wouldn't
# be a problem for the destination, but it would prevent us from
# determining success or failure here in the case where we try to
# reconnect before the destination has tidied up after the first
# thread went away.
sleep(0.5)
FakeSource.new( addr, port, "Failed to reconnect" ).close
exit 0

View File

@@ -0,0 +1,34 @@
#!/usr/bin/env ruby
# Connect, send a migration, entrust then *immediately* disconnect.
# This simulates a client which fails while the client is blocked.
#
# We attempt to reconnect immediately afterwards to prove that we can
# retry the mirroring.
require 'flexnbd/fake_source'
include FlexNBD
addr, port, srv_pid = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting" )
client.read_hello
client.write_write_request( 0, 8 )
client.write_data( "12345678" )
# Use system "kill" rather than Process.kill because Process.kill
# doesn't seem to work
system "kill -STOP #{srv_pid}"
client.write_entrust_request
client.close
system "kill -CONT #{srv_pid}"
sleep(0.25)
client2 = FakeSource.new( addr, port, "Timed out reconnecting" )
client2.close
exit(0)

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env ruby
# Connect, send a migration, entrust then *immediately* disconnect.
# This simulates a client which fails while the client is blocked.
#
# We attempt to reconnect immediately afterwards to prove that we can
# retry the mirroring.
require 'flexnbd/fake_source'
include FlexNBD
addr, port, srv_pid, rebind_addr, rebind_port = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting" )
client.read_hello
client.write_write_request( 0, 8 )
client.write_data( "12345678" )
client.write_entrust_request
client.read_response
client.close
sleep(0.25)
client2 = FakeSource.new( addr, port, "Timed out reconnecting to mirror" )
client2.send_mirror
sleep(1)
client3 = FakeSource.new( rebind_addr, rebind_port, "Timed out reconnecting to read" )
client3.close
exit(0)

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env ruby
# Connect, read the hello, then immediately disconnect. This
# simulates a sender which dislikes something in the hello message - a
# wrong size, for instance.
# After the disconnect, we reconnect to be sure that the destination
# is still alive.
require 'flexnbd/fake_source'
include FlexNBD
addr, port = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting." )
client.read_hello
client.close
sleep(0.2)
FakeSource.new( addr, port, "Timed out reconnecting." )
exit(0)

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env ruby
# encoding: utf-8
# We connect, pause the server, issue a write request, disconnect,
# then cont the server. This ensures that our disconnect happens
# while the server is trying to read the write data.
require 'flexnbd/fake_source'
include FlexNBD
addr, port, srv_pid = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting" )
client.read_hello
Process.kill( "STOP", srv_pid.to_i )
client.write_write_request( 0, 8 )
client.close
Process.kill( "CONT", srv_pid.to_i )
# This sleep ensures that we don't return control to the test runner
# too soon, giving the flexnbd process time to fall over if it's going
# to.
sleep(0.25)
# ...and can we reconnect?
client2 = FakeSource.new( addr, port, "Timed out connecting" )
client2.close
exit(0)

View File

@@ -0,0 +1,33 @@
#!/usr/bin/env ruby
# encoding: utf-8
# We connect, pause the server, issue a write request, send data,
# disconnect, then cont the server. This ensures that our disconnect
# happens before the server can try to write the reply.
require 'flexnbd/fake_source'
include FlexNBD
addr, port, srv_pid = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting" )
client.read_hello
Process.kill( "STOP", srv_pid.to_i )
client.write_write_request( 0, 8 )
client.write_data( "12345678" )
client.close
Process.kill( "CONT", srv_pid.to_i )
# This sleep ensures that we don't return control to the test runner
# too soon, giving the flexnbd process time to fall over if it's going
# to.
sleep(0.25)
# ...and can we reconnect?
client2 = FakeSource.new( addr, port, "Timed out reconnecting" )
client2.close
exit(0)

View File

@@ -0,0 +1,17 @@
#!/usr/bin/env ruby
# Connect, but get the protocol wrong: don't read the hello, so we
# close and break the sendfile.
require 'flexnbd/fake_source'
include FlexNBD
addr, port, srv_pid, newaddr, newport = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting" )
client.write_read_request( 0, 8 )
client.read_raw( 4 )
client.close
exit(0)

View File

@@ -0,0 +1,22 @@
#!/usr/bin/env ruby
# Connect to the destination, then hang. Connect a second time to the
# destination. This will trigger the destination's thread clearer. We
# can't really see any error state from here, we just try to trigger
# something the test runner can detect.
require 'flexnbd/fake_source'
include FlexNBD
addr, port = *ARGV
client1 = FakeSource.new( addr, port, "Timed out connecting" )
sleep(0.25)
client2 = FakeSource.new( addr, port, "Timed out connecting a second time" )
# This is the expected source crashing after connect
client1.close
# And this is just a tidy-up.
client2.close
exit(0)

View File

@@ -0,0 +1,22 @@
#!/usr/bin/env ruby
# encoding: utf-8
# We connect from a local address which should be blocked, sleep for a
# bit, then try to read from the socket. We should get an instant EOF
# as we've been cut off by the destination.
require 'timeout'
require 'flexnbd/fake_source'
include FlexNBD
addr, port = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting", "127.0.0.6" )
sleep( 0.25 )
client.ensure_disconnected
client.close
exit(0)

View File

@@ -0,0 +1,39 @@
#!/usr/bin/env ruby
# Simulate the hello message going astray, or the source hanging after
# receiving it.
#
# We then connect again, to confirm that the destination is still
# listening for an incoming migration.
addr, port = *ARGV
require "flexnbd/fake_source"
include FlexNBD
client = FakeSource.new( addr, port, "Timed out connecting" )
client.read_hello
# Now we do two things:
# - In the parent process, we sleep for CLIENT_MAX_WAIT_SECS+5, which
# will make the destination give up and close the connection.
# - In the child process, we sleep for CLIENT_MAX_WAIT_SECS+1, which
# should be able to reconnect despite the parent process not having
# closed its end yet.
kidpid = fork do
client.close
new_client = nil
sleep( FlexNBD::CLIENT_MAX_WAIT_SECS + 1 )
new_client = FakeSource.new( addr, port, "Timed out reconnecting." )
new_client.read_hello
exit 0
end
# Sleep for longer than the child, to give the flexnbd process a bit
# of slop
sleep( FlexNBD::CLIENT_MAX_WAIT_SECS + 3 )
client.close
_,status = Process.waitpid2( kidpid )
exit status.exitstatus

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env ruby
# Successfully send a migration, but squat on the IP and port which
# the destination wants to rebind to. The destination should retry
# every second, so we give it up then attempt to connect to the new
# server.
require 'flexnbd/fake_source'
include FlexNBD
addr, port, srv_pid, newaddr, newport = *ARGV
squatter = TCPServer.open( newaddr, newport.to_i )
client = FakeSource.new( addr, port, "Timed out connecting" )
client.send_mirror()
sleep(1)
squatter.close()
sleep(1)
client2 = FakeSource.new( newaddr, newport.to_i, "Timed out reconnecting" )
client2.read_hello
client2.read( 0, 8 )
client2.close
exit( 0 )

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env ruby
# encoding: utf-8
# Connect, read the hello then make a write request with an impossible
# (from,len) pair. We expect an error response, and not to be
# disconnected.
#
# We then expect to be able to issue a successful write: the destination
# has to flush the data in the socket.
require 'flexnbd/fake_source'
include FlexNBD
addr, port = *ARGV
client = FakeSource.new( addr, port, "Timed out connecting" )
hello = client.read_hello
client.write_write_request( hello[:size]+1, 32, "myhandle" )
client.write_data("1"*32)
response = client.read_response
fail "Not an error" if response[:error] == 0
fail "Wrong handle" unless "myhandle" == response[:handle]
client.write_write_request( 0, 32 )
client.write_data( "2"*32 )
success_response = client.read_response
fail "Second write failed" unless success_response[:error] == 0
client.close
exit(0)

View File

@@ -0,0 +1,123 @@
# Noddy test class for writing files to disc in predictable patterns
# in order to test FlexNBD.
#
class FileWriter
def initialize(filename, blocksize)
@fh = File.open(filename, "w+")
@blocksize = blocksize
@pattern = ""
end
# We write in fixed block sizes, given by "blocksize"
# _ means skip a block
# 0 means write a block full of zeroes
# f means write a block with the file offset packed every 4 bytes
#
def write(data)
@pattern += data
data.split("").each do |code|
if code == "_"
@fh.seek(@blocksize, IO::SEEK_CUR)
else
@fh.write(data(code))
end
end
@fh.flush
self
end
# Returns what the data ought to be at the given offset and length
#
def read_original( off, len )
patterns = @pattern.split( "" )
patterns.zip( (0...patterns.length).to_a ).
map { |blk, blk_off|
data(blk, blk_off)
}.join[off...(off+len)]
end
# Read what's actually in the file
#
def read(off, len)
@fh.seek(off, IO::SEEK_SET)
@fh.read(len)
end
def untouched?(offset, len)
read(offset, len) == read_original(offset, len)
end
def close
@fh.close
nil
end
protected
def data(code, at=@fh.tell)
case code
when "0", "_"
"\0" * @blocksize
when "X"
"X" * @blocksize
when "f"
r = ""
(@blocksize/4).times do
r += [at].pack("I")
at += 4
end
r
else
raise "Unknown character '#{block}'"
end
end
end
if __FILE__==$0
require 'tempfile'
require 'test/unit'
class FileWriterTest < Test::Unit::TestCase
def test_read_original_zeros
Tempfile.open("test_read_original_zeros") do |tempfile|
tempfile.close
file = FileWriter.new( tempfile.path, 4096 )
file.write( "0" )
assert_equal file.read( 0, 4096 ), file.read_original( 0, 4096 )
assert( file.untouched?(0,4096) , "Untouched file was touched." )
end
end
def test_read_original_offsets
Tempfile.open("test_read_original_offsets") do |tempfile|
tempfile.close
file = FileWriter.new( tempfile.path, 4096 )
file.write( "f" )
assert_equal file.read( 0, 4096 ), file.read_original( 0, 4096 )
assert( file.untouched?(0,4096) , "Untouched file was touched." )
end
end
def test_file_size
Tempfile.open("test_file_size") do |tempfile|
tempfile.close
file = FileWriter.new( tempfile.path, 4096 )
file.write( "f" )
assert_equal 4096, File.stat( tempfile.path ).size
end
end
def test_read_original_size
Tempfile.open("test_read_original_offsets") do |tempfile|
tempfile.close
file = FileWriter.new( tempfile.path, 4)
file.write( "f"*4 )
assert_equal 4, file.read_original(0, 4).length
end
end
end
end

467
tests/acceptance/flexnbd.rb Normal file
View File

@@ -0,0 +1,467 @@
require 'socket'
require 'thread'
require 'open3'
require 'timeout'
require 'rexml/document'
require 'rexml/streamlistener'
Thread.abort_on_exception = true
class Executor
attr_reader :pid
def run( cmd )
@pid = fork do exec cmd end
end
end # class Executor
class ValgrindExecutor
attr_reader :pid
def run( cmd )
@pid = fork do exec "valgrind --track-origins=yes #{cmd}" end
end
end # class ValgrindExecutor
class ValgrindKillingExecutor
attr_reader :pid
class Error
attr_accessor :what, :kind, :pid
attr_reader :backtrace
def initialize
@backtrace=[]
@what = ""
@kind = ""
@pid = ""
end
def add_frame
@backtrace << {}
end
def add_fn(fn)
@backtrace.last[:fn] = fn
end
def add_file(file)
@backtrace.last[:file] = file
end
def add_line(line)
@backtrace.last[:line] = line
end
def to_s
([@what + " (#{@kind}) in #{@pid}"] + @backtrace.map{|h| "#{h[:file]}:#{h[:line]} #{h[:fn]}" }).join("\n")
end
end # class Error
class ErrorListener
include REXML::StreamListener
def initialize( killer )
@killer = killer
@error = Error.new
@found = false
end
def text( text )
@text = text
end
def tag_start(tag, attrs)
case tag.to_s
when "error"
@found = true
when "frame"
@error.add_frame
end
end
def tag_end(tag)
case tag.to_s
when "what"
@error.what = @text if @found
@text = ""
when "kind"
@error.kind = @text if @found
when "file"
@error.add_file( @text ) if @found
when "fn"
@error.add_fn( @text ) if @found
when "line"
@error.add_line( @text ) if @found
when "error", "stack"
@killer.call( @error )
when "pid"
@error.pid=@text
end
end
end # class ErrorListener
class DebugErrorListener < ErrorListener
def text( txt )
print txt
super( txt )
end
def tag_start( tag, attrs )
print "<#{tag}>"
super( tag, attrs )
end
def tag_end( tag )
print "</#{tag}>"
super( tag )
end
end
def initialize
@pid = nil
end
def run( cmd )
@io_r, io_w = IO.pipe
@pid = fork do exec( "valgrind --xml=yes --xml-fd=#{io_w.fileno} " + cmd ) end
launch_watch_thread( @pid, @io_r )
@pid
end
def call( err )
Process.kill( "KILL", @pid )
$stderr.puts "*"*72
$stderr.puts "* Valgrind error spotted:"
$stderr.puts err.to_s.split("\n").map{|s| " #{s}"}
$stderr.puts "*"*72
exit(1)
end
private
def pick_listener
ENV['DEBUG'] ? DebugErrorListener : ErrorListener
end
def launch_watch_thread(pid, io_r)
Thread.start do
io_source = REXML::IOSource.new( io_r )
listener = pick_listener.new( self )
REXML::Document.parse_stream( io_source, listener )
end
end
end # class ValgrindExecutor
# Noddy test class to exercise FlexNBD from the outside for testing.
#
class FlexNBD
attr_reader :bin, :ctrl, :pid, :ip, :port, :rebind_ip, :rebind_port
class << self
def counter
Dir['tmp/*'].select{|f| File.file?(f)}.length + 1
end
end
def pick_executor
kls = if ENV['VALGRIND']
if ENV['VALGRIND'] =~ /kill/
ValgrindKillingExecutor
else
ValgrindExecutor
end
else
Executor
end
end
def build_debug_opt
if @do_debug
"--verbose"
else
"--quiet"
end
end
def initialize(bin, ip, port, rebind_ip = ip, rebind_port = port)
@bin = bin
@do_debug = ENV['DEBUG']
@debug = build_debug_opt
raise "#{bin} not executable" unless File.executable?(bin)
@executor = pick_executor.new
@ctrl = "/tmp/.flexnbd.ctrl.#{Time.now.to_i}.#{rand}"
@ip = ip
@port = port
@rebind_ip = rebind_ip
@rebind_port = rebind_port
@kill = []
end
def debug?
!!@do_debug
end
def debug( msg )
$stderr.puts msg if debug?
end
def serve_cmd( file, acl )
"#{bin} serve "\
"--addr #{ip} "\
"--port #{port} "\
"--file #{file} "\
"--sock #{ctrl} "\
"#{@debug} "\
"#{acl.join(' ')}"
end
def listen_cmd( file, acl )
"#{bin} listen "\
"--addr #{ip} "\
"--port #{port} "\
"--file #{file} "\
"--rebind-addr #{rebind_ip} " \
"--rebind-port #{rebind_port} " \
"--sock #{ctrl} "\
"#{@debug} "\
"#{acl.join(' ')}"
end
def read_cmd( offset, length )
"#{bin} read "\
"--addr #{ip} "\
"--port #{port} "\
"--from #{offset} "\
"#{@debug} "\
"--size #{length}"
end
def write_cmd( offset, data )
"#{bin} write "\
"--addr #{ip} "\
"--port #{port} "\
"--from #{offset} "\
"#{@debug} "\
"--size #{data.length}"
end
def mirror_cmd(dest_ip, dest_port)
"#{@bin} mirror "\
"--addr #{dest_ip} "\
"--port #{dest_port} "\
"--sock #{ctrl} "\
"#{@debug} "
end
def status_cmd
"#{@bin} status "\
"--sock #{ctrl} "\
"#{@debug}"
end
def acl_cmd( *acl )
"#{@bin} acl " \
"--sock #{ctrl} "\
"#{@debug} "\
"#{acl.join " "}"
end
def run_serve_cmd(cmd)
File.unlink(ctrl) if File.exists?(ctrl)
debug( cmd )
@pid = @executor.run( cmd )
start_wait_thread( @pid )
while !File.socket?(ctrl)
pid, status = Process.wait2(@pid, Process::WNOHANG)
raise "server did not start (#{cmd})" if pid
sleep 0.1
end
at_exit { kill }
end
private :run_serve_cmd
def serve( file, *acl)
run_serve_cmd( serve_cmd( file, acl ) )
end
def listen(file, *acl)
run_serve_cmd( listen_cmd( file, acl ) )
end
def start_wait_thread( pid )
@wait_thread = Thread.start do
_, status = Process.waitpid2( pid )
if @kill
fail "flexnbd quit with a bad status: #{status.exitstatus}" unless
@kill.include? status.exitstatus
else
$stderr.puts "flexnbd #{self.pid} quit"
fail "flexnbd #{self.pid} quit early with status #{status.to_i}"
end
end
end
def can_die(*status)
status = [0] if status.empty?
@kill += status
end
def kill
# At this point, to a certain degree we don't care what the exit
# status is
can_die(1)
if @pid
begin
Process.kill("INT", @pid)
rescue Errno::ESRCH => e
# already dead. Presumably this means it went away after a
# can_die() call.
end
end
@wait_thread.join if @wait_thread
end
def read(offset, length)
cmd = read_cmd( offset, length )
debug( cmd )
IO.popen(cmd) do |fh|
return fh.read
end
raise IOError.new "NBD read failed" unless $?.success?
out
end
def write(offset, data)
cmd = write_cmd( offset, data )
debug( cmd )
IO.popen(cmd, "w") do |fh|
fh.write(data)
end
raise IOError.new "NBD write failed" unless $?.success?
nil
end
def join
@wait_thread.join
end
def mirror_unchecked( dest_ip, dest_port, bandwidth=nil, action=nil, timeout=nil )
cmd = mirror_cmd( dest_ip, dest_port)
debug( cmd )
maybe_timeout( cmd, timeout )
end
def maybe_timeout(cmd, timeout=nil )
stdout, stderr = "",""
run = Proc.new do
Open3.popen3( cmd ) do |io_in, io_out, io_err|
io_in.close
stdout.replace io_out.read
stderr.replace io_err.read
end
end
if timeout
Timeout.timeout(timeout, &run)
else
run.call
end
[stdout, stderr]
end
def mirror(dest_ip, dest_port, bandwidth=nil, action=nil)
stdout, stderr = mirror_unchecked( dest_ip, dest_port, bandwidth, action )
raise IOError.new( "Migrate command failed\n" + stderr) unless $?.success?
stdout
end
def acl(*acl)
cmd = acl_cmd( *acl )
debug( cmd )
maybe_timeout( cmd, 2 )
end
def status( timeout = nil )
cmd = status_cmd()
debug( cmd )
o,e = maybe_timeout( cmd, timeout )
[parse_status(o), e]
end
def launched?
!!@pid
end
protected
def control_command(*args)
raise "Server not running" unless @pid
args = args.compact
UNIXSocket.open(@ctrl) do |u|
u.write(args.join("\n") + "\n")
code, message = u.readline.split(": ", 2)
return [code, message]
end
end
def parse_status( status )
hsh = {}
status.split(" ").each do |part|
next if part.strip.empty?
a,b = part.split("=")
b.strip!
b = true if b == "true"
b = false if b == "false"
hsh[a.strip] = b
end
hsh
end
end

View File

@@ -0,0 +1,37 @@
# encoding: utf-8
module FlexNBD
# eeevil is his one and only name...
def self.read_constants
parents = []
current = File.expand_path(".")
while current != "/"
parents << current
current = File.expand_path( File.join( current, ".." ) )
end
source_root = parents.find do |dirname|
File.directory?( File.join( dirname, "src" ) )
end
fail "No source root!" unless source_root
headers = Dir[File.join( source_root, "src", "*.h" ) ]
headers.each do |header_filename|
txt_lines = File.readlines( header_filename )
txt_lines.each do |line|
if line =~ /^#\s*define\s+([A-Z0-9_]+)\s+(\d+)\s*$/
# Bodge until I can figure out what to do with #ifdefs
const_set($1, $2.to_i) unless constants.include?( $1 )
end
end
end
end
read_constants()
end # module FlexNBD

View File

@@ -0,0 +1,163 @@
# encoding: utf-8
require 'socket'
require 'timeout'
require 'flexnbd/constants'
module FlexNBD
class FakeDest
class Client
def initialize( sock )
@sock = sock
end
def write_hello( opts = {} )
@sock.write( "NBDMAGIC" )
if opts[:magic] == :wrong
write_rand( @sock, 8 )
else
@sock.write( "\x00\x00\x42\x02\x81\x86\x12\x53" )
end
if opts[:size] == :wrong
write_rand( @sock, 8 )
else
@sock.write( "\x00\x00\x00\x00\x00\x00\x10\x00" )
end
@sock.write( "\x00" * 128 )
end
def write_rand( sock, len )
len.times do sock.write( rand(256).chr ) end
end
def read_request()
req = @sock.read(28)
magic_s = req[0 ... 4 ]
type_s = req[4 ... 8 ]
handle_s = req[8 ... 16]
from_s = req[16 ... 24]
len_s = req[24 ... 28]
{
:magic => magic_s,
:type => type_s.unpack("N").first,
:handle => handle_s,
:from => self.class.parse_be64( from_s ),
:len => len_s.unpack( "N").first
}
end
REPLY_MAGIC="\x67\x44\x66\x98"
def write_error( handle )
write_reply( handle, 1 )
end
def disconnected?
begin
Timeout.timeout(2) do
@sock.read(1) == nil
end
rescue Timeout::Error
return false
end
end
def write_reply( handle, err=0, opts={} )
if opts[:magic] == :wrong
write_rand( @sock, 4 )
else
@sock.write( REPLY_MAGIC )
end
@sock.write( [err].pack("N") )
@sock.write( handle )
end
def close
@sock.close
end
def read_data( len )
@sock.read( len )
end
def self.parse_be64(str)
raise "String is the wrong length: 8 bytes expected (#{str.length} received)" unless
str.length == 8
top, bottom = str.unpack("NN")
(top << 32) + bottom
end
def receive_mirror( opts = {} )
write_hello()
loop do
req = read_request
case req[:type]
when 1
read_data( req[:len] )
write_reply( req[:handle] )
when 65536
write_reply( req[:handle], opts[:err] == :entrust ? 1 : 0 )
break
else
raise "Unexpected request: #{req.inspect}"
end
end
disc = read_request
if disc[:type] == 2
close
else
raise "Not a disconnect: #{req.inspect}"
end
end
end # class Client
def initialize( addr, port )
@sock = TCPServer.new( addr, port )
end
def accept( err_msg = "Timed out waiting for a connection", timeout = 2)
client_sock = nil
begin
Timeout.timeout(timeout) do
client_sock = @sock.accept
end
rescue Timeout::Error
raise Timeout::Error.new(err_msg)
end
client_sock
Client.new( client_sock )
end
def close
@sock.close
end
end # module FakeDest
end # module FlexNBD

View File

@@ -0,0 +1,136 @@
# encoding: utf-8
require 'socket'
require "timeout"
require 'flexnbd/constants'
module FlexNBD
class FakeSource
def initialize( addr, port, err_msg, source_addr=nil, source_port=0 )
timing_out( 2, err_msg ) do
@sock = if source_addr
TCPSocket.new( addr, port, source_addr, source_port )
else
TCPSocket.new( addr, port )
end
end
end
def close
@sock.close
end
def read_hello()
timing_out( FlexNBD::MS_HELLO_TIME_SECS,
"Timed out waiting for hello." ) do
fail "No hello." unless (hello = @sock.read( 152 )) &&
hello.length==152
magic_s = hello[0..7]
ignore_s= hello[8..15]
size_s = hello[16..23]
size_h, size_l = size_s.unpack("NN")
size = (size_h << 32) + size_l
return { :magic => magic_s, :size => size }
end
end
def send_request( type, handle="myhandle", from=0, len=0 )
fail "Bad handle" unless handle.length == 8
@sock.write( "\x25\x60\x95\x13" )
@sock.write( [type].pack( 'N' ) )
@sock.write( handle )
@sock.write( "\x0"*4 )
@sock.write( [from].pack( 'N' ) )
@sock.write( [len ].pack( 'N' ) )
end
def write_write_request( from, len, handle="myhandle" )
send_request( 1, handle, from, len )
end
def write_entrust_request( handle="myhandle" )
send_request( 65536, handle )
end
def write_disconnect_request( handle="myhandle" )
send_request( 2, handle )
end
def write_read_request( from, len, handle="myhandle" )
send_request( 0, "myhandle", from, len )
end
def write_data( data )
@sock.write( data )
end
# Handy utility
def read( from, len )
timing_out( 2, "Timed out reading" ) do
send_request( 0, "myhandle", from, len )
read_raw( len )
end
end
def read_raw( len )
@sock.read( len )
end
def send_mirror
read_hello()
write_write_request( 0, 8 )
write_data( "12345678" )
read_response()
write_entrust_request()
read_response()
write_disconnect_request()
close()
end
def read_response
magic = @sock.read(4)
error_s = @sock.read(4)
handle = @sock.read(8)
{
:magic => magic,
:error => error_s.unpack("N").first,
:handle => handle
}
end
def ensure_disconnected
Timeout.timeout( 2 ) do
@sock.read(1)
end
end
def timing_out( time, msg )
begin
Timeout.timeout( time ) do
yield
end
rescue Timeout::Error
$stderr.puts msg
exit 1
end
end
end # class FakeSource
end # module FlexNBD

View File

@@ -0,0 +1,6 @@
#!/usr/bin/ruby
test_files = Dir[File.dirname( __FILE__ ) + "/test*.rb"]
for filename in test_files
require filename
end

View File

@@ -0,0 +1,107 @@
# encoding: utf-8
require 'test/unit'
require 'environment'
class TestDestErrorHandling < Test::Unit::TestCase
def setup
@env = Environment.new
@env.writefile1( "0" * 4 )
@env.listen1
end
def teardown
@env.cleanup
end
def test_hello_blocked_by_disconnect_causes_error_not_fatal
run_fake( "source/close_after_connect" )
assert_no_control
end
def test_hello_goes_astray_causes_timeout_error
run_fake( "source/hang_after_hello" )
assert_no_control
end
def test_disconnect_after_hello_causes_error_not_fatal
run_fake( "source/close_after_hello" )
assert_no_control
end
def test_partial_read_causes_error
run_fake( "source/close_mid_read" )
end
def test_double_connect_during_hello
run_fake( "source/connect_during_hello" )
end
def test_acl_rejection
@env.acl1("127.0.0.1")
run_fake( "source/connect_from_banned_ip")
end
def test_bad_write
run_fake( "source/write_out_of_range" )
end
def test_disconnect_before_write_data_causes_error
run_fake( "source/close_after_write" )
end
def test_disconnect_before_entrust_reply_causes_error
run_fake( "source/close_after_entrust" )
end
def test_disconnect_before_write_reply_causes_error
# Note that this is an odd case: writing the reply doesn't fail.
# The test passes because the next attempt by flexnbd to read a
# request returns EOF.
run_fake( "source/close_after_write_data" )
end
def test_disconnect_after_entrust_reply_causes_error
@env.nbd1.can_die(0)
# This fake runs a failed migration then a succeeding one, so we
# expect the destination to take control.
run_fake( "source/close_after_entrust_reply" )
assert_control
end
def test_cant_rebind_retries
run_fake( "source/successful_transfer" )
end
private
def run_fake( name )
@env.run_fake( name, @env.ip, @env.port1, @env.ip, @env.rebind_port1 )
assert @env.fake_reports_success, "#{name} failed."
end
def status
stat, _ = @env.status1
stat
end
def assert_no_control
assert !status['has_control'], "Thought it had control"
end
def assert_control
assert status['has_control'], "Didn't think it had control"
end
end # class TestDestErrorHandling

View File

@@ -0,0 +1,95 @@
# encoding: utf-8
require 'test/unit'
require 'environment'
class TestHappyPath < Test::Unit::TestCase
def setup
@env = Environment.new
end
def teardown
@env.nbd1.can_die(0)
@env.nbd2.can_die(0)
@env.cleanup
end
def test_read1
@env.writefile1("f"*64)
@env.serve1
[0, 12, 63].each do |num|
assert_equal(
@env.nbd1.read(num*@env.blocksize, @env.blocksize),
@env.file1.read(num*@env.blocksize, @env.blocksize)
)
end
[124, 1200, 10028, 25488].each do |num|
assert_equal(@env.nbd1.read(num, 4), @env.file1.read(num, 4))
end
end
# Check that we're not
#
def test_writeread1
@env.writefile1("0"*64)
@env.serve1
[0, 12, 63].each do |num|
data = "X"*@env.blocksize
@env.nbd1.write(num*@env.blocksize, data)
assert_equal(data, @env.file1.read(num*@env.blocksize, data.size))
assert_equal(data, @env.nbd1.read(num*@env.blocksize, data.size))
end
end
# Check that we're not overstepping or understepping where our writes end
# up.
#
def test_writeread2
@env.writefile1("0"*1024)
@env.serve1
d0 = "\0"*@env.blocksize
d1 = "X"*@env.blocksize
(0..63).each do |num|
@env.nbd1.write(num*@env.blocksize*2, d1)
end
(0..63).each do |num|
assert_equal(d0, @env.nbd1.read(((2*num)+1)*@env.blocksize, d0.size))
end
end
def test_mirror
@env.writefile1( "f"*4 )
@env.serve1
@env.writefile2( "0"*4 )
@env.listen2
@env.nbd1.can_die
stdout, stderr = @env.mirror12
@env.nbd1.join
assert_equal(@env.file1.read_original( 0, @env.blocksize ),
@env.file2.read( 0, @env.blocksize ) )
assert @env.status2['has_control'], "destination didn't take control"
end
def test_write_to_high_block
# Create a large file, then try to write to somewhere after the 2G boundary
@env.truncate1 "4G"
@env.serve1
@env.nbd1.write( 2**31+2**29, "12345678" )
sleep(1)
assert_equal "12345678", @env.nbd1.read( 2**31+2**29, 8 )
end
end

View File

@@ -0,0 +1,109 @@
# encoding: utf-8
require 'test/unit'
require 'environment'
class TestSourceErrorHandling < Test::Unit::TestCase
def setup
@env = Environment.new
@env.writefile1( "f" * 4 )
@env.serve1
end
def teardown
@env.nbd1.can_die(0)
@env.cleanup
end
def test_failure_to_connect_reported_in_mirror_cmd_response
stdout, stderr = @env.mirror12_unchecked
assert_match( /failed to connect/, stderr )
end
def test_destination_hangs_after_connect_reports_error_at_source
run_fake( "dest/hang_after_connect",
:err => /Remote server failed to respond/ )
end
def test_destination_rejects_connection_reports_error_at_source
run_fake( "dest/reject_acl",
:err => /Mirror was rejected/ )
end
def test_wrong_size_causes_disconnect
run_fake( "dest/hello_wrong_size",
:err => /Remote size does not match local size/ )
end
def test_wrong_magic_causes_disconnect
run_fake( "dest/hello_wrong_magic",
:err => /Mirror was rejected/ )
end
def test_disconnect_after_hello_causes_retry
run_fake( "dest/close_after_hello",
:out => /Mirror started/ )
end
def test_write_times_out_causes_retry
run_fake( "dest/hang_after_write" )
end
def test_rejected_write_causes_retry
run_fake( "dest/error_on_write" )
end
def test_disconnect_before_write_reply_causes_retry
run_fake( "dest/close_after_write" )
end
def test_bad_write_reply_causes_retry
run_fake( "dest/write_wrong_magic" )
end
def test_pre_entrust_disconnect_causes_retry
run_fake( "dest/close_after_writes" )
end
def test_post_entrust_disconnect_causes_retry
@env.nbd1.can_die(0)
run_fake( "dest/close_after_entrust" )
end
def test_entrust_error_causes_retry
run_fake( "dest/error_on_entrust" )
end
private
def run_fake(name, opts = {})
@env.run_fake( name, @env.ip, @env.port2 )
stdout, stderr = @env.mirror12_unchecked
assert_success
assert_match( opts[:err], stderr ) if opts[:err]
assert_match( opts[:out], stdout ) if opts[:out]
return stdout, stderr
end
def assert_success( msg=nil )
assert @env.fake_reports_success, msg || "Fake failed"
end
end # class TestSourceErrorHandling

102
tests/fuzz Normal file
View File

@@ -0,0 +1,102 @@
#!/usr/bin/ruby
$:.push "acceptance"
require 'flexnbd'
binary = ARGV.shift
test_size = ARGV.shift.to_i
repetitions = ARGV.shift.to_i
repetitions = 50 if repetitions == 0
seed = ARGV.shift.to_i
max_length = test_size > 10000000 ? 10000000 : test_size
CHEAT_AND_ROUND_DOWN = false # set to 'false' to expose bugs
srand(seed)
if test_size < 32768 || repetitions < 1 || !File.executable?(binary)
STDERR.print "Syntax: #{$0} <flexnbd bin> <test size> [repetitions] [seed]\n"
STDERR.print "test_size must be >= 32768 and repeitions must be >= 1"
exit 1
end
testname_local = "#{$0}.test.#{$$}.local"
testname_serve = "#{$0}.test.#{$$}.serve"
[testname_local, testname_serve].each do |name|
File.open(name, "w+") { |fh| fh.seek(test_size-1, IO::SEEK_SET); fh.write("\0") }
end
@local = File.open(testname_local, "r+")
@serve = FlexNBD.new(binary, "127.0.0.1", 41234)
@serve.serve(testname_serve)
$record = []
def print_record
$record.each do |offset, length, byte|
STDERR.print " wrote #{byte} to #{offset}+#{length}\n"
end
end
repetitions.times do |n|
begin
if File.size(testname_local) != File.size(testname_serve)
STDERR.print "Before pass #{n}: File sizes are different: local=#{File.size(testname_local)} serve=#{File.size(testname_serve)}\n"
exit 1;
end
md5_local = `md5sum < #{testname_local}`.split(" ").first
md5_serve = `md5sum < #{testname_serve}`.split(" ").first
if md5_local != md5_serve
STDERR.print "Before pass #{n}: MD5 error: local=#{md5_local} serve=#{md5_serve}\n"
print_record
STDERR.print "**** Local contents:\n"
system("hexdump #{testname_local}")
STDERR.print "**** Serve contents:\n"
system("hexdump #{testname_serve}")
exit 1
end
length = rand(max_length/8)
length &= 0xfffff000 if CHEAT_AND_ROUND_DOWN
offset = rand(test_size - length)
offset &= 0xfffff000 if CHEAT_AND_ROUND_DOWN
content = (n%2 == 0) ? ("\0" * length) : ( (n&255).chr * length)
$record << [offset, length, content[0]]
@local.seek(offset, IO::SEEK_SET)
@local.write(content)
@local.fsync
@serve.write(offset, content)
check_read = @serve.read(offset, length)
if check_read != content
STDERR.print "After pass #{n}: Didn't read back what we wrote!\n"
print_record
STDERR.print "*** We wrote these #{content.length} bytes...\n"
IO.popen("hexdump", "w") { |io| io.print(content) }
STDERR.print "*** But we got back these #{check_read.length} bytes...\n"
IO.popen("hexdump", "w") { |io| io.print(check_read) }
exit 1
end
rescue StandardError => ex
STDERR.print "During pass #{n}: Exception: #{ex}"
print_record
STDERR.print ex.backtrace.join("\n") + "\n"
exit 2
end
end
File.unlink(testname_local)
File.unlink(testname_serve)
@serve.can_die(0)

229
tests/unit/check_acl.c Normal file
View File

@@ -0,0 +1,229 @@
#include <check.h>
#include <stdio.h>
#include "acl.h"
#include "util.h"
START_TEST( test_null_acl )
{
struct acl *acl = acl_create( 0,NULL, 0 );
fail_if( NULL == acl, "No acl alloced." );
fail_unless( 0 == acl->len, "Incorrect length" );
}
END_TEST
START_TEST( test_parses_single_line )
{
char *lines[] = {"127.0.0.1"};
struct acl * acl = acl_create( 1, lines, 0 );
fail_unless( 1 == acl->len, "Incorrect length." );
fail_if( NULL == acl->entries, "No entries present." );
}
END_TEST
START_TEST( test_parses_multiple_lines )
{
char *lines[] = {"127.0.0.1", "::1"};
struct acl * acl = acl_create( 2, lines, 0 );
union mysockaddr e0, e1;
parse_ip_to_sockaddr( &e0.generic, lines[0] );
parse_ip_to_sockaddr( &e1.generic, lines[1] );
fail_unless( acl->len == 2, "Multiple lines not parsed" );
struct ip_and_mask *entry;
entry = &(*acl->entries)[0];
fail_unless(entry->ip.family == e0.family, "entry 0 has wrong family!");
entry = &(*acl->entries)[1];
fail_unless(entry->ip.family == e1.family, "entry 1 has wrong family!");
}
END_TEST
START_TEST( test_destroy_doesnt_crash )
{
char *lines[] = {"127.0.0.1"};
struct acl * acl = acl_create( 1, lines, 0 );
acl_destroy( acl );
}
END_TEST
START_TEST( test_includes_single_address )
{
char *lines[] = {"127.0.0.1"};
struct acl * acl = acl_create( 1, lines, 0 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "127.0.0.1" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
}
END_TEST
START_TEST( test_includes_single_address_when_netmask_specified_ipv4 )
{
char *lines[] = {"127.0.0.1/24"};
struct acl * acl = acl_create( 1, lines, 0 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "127.0.0.0" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
parse_ip_to_sockaddr( &x.generic, "127.0.0.1" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
parse_ip_to_sockaddr( &x.generic, "127.0.0.255" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
}
END_TEST
START_TEST( test_includes_single_address_when_netmask_specified_ipv6 )
{
char *lines[] = {"fe80::/10"};
struct acl * acl = acl_create( 1, lines, 0 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "fe80::1" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
parse_ip_to_sockaddr( &x.generic, "fe80::2" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
parse_ip_to_sockaddr( &x.generic, "fe80:ffff:ffff::ffff" );
fail_unless( acl_includes( acl, &x ), "Included address wasn't covered" );
}
END_TEST
START_TEST( test_includes_single_address_when_multiple_entries_exist )
{
char *lines[] = {"127.0.0.1", "::1"};
struct acl * acl = acl_create( 2, lines, 0 );
union mysockaddr e0;
union mysockaddr e1;
parse_ip_to_sockaddr( &e0.generic, "127.0.0.1" );
parse_ip_to_sockaddr( &e1.generic, "::1" );
fail_unless( acl_includes( acl, &e0 ), "Included address 0 wasn't covered" );
fail_unless( acl_includes( acl, &e1 ), "Included address 1 wasn't covered" );
}
END_TEST
START_TEST( test_doesnt_include_other_address )
{
char *lines[] = {"127.0.0.1"};
struct acl * acl = acl_create( 1, lines, 0 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "127.0.0.2" );
fail_if( acl_includes( acl, &x ), "Excluded address was covered." );
}
END_TEST
START_TEST( test_doesnt_include_other_address_when_netmask_specified )
{
char *lines[] = {"127.0.0.1/32"};
struct acl * acl = acl_create( 1, lines, 0 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "127.0.0.2" );
fail_if( acl_includes( acl, &x ), "Excluded address was covered." );
}
END_TEST
START_TEST( test_doesnt_include_other_address_when_multiple_entries_exist )
{
char *lines[] = {"127.0.0.1", "::1"};
struct acl * acl = acl_create( 2, lines, 0 );
union mysockaddr e0;
union mysockaddr e1;
parse_ip_to_sockaddr( &e0.generic, "127.0.0.2" );
parse_ip_to_sockaddr( &e1.generic, "::2" );
fail_if( acl_includes( acl, &e0 ), "Excluded address 0 was covered." );
fail_if( acl_includes( acl, &e1 ), "Excluded address 1 was covered." );
}
END_TEST
START_TEST( test_default_deny_rejects )
{
struct acl * acl = acl_create( 0, NULL, 1 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "127.0.0.1" );
fail_if( acl_includes( acl, &x ), "Default deny accepted." );
}
END_TEST
START_TEST( test_default_accept_rejects )
{
struct acl * acl = acl_create( 0, NULL, 0 );
union mysockaddr x;
parse_ip_to_sockaddr( &x.generic, "127.0.0.1" );
fail_unless( acl_includes( acl, &x ), "Default accept rejected." );
}
END_TEST
Suite* acl_suite(void)
{
Suite *s = suite_create("acl");
TCase *tc_create = tcase_create("create");
TCase *tc_includes = tcase_create("includes");
TCase *tc_destroy = tcase_create("destroy");
tcase_add_test(tc_create, test_null_acl);
tcase_add_test(tc_create, test_parses_single_line);
tcase_add_test(tc_includes, test_parses_multiple_lines);
tcase_add_test(tc_includes, test_includes_single_address);
tcase_add_test(tc_includes, test_includes_single_address_when_netmask_specified_ipv4);
tcase_add_test(tc_includes, test_includes_single_address_when_netmask_specified_ipv6);
tcase_add_test(tc_includes, test_includes_single_address_when_multiple_entries_exist);
tcase_add_test(tc_includes, test_doesnt_include_other_address);
tcase_add_test(tc_includes, test_doesnt_include_other_address_when_netmask_specified);
tcase_add_test(tc_includes, test_doesnt_include_other_address_when_multiple_entries_exist);
tcase_add_test(tc_includes, test_default_deny_rejects);
tcase_add_test(tc_includes, test_default_accept_rejects);
tcase_add_test(tc_destroy, test_destroy_doesnt_crash);
suite_add_tcase(s, tc_create);
suite_add_tcase(s, tc_includes);
suite_add_tcase(s, tc_destroy);
return s;
}
int main(void)
{
#ifdef DEBUG
log_level = 0;
#else
log_level = 2;
#endif
int number_failed;
Suite *s = acl_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
log_level = 0;
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

204
tests/unit/check_bitset.c Normal file
View File

@@ -0,0 +1,204 @@
#include <check.h>
#include "bitset.h"
START_TEST(test_bit_set)
{
uint64_t num = 0;
char *bits = (char*) &num;
#define TEST_BIT_SET(bit, newvalue) \
bit_set(bits, (bit)); \
fail_unless(num == (newvalue), "num was %x instead of %x", num, (newvalue));
TEST_BIT_SET(0, 1);
TEST_BIT_SET(1, 3);
TEST_BIT_SET(2, 7);
TEST_BIT_SET(7, 0x87);
TEST_BIT_SET(63, 0x8000000000000087);
}
END_TEST
START_TEST(test_bit_clear)
{
uint64_t num = 0xffffffffffffffff;
char *bits = (char*) &num;
#define TEST_BIT_CLEAR(bit, newvalue) \
bit_clear(bits, (bit)); \
fail_unless(num == (newvalue), "num was %x instead of %x", num, (newvalue));
TEST_BIT_CLEAR(0, 0xfffffffffffffffe);
TEST_BIT_CLEAR(1, 0xfffffffffffffffc);
TEST_BIT_CLEAR(2, 0xfffffffffffffff8);
TEST_BIT_CLEAR(7, 0xffffffffffffff78);
TEST_BIT_CLEAR(63,0x7fffffffffffff78);
}
END_TEST
START_TEST(test_bit_tests)
{
uint64_t num = 0x5555555555555555;
char *bits = (char*) &num;
fail_unless(bit_has_value(bits, 0, 1), "bit_has_value malfunction");
fail_unless(bit_has_value(bits, 1, 0), "bit_has_value malfunction");
fail_unless(bit_has_value(bits, 63, 0), "bit_has_value malfunction");
fail_unless(bit_is_set(bits, 0), "bit_is_set malfunction");
fail_unless(bit_is_clear(bits, 1), "bit_is_clear malfunction");
fail_unless(bit_is_set(bits, 62), "bit_is_set malfunction");
fail_unless(bit_is_clear(bits, 63), "bit_is_clear malfunction");
}
END_TEST
START_TEST(test_bit_ranges)
{
char buffer[4160];
uint64_t *longs = (unsigned long*) buffer;
uint64_t i;
memset(buffer, 0, 4160);
for (i=0; i<64; i++) {
bit_set_range(buffer, i*64, i);
fail_unless(
longs[i] == (1UL<<i)-1,
"longs[%ld] = %lx SHOULD BE %lx",
i, longs[i], (1L<<i)-1
);
fail_unless(longs[i+1] == 0, "bit_set_range overshot at i=%d", i);
}
for (i=0; i<64; i++) {
bit_clear_range(buffer, i*64, i);
fail_unless(longs[i] == 0, "bit_clear_range didn't work at i=%d", i);
}
}
END_TEST
START_TEST(test_bit_runs)
{
char buffer[256];
int i, ptr=0, runs[] = {
56,97,22,12,83,1,45,80,85,51,64,40,63,67,75,64,94,81,79,62
};
memset(buffer,0,256);
for (i=0; i < 20; i += 2) {
ptr += runs[i];
bit_set_range(buffer, ptr, runs[i+1]);
ptr += runs[i+1];
}
ptr = 0;
for (i=0; i < 20; i += 1) {
int run = bit_run_count(buffer, ptr, 2048-ptr);
fail_unless(
run == runs[i],
"run %d should have been %d, was %d",
i, runs[i], run
);
ptr += runs[i];
}
}
END_TEST
START_TEST(test_bitset)
{
struct bitset_mapping* map;
uint64_t *num;
map = bitset_alloc(6400, 100);
num = (uint64_t*) map->bits;
bitset_set_range(map,0,50);
ck_assert_int_eq(1, *num);
bitset_set_range(map,99,1);
ck_assert_int_eq(1, *num);
bitset_set_range(map,100,1);
ck_assert_int_eq(3, *num);
bitset_set_range(map,0,800);
ck_assert_int_eq(255, *num);
bitset_set_range(map,1499,2);
ck_assert_int_eq(0xc0ff, *num);
bitset_clear_range(map,1499,2);
ck_assert_int_eq(255, *num);
*num = 0;
bitset_set_range(map, 1499, 2);
bitset_clear_range(map, 1300, 200);
ck_assert_int_eq(0x8000, *num);
*num = 0;
bitset_set_range(map, 0, 6400);
ck_assert_int_eq(0xffffffffffffffff, *num);
bitset_clear_range(map, 3200, 400);
ck_assert_int_eq(0xfffffff0ffffffff, *num);
}
END_TEST
START_TEST( test_bitset_set )
{
struct bitset_mapping* map;
uint64_t *num;
map = bitset_alloc(64, 1);
num = (uint64_t*) map->bits;
ck_assert_int_eq( 0x0000000000000000, *num );
bitset_set( map );
ck_assert_int_eq( 0xffffffffffffffff, *num );
}
END_TEST
START_TEST( test_bitset_clear )
{
struct bitset_mapping* map;
uint64_t *num;
map = bitset_alloc(64, 1);
num = (uint64_t*) map->bits;
ck_assert_int_eq( 0x0000000000000000, *num );
bitset_set( map );
bitset_clear( map );
ck_assert_int_eq( 0x0000000000000000, *num );
}
END_TEST
Suite* bitset_suite(void)
{
Suite *s = suite_create("bitset");
TCase *tc_bit = tcase_create("bit");
TCase *tc_bitset = tcase_create("bitset");
tcase_add_test(tc_bit, test_bit_set);
tcase_add_test(tc_bit, test_bit_clear);
tcase_add_test(tc_bit, test_bit_tests);
tcase_add_test(tc_bit, test_bit_ranges);
tcase_add_test(tc_bit, test_bit_runs);
tcase_add_test(tc_bitset, test_bitset);
tcase_add_test(tc_bitset, test_bitset_set);
tcase_add_test(tc_bitset, test_bitset_clear);
suite_add_tcase(s, tc_bit);
suite_add_tcase(s, tc_bitset);
return s;
}
int main(void)
{
int number_failed;
Suite *s = bitset_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

122
tests/unit/check_client.c Normal file
View File

@@ -0,0 +1,122 @@
#include <check.h>
#include <stdio.h>
#include "self_pipe.h"
#include "nbdtypes.h"
#include "serve.h"
#include "client.h"
#include <unistd.h>
struct server fake_server = {0};
#define FAKE_SERVER &fake_server
#define FAKE_SOCKET (42)
START_TEST( test_assigns_socket )
{
struct client * c;
c = client_create( FAKE_SERVER, FAKE_SOCKET );
fail_unless( 42 == c->socket, "Socket wasn't assigned." );
}
END_TEST
START_TEST( test_assigns_server )
{
struct client * c;
/* can't predict the storage size so we can't allocate one on
* the stack
*/
c = client_create( FAKE_SERVER, FAKE_SOCKET );
fail_unless( FAKE_SERVER == c->serve, "Serve wasn't assigned." );
}
END_TEST
START_TEST( test_opens_stop_signal )
{
struct client *c = client_create( FAKE_SERVER, FAKE_SOCKET );
client_signal_stop( c );
fail_unless( 1 == self_pipe_signal_clear( c->stop_signal ),
"No signal was sent." );
}
END_TEST
int fd_is_closed(int);
START_TEST( test_closes_stop_signal )
{
struct client *c = client_create( FAKE_SERVER, FAKE_SOCKET );
int read_fd = c->stop_signal->read_fd;
int write_fd = c->stop_signal->write_fd;
client_destroy( c );
fail_unless( fd_is_closed( read_fd ), "Stop signal wasn't destroyed." );
fail_unless( fd_is_closed( write_fd ), "Stop signal wasn't destroyed." );
}
END_TEST
START_TEST( test_read_request_quits_on_stop_signal )
{
int fds[2];
struct nbd_request nbdr;
pipe( fds );
struct client *c = client_create( FAKE_SERVER, fds[0] );
client_signal_stop( c );
int client_read_request( struct client *, struct nbd_request *);
fail_unless( 0 == client_read_request( c, &nbdr ), "Didn't quit on stop." );
close( fds[0] );
close( fds[1] );
}
END_TEST
Suite *client_suite(void)
{
Suite *s = suite_create("client");
TCase *tc_create = tcase_create("create");
TCase *tc_signal = tcase_create("signal");
TCase *tc_destroy = tcase_create("destroy");
tcase_add_test(tc_create, test_assigns_socket);
tcase_add_test(tc_create, test_assigns_server);
tcase_add_test(tc_signal, test_opens_stop_signal);
tcase_add_test(tc_signal, test_read_request_quits_on_stop_signal);
tcase_add_test( tc_destroy, test_closes_stop_signal );
suite_add_tcase(s, tc_create);
suite_add_tcase(s, tc_signal);
suite_add_tcase(s, tc_destroy);
return s;
}
int main(void)
{
int number_failed;
Suite *s = client_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

View File

@@ -0,0 +1,42 @@
#include "control.h"
#include "flexnbd.h"
#include <check.h>
START_TEST( test_assigns_sock_name )
{
struct flexnbd flexnbd = {0};
char csn[] = "foobar";
struct control * control = control_create(&flexnbd, csn );
fail_unless( csn == control->socket_name, "Socket name not assigned" );
}
END_TEST
Suite *control_suite(void)
{
Suite *s = suite_create("control");
TCase *tc_create = tcase_create("create");
tcase_add_test(tc_create, test_assigns_sock_name);
suite_add_tcase( s, tc_create );
return s;
}
int main(void)
{
int number_failed;
Suite *s = control_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

View File

@@ -0,0 +1,47 @@
#include "flexnbd.h"
#include <check.h>
START_TEST( test_listening_assigns_sock )
{
struct flexnbd * flexnbd = flexnbd_create_listening(
"127.0.0.1",
NULL,
"4777",
NULL,
"fakefile",
"fakesock",
0,
0,
NULL,
1 );
fail_if( NULL == flexnbd->control->socket_name, "No socket was copied" );
}
END_TEST
Suite *flexnbd_suite(void)
{
Suite *s = suite_create("flexnbd");
TCase *tc_create = tcase_create("create");
tcase_add_test(tc_create, test_listening_assigns_sock);
suite_add_tcase( s, tc_create );
return s;
}
int main(void)
{
int number_failed;
Suite *s = flexnbd_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

View File

@@ -0,0 +1,62 @@
#include "flexthread.h"
#include "util.h"
#include <check.h>
START_TEST( test_mutex_create )
{
struct flexthread_mutex * ftm = flexthread_mutex_create();
NULLCHECK( ftm );
flexthread_mutex_destroy( ftm );
}
END_TEST
START_TEST( test_mutex_lock )
{
struct flexthread_mutex * ftm = flexthread_mutex_create();
fail_if( flexthread_mutex_held( ftm ), "Flexthread_mutex is held before lock" );
flexthread_mutex_lock( ftm );
fail_unless( flexthread_mutex_held( ftm ), "Flexthread_mutex is not held inside lock" );
flexthread_mutex_unlock( ftm );
fail_if( flexthread_mutex_held( ftm ), "Flexthread_mutex is held after unlock" );
flexthread_mutex_destroy( ftm );
}
END_TEST
Suite* flexthread_suite(void)
{
Suite *s = suite_create("flexthread");
TCase *tc_create = tcase_create("create");
TCase *tc_destroy = tcase_create("destroy");
tcase_add_test( tc_create, test_mutex_create );
tcase_add_test( tc_create, test_mutex_lock );
suite_add_tcase(s, tc_create);
suite_add_tcase(s, tc_destroy);
return s;
}
int main(void)
{
#ifdef DEBUG
log_level = 0;
#else
log_level = 2;
#endif
int number_failed;
Suite *s = flexthread_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
log_level = 0;
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

134
tests/unit/check_ioutil.c Normal file
View File

@@ -0,0 +1,134 @@
#include "ioutil.h"
#include <check.h>
START_TEST( test_read_until_newline_returns_line_length_plus_null )
{
int fds[2];
int nread;
char buf[5] = {0};
pipe(fds);
write( fds[1], "1234\n", 5 );
nread = read_until_newline( fds[0], buf, 5 );
ck_assert_int_eq( 5, nread );
}
END_TEST
START_TEST( test_read_until_newline_inserts_null )
{
int fds[2];
int nread;
char buf[5] = {0};
pipe(fds);
write( fds[1], "1234\n", 5 );
nread = read_until_newline( fds[0], buf, 5 );
ck_assert_int_eq( '\0', buf[4] );
}
END_TEST
START_TEST( test_read_empty_line_inserts_null )
{
int fds[2];
int nread;
char buf[5] = {0};
pipe(fds);
write( fds[1], "\n", 1 );
nread = read_until_newline( fds[0], buf, 1 );
ck_assert_int_eq( '\0', buf[0] );
ck_assert_int_eq( 1, nread );
}
END_TEST
START_TEST( test_read_eof_returns_err )
{
int fds[2];
int nread;
char buf[5] = {0};
pipe( fds );
close( fds[1] );
nread = read_until_newline( fds[0], buf, 5 );
ck_assert_int_eq( -1, nread );
}
END_TEST
START_TEST( test_read_eof_fills_line )
{
int fds[2];
int nread;
char buf[5] = {0};
pipe(fds);
write( fds[1], "1234", 4 );
close( fds[1] );
nread = read_until_newline( fds[0], buf, 5 );
ck_assert_int_eq( -1, nread );
ck_assert_int_eq( '4', buf[3] );
}
END_TEST
START_TEST( test_read_lines_until_blankline )
{
char **lines = NULL;
int fds[2];
int nlines;
pipe( fds );
write( fds[1], "a\nb\nc\n\n", 7 );
nlines = read_lines_until_blankline( fds[0], 256, &lines );
ck_assert_int_eq( 3, nlines );
}
END_TEST
Suite *ioutil_suite(void)
{
Suite *s = suite_create("ioutil");
TCase *tc_read_until_newline = tcase_create("read_until_newline");
TCase *tc_read_lines_until_blankline = tcase_create("read_lines_until_blankline");
tcase_add_test(tc_read_until_newline, test_read_until_newline_returns_line_length_plus_null);
tcase_add_test(tc_read_until_newline, test_read_until_newline_inserts_null);
tcase_add_test(tc_read_until_newline, test_read_empty_line_inserts_null);
tcase_add_test(tc_read_until_newline, test_read_eof_returns_err);
tcase_add_test(tc_read_until_newline, test_read_eof_fills_line );
tcase_add_test(tc_read_lines_until_blankline, test_read_lines_until_blankline );
suite_add_tcase(s, tc_read_until_newline);
suite_add_tcase(s, tc_read_lines_until_blankline);
return s;
}
int main(void)
{
int number_failed;
Suite *s = ioutil_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

57
tests/unit/check_listen.c Normal file
View File

@@ -0,0 +1,57 @@
#include "serve.h"
#include "listen.h"
#include "util.h"
#include "flexnbd.h"
#include <check.h>
#include <string.h>
START_TEST( test_defaults_main_serve_opts )
{
struct flexnbd flexnbd;
struct listen * listen = listen_create( &flexnbd, "127.0.0.1", NULL, "4777", NULL,
"foo", 0, 0, NULL, 1 );
NULLCHECK( listen );
struct server *init_serve = listen->init_serve;
struct server *main_serve = listen->main_serve;
NULLCHECK( init_serve );
NULLCHECK( main_serve );
fail_unless( 0 == memcmp(&init_serve->bind_to,
&main_serve->bind_to,
sizeof( union mysockaddr )),
"Main serve bind_to was not set" );
}
END_TEST
Suite* listen_suite(void)
{
Suite *s = suite_create("listen");
TCase *tc_create = tcase_create("create");
tcase_add_exit_test(tc_create, test_defaults_main_serve_opts, 0);
suite_add_tcase(s, tc_create);
return s;
}
#ifdef DEBUG
# define LOG_LEVEL 0
#else
# define LOG_LEVEL 2
#endif
int main(void)
{
log_level = LOG_LEVEL;
int number_failed;
Suite *s = listen_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

104
tests/unit/check_mbox.c Normal file
View File

@@ -0,0 +1,104 @@
#include "mbox.h"
#include "util.h"
#include <pthread.h>
#include <check.h>
START_TEST( test_allocs_cvar )
{
struct mbox * mbox = mbox_create();
fail_if( NULL == mbox, "Nothing allocated" );
pthread_cond_t cond_zero;
/* A freshly inited pthread_cond_t is set to {0} */
memset( &cond_zero, 'X', sizeof( cond_zero ) );
fail_if( memcmp( &cond_zero, &mbox->filled_cond, sizeof( cond_zero ) ) == 0 ,
"Condition variable not allocated" );
fail_if( memcmp( &cond_zero, &mbox->emptied_cond, sizeof( cond_zero ) ) == 0 ,
"Condition variable not allocated" );
}
END_TEST
START_TEST( test_post_stores_value )
{
struct mbox * mbox = mbox_create();
void * deadbeef = (void *)0xDEADBEEF;
mbox_post( mbox, deadbeef );
fail_unless( deadbeef == mbox_contents( mbox ),
"Contents were not posted" );
}
END_TEST
void * mbox_receive_runner( void * mbox_uncast )
{
struct mbox * mbox = (struct mbox *)mbox_uncast;
void * contents = NULL;
contents = mbox_receive( mbox );
return contents;
}
START_TEST( test_receive_blocks_until_post )
{
struct mbox * mbox = mbox_create();
pthread_t receiver;
pthread_create( &receiver, NULL, mbox_receive_runner, mbox );
void * deadbeef = (void *)0xDEADBEEF;
void * retval =NULL;
usleep(10000);
fail_unless( EBUSY == pthread_tryjoin_np( receiver, &retval ),
"Receiver thread wasn't blocked");
mbox_post( mbox, deadbeef );
fail_unless( 0 == pthread_join( receiver, &retval ),
"Failed to join the receiver thread" );
fail_unless( retval == deadbeef,
"Return value was wrong" );
}
END_TEST
Suite* acl_suite(void)
{
Suite *s = suite_create("acl");
TCase *tc_create = tcase_create("create");
TCase *tc_post = tcase_create("post");
tcase_add_test(tc_create, test_allocs_cvar);
tcase_add_test( tc_post, test_post_stores_value );
tcase_add_test( tc_post, test_receive_blocks_until_post);
suite_add_tcase(s, tc_create);
suite_add_tcase(s, tc_post);
return s;
}
int main(void)
{
#ifdef DEBUG
log_level = 0;
#else
log_level = 2;
#endif
int number_failed;
Suite *s = acl_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
log_level = 0;
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

242
tests/unit/check_nbdtypes.c Normal file
View File

@@ -0,0 +1,242 @@
#include <check.h>
#include "nbdtypes.h"
START_TEST(test_init_passwd)
{
struct nbd_init_raw init_raw;
struct nbd_init init;
memcpy( init_raw.passwd, INIT_PASSWD, 8 );
nbd_r2h_init( &init_raw, &init );
memset( init_raw.passwd, 0, 8 );
nbd_h2r_init( &init, &init_raw );
fail_unless( memcmp( init.passwd, INIT_PASSWD, 8 ) == 0, "The password was not copied." );
fail_unless( memcmp( init_raw.passwd, INIT_PASSWD, 8 ) == 0, "The password was not copied back." );
}
END_TEST
START_TEST(test_init_magic)
{
struct nbd_init_raw init_raw;
struct nbd_init init;
init_raw.magic = 12345;
nbd_r2h_init( &init_raw, &init );
fail_unless( be64toh( 12345 ) == init.magic, "Magic was not converted." );
init.magic = 67890;
nbd_h2r_init( &init, &init_raw );
fail_unless( htobe64( 67890 ) == init_raw.magic, "Magic was not converted back." );
}
END_TEST
START_TEST(test_init_size)
{
struct nbd_init_raw init_raw;
struct nbd_init init;
init_raw.size = 12345;
nbd_r2h_init( &init_raw, &init );
fail_unless( be64toh( 12345 ) == init.size, "Size was not converted." );
init.size = 67890;
nbd_h2r_init( &init, &init_raw );
fail_unless( htobe64( 67890 ) == init_raw.size, "Size was not converted back." );
}
END_TEST
START_TEST(test_request_magic )
{
struct nbd_request_raw request_raw;
struct nbd_request request;
request_raw.magic = 12345;
nbd_r2h_request( &request_raw, &request );
fail_unless( be32toh( 12345 ) == request.magic, "Magic was not converted." );
request.magic = 67890;
nbd_h2r_request( &request, &request_raw );
fail_unless( htobe32( 67890 ) == request_raw.magic, "Magic was not converted back." );
}
END_TEST
START_TEST(test_request_type )
{
struct nbd_request_raw request_raw;
struct nbd_request request;
request_raw.type = 12345;
nbd_r2h_request( &request_raw, &request );
fail_unless( be32toh( 12345 ) == request.type, "Type was not converted." );
request.type = 67890;
nbd_h2r_request( &request, &request_raw );
fail_unless( htobe32( 67890 ) == request_raw.type, "Type was not converted back." );
}
END_TEST
START_TEST(test_request_handle)
{
struct nbd_request_raw request_raw;
struct nbd_request request;
memcpy( request_raw.handle, "MYHANDLE", 8 );
nbd_r2h_request( &request_raw, &request );
memset( request_raw.handle, 0, 8 );
nbd_h2r_request( &request, &request_raw );
fail_unless( memcmp( request.handle, "MYHANDLE", 8 ) == 0, "The handle was not copied." );
fail_unless( memcmp( request_raw.handle, "MYHANDLE", 8 ) == 0, "The handle was not copied back." );
}
END_TEST
START_TEST(test_request_from )
{
struct nbd_request_raw request_raw;
struct nbd_request request;
request_raw.from = 12345;
nbd_r2h_request( &request_raw, &request );
fail_unless( be64toh( 12345 ) == request.from, "From was not converted." );
request.from = 67890;
nbd_h2r_request( &request, &request_raw );
fail_unless( htobe64( 67890 ) == request_raw.from, "From was not converted back." );
}
END_TEST
START_TEST(test_request_len )
{
struct nbd_request_raw request_raw;
struct nbd_request request;
request_raw.len = 12345;
nbd_r2h_request( &request_raw, &request );
fail_unless( be32toh( 12345 ) == request.len, "Type was not converted." );
request.len = 67890;
nbd_h2r_request( &request, &request_raw );
fail_unless( htobe32( 67890 ) == request_raw.len, "Type was not converted back." );
}
END_TEST
START_TEST(test_reply_magic )
{
struct nbd_reply_raw reply_raw;
struct nbd_reply reply;
reply_raw.magic = 12345;
nbd_r2h_reply( &reply_raw, &reply );
fail_unless( be32toh( 12345 ) == reply.magic, "Magic was not converted." );
reply.magic = 67890;
nbd_h2r_reply( &reply, &reply_raw );
fail_unless( htobe32( 67890 ) == reply_raw.magic, "Magic was not converted back." );
}
END_TEST
START_TEST(test_reply_error )
{
struct nbd_reply_raw reply_raw;
struct nbd_reply reply;
reply_raw.error = 12345;
nbd_r2h_reply( &reply_raw, &reply );
fail_unless( be32toh( 12345 ) == reply.error, "Error was not converted." );
reply.error = 67890;
nbd_h2r_reply( &reply, &reply_raw );
fail_unless( htobe32( 67890 ) == reply_raw.error, "Error was not converted back." );
}
END_TEST
START_TEST(test_reply_handle)
{
struct nbd_reply_raw reply_raw;
struct nbd_reply reply;
memcpy( reply_raw.handle, "MYHANDLE", 8 );
nbd_r2h_reply( &reply_raw, &reply );
memset( reply_raw.handle, 0, 8 );
nbd_h2r_reply( &reply, &reply_raw );
fail_unless( memcmp( reply.handle, "MYHANDLE", 8 ) == 0, "The handle was not copied." );
fail_unless( memcmp( reply_raw.handle, "MYHANDLE", 8 ) == 0, "The handle was not copied back." );
}
END_TEST
START_TEST( test_convert_from )
{
/* Check that we can correctly pull numbers out of an
* nbd_request_raw */
struct nbd_request_raw request_raw;
struct nbd_request request;
char readbuf[] = {0x80, 0, 0, 0, 0, 0, 0, 0};
memcpy( &request_raw.from, readbuf, 8 );
nbd_r2h_request( &request_raw, &request );
uint64_t target = 1;
target <<= 63;
fail_unless( target == request.from, "from was wrong" );
}
END_TEST
Suite *nbdtypes_suite(void)
{
Suite *s = suite_create( "nbdtypes" );
TCase *tc_init = tcase_create( "nbd_init" );
TCase *tc_request = tcase_create( "nbd_request" );
TCase *tc_reply = tcase_create( "nbd_reply" );
tcase_add_test( tc_init, test_init_passwd );
tcase_add_test( tc_init, test_init_magic );
tcase_add_test( tc_init, test_init_size );
tcase_add_test( tc_request, test_request_magic );
tcase_add_test( tc_request, test_request_type );
tcase_add_test( tc_request, test_request_handle );
tcase_add_test( tc_request, test_request_from );
tcase_add_test( tc_request, test_request_len );
tcase_add_test( tc_request, test_convert_from );
tcase_add_test( tc_reply, test_reply_magic );
tcase_add_test( tc_reply, test_reply_error );
tcase_add_test( tc_reply, test_reply_handle );
suite_add_tcase( s, tc_init );
suite_add_tcase( s, tc_request );
suite_add_tcase( s, tc_reply );
return s;
}
int main(void)
{
int number_failed;
Suite *s = nbdtypes_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

47
tests/unit/check_parse.c Normal file
View File

@@ -0,0 +1,47 @@
#include "parse.h"
#include "util.h"
#include <check.h>
START_TEST( test_can_parse_ip_address_twice )
{
char ip_address[] = "127.0.0.1";
struct sockaddr saddr;
parse_ip_to_sockaddr( &saddr, ip_address );
parse_ip_to_sockaddr( &saddr, ip_address );
}
END_TEST
Suite* parse_suite(void)
{
Suite *s = suite_create("parse");
TCase *tc_create = tcase_create("ip_to_sockaddr");
tcase_add_test(tc_create, test_can_parse_ip_address_twice);
suite_add_tcase(s, tc_create);
return s;
}
#ifdef DEBUG
# define LOG_LEVEL 0
#else
# define LOG_LEVEL 2
#endif
int main(void)
{
log_level = LOG_LEVEL;
int number_failed;
Suite *s = parse_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

View File

@@ -0,0 +1,197 @@
#include "readwrite.h"
#include <check.h>
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>
#include <stddef.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/un.h>
#include "util.h"
#include "nbdtypes.h"
int fd_read_request( int, struct nbd_request_raw *);
int fd_write_reply( int, char *, int );
int marker;
void error_marker(void * unused __attribute__((unused)),
int fatal __attribute__((unused)))
{
marker = 1;
return;
}
struct respond {
int sock_fds[2]; // server end
int do_fail;
pthread_t thread_id;
pthread_attr_t thread_attr;
struct nbd_request received;
};
void * responder( void *respond_uncast )
{
struct respond * resp = (struct respond *) respond_uncast;
int sock_fd = resp->sock_fds[1];
struct nbd_request_raw request_raw;
char wrong_handle[] = "WHOOPSIE";
if( fd_read_request( sock_fd, &request_raw ) == -1){
fprintf(stderr, "Problem with fd_read_request\n");
} else {
nbd_r2h_request( &request_raw, &resp->received);
if (resp->do_fail){
fd_write_reply( sock_fd, wrong_handle, 0 );
}
else {
fd_write_reply( sock_fd, resp->received.handle, 0 );
}
}
return NULL;
}
struct respond * respond_create( int do_fail )
{
struct respond * respond = (struct respond *)calloc( 1, sizeof( struct respond ) );
socketpair( PF_UNIX, SOCK_STREAM, 0, respond->sock_fds );
respond->do_fail = do_fail;
pthread_attr_init( &respond->thread_attr );
pthread_create( &respond->thread_id, &respond->thread_attr, responder, respond );
return respond;
}
void respond_destroy( struct respond * respond ){
NULLCHECK( respond );
pthread_join( respond->thread_id, NULL );
pthread_attr_destroy( &respond->thread_attr );
close( respond->sock_fds[0] );
close( respond->sock_fds[1] );
free( respond );
}
void * entruster( void * nothing __attribute__((unused)))
{
DECLARE_ERROR_CONTEXT( error_context );
error_set_handler( (cleanup_handler *)error_marker, error_context );
struct respond * respond = respond_create( 1 );
socket_nbd_entrust( respond->sock_fds[0] );
return NULL;
}
START_TEST( test_rejects_mismatched_handle )
{
error_init();
pthread_t entruster_thread;
log_level=5;
marker = 0;
pthread_create( &entruster_thread, NULL, entruster, NULL );
FATAL_UNLESS( 0 == pthread_join( entruster_thread, NULL ), "pthread_join failed");
log_level=2;
fail_unless( marker == 1, "Error handler wasn't called" );
}
END_TEST
START_TEST( test_accepts_matched_handle )
{
struct respond * respond = respond_create( 0 );
socket_nbd_entrust( respond->sock_fds[0] );
respond_destroy( respond );
}
END_TEST
START_TEST( test_entrust_type_sent )
{
struct respond * respond = respond_create( 0 );
socket_nbd_entrust( respond->sock_fds[0] );
fail_unless( respond->received.type == REQUEST_ENTRUST, "Wrong type sent." );
respond_destroy( respond );
}
END_TEST
START_TEST( test_disconnect_doesnt_read_reply )
{
struct respond * respond = respond_create( 1 );
socket_nbd_disconnect( respond->sock_fds[0] );
respond_destroy( respond );
}
END_TEST
Suite* readwrite_suite(void)
{
Suite *s = suite_create("acl");
TCase *tc_transfer = tcase_create("entrust");
TCase *tc_disconnect = tcase_create("disconnect");
tcase_add_test(tc_transfer, test_rejects_mismatched_handle);
tcase_add_exit_test(tc_transfer, test_accepts_matched_handle, 0);
tcase_add_test( tc_transfer, test_entrust_type_sent );
/* This test is a little funny. We respond with a dodgy handle
* and check that this *doesn't* cause a message rejection,
* because we want to know that the sender won't even try to
* read the response.
*/
tcase_add_exit_test( tc_disconnect, test_disconnect_doesnt_read_reply,0 );
suite_add_tcase(s, tc_transfer);
suite_add_tcase(s, tc_disconnect);
return s;
}
#ifdef DEBUG
# define LOG_LEVEL 0
#else
# define LOG_LEVEL 2
#endif
int main(void)
{
log_level = LOG_LEVEL;
int number_failed;
Suite *s = readwrite_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
log_level = 0;
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

View File

@@ -0,0 +1,198 @@
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <check.h>
#include <mcheck.h>
#include "self_pipe.h"
START_TEST( test_opens_pipe )
{
struct self_pipe* sig;
char buf[] = " ";
sig = self_pipe_create();
write( sig->write_fd, "1", 1 );
read( sig->read_fd, buf, 1 );
fail_unless( buf[0] == '1', "Pipe does not seem to be open;" );
self_pipe_destroy( sig );
}
END_TEST
void * signal_thread( void * thing )
{
struct self_pipe *sig = (struct self_pipe *)thing;
usleep( 100000 );
self_pipe_signal( sig );
return NULL;
}
pthread_t start_signal_thread( struct self_pipe *sig )
{
pthread_attr_t attr;
pthread_t thread_id;
pthread_attr_init( &attr );
pthread_create( &thread_id, &attr, signal_thread, sig );
pthread_attr_destroy( &attr );
return thread_id;
}
START_TEST( test_signals )
{
struct self_pipe* sig;
fd_set fds;
pthread_t signal_thread_id;
sig = self_pipe_create();
FD_ZERO( &fds );
self_pipe_fd_set( sig, &fds );
signal_thread_id = start_signal_thread( sig );
if ( select( FD_SETSIZE, &fds, NULL, NULL, NULL ) == -1 ) {
fail( strerror(errno) );
}
self_pipe_signal_clear( sig );
fail_unless( self_pipe_fd_isset( sig, &fds ), "Signalled pipe was not FD_ISSET." );
pthread_join( signal_thread_id, NULL );
self_pipe_destroy( sig );
}
END_TEST
START_TEST( test_clear_returns_immediately )
{
struct self_pipe *sig;
sig = self_pipe_create();
fail_unless( 0 == self_pipe_signal_clear( sig ), "Wrong clear result." );
}
END_TEST
START_TEST( test_destroy_closes_read_pipe )
{
struct self_pipe* sig;
ssize_t read_len;
int orig_read_fd;
sig = self_pipe_create();
orig_read_fd = sig->read_fd;
self_pipe_destroy( sig );
while( (read_len = read( orig_read_fd, "", 0 )) == -1 && errno == EINTR );
switch( read_len ) {
case 0:
fail("The read fd wasn't closed." );
break;
case -1:
switch(errno) {
case EBADF:
/* This is what we want */
break;
case EAGAIN:
fail( "The read fd wasn't closed." );
break;
default:
fail( strerror( errno ) );
break;
}
break;
default:
fail( "The read fd wasn't closed, and had data in it." );
break;
}
}
END_TEST
START_TEST( test_destroy_closes_write_pipe )
{
struct self_pipe * sig;
ssize_t write_len;
int orig_write_fd;
sig = self_pipe_create();
orig_write_fd = sig->write_fd;
self_pipe_destroy( sig );
while ( ( write_len = write( orig_write_fd, "", 0 ) ) == -1 && errno == EINTR );
switch( write_len ) {
case 0:
fail( "The write fd wasn't closed." );
break;
case -1:
switch( errno ) {
case EPIPE:
case EBADF:
/* This is what we want */
break;
case EAGAIN:
fail("The write fd wasn't closed." );
break;
default:
fail( strerror( errno ) );
break;
}
break;
default:
/* To get here, the write(_,_,0) would have to
* write some bytes.
*/
fail( "The write fd wasn't closed, and something REALLY WEIRD is going on." );
break;
}
}
END_TEST
Suite *self_pipe_suite(void)
{
Suite *s = suite_create("self_pipe");
TCase *tc_create = tcase_create("create");
TCase *tc_signal = tcase_create("signal");
TCase *tc_destroy = tcase_create("destroy");
tcase_add_test(tc_create, test_opens_pipe);
tcase_add_test(tc_signal, test_signals );
tcase_add_test(tc_signal, test_clear_returns_immediately );
tcase_add_test(tc_destroy, test_destroy_closes_read_pipe );
tcase_add_test(tc_destroy, test_destroy_closes_write_pipe );
/* We don't test that destroy free()'s the self_pipe pointer because
* that'll be caught by valgrind.
*/
suite_add_tcase(s, tc_create);
suite_add_tcase(s, tc_signal);
suite_add_tcase(s, tc_destroy);
return s;
}
int main(void)
{
int number_failed;
Suite *s = self_pipe_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

264
tests/unit/check_serve.c Normal file
View File

@@ -0,0 +1,264 @@
#include "serve.h"
#include "util.h"
#include "self_pipe.h"
#include "client.h"
#include "flexnbd.h"
#include <stdlib.h>
#include <check.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <sys/stat.h>
#include <fcntl.h>
#ifdef DEBUG
# define LOG_LEVEL 0
#else
# define LOG_LEVEL 2
#endif
/* Need these because libcheck is braindead and doesn't
* run teardown after a failing test
*/
#define myfail( msg ) do { teardown(); fail(msg); } while (0)
#define myfail_if( tst, msg ) do { if( tst ) { myfail( msg ); } } while (0)
#define myfail_unless( tst, msg ) myfail_if( !(tst), msg )
char * dummy_file;
char *make_tmpfile(void)
{
FILE *fp;
char *fn_buf;
char leader[] = "/tmp/check_serve";
fn_buf = (char *)malloc( 1024 );
strncpy( fn_buf, leader, sizeof( leader ) - 1);
snprintf( &fn_buf[sizeof( leader ) - 1], 10, "%d", getpid() );
fp = fopen( fn_buf, "w" );
fwrite( fn_buf, 1024, 1, fp );
fclose( fp );
return fn_buf;
}
void setup( void )
{
dummy_file = make_tmpfile();
}
void teardown( void )
{
if( dummy_file ){ unlink( dummy_file ); }
free( dummy_file );
dummy_file = NULL;
}
START_TEST( test_replaces_acl )
{
struct flexnbd flexnbd;
flexnbd.signal_fd = -1;
struct server * s = server_create( &flexnbd, "127.0.0.1", "0", dummy_file, 0, 0, NULL, 1, 1 );
struct acl * new_acl = acl_create( 0, NULL, 0 );
server_replace_acl( s, new_acl );
myfail_unless( s->acl == new_acl, "ACL wasn't replaced." );
server_destroy( s );
}
END_TEST
START_TEST( test_signals_acl_updated )
{
struct flexnbd flexnbd;
flexnbd.signal_fd = -1;
struct server * s = server_create( &flexnbd, "127.0.0.1", "0", dummy_file, 0, 0, NULL, 1, 1 );
struct acl * new_acl = acl_create( 0, NULL, 0 );
server_replace_acl( s, new_acl );
myfail_unless( 1 == self_pipe_signal_clear( s->acl_updated_signal ),
"No signal sent." );
server_destroy( s );
}
END_TEST
int connect_client( char *addr, int actual_port, char *source_addr )
{
int client_fd;
struct addrinfo hint;
struct addrinfo *ailist, *aip;
memset( &hint, '\0', sizeof( struct addrinfo ) );
hint.ai_socktype = SOCK_STREAM;
myfail_if( getaddrinfo( addr, NULL, &hint, &ailist ) != 0, "getaddrinfo failed." );
int connected = 0;
for( aip = ailist; aip; aip = aip->ai_next ) {
((struct sockaddr_in *)aip->ai_addr)->sin_port = htons( actual_port );
client_fd = socket( aip->ai_family, aip->ai_socktype, aip->ai_protocol );
if (source_addr) {
struct sockaddr src;
if( !parse_ip_to_sockaddr(&src, source_addr)) {
close(client_fd);
continue;
}
bind(client_fd, &src, sizeof(struct sockaddr_in6));
}
if( client_fd == -1) { continue; }
if( connect( client_fd, aip->ai_addr, aip->ai_addrlen) == 0 ) {
connected = 1;
break;
}
close( client_fd );
}
myfail_unless( connected, "Didn't connect." );
return client_fd;
}
/* These are "internal" functions we need for the following test. We
* shouldn't need them but there's no other way at the moment. */
void serve_open_server_socket( struct server * );
int server_port( struct server * );
void server_accept( struct server * );
int fd_is_closed( int );
void server_close_clients( struct server * );
START_TEST( test_acl_update_closes_bad_client )
{
/* This is the wrong way round. Rather than pulling the thread
* and socket out of the server structure, we should be testing
* a client socket.
*/
struct flexnbd flexnbd;
flexnbd.signal_fd = -1;
struct server * s = server_create( &flexnbd, "127.0.0.7", "0", dummy_file, 0, 0, NULL, 1, 1 );
struct acl * new_acl = acl_create( 0, NULL, 1 );
struct client * c;
struct client_tbl_entry * entry;
int actual_port;
int client_fd;
int server_fd;
serve_open_server_socket( s );
actual_port = server_port( s );
client_fd = connect_client( "127.0.0.7", actual_port, "127.0.0.1" );
server_accept( s );
entry = &s->nbd_client[0];
c = entry->client;
/* At this point there should be an entry in the nbd_clients
* table and a background thread to run the client loop
*/
myfail_if( entry->thread == 0, "No client thread was started." );
server_fd = c->socket;
myfail_if( fd_is_closed(server_fd),
"Sanity check failed - client socket wasn't open." );
server_replace_acl( s, new_acl );
/* accept again, so that we can react to the acl replacement signal */
server_accept( s );
/* Fail if we time out here */
while( !fd_is_closed( server_fd ) );
close( client_fd );
server_close_clients( s );
server_destroy( s );
}
END_TEST
START_TEST( test_acl_update_leaves_good_client )
{
struct flexnbd flexnbd;
flexnbd.signal_fd = -1;
struct server * s = server_create( &flexnbd, "127.0.0.7", "0", dummy_file, 0, 0, NULL, 1, 1 );
char *lines[] = {"127.0.0.1"};
struct acl * new_acl = acl_create( 1, lines, 1 );
struct client * c;
struct client_tbl_entry * entry;
int actual_port;
int client_fd;
int server_fd;
serve_open_server_socket( s );
actual_port = server_port( s );
client_fd = connect_client( "127.0.0.7", actual_port, "127.0.0.1" );
server_accept( s );
entry = &s->nbd_client[0];
c = entry->client;
/* At this point there should be an entry in the nbd_clients
* table and a background thread to run the client loop
*/
myfail_if( entry->thread == 0, "No client thread was started." );
server_fd = c->socket;
myfail_if( fd_is_closed(server_fd),
"Sanity check failed - client socket wasn't open." );
server_replace_acl( s, new_acl );
server_accept( s );
myfail_if( self_pipe_signal_clear( c->stop_signal ),
"Client was told to stop." );
close( client_fd );
server_close_clients( s );
server_destroy( s );
}
END_TEST
Suite* serve_suite(void)
{
Suite *s = suite_create("serve");
TCase *tc_acl_update = tcase_create("acl_update");
tcase_add_checked_fixture( tc_acl_update, setup, NULL );
tcase_add_test(tc_acl_update, test_replaces_acl);
tcase_add_test(tc_acl_update, test_signals_acl_updated);
tcase_add_exit_test(tc_acl_update, test_acl_update_closes_bad_client, 0);
tcase_add_exit_test(tc_acl_update, test_acl_update_leaves_good_client, 0);
suite_add_tcase(s, tc_acl_update);
return s;
}
int main(void)
{
log_level = LOG_LEVEL;
error_init();
int number_failed;
Suite *s = serve_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

139
tests/unit/check_status.c Normal file
View File

@@ -0,0 +1,139 @@
#include "status.h"
#include "serve.h"
#include "ioutil.h"
#include "util.h"
#include <check.h>
START_TEST( test_status_create )
{
struct server server;
struct status *status = NULL;
status = status_create( &server );
fail_if( NULL == status, "Status wasn't allocated" );
status_destroy( status );
}
END_TEST
START_TEST( test_gets_has_control )
{
struct server server;
struct status * status;
server.has_control = 1;
status = status_create( &server );
fail_unless( status->has_control == 1, "has_control wasn't copied" );
status_destroy( status );
}
END_TEST
START_TEST( test_gets_is_mirroring )
{
struct server server;
struct status * status;
server.mirror = NULL;
status = status_create( &server );
fail_if( status->is_mirroring, "is_mirroring was set" );
status_destroy( status );
server.mirror = (struct mirror *)xmalloc( sizeof( struct mirror ) );
status = status_create( &server );
fail_unless( status->is_mirroring, "is_mirroring wasn't set" );
status_destroy( status );
}
END_TEST
START_TEST( test_renders_has_control )
{
struct status status;
int fds[2];
pipe(fds);
char buf[1024] = {0};
status.has_control = 1;
status_write( &status, fds[1] );
fail_unless( read_until_newline( fds[0], buf, 1024 ) > 0,
"Couldn't read the result" );
char *found = strstr( buf, "has_control=true" );
fail_if( NULL == found, "has_control=true not found" );
status.has_control = 0;
status_write( &status, fds[1] );
fail_unless( read_until_newline( fds[0], buf, 1024 ) > 0,
"Couldn't read the result" );
found = strstr( buf, "has_control=false" );
fail_if( NULL == found, "has_control=false not found" );
}
END_TEST
START_TEST( test_renders_is_mirroring )
{
struct status status;
int fds[2];
pipe(fds);
char buf[1024] = {0};
status.is_mirroring = 1;
status_write( &status, fds[1] );
fail_unless( read_until_newline( fds[0], buf, 1024 ) > 0,
"Couldn't read the result" );
char *found = strstr( buf, "is_mirroring=true" );
fail_if( NULL == found, "is_mirroring=true not found" );
status.is_mirroring = 0;
status_write( &status, fds[1] );
fail_unless( read_until_newline( fds[0], buf, 1024 ) > 0,
"Couldn't read the result" );
found = strstr( buf, "is_mirroring=false" );
fail_if( NULL == found, "is_mirroring=false not found" );
}
END_TEST
Suite *status_suite(void)
{
Suite *s = suite_create("status");
TCase *tc_create = tcase_create("create");
TCase *tc_render = tcase_create("render");
tcase_add_test(tc_create, test_status_create);
tcase_add_test(tc_create, test_gets_has_control);
tcase_add_test(tc_create, test_gets_is_mirroring);
tcase_add_test(tc_render, test_renders_has_control);
tcase_add_test(tc_render, test_renders_is_mirroring);
suite_add_tcase(s, tc_create);
suite_add_tcase(s, tc_render);
return s;
}
int main(void)
{
int number_failed;
Suite *s = status_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}

172
tests/unit/check_util.c Normal file
View File

@@ -0,0 +1,172 @@
#include "util.h"
#include "self_pipe.h"
#include <check.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
struct cleanup_bucket {
struct self_pipe *called_signal;
};
struct cleanup_bucket bkt;
void bucket_init(void){
if ( bkt.called_signal ) {
self_pipe_destroy( bkt.called_signal );
}
bkt.called_signal = self_pipe_create();
}
void setup(void)
{
bkt.called_signal = NULL;
}
int handler_called(void)
{
return self_pipe_signal_clear( bkt.called_signal );
}
void dummy_cleanup( struct cleanup_bucket * foo, int fatal __attribute__((unused)) )
{
if (NULL != foo){
self_pipe_signal( foo->called_signal );
}
}
void trigger_fatal(void)
{
error_init();
error_set_handler( (cleanup_handler*) dummy_cleanup, &bkt );
log_level = 5;
fatal("Expected fatal error");
}
void trigger_error( void )
{
error_init();
error_set_handler( (cleanup_handler *) dummy_cleanup, &bkt);
log_level = 4;
error("Expected error");
}
START_TEST( test_fatal_kills_process )
{
pid_t pid;
pid = fork();
if ( pid == 0 ) {
trigger_fatal();
/* If we get here, just block so the test timeout fails
* us */
sleep(10);
}
else {
int kidstatus;
int result;
result = waitpid( pid, &kidstatus, 0 );
fail_if( result < 0, "Wait failed." );
fail_unless( kidstatus == 6, "Kid was not aborted." );
}
}
END_TEST
void * error_thread( void *nothing __attribute__((unused)) )
{
trigger_error();
return NULL;
}
START_TEST( test_error_doesnt_kill_process )
{
bucket_init();
pthread_attr_t attr;
pthread_t tid;
pthread_attr_init( &attr );
pthread_create( &tid, &attr, error_thread, NULL );
pthread_join( tid, NULL );
}
END_TEST
START_TEST( test_error_calls_handler )
{
bucket_init();
pthread_attr_t attr;
pthread_t tid;
pthread_attr_init( &attr );
pthread_create( &tid, &attr, error_thread, NULL );
pthread_join( tid, NULL );
fail_unless( handler_called(), "Handler wasn't called." );
}
END_TEST
START_TEST( test_fatal_doesnt_call_handler )
{
bucket_init();
pid_t kidpid;
kidpid = fork();
if ( kidpid == 0 ) {
trigger_fatal();
}
else {
int kidstatus;
int result = waitpid( kidpid, &kidstatus, 0 );
fail_if( result < 0, "Wait failed" );
fail_if( handler_called(), "Handler was called.");
}
}
END_TEST
Suite* error_suite(void)
{
Suite *s = suite_create("error");
TCase *tc_process = tcase_create("process");
TCase *tc_handler = tcase_create("handler");
tcase_add_checked_fixture( tc_process, setup, NULL );
tcase_add_test(tc_process, test_fatal_kills_process);
tcase_add_test(tc_process, test_error_doesnt_kill_process);
tcase_add_test(tc_handler, test_error_calls_handler );
tcase_add_test(tc_handler, test_fatal_doesnt_call_handler);
suite_add_tcase(s, tc_process);
suite_add_tcase(s, tc_handler);
return s;
}
int main(void)
{
int number_failed;
Suite *s = error_suite();
SRunner *sr = srunner_create(s);
srunner_run_all(sr, CK_NORMAL);
number_failed = srunner_ntests_failed(sr);
srunner_free(sr);
return (number_failed == 0) ? 0 : 1;
}