Add parallel transaction support for remote source or target specifications.
The implementation is a bit crude because I don't want to take too many
chances on a codebase that wasn't originally designed to be multi-threaded,
so the master mutex is only released when a thread is waiting for input
on a socket.
* Add the -p<threads> option and compile -pthread by default. This is only
useful when the source and/or destination is a remote host. Note that
parallel transaction mode will not work with older cpdup binaries on the
remote end.
This greatly improves cpdup's performance when operating on a remote
source and/or target.
* Add -l to force stdout and stderr to be line-buffered.