MT-write - multi-threaded writing patch
What is MT-write?
MT-write enables tar and unzip to write multiple files in parallel.
The goal thereof is to speed up the proces of extracting a tar file to a
filesystem. Especially Solaris' ZFS filesystem is able to benefit from
MT-write's strategy. The amount of performance improvement strongly
depends on the machine and its filesystem. In my experience the
performance improvement with tar is in the are of a
factor of two to
three - i.e. the extraction time of a tar file is cut in
half. Be aware that this experience is based on experiments on an
UltraSPARC machine with SCSI disks.
Feel free to send me your results of performance experiments -
especially if you have further hints for tuning.
How does MT-write work?
MT-write intercepts the executable's calls to write and friends and
hands them off to worker threads. In consequence the executable will be
unaware of occuring errors. If any error occures mtwrite.so writes an
errormessage to the standard error output of the executable.
How big is the benefit?
Well that really depends. On my dual-processor UltraSPARC running
Solaris 10 using a ZFS on two fast 10k RPM SCSI Disks setup as a mirror
the benefit is
at least at a factor of two to three for
mtstar. I have even experienced a performance advantage of a
factor of eight. On a more regular filesystem (e.g. UFS) the results
differ - I have seen small performance regressions and minor performance
enhancements.
Key point for performance tuning with MT-write is finding the apropriate
maximum number of threads for your setup. Initially, I shipped
mtwrite.so with a default of 16 threads at maximum. I doubled this value
to 32 for r70317, as I have seen further improvements with more threads.
If you are extracting very big files, mtwrite will consume a lot of
memory. You can limit its hunger by setting
MTWRITE_MEMPERTHREAD. It reflects the maximum amount of memory
available to each working thread. If there isn't enough memory left,
write()s will be issued from the write initiator instead of copying the
data to a worker thread.
How do I use it?
mtwrite.so is a preloadable shared object designed to be preloaded to the
tar executable. E.g.
$ env LD_PRELOAD=mtwrite.so tar xf mytarfile.tar
For the programs that I have tested, wrapper scripts are included for
your convenience that have the prefix "mt". So you can simply type:
$ mttar xf mytarfile.tar
Where can I get it?
Here:
MT-write R100528
MT-write R90214
MT-write R80423
MT-write R80417
MT-write R70317
MT-write R70225
MT-write R70210
MT-write R70206
MT-write R70125
LICENSE:
GNU LGPL 2
Thomas Maier-Komor