MT-write - multi-threaded writing patch

What is MT-write?

MT-write enables tar and unzip to write multiple files in parallel. The goal thereof is to speed up the proces of extracting a tar file to a filesystem. Especially Solaris' ZFS filesystem is able to benefit from MT-write's strategy. The amount of performance improvement strongly depends on the machine and its filesystem. In my experience the performance improvement with tar is in the are of a factor of two to three - i.e. the extraction time of a tar file is cut in half. Be aware that this experience is based on experiments on an UltraSPARC machine with SCSI disks. Feel free to send me your results of performance experiments - especially if you have further hints for tuning.

How does MT-write work?

MT-write intercepts the executable's calls to write and friends and hands them off to worker threads. In consequence the executable will be unaware of occuring errors. If any error occures mtwrite.so writes an errormessage to the standard error output of the executable.

How big is the benefit?

Well that really depends. On my dual-processor UltraSPARC running Solaris 10 using a ZFS on two fast 10k RPM SCSI Disks setup as a mirror the benefit is at least at a factor of two to three for mtstar. I have even experienced a performance advantage of a factor of eight. On a more regular filesystem (e.g. UFS) the results differ - I have seen small performance regressions and minor performance enhancements.

Key point for performance tuning with MT-write is finding the apropriate maximum number of threads for your setup. Initially, I shipped mtwrite.so with a default of 16 threads at maximum. I doubled this value to 32 for r70317, as I have seen further improvements with more threads.

If you are extracting very big files, mtwrite will consume a lot of memory. You can limit its hunger by setting MTWRITE_MEMPERTHREAD. It reflects the maximum amount of memory available to each working thread. If there isn't enough memory left, write()s will be issued from the write initiator instead of copying the data to a worker thread.

How do I use it?

mtwrite.so is a preloadable shared object designed to be preloaded to the tar executable. E.g.
$ env LD_PRELOAD=mtwrite.so tar xf mytarfile.tar For the programs that I have tested, wrapper scripts are included for your convenience that have the prefix "mt". So you can simply type:
$ mttar xf mytarfile.tar

Where can I get it?

Here:
MT-write R100528
MT-write R90214
MT-write R80423
MT-write R80417
MT-write R70317
MT-write R70225
MT-write R70210
MT-write R70206
MT-write R70125

LICENSE:

GNU LGPL 2
Thomas Maier-Komor