2012-12-23 05:52:48 +00:00
Pixz (pronounced 'pixie') is a parallel, indexing version of XZ
Repository: https://github.com/vasi/pixz
Downloads: https://sourceforge.net/projects/pixz/files/
2010-01-13 06:04:56 +00:00
The existing XZ Utils ( http://tukaani.org/xz/ ) provide great compression in the .xz file format, but they have two significant problems:
* They are single-threaded, while most users nowadays have multi-core computers.
* The .xz files they produce are just one big block of compressed data, rather than a collection of smaller blocks. This makes random access to the original data impossible.
2010-10-14 06:11:46 +00:00
With pixz, both these problems are solved. The most useful commands:
2010-01-13 06:04:56 +00:00
2010-10-14 06:13:23 +00:00
$ pixz foo.tar foo.tpxz # Compress and index a tarball, multi-core
$ pixz -l foo.tpxz # Very quickly list the contents of the compressed tarball
2010-10-14 06:29:14 +00:00
$ pixz -d foo.tpxz foo.tar # Decompress it, multi-core
2010-10-14 06:13:23 +00:00
$ pixz -x dir/file < foo.tpxz | tar x # Very quickly extract a file, multi-core.
# Also verifies that contents match index.
2010-01-13 06:04:56 +00:00
2010-10-14 06:29:14 +00:00
$ tar -Ipixz -cf foo.tpxz foo # Create a tarball using pixz for multi-core compression
2010-10-14 06:13:23 +00:00
$ pixz bar bar.xz # Compress a non-tarball, multi-core
$ pixz -d bar.xz bar # Decompress it, multi-core
2010-01-13 06:04:56 +00:00
2010-01-17 01:04:25 +00:00
2010-10-14 06:11:46 +00:00
Specifying input and output:
2010-10-14 06:13:23 +00:00
$ pixz < foo.tar > foo.tpxz # Same as 'pixz foo.tar foo.tpxz'
$ pixz -i foo.tar -o foo.tpxz # Ditto. These both work for -x, -d and -l too, eg:
2010-10-14 06:11:46 +00:00
$ pixz -x -i foo.tpxz -o foo.tar file1 file2 ... # Extract the files from foo.tpxz into foo.tar
2010-10-14 06:13:23 +00:00
$ pixz foo.tar # Compress it to foo.tpxz, removing the original
$ pixz -d foo.tpxz # Extract it to foo.tar, removing the original
2010-10-14 06:11:46 +00:00
Other flags:
2010-10-14 06:13:23 +00:00
$ pixz -1 foo.tar # Faster, worse compression
2012-10-14 06:01:11 +00:00
$ pixz -9 foo.tar # Better, slower compression
$ pixz -p 2 foo.tar # Cap the number of threads at 2
2010-10-14 06:11:46 +00:00
2010-10-14 06:13:23 +00:00
$ pixz -t foo.tar # Compress but don't treat it as a tarball (don't index it)
$ pixz -d -t foo.tpxz # Decompress foo, don't check that contents match index
$ pixz -l -t foo.tpxz # List the xz blocks instead of files
2010-01-17 01:04:25 +00:00
2012-12-23 05:52:48 +00:00
For even more tuning flags, check the manual page.
2010-01-17 01:04:25 +00:00
Compare to:
2010-10-14 06:13:23 +00:00
plzip
* About equally complex, efficient
* lzip format seems less-used
* Version 1 is theoretically indexable...I think
ChopZip
* Python, much simpler
* More flexible, supports arbitrary compression programs
* Uses streams instead of blocks, not indexable
* Splits input and then combines output, much higher disk usage
pxz
* Simpler code
* Uses OpenMP instead of pthreads
* Uses streams instead of blocks, not indexable
* Uses temp files and doesn't combine them until the whole file is compressed, high disk/memory usage
2010-01-17 01:04:25 +00:00
Comparable tools for other compression algorithms:
2010-10-14 06:13:23 +00:00
pbzip2
* Not indexable
* Appears slow
* bzip2 algorithm is non-ideal
pigz
* Not indexable
2012-12-22 20:18:39 +00:00
dictzip, idzip
2010-10-14 06:13:23 +00:00
* Not parallel
2010-10-11 03:41:23 +00:00
Requirements:
2010-10-14 06:13:23 +00:00
* libarchive 2.8 or later
* liblzma 4.999.9-beta-212 or later (from the xz distribution)