2
0
mirror of https://github.com/vasi/pixz synced 2024-10-30 15:21:41 +00:00
Go to file
2024-07-06 01:09:14 -04:00
m4 complete autotools build 2015-08-07 11:33:37 +02:00
src Use multiple files in extraction example 2024-07-06 01:09:14 -04:00
test more portable permission test 2021-01-24 07:15:40 -05:00
.gitattributes Add "make tarball" 2012-12-22 18:25:45 -05:00
.gitignore testing framework and first test 2015-08-08 12:30:49 +02:00
.travis.yml update travis instructions 2017-05-12 18:41:39 +02:00
autogen.sh complete autotools build 2015-08-07 11:33:37 +02:00
configure.ac escape newline 2024-03-07 03:56:25 -05:00
LICENSE update dates 2020-08-18 00:22:41 -04:00
Makefile.am fix README.md file name in DIST 2015-09-08 13:02:55 +02:00
NEWS release 1.0.7 2020-07-26 16:34:00 -04:00
README.md improve compatibility section 2024-05-08 14:36:10 -04:00
test.sh test.sh: Make it general, make it work 2012-10-13 07:11:16 -04:00
TODO update TODO because autoconf is done 2015-08-07 11:39:22 +02:00

pixz

Build Status

Pixz (pronounced pixie) is a parallel, indexing version of xz.

Repository: https://github.com/vasi/pixz

Downloads: https://github.com/vasi/pixz/releases

pixz vs xz

The existing XZ Utils provide great compression in the .xz file format, but they produce just one big block of compressed data. Pixz instead produces a collection of smaller blocks which makes random access to the original data possible. This is especially useful for large tarballs.

Differences to xz

  • pixz automatically indexes tarballs during compression (unless the -t argument is used)
  • pixz supports parallel decompression, which xz does not
  • pixz defaults to using all available CPU cores, while xz defaults to using only one core
  • pixz provides -i and -o command line options to specify input and output file
  • pixz does not need the command line option -z (or --compress). Instead, it compresses by default, and decompresses if -d is passed.
  • pixz uses different logic to decide whether to use stdin/stdout. pixz somefile will always output to another file, while pixz with no filenames will always use stdin/stdout. There's no -c argument to explicitly request stdout.
  • Some other flags mean different things for pixz and xz, including -f, -l, -q and -t. Please read the manpages for more detail on these.

Building pixz

General help about the building process's configuration step can be acquired via:

./configure --help

Dependencies

  • pthreads
  • liblzma 4.999.9-beta-212 or later (from the xz distribution)
  • libarchive 2.8 or later
  • AsciiDoc to generate the man page

Build from Release Tarball

./configure
make
make install

You many need sudo permissions to run make install.

Build from GitHub

git clone https://github.com/vasi/pixz.git
cd pixz
./autogen.sh
./configure
make
make install

You many need sudo permissions to run make install.

Usage

Single Files

Compress a single file (no tarball, just compression), multi-core:

pixz bar bar.xz

Decompress it, multi-core:

pixz -d bar.xz bar

Tarballs

Compress and index a tarball, multi-core:

pixz foo.tar foo.tpxz

Very quickly list the contents of the compressed tarball:

pixz -l foo.tpxz

Decompress the tarball, multi-core:

pixz -d foo.tpxz foo.tar

Very quickly extract a single file, multi-core, also verifies that contents match index:

pixz -x dir/file < foo.tpxz | tar x

Create a tarball using pixz for multi-core compression:

tar -Ipixz -cf foo.tpxz foo/

Specifying Input and Output

These are the same (also work for -x, -d and -l as well):

pixz foo.tar foo.tpxz
pixz < foo.tar > foo.tpxz
pixz -i foo.tar -o foo.tpxz

Extract the files from foo.tpxz into foo.tar:

pixz -x -i foo.tpxz -o foo.tar file1 file2 ...

Compress to foo.tpxz, removing the original:

pixz foo.tar

Extract to foo.tar, removing the original:

pixz -d foo.tpxz

Other Flags

Faster, worse compression:

pixz -1 foo.tar

Better, slower compression:

pixz -9 foo.tar

Use exactly 2 threads:

pixz -p 2 foo.tar

Compress, but do not treat it as a tarball, i.e. do not index it:

pixz -t foo.tar

Decompress, but do not check that contents match index:

pixz -d -t foo.tpxz

List the xz blocks instead of files:

pixz -l -t foo.tpxz

For even more tuning flags, check the manual page:

man pixz

Comparison to other Tools

plzip

  • about equally complex and efficient
  • lzip format seems less-used
  • version 1 is theoretically indexable, I think

ChopZip

  • written in Python, much simpler
  • more flexible, supports arbitrary compression programs
  • uses streams instead of blocks, not indexable
  • splits input and then combines output, much higher disk usage

pxz

  • simpler code
  • uses OpenMP instead of pthreads
  • uses streams instead of blocks, not indexable
  • uses temporary files and does not combine them until the whole file is compressed, high disk and memory usage

pbzip2

  • not indexable
  • appears slow
  • bzip2 algorithm is non-ideal

pigz

  • not indexable

dictzip, idzip

  • not parallel