mirror of
https://github.com/vasi/pixz
synced 2024-11-16 21:26:04 +00:00
updates README
This commit is contained in:
parent
378ee786d6
commit
00bad4eed5
80
README
80
README
@ -1,80 +0,0 @@
|
||||
Pixz (pronounced 'pixie') is a parallel, indexing version of XZ
|
||||
|
||||
Repository: https://github.com/vasi/pixz
|
||||
Downloads: https://sourceforge.net/projects/pixz/files/
|
||||
|
||||
|
||||
The existing XZ Utils ( http://tukaani.org/xz/ ) provide great compression in the .xz file format, but they have two significant problems:
|
||||
|
||||
* They are single-threaded, while most users nowadays have multi-core computers.
|
||||
* The .xz files they produce are just one big block of compressed data, rather than a collection of smaller blocks. This makes random access to the original data impossible.
|
||||
|
||||
|
||||
With pixz, both these problems are solved. The most useful commands:
|
||||
|
||||
$ pixz foo.tar foo.tpxz # Compress and index a tarball, multi-core
|
||||
$ pixz -l foo.tpxz # Very quickly list the contents of the compressed tarball
|
||||
$ pixz -d foo.tpxz foo.tar # Decompress it, multi-core
|
||||
$ pixz -x dir/file < foo.tpxz | tar x # Very quickly extract a file, multi-core.
|
||||
# Also verifies that contents match index.
|
||||
|
||||
$ tar -Ipixz -cf foo.tpxz foo # Create a tarball using pixz for multi-core compression
|
||||
|
||||
$ pixz bar bar.xz # Compress a non-tarball, multi-core
|
||||
$ pixz -d bar.xz bar # Decompress it, multi-core
|
||||
|
||||
|
||||
Specifying input and output:
|
||||
|
||||
$ pixz < foo.tar > foo.tpxz # Same as 'pixz foo.tar foo.tpxz'
|
||||
$ pixz -i foo.tar -o foo.tpxz # Ditto. These both work for -x, -d and -l too, eg:
|
||||
|
||||
$ pixz -x -i foo.tpxz -o foo.tar file1 file2 ... # Extract the files from foo.tpxz into foo.tar
|
||||
|
||||
$ pixz foo.tar # Compress it to foo.tpxz, removing the original
|
||||
$ pixz -d foo.tpxz # Extract it to foo.tar, removing the original
|
||||
|
||||
|
||||
Other flags:
|
||||
|
||||
$ pixz -1 foo.tar # Faster, worse compression
|
||||
$ pixz -9 foo.tar # Better, slower compression
|
||||
$ pixz -p 2 foo.tar # Cap the number of threads at 2
|
||||
|
||||
$ pixz -t foo.tar # Compress but don't treat it as a tarball (don't index it)
|
||||
$ pixz -d -t foo.tpxz # Decompress foo, don't check that contents match index
|
||||
$ pixz -l -t foo.tpxz # List the xz blocks instead of files
|
||||
|
||||
For even more tuning flags, check the manual page.
|
||||
|
||||
|
||||
Compare to:
|
||||
plzip
|
||||
* About equally complex, efficient
|
||||
* lzip format seems less-used
|
||||
* Version 1 is theoretically indexable...I think
|
||||
ChopZip
|
||||
* Python, much simpler
|
||||
* More flexible, supports arbitrary compression programs
|
||||
* Uses streams instead of blocks, not indexable
|
||||
* Splits input and then combines output, much higher disk usage
|
||||
pxz
|
||||
* Simpler code
|
||||
* Uses OpenMP instead of pthreads
|
||||
* Uses streams instead of blocks, not indexable
|
||||
* Uses temp files and doesn't combine them until the whole file is compressed, high disk/memory usage
|
||||
|
||||
Comparable tools for other compression algorithms:
|
||||
pbzip2
|
||||
* Not indexable
|
||||
* Appears slow
|
||||
* bzip2 algorithm is non-ideal
|
||||
pigz
|
||||
* Not indexable
|
||||
dictzip, idzip
|
||||
* Not parallel
|
||||
|
||||
|
||||
Requirements:
|
||||
* libarchive 2.8 or later
|
||||
* liblzma 4.999.9-beta-212 or later (from the xz distribution)
|
182
README.md
Normal file
182
README.md
Normal file
@ -0,0 +1,182 @@
|
||||
pixz
|
||||
====
|
||||
|
||||
Pixz (pronounced *pixie*) is a parallel, indexing version of `xz`.
|
||||
|
||||
Repository: https://github.com/vasi/pixz
|
||||
|
||||
Downloads: https://github.com/vasi/pixz/releases
|
||||
|
||||
pixz vs xz
|
||||
----------
|
||||
|
||||
The existing [XZ Utils](http://tukaani.org/xz/) provide great compression in the `.xz` file format,
|
||||
but they have two significant problems:
|
||||
|
||||
- they are single-threaded, while most users nowadays have multi-core computers
|
||||
- the `.xz` files they produce are just one big block of compressed data, rather than a collection
|
||||
of smaller blocks which makes random access to the original data impossible
|
||||
|
||||
With pixz, both these problems are solved.
|
||||
|
||||
Building pixz
|
||||
-------------
|
||||
|
||||
General help about the building process's configuration step can be acquired via:
|
||||
|
||||
```
|
||||
./configure --help
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
- pthreads
|
||||
- liblzma 4.999.9-beta-212 or later (from the xz distribution)
|
||||
- libarchive 2.8 or later
|
||||
- AsciiDoc to generate the man page
|
||||
|
||||
### Build from Release Tarball
|
||||
|
||||
```
|
||||
./configure
|
||||
make
|
||||
make install
|
||||
```
|
||||
|
||||
You many need `sudo` permissions to run `make install`.
|
||||
|
||||
### Build from GitHub
|
||||
|
||||
```
|
||||
git clone https://github.com/vasi/pixz.git
|
||||
cd pixz
|
||||
./autogen.sh
|
||||
./configure
|
||||
make
|
||||
make install
|
||||
```
|
||||
|
||||
You many need `sudo` permissions to run `make install`.
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
### Single Files
|
||||
|
||||
Compress a single file (no tarball, just compression), multi-core:
|
||||
|
||||
pixz bar bar.xz
|
||||
|
||||
Decompress it, multi-core:
|
||||
|
||||
pixz -d bar.xz bar
|
||||
|
||||
### Tarballs
|
||||
|
||||
Compress and index a tarball, multi-core:
|
||||
|
||||
pixz foo.tar foo.tpxz
|
||||
|
||||
Very quickly list the contents of the compressed tarball:
|
||||
|
||||
pixz -l foo.tpxz
|
||||
|
||||
Decompress the tarball, multi-core:
|
||||
|
||||
pixz -d foo.tpxz foo.tar
|
||||
|
||||
Very quickly extract a single file, multi-core, also verifies that contents match index:
|
||||
|
||||
pixz -x dir/file < foo.tpxz | tar x
|
||||
|
||||
Create a tarball using pixz for multi-core compression:
|
||||
|
||||
tar -Ipixz -cf foo.tpxz foo/
|
||||
|
||||
### Specifying Input and Output
|
||||
|
||||
These are the same (also work for `-x`, `-d` and `-l` as well):
|
||||
|
||||
pixz foo.tar foo.tpxz
|
||||
pixz < foo.tar > foo.tpxz
|
||||
pixz -i foo.tar -o foo.tpxz
|
||||
|
||||
Extract the files from `foo.tpxz` into `foo.tar`:
|
||||
|
||||
pixz -x -i foo.tpxz -o foo.tar file1 file2 ...
|
||||
|
||||
Compress to `foo.tpxz`, removing the original:
|
||||
|
||||
pixz foo.tar
|
||||
|
||||
Extract to `foo.tar`, removing the original:
|
||||
|
||||
pixz -d foo.tpxz
|
||||
|
||||
### Other Flags
|
||||
|
||||
Faster, worse compression:
|
||||
|
||||
pixz -1 foo.tar
|
||||
|
||||
Better, slower compression:
|
||||
|
||||
pixz -9 foo.tar
|
||||
|
||||
Use exactly 2 threads:
|
||||
|
||||
pixz -p 2 foo.tar
|
||||
|
||||
Compress, but do not treat it as a tarball, i.e. do not index it:
|
||||
|
||||
pixz -t foo.tar
|
||||
|
||||
Decompress, but do not check that contents match index:
|
||||
|
||||
pixz -d -t foo.tpxz
|
||||
|
||||
List the xz blocks instead of files:
|
||||
|
||||
pixz -l -t foo.tpxz
|
||||
|
||||
For even more tuning flags, check the manual page:
|
||||
|
||||
man pixz
|
||||
|
||||
Comparison to other Tools
|
||||
-------------------------
|
||||
|
||||
### plzip
|
||||
|
||||
- about equally complex and efficient
|
||||
- lzip format seems less-used
|
||||
- version 1 is theoretically indexable, I think
|
||||
|
||||
### ChopZip
|
||||
|
||||
- written in Python, much simpler
|
||||
- more flexible, supports arbitrary compression programs
|
||||
- uses streams instead of blocks, not indexable
|
||||
- splits input and then combines output, much higher disk usage
|
||||
|
||||
### pxz
|
||||
|
||||
- simpler code
|
||||
- uses OpenMP instead of pthreads
|
||||
- uses streams instead of blocks, not indexable
|
||||
- uses temporary files and does not combine them until the whole file is compressed, high disk and
|
||||
memory usage
|
||||
|
||||
### pbzip2
|
||||
|
||||
- not indexable
|
||||
- appears slow
|
||||
- bzip2 algorithm is non-ideal
|
||||
|
||||
### pigz
|
||||
|
||||
- not indexable
|
||||
|
||||
### dictzip, idzip
|
||||
|
||||
- not parallel
|
Loading…
Reference in New Issue
Block a user