The gzread function is slow. Every time you seek to a new location, the
whole file up to that position has to be decompressed again. This causes
massive lags when trying to do simple things in lnav on a large .gz file.
Use the zlib inflate* functions instead and record the dictionary
periodically while processing the file the first time. Then use
inflateSetDictionary to restore the dictionary to a convenient
location when trying to seek into the file again in the future.
Use a default period of 1MB of compressed data for syncpoints.
Each syncpoint uses 32KB. This is a ratio of 3.2%. For example,
a 1GB .gz file (compressed size) will require us to keep 32MB
of index data in memory. A better method may be to use a fixed
number of syncpoints and divide the file appropriately. This
would keep the memory bounded at the cost of slower file
navigation on large .gz files.
Use pread to read the data for the stream decompressor and remove
the lock_hack previously employed.
NB. The documentation on these zlib functions is sparse. I followed
the example in zlib/examples/zran.c, but I used the z_stream total_in
and total_out variables instead of keeping my own separately as zran.c
does. Maybe this is incompatible with some very old zlib versions.
I haven't looked.