The gzread function is slow. Every time you seek to a new location, the
whole file up to that position has to be decompressed again. This causes
massive lags when trying to do simple things in lnav on a large .gz file.
Use the zlib inflate* functions instead and record the dictionary
periodically while processing the file the first time. Then use
inflateSetDictionary to restore the dictionary to a convenient
location when trying to seek into the file again in the future.
Use a default period of 1MB of compressed data for syncpoints.
Each syncpoint uses 32KB. This is a ratio of 3.2%. For example,
a 1GB .gz file (compressed size) will require us to keep 32MB
of index data in memory. A better method may be to use a fixed
number of syncpoints and divide the file appropriately. This
would keep the memory bounded at the cost of slower file
navigation on large .gz files.
Use pread to read the data for the stream decompressor and remove
the lock_hack previously employed.
NB. The documentation on these zlib functions is sparse. I followed
the example in zlib/examples/zran.c, but I used the z_stream total_in
and total_out variables instead of keeping my own separately as zran.c
does. Maybe this is incompatible with some very old zlib versions.
I haven't looked.
* grep_proc.cc: When a request is queued with the start line
== -1, we need to start searching from the highest line
ever seen and not the last line processed.
* line_buffer.cc: If a partial line was read, we need to
avoid returning another line if more data is appended
to the file.
* lnav.cc: Accept file name patterns on the command-line that
don't match any files yet. Initialize the screen before
redirecting stderr to the log file or /dev/null.
* log_format.hh: Add some comments. Start to add back support
for scrubbing.
* logfile_sub_source.cc: Move scrubbing to the format impl.
* textview_curses.hh: Add comments.