mirror of
https://github.com/phiresky/ripgrep-all.git
synced 2024-11-12 19:10:32 +00:00
rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
.vscode | ||
ci | ||
exampledir | ||
src | ||
.gitignore | ||
.travis.yml | ||
Cargo.lock | ||
Cargo.toml | ||
README.md | ||
rustfmt.toml |
rga - ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc
rga is a tool to recursively search for text in many different types of files. It is based on the awesome ripgrep.
Future Work
- photograph adapter (based on object classification / detection (yolo?)) for fun, based on something like this. Tried, but very hard to integrate (especially state of the art approaches).
- 7z adapter (couldn't find a nice to use rust library)
considerations
- matching on mime (magic bytes) instead of filename
- allow per-adapter configuration options
Setup
rga should compile with stable Rust. To install it, simply run
apt install build-essential pandoc poppler-utils
cargo install ripgrep_all
rga --help
Some rga adapters run external binaries
Development
To enable debug logging:
export RUST_LOG=debug
export RUST_BACKTRACE=1
Also rember to disable caching with --rga-no-cache
or clear the cache in ~/.cache/rga
to debug the adapters.
Similar tools
- pdfgrep
- this gist has my proof of concept version of a caching extractor to use ripgrep as a replacement for pdfgrep.
- this gist is a more extensive preprocessing script by @ColonolBuendia