mirror of
https://github.com/namuan/dr-doc-search
synced 2024-11-15 18:13:56 +00:00
docs: Update README
This commit is contained in:
parent
e3b0303376
commit
cec3c36288
@ -4,6 +4,7 @@ repos:
|
||||
hooks:
|
||||
- id: check-ast
|
||||
- id: check-added-large-files
|
||||
exclude: "^docs/dr-doc-search-github-demo.gif"
|
||||
- id: check-merge-conflict
|
||||
- id: check-case-conflict
|
||||
- id: check-docstring-first
|
||||
|
66
README.md
66
README.md
@ -4,6 +4,11 @@
|
||||
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dr-doc-search?style=flat-square)](https://pypi.python.org/pypi/dr-doc-search/)
|
||||
[![PyPI - License](https://img.shields.io/pypi/l/dr-doc-search?style=flat-square)](https://pypi.python.org/pypi/dr-doc-search/)
|
||||
|
||||
Converse with a book (PDF)
|
||||
|
||||
![](docs/dr-doc-search-github-demo.gif)
|
||||
|
||||
See [tweet](https://twitter.com/deskriders_twt/status/1612088387984588802) for full demo.
|
||||
|
||||
---
|
||||
|
||||
@ -15,8 +20,6 @@
|
||||
|
||||
---
|
||||
|
||||
Converse with an ebook (PDF)
|
||||
|
||||
## Pre-requisites
|
||||
|
||||
- [Tessaract OCR](https://github.com/tesseract-ocr/tesseract)
|
||||
@ -30,6 +33,65 @@ pip install dr-doc-search
|
||||
|
||||
## Example Usage
|
||||
|
||||
There are two steps to use this application:
|
||||
|
||||
**1.** First, you need to create the index and generate embeddings for the PDF file.
|
||||
Here I'm using a PDF file generated from this page [Parable of a Monetary Economy
|
||||
](http://heteconomist.com/parable-of-a-monetary-economy/)
|
||||
|
||||
Before running this, you need to set up your OpenAI API key. You can get it from [OpenAI](https://beta.openai.com/account/api-keys).
|
||||
|
||||
```shell
|
||||
export OPENAI_API_KEY=<your-openai-api-key>
|
||||
```
|
||||
|
||||
The run the following command to start the training process:
|
||||
|
||||
```shell
|
||||
dr-doc-search --train -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf
|
||||
```
|
||||
|
||||
The training process generates some temporary files in the `OutputDir/dr-doc-search/<pdf-name>` folder under your home directory.
|
||||
Here is what it looks like:
|
||||
|
||||
```text
|
||||
~/OutputDir/dr-doc-search/parable-of-a-monetary-economy-heteconomist
|
||||
$ tree
|
||||
.
|
||||
├── images
|
||||
│ ├── output-1.png
|
||||
│ ├── output-10.png
|
||||
│ ├── output-11.png
|
||||
...
|
||||
│ └── output-9.png
|
||||
├── index
|
||||
│ ├── docsearch.index
|
||||
│ └── index.pkl
|
||||
├── parable-of-a-monetary-economy-heteconomist.pdf
|
||||
└── scanned
|
||||
├── output-1.txt
|
||||
...
|
||||
└── output-9.txt
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
> It is possible to change the base of the output directory by providing the `--app-dir` argument.
|
||||
|
||||
**2.** Now that we have the index, we can use it to start asking questions.
|
||||
|
||||
```shell
|
||||
dr-doc-search -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf --input-question "How did the attempt to reduce the debut resulted in decrease in employment?"
|
||||
```
|
||||
|
||||
Or You can open up a web interface (on port :5006) to ask questions:
|
||||
|
||||
```shell
|
||||
dr-doc-search --web-app -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf
|
||||
```
|
||||
|
||||
There are more options for choose the start and end pages for the PDF file.
|
||||
See the help for more details:
|
||||
|
||||
```shell
|
||||
dr-doc-search --help
|
||||
```
|
||||
|
BIN
docs/dr-doc-search-github-demo.gif
Normal file
BIN
docs/dr-doc-search-github-demo.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 30 MiB |
@ -1,5 +1,6 @@
|
||||
"""
|
||||
Indexes a PDF file and generate OpenAI Embeddings
|
||||
Indexes a PDF file and generate OpenAI Embeddings.
|
||||
Also allow user to ask question using the command line interface or the web app.
|
||||
"""
|
||||
from argparse import ArgumentParser, Namespace, RawDescriptionHelpFormatter
|
||||
from pathlib import Path
|
||||
|
Loading…
Reference in New Issue
Block a user