docs: remove nltk download steps (#1253)

### Summary

Updates the docs to remove the `nltk` download steps from
`unstructured`. As of `unstructured` `0.4.14`, this is handled
automatically in the relevant modules within `unstructured`.
searx-search-suffix
Matt Robinson 1 year ago committed by GitHub
parent 5bc6dc076e
commit 10e73a3723
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -17,10 +17,6 @@ This page is broken into two parts: installation and setup, and then references
- `poppler-utils`
- `tesseract-ocr`
- `libreoffice`
- Run the following to install NLTK dependencies. `unstructured` will handle this automatically
soon.
- `python -c "import nltk; nltk.download('punkt')"`
- `python -c "import nltk; nltk.download('averaged_perceptron_tagger')"`
- If you are parsing PDFs, run the following to install the `detectron2` model, which
`unstructured` uses for layout detection:
- `pip install "detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.6#egg=detectron2"`

Loading…
Cancel
Save