docs: update unstructured install instructions (#8596)

### Summary

Updates the `unstructured` install instructions. For
`unstructured>=0.9.0`, dependencies are broken out by document type and
the base `unstructured` package includes fewer dependencies. `pip
install "unstructured[local-inference]"` has been replace by `pip
install "unstructured[all-docs]"`, though the `local-inference` extra is
still supported for the time being.

### Reviewers

- @rlancemartin
- @eyurtsev
- @hwchase17
This commit is contained in:
Matt Robinson 2023-08-01 17:17:49 -04:00 committed by GitHub
parent 73072d3db8
commit 8961c720b8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 4 additions and 3 deletions

View File

@ -18,8 +18,7 @@
"outputs": [],
"source": [
"# # Install package\n",
"!pip install \"unstructured[local-inference]\"\n",
"!pip install layoutparser[layoutmodels,tesseract]"
"!pip install \"unstructured[all-docs]\"\n"
]
},
{

View File

@ -11,7 +11,9 @@ ecosystem within LangChain.
If you are using a loader that runs locally, use the following steps to get `unstructured` and
its dependencies running locally.
- Install the Python SDK with `pip install "unstructured[local-inference]"`
- Install the Python SDK with `pip install unstructured`.
- You can install document specific dependencies with extras, i.e. `pip install "unstructured[docx]"`.
- To install the dependencies for all document types, use `pip install "unstructured[all-docs]"`.
- Install the following system dependencies if they are not already available on your system.
Depending on what document types you're parsing, you may not need all of these.
- `libmagic-dev` (filetype detection)