You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/docs/integrations/providers/deepsparse.mdx

35 lines
1.2 KiB
Markdown

# DeepSparse
This page covers how to use the [DeepSparse](https://github.com/neuralmagic/deepsparse) inference runtime within LangChain.
It is broken into two parts: installation and setup, and then examples of DeepSparse usage.
## Installation and Setup
- Install the Python package with `pip install deepsparse`
- Choose a [SparseZoo model](https://sparsezoo.neuralmagic.com/?useCase=text_generation) or export a support model to ONNX [using Optimum](https://github.com/neuralmagic/notebooks/blob/main/notebooks/opt-text-generation-deepsparse-quickstart/OPT_Text_Generation_DeepSparse_Quickstart.ipynb)
## LLMs
There exists a DeepSparse LLM wrapper, which you can access with:
```python
from langchain_community.llms import DeepSparse
```
It provides a unified interface for all models:
```python
llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none')
print(llm.invoke('def fib():'))
```
Additional parameters can be passed using the `config` parameter:
```python
config = {'max_generated_tokens': 256}
llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none', config=config)
```