`LlamaCppEmbeddings`: adds `verbose` parameter, similar to `llms.LlamaCpp` class (#11038)

## Description

As of now, when instantiating and during inference, `LlamaCppEmbeddings`
outputs (a lot of) verbose when controlled from Langchain binding - it
is a bit annoying when computing the embeddings of long documents, for
instance.

This PR adds `verbose` for `LlamaCppEmbeddings` objects to be able
**not** to print the verbose of the model to `stderr`. It is natively
supported by `llama-cpp-python` and directly passed to the library – the
PR is hence very small.

The value of `verbose` is `True` by default, following the way it is
defined in [`LlamaCpp` (`llamacpp.py`
#L136-L137)](c87e9fb2ce/libs/langchain/langchain/llms/llamacpp.py (L136-L137))

## Issue

_No issue linked_

## Dependencies

_No additional dependency needed_

## To see it in action

```python
from langchain.embeddings import LlamaCppEmbeddings

MODEL_PATH = "<path_to_gguf_file>"

if __name__ == "__main__":
    llm_embeddings = LlamaCppEmbeddings(
        model_path=MODEL_PATH,
        n_gpu_layers=1,
        n_batch=512,
        n_ctx=2048,
        f16_kv=True,
        verbose=False,
    )
```

Co-authored-by: Bagatur <baskaryan@gmail.com>
pull/11006/head^2
Clément Sicard 11 months ago committed by GitHub
parent a00a73ef18
commit 1b48d6cb8c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -54,6 +54,9 @@ class LlamaCppEmbeddings(BaseModel, Embeddings):
n_gpu_layers: Optional[int] = Field(None, alias="n_gpu_layers")
"""Number of layers to be loaded into gpu memory. Default None."""
verbose: bool = Field(True, alias="verbose")
"""Print verbose output to stderr."""
class Config:
"""Configuration for this pydantic object."""
@ -73,6 +76,7 @@ class LlamaCppEmbeddings(BaseModel, Embeddings):
"use_mlock",
"n_threads",
"n_batch",
"verbose",
]
model_params = {k: values[k] for k in model_param_names}
# For backwards compatibility, only include if non-null.

Loading…
Cancel
Save