mirror of https://github.com/hwchase17/langchain
community[patch]: Fix YandexGPT embeddings (#19720)
Fix of YandexGPT embeddings. The current version uses a single `model_name` for queries and documents, essentially making the `embed_documents` and `embed_query` methods the same. Yandex has a different endpoint (`model_uri`) for encoding documents, see [this](https://yandex.cloud/en/docs/yandexgpt/concepts/embeddings). The bug may impact retrievers built with `YandexGPTEmbeddings` (for instance FAISS database as retriever) since they use both `embed_documents` and `embed_query`. A simple snippet to test the behaviour: ```python from langchain_community.embeddings.yandex import YandexGPTEmbeddings embeddings = YandexGPTEmbeddings() q_emb = embeddings.embed_query('hello world') doc_emb = embeddings.embed_documents(['hello world', 'hello world']) q_emb == doc_emb[0] ``` The response is `True` with the current version and `False` with the changes I made. Twitter: @egor_krash --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>pull/20427/head
parent
4be7ca7b4c
commit
c8391d4ff1
@ -0,0 +1,24 @@
|
||||
import os
|
||||
|
||||
from langchain_community.embeddings import YandexGPTEmbeddings
|
||||
|
||||
|
||||
def test_init() -> None:
|
||||
os.environ["YC_API_KEY"] = "foo"
|
||||
models = [
|
||||
YandexGPTEmbeddings(folder_id="bar"),
|
||||
YandexGPTEmbeddings(
|
||||
query_model_uri="emb://bar/text-search-query/latest",
|
||||
doc_model_uri="emb://bar/text-search-doc/latest",
|
||||
),
|
||||
YandexGPTEmbeddings(
|
||||
folder_id="bar",
|
||||
query_model_name="text-search-query",
|
||||
doc_model_name="text-search-doc",
|
||||
),
|
||||
]
|
||||
for embeddings in models:
|
||||
assert embeddings.model_uri == "emb://bar/text-search-query/latest"
|
||||
assert embeddings.doc_model_uri == "emb://bar/text-search-doc/latest"
|
||||
assert embeddings.model_name == "text-search-query"
|
||||
assert embeddings.doc_model_name == "text-search-doc"
|
Loading…
Reference in New Issue