feat: add batch request support for text-embedding-v3 model (#26375)

PR title: “langchain: add batch request support for text-embedding-v3
model”

PR message:

• Description: This PR introduces batch request support for the
text-embedding-v3 model within LangChain. The new functionality allows
users to process multiple text inputs in a single request, improving
efficiency and performance for high-volume applications.
	•	Issue: This PR addresses #<issue_number> (if applicable).
• Dependencies: No new external dependencies are required for this
change.
• Twitter handle: If announced on Twitter, please mention me at
@yourhandle.

Add tests and docs:

1. Added unit tests to cover the batch request functionality, ensuring
it operates without requiring network access.
2. Included an example notebook demonstrating the batch request feature,
located in docs/docs/integrations.

Lint and test: All required formatting and linting checks have been
performed using make format and make lint. The changes have been
verified with make test to ensure compatibility.

Additional notes:

	•	The changes are fully backwards compatible.
• No modifications were made to pyproject.toml, ensuring no new
dependencies were added.
• The update only affects the langchain package and does not involve
other packages.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
L 2024-11-01 02:56:22 +08:00 committed by GitHub
parent 2545fbe709
commit 8ef0df3539
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -23,6 +23,8 @@ from tenacity import (
logger = logging.getLogger(__name__)
BATCH_SIZE = {"text-embedding-v1": 25, "text-embedding-v2": 25, "text-embedding-v3": 6}
def _create_retry_decorator(embeddings: DashScopeEmbeddings) -> Callable[[Any], Any]:
multiplier = 1
@ -49,9 +51,12 @@ def embed_with_retry(embeddings: DashScopeEmbeddings, **kwargs: Any) -> Any:
i = 0
input_data = kwargs["input"]
input_len = len(input_data) if isinstance(input_data, list) else 1
batch_size = BATCH_SIZE.get(kwargs["model"], 25)
while i < input_len:
kwargs["input"] = (
input_data[i : i + 25] if isinstance(input_data, list) else input_data
input_data[i : i + batch_size]
if isinstance(input_data, list)
else input_data
)
resp = embeddings.client.call(**kwargs)
if resp.status_code == 200:
@ -67,7 +72,7 @@ def embed_with_retry(embeddings: DashScopeEmbeddings, **kwargs: Any) -> Any:
f"code: {resp.code} \n message: {resp.message}",
response=resp,
)
i += 25
i += batch_size
return result
return _embed_with_retry(**kwargs)