langchain/langchain
wewebber-merlin 8a7c95e555
Retryable exception for empty OpenAI embedding. (#7070)
Description:

The OpenAI "embeddings" API intermittently falls into a failure state
where an embedding is returned as [ Nan ], rather than the expected 1536
floats. This patch checks for that state (specifically, for an embedding
of length 1) and if it occurs, throws an ApiError, which will cause the
chunk to be retried.

Issue:

I have been unable to find an official langchain issue for this problem,
but it is discussed (by another user) at
https://stackoverflow.com/questions/76469415/getting-embeddings-of-length-1-from-langchain-openaiembeddings

Maintainer: @dev2049

Testing: 

Since this is an intermittent OpenAI issue, I have not provided a unit
or integration test. The provided code has, though, been run
successfully over several million tokens.

---------

Co-authored-by: William Webber <william@williamwebber.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 15:23:45 -04:00
..
agents Remove duplicate lines (#7138) 2023-07-04 20:13:27 -04:00
callbacks support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
chains support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
chat_models support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
client Harrison/split schema dir (#7025) 2023-07-01 13:39:19 -04:00
docstore Update in_memory.py to fix "TypeError: keywords must be strings" (#7202) 2023-07-05 12:48:38 -04:00
document_loaders added Brave Search document_loader (#6989) 2023-07-02 19:01:24 -07:00
embeddings Retryable exception for empty OpenAI embedding. (#7070) 2023-07-05 15:23:45 -04:00
evaluation support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
experimental docstrings document_loaders 1 (#6847) 2023-07-02 12:13:04 -07:00
graphs Support for SPARQL (#7165) 2023-07-05 13:00:16 -04:00
indexes
llms support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
load added docstrings where they missed (#6626) 2023-06-23 15:49:44 -07:00
memory add docstring for in memory class (#7160) 2023-07-04 14:59:17 -07:00
output_parsers Mark some output parsers as serializable (cross-checked w/ JS) (#7083) 2023-07-05 14:53:56 -04:00
prompts Jinja2 validation changed to issue warnings rather than issuing exceptions. (#7161) 2023-07-05 14:04:29 -04:00
retrievers fix: Chroma filter symbols not supporting LIKE and CONTAIN (#7169) 2023-07-05 14:04:18 -04:00
schema support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
tools support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
utilities Add serialized object to retriever start callback (#7074) 2023-07-05 18:04:43 +01:00
vectorstores Remove extra base model (#7213) 2023-07-05 14:02:27 -04:00
__init__.py move base prompt to schema (#6995) 2023-07-02 22:38:59 -04:00
base_language.py Harrison/split schema dir (#7025) 2023-07-01 13:39:19 -04:00
cache.py Page per class-style api reference (#6560) 2023-06-30 09:23:32 -07:00
docker-compose.yaml
document_transformers.py added docstrings where they missed (#6626) 2023-06-23 15:49:44 -07:00
env.py Use LCP Client in Tracer (#5908) 2023-06-08 21:15:14 -07:00
example_generator.py
formatting.py
input.py FileCallbackHandler (#5589) 2023-06-03 16:48:48 -07:00
math_utils.py
model_laboratory.py
py.typed
python.py
requests.py fix: missing parameter in POST/PUT/PATCH HTTP requests (#7194) 2023-07-05 12:47:30 -04:00
serpapi.py
server.py Fix for ModuleNotFoundError while running langchain-server. Issue #5833 (#6077) 2023-06-13 08:37:07 -07:00
sql_database.py updated sql_database.py for returning sorted table names. (#6692) 2023-06-25 12:04:24 -07:00
text_splitter.py Page per class-style api reference (#6560) 2023-06-30 09:23:32 -07:00
utils.py added docstrings where they missed (#6626) 2023-06-23 15:49:44 -07:00