You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/langchain
Alexander Hoyle 42b892c21b
Avoid IntegrityError for SQLiteCache updates (#1286)
While using a `SQLiteCache`, if there are duplicate `(prompt, llm, idx)`
tuples passed to
[`update_cache()`](c5dd491a21/langchain/llms/base.py (L39)),
then an `IntegrityError` is thrown. This can happen when there are
duplicated prompts within the same batch.

This PR changes the SQLAlchemy `session.add()` to a `session.merge()` in
`cache.py`, [following the solution from this SO
thread](https://stackoverflow.com/questions/10322514/dealing-with-duplicate-primary-keys-on-insert-in-sqlalchemy-declarative-style).
I believe this fixes #983, but not entirely sure since that also
involves async

Here's a minimal example of the error:
```python
from pathlib import Path

import langchain
from langchain.cache import SQLiteCache

llm = langchain.OpenAI(model_name="text-ada-001", openai_api_key=Path("/.openai_api_key").read_text().strip())
langchain.llm_cache = SQLiteCache("test_cache.db")
llm.generate(['a'] * 5)
```
```
>   IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: full_llm_cache.prompt, full_llm_cache.llm, full_llm_cache.idx
    [SQL: INSERT INTO full_llm_cache (prompt, llm, idx, response) VALUES (?, ?, ?, ?)]
    [parameters: ('a', "[('_type', 'openai'), ('best_of', 1), ('frequency_penalty', 0), ('logit_bias', {}), ('max_tokens', 256), ('model_name', 'text-ada-001'), ('n', 1), ('presence_penalty', 0), ('request_timeout', None), ('stop', None), ('temperature', 0.7), ('top_p', 1)]", 0, '\n\nA is for air.\n\nA is for atmosphere.')]
    (Background on this error at: https://sqlalche.me/e/14/gkpj)
```

After the change, we now have the following
```python
class Output:
    def __init__(self, text):
        self.text = text

# make dummy data
cache = SQLiteCache("test_cache_2.db")
cache.update(prompt="prompt_0", llm_string="llm_0", return_val=[Output("text_0")])
cache.engine.execute("SELECT * FROM full_llm_cache").fetchall()

# output
>   [('prompt_0', 'llm_0', 0, 'text_0')]
```

```python
#  update data, before change this would have thrown an `IntegrityError`
cache.update(prompt="prompt_0", llm_string="llm_0", return_val=[Output("text_0_new")])
cache.engine.execute("SELECT * FROM full_llm_cache").fetchall()

# output
>   [('prompt_0', 'llm_0', 0, 'text_0_new')]
```
1 year ago
..
agents searx: remove duplicate param (#1219) 1 year ago
callbacks rfc: callback changes (#1165) 1 year ago
chains Harrison/source docs (#1275) 1 year ago
docstore Harrison/wiki update (#622) 1 year ago
document_loaders add CoNLL-U document loader (#1297) 1 year ago
embeddings Harrison/cohere params (#1278) 1 year ago
evaluation Refactor some loops into list comprehensions (#1185) 1 year ago
graphs catch networkx error (#1201) 1 year ago
indexes Harrion/kg (#1016) 1 year ago
llms Harrison/banana fix (#1311) 1 year ago
prompts ruff ruff (#1203) 1 year ago
tools add ifttt tool (#1244) 1 year ago
utilities Harrison/errors (#1276) 1 year ago
vectorstores Harrison/add documents (#1197) 1 year ago
__init__.py Add Writer, Banana, Modal, StochasticAI (#1270) 1 year ago
cache.py Avoid IntegrityError for SQLiteCache updates (#1286) 1 year ago
docker-compose.yaml add tracing support to langchain (#741) 1 year ago
example_generator.py Harrison/improve cache (#368) 1 year ago
formatting.py initial commit 2 years ago
input.py Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841) 1 year ago
model_laboratory.py Harrison/improve cache (#368) 1 year ago
py.typed Add py.typed marker to package (#121) 2 years ago
python.py Harrison/tools exp (#372) 1 year ago
requests.py LLMRequestsChain (#267) 2 years ago
schema.py add tracing support to langchain (#741) 1 year ago
serpapi.py move serpapi wrapper (#1199) 1 year ago
server.py add tracing support to langchain (#741) 1 year ago
sql_database.py fix sqlite internal tables breaking table_info (#1224) 1 year ago
text_splitter.py fix bug with length function (#1257) 1 year ago
utils.py Harrison/bing wrapper (#656) 1 year ago