You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/libs/langchain
Chris Papademetrious ee6c922c91
langchain[minor]: enhance `LocalFileStore` to offer `update_atime` parameter that updates access times on read (#20951)
**Description:**
The `LocalFileStore` class can be used to create an on-disk
`CacheBackedEmbeddings` cache. The number of files in these embeddings
caches can grow to be quite large over time (hundreds of thousands) as
embeddings are computed for new versions of content, but the embeddings
for old/deprecated content are not removed.

A *least-recently-used* (LRU) cache policy could be applied to the
`LocalFileStore` directory to delete cache entries that have not been
referenced for some time:

```bash
# delete files that have not been accessed in the last 90 days
find embeddings_cache_dir/ -atime 90 -print0 | xargs -0 rm
```

However, most filesystems in enterprise environments disable access time
modification on read to improve performance. As a result, the access
times of these cache entry files are not updated when their values are
read.

To resolve this, this pull request updates the `LocalFileStore`
constructor to offer an `update_atime` parameter that causes access
times to be updated when a cache entry is read.

For example,

```python
file_store = LocalFileStore(temp_dir, update_atime=True)
```

The default is `False`, which retains the original behavior.

**Testing:**
I updated the LocalFileStore unit tests to test the access time update.
2 months ago
..
langchain langchain[minor]: enhance `LocalFileStore` to offer `update_atime` parameter that updates access times on read (#20951) 2 months ago
scripts langchain[patch]: Update import handling in `adapters` (#21079) 2 months ago
tests langchain[minor]: enhance `LocalFileStore` to offer `update_atime` parameter that updates access times on read (#20951) 2 months ago
.dockerignore (WIP) set up experimental (#7959) 11 months ago
.flake8 (WIP) set up experimental (#7959) 11 months ago
Dockerfile (WIP) set up experimental (#7959) 11 months ago
LICENSE IMPROVEMENT add license file to subproject (#8403) 8 months ago
Makefile update scheduled tests (#20526) 2 months ago
README.md docs: reorg and visual refresh (#19765) 3 months ago
dev.Dockerfile infra: fix how Poetry is installed in the dev container (#20521) 2 months ago
poetry.lock langchain[patch]: Remove jsonpatch from poetry file (#21272) 2 months ago
poetry.toml (WIP) set up experimental (#7959) 11 months ago
pyproject.toml langchain[patch]: Remove jsonpatch from poetry file (#21272) 2 months ago

README.md

🦜🔗 LangChain

Building applications with LLMs through composability

Release Notes lint test Downloads License: MIT Twitter Open in Dev Containers Open in GitHub Codespaces GitHub star chart Dependency Status Open Issues

Looking for the JS/TS version? Check out LangChain.js.

To help you ship LangChain apps to production faster, check out LangSmith. LangSmith is a unified developer platform for building, testing, and monitoring LLM applications. Fill out this form to speak with our sales team.

Quick Install

pip install langchain or pip install langsmith && conda install langchain -c conda-forge

🤔 What is this?

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

This library aims to assist in the development of those types of applications. Common examples of these applications include:

Question answering with RAG

🧱 Extracting structured output

🤖 Chatbots

📖 Documentation

Please see here for full documentation on:

  • Getting started (installation, setting up the environment, simple examples)
  • How-To examples (demos, integrations, helper functions)
  • Reference (full API docs)
  • Resources (high-level explanation of core concepts)

🚀 What can this help with?

There are five main areas that LangChain is designed to help with. These are, in increasing order of complexity:

📃 Models and Prompts:

This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with chat models and LLMs.

🔗 Chains:

Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

📚 Retrieval Augmented Generation:

Retrieval Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.

🤖 Agents:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.

🧐 Evaluation:

[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

For more information on these concepts, please see our full documentation.

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see the Contributing Guide.