You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Go to file

Harrison Chase 4c02f4bc30 Fix bug in svm.LinearSVC, add support for a relevancy_threshold (#2959 ) (#2981 ) - Modify SVMRetriever class to add an optional relevancy_threshold - Modify SVMRetriever.get_relevant_documents method to filter out documents with similarity scores below the relevancy threshold - Normalized the similarities to be between 0 and 1 so the relevancy_threshold makes more sense - The number of results are limited to the top k documents or the maximum number of relevant documents above the threshold, whichever is smaller This code will now return the top self.k results (or less, if there are not enough results that meet the self.relevancy_threshold criteria). The svm.LinearSVC implementation in scikit-learn is non-deterministic, which means SVMRetriever.from_texts(["bar", "world", "foo", "hello", "foo bar"]) could return [3 0 5 4 2 1] instead of [0 3 5 4 2 1] with a query of "foo". If you pass in multiple "foo" texts, the order could be different each time. Here, we only care if the 0 is the first element, otherwise it will offset the text and similarities. Example: ```python retriever = SVMRetriever.from_texts( ["foo", "bar", "world", "hello", "foo bar"], OpenAIEmbeddings(), k=4, relevancy_threshold=.25 ) result = retriever.get_relevant_documents("foo") ``` yields ```python [Document(page_content='foo', metadata={}), Document(page_content='foo bar', metadata={})] ``` --------- Co-authored-by: Brandon Sandoval <52767641+account00001@users.noreply.github.com>		1 year ago
.github	fix: tests with Dockerfile (#2382 )	1 year ago
docs	Fix docs for parse_with_prompt (#2986 )	1 year ago
langchain	Fix bug in svm.LinearSVC, add support for a relevancy_threshold (#2959 ) (#2981 )	1 year ago
tests	Remove pythonrepl from LLM-MathChain (#2943 )	1 year ago
.dockerignore	fix: tests with Dockerfile (#2382 )	1 year ago
.flake8	change run to use args and kwargs (#367 )	1 year ago
.gitignore	fix: elasticsearch (#2402 )	1 year ago
CITATION.cff	bump version to 0069 (#710 )	1 year ago
Dockerfile	feat: add pytest-vcr for recording HTTP interactions in integration tests (#2445 )	1 year ago
LICENSE	add license (#50 )	2 years ago
Makefile	Add lint_diff command (#2449 )	1 year ago
README.md	Update README.md (#2805 )	1 year ago
poetry.lock	Remove pythonrepl from LLM-MathChain (#2943 )	1 year ago
poetry.toml	fix Poetry 1.4.0+ installation (#1935 )	1 year ago
pyproject.toml	Remove pythonrepl from LLM-MathChain (#2943 )	1 year ago
readthedocs.yml	update rtd config (#1664 )	1 year ago

README.md

Unescape Escape

🦜️🔗 LangChain

⚡ Building applications with LLMs through composability ⚡

Production Support: As you move your LangChains into production, we'd love to offer more comprehensive support. Please fill out this form and we'll set up a dedicated support Slack channel.

Quick Install

pip install langchain or conda install langchain -c conda-forge

🤔 What is this?

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

This library is aimed at assisting in the development of those types of applications. Common examples of these types of applications include:

❓ Question Answering over specific documents

Documentation
End-to-end Example: Question Answering over Notion Database

💬 Chatbots

Documentation
End-to-end Example: Chat-LangChain

🤖 Agents

Documentation
End-to-end Example: GPT+WolframAlpha

📖 Documentation

Please see here for full documentation on:

Getting started (installation, setting up the environment, simple examples)
How-To examples (demos, integrations, helper functions)
Reference (full API docs)
Resources (high-level explanation of core concepts)

🚀 What can this help with?

There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:

📃 LLMs and Prompts:

This includes prompt management, prompt optimization, generic interface for all LLMs, and common utilities for working with LLMs.

🔗 Chains:

Chains go beyond just a single LLM call, and are sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

📚 Data Augmented Generation:

Data Augmented Generation involves specific types of chains that first interact with an external datasource to fetch data to use in the generation step. Examples of this include summarization of long pieces of text and question/answering over specific data sources.

🤖 Agents:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end to end agents.

🧠 Memory:

Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

🧐 Evaluation:

[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

For more information on these concepts, please see our full documentation.

💁 Contributing

As an open source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infra, or better documentation.

For detailed information on how to contribute, see here.

README.md Unescape Escape