langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
Jona Sassenhagen	0ea7224535	[Minor] Remove tagger from spacy sentencizer (#7534 ) @svlandeg gave me a tip for how to improve a bit on https://github.com/hwchase17/langchain/pull/7442 for some extra speed and memory gains. The tagger isn't needed for sentencization, so can be disabled too.	2023-07-11 16:43:46 -04:00
Kacper Łukawski	1f83b5f47e	Reuse the existing collection if configured properly in Qdrant.from_texts (#7530 ) This PR changes the behavior of `Qdrant.from_texts` so the collection is reused if not requested to recreate it. Previously, calling `Qdrant.from_texts` or `Qdrant.from_documents` resulted in removing the old data which was confusing for many.	2023-07-11 16:24:35 -04:00
Leonid Kuligin	6674b33cf5	Added support for chat_history (#7555 ) #7469 Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-11 15:27:26 -04:00
Felix Brockmeier	406a9dc11f	Add notebook example for Lemon AI NLP Workflow Automation (#7556 ) - Description: Added notebook to LangChain docs that explains how to use Lemon AI NLP Workflow Automation tool with Langchain - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @agola11 - Twitter handle: felixbrockm	2023-07-11 15:15:11 -04:00
Lance Martin	9e067b8cc9	Add env setup (#7550 ) Include setup	2023-07-11 09:48:40 -07:00
Bagatur	3c4338470e	bump 230 (#7544 )	2023-07-11 11:24:08 -04:00
Bagatur	d2137eea9f	fix cpal docs (#7545 )	2023-07-11 11:07:45 -04:00
Boris	9129318466	CPAL (#6255 ) # Causal program-aided language (CPAL) chain ## Motivation This builds on the recent [PAL](https://arxiv.org/abs/2211.10435) to stop LLM hallucination. The problem with the [PAL](https://arxiv.org/abs/2211.10435) approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination. For example, using the below word problem, PAL answers with 5, and CPAL answers with 13. "Tim buys the same number of pets as Cindy and Boris." "Cindy buys the same number of pets as Bill plus Bob." "Boris buys the same number of pets as Ben plus Beth." "Bill buys the same number of pets as Obama." "Bob buys the same number of pets as Obama." "Ben buys the same number of pets as Obama." "Beth buys the same number of pets as Obama." "If Obama buys one pet, how many pets total does everyone buy?" The CPAL chain represents the causal structure of the above narrative as a causal graph or DAG, which it can also plot, as shown below. ![complex-graph](https://github.com/hwchase17/langchain/assets/367522/d938db15-f941-493d-8605-536ad530f576) . The two major sections below are: 1. Technical overview 2. Future application Also see [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 1. Technical overview ### CPAL versus PAL Like [PAL](https://arxiv.org/abs/2211.10435), CPAL intends to reduce large language model (LLM) hallucination. The CPAL chain is different from the PAL chain for a couple of reasons. * CPAL adds a causal structure (or DAG) to link entity actions (or math expressions). * The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities. PAL's generated python code is wrong. It hallucinates when complexity increases. ```python def solution(): """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?""" obama_pets = 1 tim_pets = obama_pets cindy_pets = obama_pets + obama_pets boris_pets = obama_pets + obama_pets total_pets = tim_pets + cindy_pets + boris_pets result = total_pets return result # math result is 5 ``` CPAL's generated python code is correct. ```python story outcome data name code value depends_on 0 obama pass 1.0 [] 1 bill bill.value = obama.value 1.0 [obama] 2 bob bob.value = obama.value 1.0 [obama] 3 ben ben.value = obama.value 1.0 [obama] 4 beth beth.value = obama.value 1.0 [obama] 5 cindy cindy.value = bill.value + bob.value 2.0 [bill, bob] 6 boris boris.value = ben.value + beth.value 2.0 [ben, beth] 7 tim tim.value = cindy.value + boris.value 4.0 [cindy, boris] query data { "question": "how many pets total does everyone buy?", "expression": "SELECT SUM(value) FROM df", "llm_error_msg": "" } # query result is 13 ``` Based on the comments below, CPAL's intended location in the library is `experimental/chains/cpal` and PAL's location is`chains/pal`. ### CPAL vs Graph QA Both the CPAL chain and the Graph QA chain extract entity-action-entity relations into a DAG. The CPAL chain is different from the Graph QA chain for a few reasons. * Graph QA does not connect entities to math expressions * Graph QA does not associate actions in a sequence of dependence. * Graph QA does not decompose the narrative into these three parts: 1. Story plot or causal model 4. Hypothetical question 5. Hypothetical condition ### Evaluation Preliminary evaluation on simple math word problems shows that this CPAL chain generates less hallucination than the PAL chain on answering questions about a causal narrative. Two examples are in [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 2. Future application ### "Describe as Narrative, Test as Code" The thesis here is that the Describe as Narrative, Test as Code approach allows you to represent a causal mental model both as code and as a narrative, giving you the best of both worlds. #### Why describe a causal mental mode as a narrative? The narrative form is quick. At a consensus building meeting, people use narratives to persuade others of their causal mental model, aka. plan. You can share, version control and index a narrative. #### Why test a causal mental model as a code? Code is testable, complex narratives are not. Though fast, narratives are problematic as their complexity increases. The problem is LLMs and humans are prone to hallucination when predicting the outcomes of a narrative. The cost of building a consensus around the validity of a narrative outcome grows as its narrative complexity increases. Code does not require tribal knowledge or social power to validate. Code is composable, complex narratives are not. The answer of one CPAL chain can be the hypothetical conditions of another CPAL Chain. For stochastic simulations, a composable plan can be integrated with the [DoWhy library](https://github.com/py-why/dowhy). Lastly, for the futuristic folk, a composable plan as code allows ordinary community folk to design a plan that can be integrated with a blockchain for funding. An explanation of a dependency planning application is [here.](https://github.com/borisdev/cpal-llm-chain-demo) --- Twitter handle: @boris_dev --------- Co-authored-by: Boris Dev <borisdev@Boriss-MacBook-Air.local>	2023-07-11 10:11:21 -04:00
Alejandra De Luna	2e4047e5e7	feat: support generate as an early stopping method for `OpenAIFunctionsAgent` (#7229 ) This PR proposes an implementation to support `generate` as an `early_stopping_method` for the new `OpenAIFunctionsAgent` class. The motivation behind is to facilitate the user to set a maximum number of actions the agent can take with `max_iterations` and force a final response with this new agent (as with the `Agent` class). The following changes were made: - The `OpenAIFunctionsAgent.return_stopped_response` method was overwritten to support `generate` as an `early_stopping_method` - A boolean `with_functions` parameter was added to the `OpenAIFunctionsAgent.plan` method This way the `OpenAIFunctionsAgent.return_stopped_response` method can call the `OpenAIFunctionsAgent.plan` method with `with_function=False` when the `early_stopping_method` is set to `generate`, making a call to the LLM with no functions and forcing a final response from the `"assistant"`. - Relevant maintainer: @hinthornw - Twitter handle: @aledelunap --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 09:25:02 -04:00
Hashem Alsaket	1dd4236177	Fix HF endpoint returns blank for text-generation (#7386 ) Description: Current `_call` function in the `langchain.llms.HuggingFaceEndpoint` class truncates response when `task=text-generation`. Same error discussed a few days ago on Hugging Face: https://huggingface.co/tiiuae/falcon-40b-instruct/discussions/51 Issue: Fixes #7353 Tag maintainer: @hwchase17 @baskaryan @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 03:06:05 -04:00
Lance Martin	4a94f56258	Minor edits to QA docs (#7507 ) Small clean-ups	2023-07-10 22:15:05 -07:00
Raymond Yuan	5171c3bcca	Refactor vector storage to correctly handle relevancy scores (#6570 ) Description: This pull request aims to support generating the correct generic relevancy scores for different vector stores by refactoring the relevance score functions and their selection in the base class and subclasses of VectorStore. This is especially relevant with VectorStores that require a distance metric upon initialization. Note many of the current implenetations of `_similarity_search_with_relevance_scores` are not technically correct, as they just return `self.similarity_search_with_score(query, k, **kwargs)` without applying the relevant score function Also includes changes associated with: https://github.com/hwchase17/langchain/pull/6564 and https://github.com/hwchase17/langchain/pull/6494 See more indepth discussion in thread in #6494 Issue: https://github.com/hwchase17/langchain/issues/6526 https://github.com/hwchase17/langchain/issues/6481 https://github.com/hwchase17/langchain/issues/6346 Dependencies: None The changes include: - Properly handling score thresholding in FAISS `similarity_search_with_score_by_vector` for the corresponding distance metric. - Refactoring the `_similarity_search_with_relevance_scores` method in the base class and removing it from the subclasses for incorrectly implemented subclasses. - Adding a `_select_relevance_score_fn` method in the base class and implementing it in the subclasses to select the appropriate relevance score function based on the distance strategy. - Updating the `__init__` methods of the subclasses to set the `relevance_score_fn` attribute. - Removing the `_default_relevance_score_fn` function from the FAISS class and using the base class's `_euclidean_relevance_score_fn` instead. - Adding the `DistanceStrategy` enum to the `utils.py` file and updating the imports in the vector store classes. - Updating the tests to import the `DistanceStrategy` enum from the `utils.py` file. --------- Co-authored-by: Hanit <37485638+hanit-com@users.noreply.github.com>	2023-07-10 20:37:03 -07:00
Lance Martin	bd0c6381f5	Minor update to clarify map-reduce custom prompt usage (#7453 ) Update docs for map-reduce custom prompt usage	2023-07-10 16:43:44 -07:00
Lance Martin	28d2b213a4	Update landing page for "question answering over documents" (#7152 ) Improve documentation for a central use-case, qa / chat over documents. This will be merged as an update to `index.mdx` [here](https://python.langchain.com/docs/use_cases/question_answering/). Testing w/ local Docusaurus server: ``` From `docs` directory: mkdir _dist cp -r {docs_skeleton,snippets} _dist cp -r extras/* _dist/docs_skeleton/docs cd _dist/docs_skeleton yarn install yarn start ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 14:15:13 -07:00
William FH	dd648183fa	Rm create_project line (#7486 ) not needed	2023-07-10 10:49:55 -07:00
Leonid Ganeline	5eec74d9a5	docstrings `document_loaders` 3 (#6937 ) - Updated docstrings for `document_loaders` - Mass update `"""Loader that loads` to `"""Loads` @baskaryan - please, review	2023-07-10 08:56:53 -07:00
Stanko Kuveljic	9d13dcd17c	Pinecone: Add V4 support (#7473 )	2023-07-10 08:39:47 -07:00
Adilkhan Sarsen	5debd5043e	Added deeplake use case examples of the new features (#6528 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> 1. Added use cases of the new features 2. Done some code refactoring --------- Co-authored-by: Ivo Stranic <istranic@gmail.com>	2023-07-10 07:04:29 -07:00
Bagatur	9b615022e2	bump 229 (#7467 )	2023-07-10 04:38:55 -04:00
Kazuki Maeda	92b4418c8c	Datadog logs loader (#7356 ) ### Description Created a Loader to get a list of specific logs from Datadog Logs. ### Dependencies `datadog_api_client` is required. ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 04:27:55 -04:00
Yifei Song	7d29bb2c02	Add Xorbits Dataframe as a Document Loader (#7319 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits document loader, which allows langchain to leverage Xorbits to parallelize and distribute the loading of data. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/Xorbitsio Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 04:24:47 -04:00
Sergio Moreno	21a353e9c2	feat: ctransformers support async chain (#6859 ) - Description: Adding async method for CTransformers - Issue: I've found impossible without this code to run Websockets inside a FastAPI micro service and a CTransformers model. - Tag maintainer: Not necessary yet, I don't like to mention directly - Twitter handle: @_semoal	2023-07-10 04:23:41 -04:00
Paul-Emile Brotons	d2cf0d16b3	adding max_marginal_relevance_search method to MongoDBAtlasVectorSearch (#7310 ) Adding a maximal_marginal_relevance method to the MongoDBAtlasVectorSearch vectorstore enhances the user experience by providing more diverse search results Issue: #7304	2023-07-10 04:04:19 -04:00
Bagatur	04cddfba0d	Add lark import error (#7465 )	2023-07-10 03:21:23 -04:00
Matt Robinson	bcab894f4e	feat: Add `UnstructuredTSVLoader` (#7367 ) ### Summary Adds an `UnstructuredTSVLoader` for TSV files. Also updates the doc strings for `UnstructuredCSV` and `UnstructuredExcel` loaders. ### Testing ```python from langchain.document_loaders.tsv import UnstructuredTSVLoader loader = UnstructuredTSVLoader( file_path="example_data/mlb_teams_2012.csv", mode="elements" ) docs = loader.load() ```	2023-07-10 03:07:10 -04:00
Ronald Li	490f4a9ff0	Fixes KeyError in AmazonKendraRetriever initializer (#7464 ) ### Description argument variable client is marked as required in commit `81e5b1ad36` which breaks the default way of initialization providing only index_id. This commit avoid KeyError exception when it is initialized without a client variable ### Dependencies no dependency required	2023-07-10 03:02:36 -04:00
Jona Sassenhagen	7ffc431b3a	Add spacy sentencizer (#7442 ) `SpacyTextSplitter` currently uses spacy's statistics-based `en_core_web_sm` model for sentence splitting. This is a good splitter, but it's also pretty slow, and in this case it's doing a lot of work that's not needed given that the spacy parse is then just thrown away. However, there is also a simple rules-based spacy sentencizer. Using this is at least an order of magnitude faster than using `en_core_web_sm` according to my local tests. Also, spacy sentence tokenization based on `en_core_web_sm` can be sped up in this case by not doing the NER stage. This shaves some cycles too, both when loading the model and when parsing the text. Consequently, this PR adds the option to use the basic spacy sentencizer, and it disables the NER stage for the current approach, which is kept as the default. Lastly, when extracting the tokenized sentences, the `text` attribute is called directly instead of doing the string conversion, which is IMO a bit more idiomatic.	2023-07-10 02:52:05 -04:00
charosen	50a9fcccb0	feat(module): add param ids to ElasticVectorSearch.from_texts method (#7425 ) # add param ids to ElasticVectorSearch.from_texts method. - Description: add param ids to ElasticVectorSearch.from_texts method. - Issue: NA. It seems `add_texts` already supports passing in document ids, but param `ids` is omitted in `from_texts` classmethod, - Dependencies: None, - Tag maintainer: @rlancemartin, @eyurtsev please have a look, thanks ``` # ElasticVectorSearch add_texts def add_texts( self, texts: Iterable[str], metadatas: Optional[List[dict]] = None, refresh_indices: bool = True, ids: Optional[List[str]] = None, kwargs: Any, ) -> List[str]: ... ``` ``` # ElasticVectorSearch from_texts @classmethod def from_texts( cls, texts: List[str], embedding: Embeddings, metadatas: Optional[List[dict]] = None, elasticsearch_url: Optional[str] = None, index_name: Optional[str] = None, refresh_indices: bool = True, kwargs: Any, ) -> ElasticVectorSearch: ``` Co-authored-by: charosen <charosen@bupt.cn>	2023-07-10 02:25:35 -04:00
James Yin	a5fd8873b1	fix: type hint of get_chat_history in BaseConversationalRetrievalChain (#7461 ) The type hint of `get_chat_history` property in `BaseConversationalRetrievalChain` is incorrect. @baskaryan	2023-07-10 02:14:00 -04:00
nikkie	dfc3f83b0f	docs(vectorstores/integrations/chroma): Fix loading and saving (#7437 ) - Description: Fix loading and saving code about Chroma - Issue: the issue #7436 - Dependencies: - - Twitter handle: https://twitter.com/ftnext	2023-07-10 02:05:15 -04:00
Daniel Chalef	c7f7788d0b	Add ZepMemory; improve ZepChatMessageHistory handling of metadata; Fix bugs (#7444 ) Hey @hwchase17 - This PR adds a `ZepMemory` class, improves handling of Zep's message metadata, and makes it easier for folks building custom chains to persist metadata alongside their chat history. We've had plenty confused users unfamiliar with ChatMessageHistory classes and how to wrap the `ZepChatMessageHistory` in a `ConversationBufferMemory`. So we've created the `ZepMemory` class as a light wrapper for `ZepChatMessageHistory`. Details: - add ZepMemory, modify notebook to demo use of ZepMemory - Modify summary to be SystemMessage - add metadata argument to add_message; add Zep metadata to Message.additional_kwargs - support passing in metadata	2023-07-10 01:53:49 -04:00
Saurabh Chaturvedi	8f8e8d701e	Fix info about YouTube (#7447 ) (Unintentionally mean 😅) nit: YouTube wasn't created by Google, this PR fixes the mention in docs.	2023-07-10 01:52:55 -04:00
Leonid Ganeline	560c4dfc98	docstrings: `docstore` and `client` (#6783 ) updated docstrings in `docstore/` and `client/` @baskaryan	2023-07-09 01:34:28 -04:00
Jeroen Van Goey	f5bd88757e	Fix typo (#7416 ) `quesitons` -> `questions`.	2023-07-09 00:54:48 -04:00
Alejandro Garrido Mota	ea9c3cc9c9	Fix syntax erros in documentation (#7409 ) - Description: Tiny documentation fix. In Python, when defining function parameters or providing arguments to a function or class constructor, we do not use the `:` character. - Issue: N/A - Dependencies: N/A, - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @mogaal	2023-07-08 19:52:01 -04:00
Nolan	5da9f9abcb	docs(agents/toolkits): Fix error in document_comparison_toolkit.ipynb (#7417 ) Replace this comment with: - Description: Removes unneeded output warning in documentation at https://python.langchain.com/docs/modules/agents/toolkits/document_comparison_toolkit - Issue: - - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: @finnless	2023-07-08 19:51:08 -04:00
nikkie	2eb4a2ceea	docs(retrievers/get-started): Fix broken state_of_the_union.txt link (#7399 ) Thank you for this awesome library. - Description: Fix broken link in documentation - Issue: - https://python.langchain.com/docs/modules/data_connection/retrievers/#get-started - the URL: https://github.com/hwchase17/langchain/blob/master/docs/modules/state_of_the_union.txt - I think the right one is https://github.com/hwchase17/langchain/blob/master/docs/extras/modules/state_of_the_union.txt - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: -	2023-07-08 11:11:05 -04:00
Delgermurun	e7420789e4	improve description of JinaChat (#7397 ) very small doc string change in the `JinaChat` class.	2023-07-08 10:57:11 -04:00
Bagatur	26c86a197c	bump 228 (#7393 )	2023-07-08 03:05:20 -04:00
SvMax	1d649b127e	Added param to return only a structured json from the get_format_instructions method (#5848 ) I just added a parameter to the method get_format_instructions, to return directly the JSON instructions without the leading instruction sentence. I'm planning to use it to define the structure of a JSON object passed in input, the get_format_instructions(). --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-08 02:57:26 -04:00
Bagatur	362bc301df	fix jina (#7392 )	2023-07-08 02:41:54 -04:00
Delgermurun	a1603fccfb	integrate JinaChat (#6927 ) Integration with https://chat.jina.ai/api. It is OpenAI compatible API. - Twitter handle: [https://twitter.com/JinaAI_](https://twitter.com/JinaAI_) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-08 02:17:04 -04:00
William FH	4ba7396f96	Add single run eval loader (#7390 ) Plus - add evaluation name to make string and embedding validators work with the run evaluator loader. - Rm unused root validator	2023-07-07 23:06:49 -07:00
Roger Yu	633b673b85	Update pinecone.ipynb (#7382 ) Fix typo	2023-07-08 01:48:03 -04:00
Oleg Zabluda	4d697d3f24	Allow passing custom prompts to GraphIndexCreator (#7381 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-08 01:47:53 -04:00
William FH	612a74eb7e	Make Ref Example Threadsafe (#7383 ) Have noticed transient ref example misalignment. I believe this is caused by the logic of assigning an example within the thread executor rather than before.	2023-07-07 21:50:42 -07:00
William FH	4789c99bc2	Add String Distance and Embedding Evaluators (#7123 ) Add a string evaluator and pairwise string evaluator implementation for: - Embedding distance - String distance Update docs	2023-07-07 21:44:31 -07:00
ljeagle	fb6e63dc36	Upgrade the AwaDB from 0.3.5 to 0.3.6 (#7363 )	2023-07-07 20:41:17 -07:00
William FH	c5edbea34a	Load Run Evaluator (#7101 ) Current problems: 1. Evaluating LLMs or Chat models isn't smooth. Even specifying 'generations' as the output inserts a redundant list into the eval template 2. Configuring input / prediction / reference keys in the `get_qa_evaluator` function is confusing. Unless you are using a chain with the default keys, you have to specify all the variables and need to reason about whether the key corresponds to the traced run's inputs, outputs or the examples inputs or outputs. Proposal: - Configure the run evaluator according to a model. Use the model type and input/output keys to assert compatibility where possible. Only need to specify a reference_key for certain evaluators (which is less confusing than specifying input keys) When does this work: - If you have your langchain model available (assumed always for run_on_dataset flow) - If you are evaluating an LLM, Chat model, or chain - If the LLM or chat models are traced by langchain (wouldn't work if you add an incompatible schema via the REST API) When would this fail: - Currently if you directly create an example from an LLM run, the outputs are generations with all the extra metadata present. A simple `example_key` and dumping all to the template could make the evaluations unreliable - Doesn't help if you're not using the low level API - If you want to instantiate the evaluator without instantiating your chain or LLM (maybe common for monitoring, for instance) -> could also load from run or run type though What's ugly: - Personally think it's better to load evaluators one by one since passing a config down is pretty confusing. - Lots of testing needs to be added - Inconsistent in that it makes a separate run and example input mapper instead of the original `RunEvaluatorInputMapper`, which maps a run and example to a single input. Example usage running the for an LLM, Chat Model, and Agent. ``` # Test running for the string evaluators evaluator_names = ["qa", "criteria"] model = ChatOpenAI() configured_evaluators = load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer") run_on_dataset(ds_name, model, run_evaluators=configured_evaluators) ``` <details> <summary>Full code with dataset upload</summary> ``` ## Create dataset from langchain.evaluation.run_evaluators.loading import load_run_evaluators_for_model from langchain.evaluation import load_dataset import pandas as pd lcds = load_dataset("llm-math") df = pd.DataFrame(lcds) from uuid import uuid4 from langsmith import Client client = Client() ds_name = "llm-math - " + str(uuid4())[0:8] ds = client.upload_dataframe(df, name=ds_name, input_keys=["question"], output_keys=["answer"]) ## Define the models we'll test over from langchain.llms import OpenAI from langchain.chat_models import ChatOpenAI from langchain.agents import initialize_agent, AgentType from langchain.tools import tool llm = OpenAI(temperature=0) chat_model = ChatOpenAI(temperature=0) @tool def sum(a: float, b: float) -> float: """Add two numbers""" return a + b def construct_agent(): return initialize_agent( llm=chat_model, tools=[sum], agent=AgentType.OPENAI_MULTI_FUNCTIONS, ) agent = construct_agent() # Test running for the string evaluators evaluator_names = ["qa", "criteria"] models = [llm, chat_model, agent] run_evaluators = [] for model in models: run_evaluators.append(load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer")) # Run on LLM, Chat Model, and Agent from langchain.client.runner_utils import run_on_dataset to_test = [llm, chat_model, construct_agent] for model, configured_evaluators in zip(to_test, run_evaluators): run_on_dataset(ds_name, model, run_evaluators=configured_evaluators, verbose=True) ``` </details> --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-07 19:57:59 -07:00
Bagatur	1ac347b4e3	update databerry-chaindesk redirect (#7378 )	2023-07-07 19:11:46 -04:00

... 4 5 6 7 8 ...

3287 Commits