langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-18 09:25:54 +00:00

Author	SHA1	Message	Date
i-w-a	95ee69a301	langchain[patch]: In HTMLHeaderTextSplitter set default encoding to utf-8 (#16372 ) - Description: The HTMLHeaderTextSplitter Class now explicitly specifies utf-8 encoding in the part of the split_text_from_file method that calls the HTMLParser. - Issue: Prevent garbled characters due to differences in encoding of html files (except for English in particular, I noticed that problem with Japanese). - Dependencies: No dependencies, - Twitter handle: @i_w__a	2024-01-23 18:20:29 -08:00
Noah Stapp	e135e5257c	community[patch]: Include scores in MongoDB Atlas QA chain results (#14666 ) Adds the ability to return similarity scores when using `RetrievalQA.from_chain_type` with `MongoDBAtlasVectorSearch`. Requires that `return_source_documents=True` is set. Example use: ``` vector_search = MongoDBAtlasVectorSearch.from_documents(...) qa = RetrievalQA.from_chain_type( llm=OpenAI(), chain_type="stuff", retriever=vector_search.as_retriever(search_kwargs={"additional": ["similarity_score"]}), return_source_documents=True ) ... docs = qa({"query": "..."}) docs["source_documents"][0].metadata["score"] # score will be here ``` I've tested this feature locally, using a MongoDB Atlas Cluster with a vector search index.	2024-01-23 18:18:28 -08:00
Serena Ruan	90f5a1c40e	community[minor]: Improve mlflow callback (#15691 ) - Description: Allow passing run_id to MLflowCallbackHandler to resume a run instead of creating a new run. Support recording retriever relevant metrics. Refactor the code to fix some bugs. --------- Signed-off-by: Serena Ruan <serena.rxy@gmail.com>	2024-01-23 18:16:51 -08:00
Facundo Santiago	92e6a641fd	feat: adding paygo api support for Azure ML / Azure AI Studio (#14560 ) - Description: Introducing support for LLMs and Chat models running in Azure AI studio and Azure ML using the new deployment mode pay-as-you-go (model as a service). - Issue: NA - Dependencies: None. - Tag maintainer: @prakharg-msft @gdyre - Twitter handle: @santiagofacundo Examples added: * [docs/docs/integrations/llms/azure_ml.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_endpoint.ipynb) * [docs/docs/integrations/chat/azureml_chat_endpoint.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_chat_endpoint.ipynb) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-23 17:08:51 -08:00
Davide Menini	9ce177580a	community: normalize bedrock embeddings (#15103 ) In this PR I added a post-processing function to normalize the embeddings. This happens only if the new `normalize` flag is `True`. --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com>	2024-01-23 17:05:24 -08:00
baichuan-assistant	20fcd49348	community: Fix Baichuan Chat. (#15207 ) - Description: Baichuan Chat (with both Baichuan-Turbo and Baichuan-Turbo-192K models) has updated their APIs. There are breaking changes. For example, BAICHUAN_SECRET_KEY is removed in the latest API but is still required in Langchain. Baichuan's Langchain integration needs to be updated to the latest version. - Issue: #15206 - Dependencies: None, - Twitter handle: None @hwchase17. Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>	2024-01-23 17:01:57 -08:00
gcheron	cfc225ecb3	community: SQLStrStore/SQLDocStore provide an easy SQL alternative to `InMemoryStore` to persist data remotely in a SQL storage (#15909 ) Description: - Implement `SQLStrStore` and `SQLDocStore` classes that inherits from `BaseStore` to allow to persist data remotely on a SQL server. - SQL is widely used and sometimes we do not want to install a caching solution like Redis. - Multiple issues/comments complain that there is no easy remote and persistent solution that are not in memory (users want to replace InMemoryStore), e.g., https://github.com/langchain-ai/langchain/issues/14267, https://github.com/langchain-ai/langchain/issues/15633, https://github.com/langchain-ai/langchain/issues/14643, https://stackoverflow.com/questions/77385587/persist-parentdocumentretriever-of-langchain - This is particularly painful when wanting to use `ParentDocumentRetriever ` - This implementation is particularly useful when: * it's expensive to construct an InMemoryDocstore/dict * you want to retrieve documents from remote sources * you just want to reuse existing objects - This implementation integrates well with PGVector, indeed, when using PGVector, you already have a SQL instance running. `SQLDocStore` is a convenient way of using this instance to store documents associated to vectors. An integration example with ParentDocumentRetriever and PGVector is provided in docs/docs/integrations/stores/sql.ipynb or [here](https://github.com/gcheron/langchain/blob/sql-store/docs/docs/integrations/stores/sql.ipynb). - It persists `str` and `Document` objects but can be easily extended. Issue: Provide an easy SQL alternative to `InMemoryStore`. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-23 16:50:48 -08:00
dudgeon	26b2ad6d5b	Fixed typo on quickstart.ipynb (#16482 ) - Description: Quick typo fix: `inpect` >> `inspect` - Issue: N/A - Dependencies: any dependencies required for this change, - Twitter handle: @geoffdudgeon	2024-01-23 16:50:13 -08:00
Massimiliano Pronesti	e529939c54	feat(llms): support more tasks in HuggingFaceHub LLM and remove deprecated dep (#14406 ) - Description: this PR upgrades the `HuggingFaceHub` LLM: * support more tasks (`translation` and `conversational`) * replaced the deprecated `InferenceApi` with `InferenceClient` * adjusted the overall logic to use the "recommended" model for each task when no model is provided, and vice-versa. - Tag mainter(s): @baskaryan @hwchase17	2024-01-23 16:48:56 -08:00
Erick Friis	afb25eeec4	cli[patch]: add integration tests to default makefile (#16479 )	2024-01-23 16:09:16 -07:00
Erick Friis	51c8ef6af4	templates: fix azure params in retrieval agent (#16257 ) - FIX templates/retrieval-agent/retireval-agent/chain.py to use the new Syntax for Azure env params - cr --------- Co-authored-by: braun-viathan <p.braun@viathan.de> Co-authored-by: Braun-viathan <121631422+braun-viathan@users.noreply.github.com>	2024-01-23 14:58:06 -07:00
Lance Martin	c3530f1c11	templates: Minor nit on HyDE (#16478 )	2024-01-23 14:23:08 -07:00
Bagatur	ba326b98d0	langchain[patch]: Release 0.1.3 (#16475 )	2024-01-23 11:50:25 -08:00
Bagatur	54149292f8	community[patch]: Release 0.0.15 (#16474 )	2024-01-23 11:50:10 -08:00
Bagatur	ef6a335570	core[patch]: Release 0.1.15 (#16473 )	2024-01-23 11:31:50 -08:00
Erick Friis	1f4ac62dee	cli[patch], google-vertexai[patch]: readme template (#16470 )	2024-01-23 12:08:17 -07:00
Eugene Yurtsev	39d1cbfecf	Docs: Document astream_events API (#16300 ) Document astream events API	2024-01-23 12:32:45 -05:00
Tomaz Bratanic	d0a8082188	Fix neo4j sanitize (#16439 ) Fix the sanitization bug and add an integration test	2024-01-23 10:56:28 -05:00
William FH	5de59f9236	Core[Patch] Parse tool input after on_start (#16430 ) For tracing, if a validation error occurs, currently it is attributed to the previous step of the chain. It would be nice to have the on_start and on_error callbacks called for tools when there is a validation error that occurs to more easily attribute the root-cause	2024-01-23 10:54:47 -05:00
Nuno Campos	226fe645f1	core[patch] Do not try to access attribute of None (#16321 )	2024-01-22 22:10:03 -08:00
Florian MOREL	4b7969efc5	community[minor]: New documents loader for visio files (with extension .vsdx) (#16171 ) Description : New documents loader for visio files (with extension .vsdx) A [visio file](https://fr.wikipedia.org/wiki/Microsoft_Visio) (with extension .vsdx) is associated with Microsoft Visio, a diagram creation software. It stores information about the structure, layout, and graphical elements of a diagram. This format facilitates the creation and sharing of visualizations in areas such as business, engineering, and computer science. A Visio file can contain multiple pages. Some of them may serve as the background for others, and this can occur across multiple layers. This loader extracts the textual content from each page and its associated pages, enabling the extraction of all visible text from each page, similar to what an OCR algorithm would do. Dependencies : xmltodict package	2024-01-22 22:07:03 -08:00
KhoPhi	fb41b68ea1	docs: Update with LCEL examples to Ollama & ChatOllama Integration notebook (#16194 ) - Description: Updated the Chat/Ollama docs notebook with LCEL chain examples - Issue: #15664 I'm a new contributor 😊 - Dependencies: No dependencies - Twitter handle: Comments: - How do I truncate the output of the stream in the notebook if and or when it goes on and on and on for even the basic of prompts? Edit: Looking forward to feedback @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-01-22 22:05:59 -08:00
Michael Gorham	3b0226b2c6	docs: Update redis_chat_message_history.ipynb (#16344 ) ## Problem Spent several hours trying to figure out how to pass `RedisChatMessageHistory` as a `GetSessionHistoryCallable` with a different REDIS hostname. This example kept connecting to `redis://localhost:6379`, but I wanted to connect to a server not hosted locally. ## Cause Assumption the user knows how to implement `BaseChatMessageHistory` and `GetSessionHistoryCallable` ## Solution Update documentation to show how to explicitly set the REDIS hostname using a lambda function much like the MongoDB and SQLite examples.	2024-01-22 21:59:59 -08:00
Ian	c98994c3c9	docs: Improve notebook to show how to use tidb to store history messages (#16420 ) After merging [PR #16304](https://github.com/langchain-ai/langchain/pull/16304), I realized that our notebook example for integrating TiDB with LangChain was too basic. To make it more useful and user-friendly, I plan to create a detailed example. This will show how to use TiDB for saving history messages in LangChain, offering a clearer, more practical guide for our users	2024-01-22 21:58:37 -08:00
Eugene Yurtsev	c88750d54b	Docs: Agent streaming notebooks (#15858 ) Update information about streaming in the agents section. Show how to use astream_events to get token by token streaming.	2024-01-22 21:54:55 -05:00
Eugene Yurtsev	e5672bc944	docs: Re-write custom agent to show to write a tools agent (#15907 ) Shows how to write a tools agent rather than a functions agent.	2024-01-22 17:28:31 -08:00
Boris Feld	404abf139a	community: Add CometLLM tracing context var (#15765 ) I also added LANGCHAIN_COMET_TRACING to enable the CometLLM tracing integration similar to other tracing integrations. This is easier for end-users to enable it rather than importing the callback and pass it manually. (This is the same content as https://github.com/langchain-ai/langchain/pull/14650 but rebased and squashed as something seems to confuse Github Action).	2024-01-22 15:17:16 -08:00
Nicolò Boschi	a500527030	infra: google-vertexai relax types-requests deps range (#16264 ) - Description: At the moment it's not possible to include in the same project langchain-google-vertexai and boto3 (e.g. use bedrock and vertex in the same application) because of the dependency resolutions conflict. boto3 is still using urllib3 1.x, meanwhile langchain-google-vertexai -> types-requests depends on urllib3 2.x. [the last version of types-requests that allows urllib3 1.x is 2.31.0.6](https://pypi.org/project/types-requests/#description). In this PR I allow the vertexai package to get that version also. - Twitter handle: nicoloboschi	2024-01-22 14:54:41 -08:00
DL	b9e7f6f38a	community[minor]: Bedrock async methods (#12477 ) Description: Added support for asynchronous streaming in the Bedrock class and corresponding tests. Primarily: async def aprepare_output_stream async def _aprepare_input_and_invoke_stream async def _astream async def _acall I've ensured that the code adheres to the project's linting and formatting standards by running make format, make lint, and make test. Issue: #12054, #11589 Dependencies: None Tag maintainer: @baskaryan Twitter handle: @dominic_lovric --------- Co-authored-by: Piyush Jain <piyushjain@duck.com>	2024-01-22 14:44:49 -08:00
Jennifer Melot	d6275e47f2	docs: Updated integration docs structure for tools/arxiv (#16091 ) (#16250 ) - Description: Updated docs for tools/arxiv to use `AgentExecutor` and `invoke` - Issue: #15664 - Dependencies: None - Twitter handle: None	2024-01-22 14:34:22 -08:00
Frank995	5694728816	community[patch]: Implement vector length definition at init time in PGVector for indexing (#16133 ) Replace this entire comment with: - Description: allow user to define tVector length in PGVector when creating the embedding store, this allows for later indexing - Issue: #16132 - Dependencies: None	2024-01-22 14:32:44 -08:00
ChengZi	a950fa0487	docs: add milvus multitenancy doc (#16177 ) - Description: add milvus multitenancy doc, it is an example for this [pr](https://github.com/langchain-ai/langchain/pull/15740) . - Issue: No, - Dependencies: No, - Twitter handle: No Signed-off-by: ChengZi <chen.zhang@zilliz.com>	2024-01-22 14:25:26 -08:00
Chase VanSteenburg	1011b681dc	core[patch]: Fix f-string formatting in error message for configurable_fields (#16411 ) - Description: Simple fix to f-string formatting. Allows more informative ValueError output. - Issue: None needed. - Dependencies: None. - Twitter handle: @FlightP1an	2024-01-22 14:08:44 -08:00
parkererickson-tg	b26a22f307	community[minor]: add TigerGraph support (#16280 ) Description: Add support for querying TigerGraph databases through the InquiryAI service. Issue: N/A Dependencies: N/A Twitter handle: @TigerGraphDB	2024-01-22 14:07:44 -08:00
Christophe Bornet	8da34118bc	docs: Add documentation for Cassandra Document Loader (#16282 )	2024-01-22 14:06:21 -08:00
Alireza Kashani	d1b4ead87c	community[patch]: Update grobid.py (#16298 ) there is a case where "coords" does not exist in the "sentence" therefore, the "split(";")" will lead to error. we can fix that by adding "if sentence.get("coords") is not None:" the resulting empty "sbboxes" from this scenario will raise error at "sbboxes[0]["page"]" because sbboxes are empty. the PDF from https://pubmed.ncbi.nlm.nih.gov/23970373/ can replicate those errors.	2024-01-22 14:03:58 -08:00
s-g-1	fbe592a5ce	community[patch]: fix typo in pgvecto_rs debug msg (#16318 ) fixes typo in pip install message for the pgvecto_rs community vector store no issues found mentioning this no dependents changed	2024-01-22 14:01:33 -08:00
James Braza	d511366dd3	infra: absolute `EXAMPLE_DIR` path in core unit tests (#16325 ) If you invoked testing from places besides `core/`, this `EXAMPLE_DIR` path won't work. This PR makes`EXAMPLE_DIR` robust against invocation location	2024-01-22 14:00:23 -08:00
Jonathan Algar	774e543e1f	docs: fix formatting issue in rockset.ipynb (#16328 ) Description: randomly discovered while working on another PR https://github.com/quarto-dev/quarto-cli/discussions/8131#discussioncomment-8027706 @anubhav94N ICYI	2024-01-22 13:59:45 -08:00
Ian	b9f5104e6c	communty[minor]: Store Message History to TiDB Database (#16304 ) This pull request integrates the TiDB database into LangChain for storing message history, marking one of several steps towards a comprehensive integration of TiDB with LangChain. A simple usage ```python from datetime import datetime from langchain_community.chat_message_histories import TiDBChatMessageHistory history = TiDBChatMessageHistory( connection_string="mysql+pymysql://<host>:<PASSWORD>@<host>:4000/<db>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true", session_id="code_gen", earliest_time=datetime.utcnow(), # Optional to set earliest_time to load messages after this time point. ) history.add_user_message("hi! How's feature going?") history.add_ai_message("It's almot done") ```	2024-01-22 13:56:56 -08:00
Erick Friis	35ec0bbd3b	cli[patch]: pypi fields (#16410 )	2024-01-22 14:28:30 -07:00
Erick Friis	2ac3a82d85	cli[patch]: new fields in integration template, release 0.0.21 (#16398 )	2024-01-22 14:26:47 -07:00
Erick Friis	cfe95ab085	multiple: update langsmith dep (#16407 )	2024-01-22 14:23:11 -07:00
Sarthak Chaure	dd5b8107b1	Docs: Updated callbacks/index.mdx (#16404 ) The callbacks get started demo code was updated , replacing the chain.run() command ( which is now depricated) ,with the updated chain.invoke() command. Solving the following issue : #16379 Twitter/X : @Hazxhx	2024-01-22 16:10:19 -05:00
Omar-aly	873de14cd8	docs: update vectorstores/llm_rails integration doc (#16199 ) Description: - Updated the docs for the vectorstores integration module llm_rails.ipynb Issue: - [Connected to Issue #15664](https://github.com/langchain-ai/langchain/issues/15664) Dependencies: - N/A Co-authored-by: omaraly23 <112936089+omaraly22@users.noreply.github.com>	2024-01-22 11:40:08 -08:00
Eli Lucherini	6b2a57161a	community[patch]: allow additional kwargs in MlflowEmbeddings for compatibility with Cohere API (#15242 ) - Description: add support for kwargs in`MlflowEmbeddings` `embed_document()` and `embed_query()` so that all the arguments required by Cohere API (and others?) can be passed down to the server. - Issue: #15234 - Dependencies: MLflow with MLflow Deployments (`pip install mlflow[genai]`) Tests Now this code [adapted from the docs](https://python.langchain.com/docs/integrations/providers/mlflow#embeddings-example) for the Cohere API works locally. ```python """ Setup ----- export COHERE_API_KEY=... mlflow deployments start-server --config-path examples/deployments/cohere/config.yaml Run --- python /path/to/this/file.py """ embeddings = MlflowCohereEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings") print(embeddings.embed_query("hello")[:3]) print(embeddings.embed_documents(["hello", "world"])[0][:3]) ``` Output ``` [0.060455322, 0.028793335, -0.025848389] [0.031707764, 0.021057129, -0.009361267] ```	2024-01-22 11:38:11 -08:00
Guillem Orellana Trullols	aad2aa7188	community[patch]: BedrockChat -> Support Titan express as chat model (#15408 ) Titan Express model was not supported as a chat model because LangChain messages were not "translated" to a text prompt. Co-authored-by: Guillem Orellana Trullols <guillem.orellana_trullols@siemens.com>	2024-01-22 11:37:23 -08:00
Piotr Mardziel	1b9001db47	core[patch]: preserve inspect.iscoroutinefunction with @deprecated decorator (#16295 ) Adjusted `deprecate` decorator to make sure decorated async functions are still recognized as "coroutinefunction" by `inspect`. Before change, functions such as `LLMChain.acall` which are decorated as deprecated are not recognized as coroutine functions. After the change, they are recognized: ```python import inspect from langchain import LLMChain # Is false before change but true after. inspect.iscoroutinefunction(LLMChain.acall) ```	2024-01-22 11:34:13 -08:00
Katarina Supe	01c2f27ffa	community[patch]: Update Memgraph support (#16360 ) - Description: I removed two queries to the database and left just one whose results were formatted afterward into other type of schema (avoided two calls to DB) - Issue: / - Dependencies: / - Twitter handle: @supe_katarina	2024-01-22 11:33:28 -08:00
Lance Martin	369e90d427	docs: Minor update to Robocorp toolkit docs (#16399 )	2024-01-22 11:33:13 -08:00

... 4 5 6 7 8 ...

7330 Commits