langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
sasha	1c9ceff503	community: add metadata to chain logging; (#22122 ) Hey, I'm Sasha. The SDK engineer from [Comet](https://comet.com). This PR updates the CometTracer class. Added metadata to CometTracerr. From now on, both chains and spans will send it.	2024-05-24 15:29:40 +00:00
Jirka Lhotka	7c0459faf2	community: Update costs of openai finetuned models (#22124 ) - Description: Update costs of finetuned models and add gpt-3-turbo-0125. Source: https://openai.com/api/pricing/ - Issue: N/A - Dependencies: None	2024-05-24 15:25:17 +00:00
Eugene Yurtsev	d3db83abe3	community[major]: lint for usage of xml library (#22132 ) * Lint for usage of standard xml library * Add forced opt-in for quip client * Actual security issue is with underlying QuipClient not LangChain integration (since the client is doing the parsing), but adding enforcement at the LangChain level.	2024-05-24 15:23:53 +00:00
Tom Aarsen	5b5ea2af30	docs: Add explanation on how to use Hugging Face embeddings (#22118 ) - Description: I've added a tab on embedding text with LangChain using Hugging Face models to here: https://python.langchain.com/v0.2/docs/how_to/embed_text/. HF was mentioned in the running text, but not in the tabs, which I thought was odd. - Issue: N/A - Dependencies: N/A - Twitter handle: No need, this is tiny :) Also, I had a ton of issues with the poetry docs/lint install, so I haven't linted this. Apologies for that. cc @Jofthomas - Tom Aarsen	2024-05-24 11:21:03 -04:00
Bagatur	baa3c975cb	anthropic[patch]: allow tool call mutation (#22130 ) If tool_use blocks and tool_calls with overlapping IDs are present, prefer the values of the tool_calls. Allows for mutating AIMessages just via tool_calls.	2024-05-24 08:18:14 -07:00
Christophe Bornet	c838de5027	doc: Add doc for CassandraByteStore (#22126 ) Preview: https://langchain-git-fork-cbornet-doc-cassandrabytestore-langchain.vercel.app/v0.2/docs/integrations/stores/cassandra/	2024-05-24 10:57:55 -04:00
Vadym Barda	2edb512282	docs: improve how-to docs for message history (#22072 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-05-23 20:12:24 -04:00
Artem	eb7c453b98	docs: update `hub.pull("rlm/map-prompt")` to `hub.pull("rlm/reduce-prompt")` for reduce prompt (#22088 ) PR message: Update `hub.pull("rlm/map-prompt")` to `hub.pull("rlm/reduce-prompt")` in summarization.ipynb Description: Fix typo in prompt hub link from `reduce_prompt = hub.pull("rlm/map-prompt")` to `reduce_prompt = hub.pull("rlm/reduce-prompt")` following next issue Issue: #22014 Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-05-23 23:07:37 +00:00
Leonid Ganeline	2416737c5f	docs: compact the API Reference links (#21285 ) This PR is opinionated. Issue: the `API Reference` sections in the examples hold too much vertical space and make us scroll the page too much. See an [example](https://python.langchain.com/docs/get_started/quickstart/#conversation-retrieval-chain). These sections are important. So, the compacting should not make these sections less noticeable. Change: compacting the `API Reference` sections. See the [same example after change applied](https://langchain-j6nya46lf-langchain.vercel.app/docs/get_started/quickstart/#conversation-retrieval-chain). It is more compact and now looks like references (footnotes). Note: I would also change the section style, so it would be more noticeable (maybe to look like the footnotes. Smaller wider font?) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-23 15:50:23 -07:00
ccurme	0ea1e89b2c	groq: read tool calls from .tool_calls attribute (#22096 )	2024-05-23 18:16:06 -04:00
Bagatur	96c21dfe56	docs: hf feat table tool calling (#22091 )	2024-05-23 15:09:30 -07:00
Eugene Yurtsev	63004a0945	codespell ignore remaining issues (#22097 )	2024-05-23 21:51:39 +00:00
Eugene Yurtsev	2d693c484e	docs: fix some spelling mistakes caught by newest version of code spell (#22090 ) Going to merge this even though it doesn't pass all tests, and open a separate PR for the remaining spelling mistakes.	2024-05-23 16:59:11 -04:00
Bagatur	38783d07c9	infra: api docs quick preview (#22093 )	2024-05-23 13:57:45 -07:00
Pavel Zloi	fe26f937e4	community[minor]: ManticoreSearch engine added to vectorstore (#19117 ) Description: ManticoreSearch engine added to vectorstores Issue: no issue, just a new feature Dependencies: https://pypi.org/project/manticoresearch-dev/ Twitter handle: @EvilFreelancer - Example notebook with test integration: https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-23 13:56:18 -07:00
Erick Friis	95c3e5f85f	cli: model name substitution fix, release 0.0.23 (#22089 )	2024-05-23 13:09:38 -07:00
Kartheek Yakkala	18b8c8628a	docs : Added integrations for tools with langchain_community (#22056 ) - PR title: Docs enhancement - Description: Adding installation instructions for integrations requiring `langchain-community` package since 0.2 - Issue: https://github.com/langchain-ai/langchain/issues/22005	2024-05-23 15:09:34 -04:00
ccurme	152c8cac33	anthropic, openai: cut pre-releases (#22083 )	2024-05-23 15:02:23 -04:00
ccurme	cd07521170	core: bump to 0.2.1rc (#22080 )	2024-05-23 18:36:50 +00:00
Harrison Chase	170cc8aec3	docs: add multi-modal-docs (#21734 ) We dont really have any abstractions around multi-modal... so add a section explaining we dont have any abstrations and then how to guides for openai and anthropic (probably need to add for more) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: junefish <junefish@users.noreply.github.com> Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-23 18:33:25 +00:00
ccurme	fbfed65fb1	core, partners: add token usage attribute to AIMessage (#21944 ) ```python class UsageMetadata(TypedDict): """Usage metadata for a message, such as token counts. Attributes: input_tokens: (int) count of input (or prompt) tokens output_tokens: (int) count of output (or completion) tokens total_tokens: (int) total token count """ input_tokens: int output_tokens: int total_tokens: int ``` ```python class AIMessage(BaseMessage): ... usage_metadata: Optional[UsageMetadata] = None """If provided, token usage information associated with the message.""" ... ```	2024-05-23 14:21:58 -04:00
Bagatur	3d26807b92	community[patch]: Release. 0.2.1 (#22073 )	2024-05-23 10:40:32 -07:00
Bagatur	2d968213d7	langchain[patch]: Release 0.2.1 (#22074 )	2024-05-23 10:09:36 -07:00
maang-h	9aba9e3e33	community[patch]: Update the default “API URL” and “MODEL” of sparkllm (#22070 ) - Description: When I was running the sparkllm, I found that the default parameters currently used could no longer run correctly. - original parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat" - spark_llm_domain: "generalv3" ```python # example from langchain_community.chat_models import ChatSparkLLM spark = ChatSparkLLM(spark_app_id="my_app_id", spark_api_key="my_api_key", spark_api_secret="my_api_secret") spark.invoke("hello") ``` ![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414) So I updated them to 3.5 (same as sparkllm official website). After the update, they can be used normally. - new parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat" - spark_llm_domain: "generalv3.5"	2024-05-23 12:25:20 -04:00
junkeon	4fda7bf4f2	upstage[patch] : fix error handling in Layout Analysis parser (#22054 ) This pull request addresses and fixes exception handling in the UpstageLayoutAnalysisParser and enhances the test coverage by adding error exception tests for the document loader. These improvements ensure robust error handling and increase the reliability of the system when dealing with external API calls and JSON responses. ### Changes Made 1. Fix Request Exception Handling: - Issue: The existing implementation of UpstageLayoutAnalysisParser did not properly handle exceptions thrown by the requests library, which could lead to unhandled exceptions and potential crashes. - Solution: Added comprehensive exception handling for requests.RequestException to catch any request-related errors. This includes logging the error details and raising a ValueError with a meaningful error message. 2. Add Error Exception Tests for Document Loader: - New Tests: Introduced new test cases to verify the robustness of the UpstageLayoutAnalysisLoader against various error scenarios. The tests ensure that the loader gracefully handles: - RequestException: Simulates network issues or invalid API requests to ensure appropriate error handling and user feedback. - JSONDecodeError: Simulates scenarios where the API response is not a valid JSON, ensuring the system does not crash and provides clear error messaging.	2024-05-23 11:45:34 -04:00
JuHyung Son	d9eff44400	partner-upstage[patch]: embeddings empty list bug (#22057 ) Fixed an error in `embed_documents` when the input was given as an empty list. And I have revised the document.	2024-05-23 11:44:30 -04:00
Martin Triska	2df8ac402a	community[minor]: Added propagation of document metadata from O365BaseLoader (#20663 ) Description: - Added propagation of document metadata from O365BaseLoader to FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the hood). - This is done by passing dictionary `metadata_dict`: key=filename and value=dictionary containing document's metadata - Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use `mimetype` from it (if available) and pass metadata further into blob loader. Issue: - `O365BaseLoader` under the hood downloads documents to temp folder and then uses `FileSystemBlobLoader` on it. - However metadata about the document in question is lost in this process. In particular: - `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file extension, but that does not work 100% of the time. - `web_url`: this is useful to keep around since in RAG LLM we might want to provide link to the source document. In order to work well with document parsers, we pass the `web_url` as `source` (`web_url` is ignored by parsers, `source` is preserved) Dependencies: None Twitter handle: @martintriska1 Please review @baskaryan --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-05-23 11:42:19 -04:00
Eugene Yurtsev	e5541d1da7	community[patch]: Update doc-string in CloudBlobLoader (#22069 ) Update doc-string	2024-05-23 15:31:41 +00:00
Maxime Perrin	8ba4f77734	docs : Adding correct imports to the integrations callbacks doc (#22059 ) - Description: Adding correct imports to the integrations callbacks doc (langchain-community package) - Issue: #22005 --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-05-23 11:27:36 -04:00
Philippe PRADOS	6dd621d636	community[minor]: Add CloudBlobLoader that supports loading data from cloud buckets (#21957 ) Thank you for contributing to LangChain! - [ ] PR title: "Add CloudBlobLoader" - community: Add CloudBlobLoader - [ ] PR message: Add cloud blob loader - Description: Langchain provides several approaches to read different file formats: Specific loaders (`CVSLoader`) or blob-compatible loaders (`FileSystemBlobLoader`). The only implementation proposed for BlobLoader is `FileSystemBlobLoader`. Many projects retrieve files from cloud storage. We propose a new implementation of `BlobLoader` to read files from the three cloud storage systems. The interface is strictly identical to `FileSystemBlobLoader`. The only difference is the constructor, which takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`, or `gs://my-bucket`. By streamlining the process, this novel implementation eliminates the requirement to pre-download files from cloud storage to local temporary files (which are seldom removed). The code relies on the [CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to interpret cloud URLs. This has been added as an optional dependency. ```Python loader = CloudBlobLoader("s3://mybucket/id") for blob in loader.yield_blobs(): print(blob) ``` - [X] Dependencies: CloudPathLib - [X] Twitter handle: pprados - [X] Add tests and docs: Add unit test, but it's easy to convert to integration test, with some files in a cloud storage (see `test_cloud_blob_loader.py`) - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. Hello from Paris @hwchase17. Can you review this PR? --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-05-23 10:59:55 -04:00
Christophe Bornet	74947ec894	community[minor]: Add Cassandra ByteStore (#22064 )	2024-05-23 10:46:23 -04:00
Christophe Bornet	fea6b99b16	community[minor]: Add async methods to CassandraChatMessageHistory (#21975 )	2024-05-23 10:13:05 -04:00
Eugene Yurtsev	37cfc00310	docs: concepts callbacks fix admonition (#22048 ) Correct the admonition text	2024-05-22 20:33:28 -04:00
Erick Friis	53293dace8	docs: version increases (#22050 )	2024-05-22 16:20:10 -07:00
Sky	12d65f17ff	community[patch]: surrealdb provide functions for MMR (Maximal Marginal Relevance) (#21185 ) This PR contains 4 added functions: - max_marginal_relevance_search_by_vector - amax_marginal_relevance_search_by_vector - max_marginal_relevance_search - amax_marginal_relevance_search I'm no langchain expert, but tried do inspect other vectorstore sources like chroma, to build these functions for SurrealDB. If someone has some changes for me, please let me know. Otherwise I would be happy, if these changes are added to the repository, so that I can use the orignal repo and not my local monkey patched version. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-22 22:53:55 +00:00
Erick Friis	58b6c72375	docs: add astream v2 migration guide links (#21845 ) - docs: v0.2 version sidebar - x - x	2024-05-22 15:48:42 -07:00
Bruno Alvisio	5eabe90494	community[patch]: Adding HEADER to the list of supported locations (#21946 ) Description: adds headers to the list of supported locations when generating the openai function schema	2024-05-22 22:47:56 +00:00
Bagatur	50186da0a1	infra: rm unused # noqa violations (#22049 ) Updating #21137	2024-05-22 15:21:08 -07:00
acho98	45ed5f3f51	community[minor]: Add Clova Embeddings for LangChain Community (#21890 ) - [ ] PR title: "Add Naver ClovaX embedding to LangChain community" - HyperClovaX is a large language model developed by [Naver](https://clova-x.naver.com/welcome). It's a powerful and purpose-trained LLM. - You can visit the embedding service provided by [ClovaX](https://www.ncloud.com/product/aiService/clovaStudio) - You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY, CLOVA_EMB_APP_ID From https://www.ncloud.com/product/aiService/clovaStudio --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-22 22:08:47 +00:00
arpitkumar980	444c2a3d9f	community[patch]: sharepoint loader identity enabled (#21176 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines:https://github.com/arpitkumar980/langchain.git - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-05-22 22:08:31 +00:00
Eugene Yurtsev	8a877120c3	docs: add admonitions to how-to callbacks (#22046 ) Add admonitions with more information.	2024-05-22 22:05:57 +00:00
HuiyuanYan	bf3aefce93	community[patch]: Update tongyi.py to support MultimodalConversation in dashscope. (#21249 ) Add the support of multimodal conversation in dashscope,now we can use multimodal language model "qwen-vl-v1", "qwen-vl-chat-v1", "qwen-audio-turbo" to processing picture an audio. :) - [ ] PR title: "community: add multimodal conversation support in dashscope" - [ ] PR message: *Delete this entire checklist* and replace with - Description: add multimodal conversation support in dashscope - Issue: - Dependencies: dashscope≥1.18.0 - Twitter handle: none :) - [ ] How to use it?: - ```python Tongyi_chat = ChatTongyi( top_p=0.5, dashscope_api_key=api_key, model="qwen-vl-v1" ) response= Tongyi_chat.invoke( input = [ { "role": "user", "content": [ {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"}, {"text": "这是什么?"} ] } ] ) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-22 22:04:58 +00:00
mochi	63284ffebf	experimental[patch], docs: refine notebook for MyScale `SelfQueryRetriever` (#22016 ) - Description: upgrade model to `gpt-4o`	2024-05-22 21:49:01 +00:00
MSubik	d948783a4c	community[patch]: standardize init args, update for javelin sdk release. (#21980 ) Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) Updated the Javelin chat model to standardize the initialization argument. Also fixed an existing bug, where code was initialized with incorrect call to the JavelinClient defined in the javelin_sdk, resulting in an initialization error. See related [Javelin Documentation](https://docs.getjavelin.io/docs/javelin-python/quickstart).	2024-05-22 21:47:28 +00:00
Mohammad Mohtashim	16617dd239	community[patch]: AzureSearchVectorStoreRetriever Fixed to account for search_kwargs (#21572 ) - Description: Fixed `AzureSearchVectorStoreRetriever` to account for search_kwargs. More explanation is in the mentioned issue. - Issue: #21492 --------- Co-authored-by: MAC <mac@MACs-MacBook-Pro.local> Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-22 14:46:41 -07:00
Klaudia Lemiec	45351d1bc6	docs: Chroma docstrings update (#22001 ) Thank you for contributing to LangChain! - [X] PR title: "docs: Chroma docstrings update" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] PR message: - Description: Added and updated Chroma docstrings - Issue: https://github.com/langchain-ai/langchain/issues/21983 - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - only docs - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-05-22 21:45:30 +00:00
Jerron Lim	28456c2c33	community[patch]: add args_schema to WikipediaQueryRun (#22019 ) Description: This change adds args_schema (pydantic BaseModel) to WikipediaQueryRun for correct schema formatting on LLM function calls Issue: currently using WikipediaQueryRun with OpenAI function calling returns the following error "TypeError: WikipediaQueryRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"Hunter x Hunter"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-22 21:31:58 +00:00
Mazen Ramadan	3c1d77dd64	community[minor]: Add Scrapfly Loader community integration (#22036 ) Added [Scrapfly](https://scrapfly.io/) Web Loader integration. Scrapfly is a web scraping API that allows extracting web page data into accessible markdown or text datasets. - __Description__: Added Scrapfly web loader for retrieving web page data as markdown or text. - Dependencies: scrapfly-sdk - Twitter: @thealchemi1st --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-22 21:29:13 +00:00
Chad Juliano	9a66c43146	docs: Use Kinetica Sql context API (#21993 ) Update python notebook to use new Kinetica SQL context API.	2024-05-22 14:26:20 -07:00
ccurme	b51a1eba4d	langchain, community: move OpenAIAssistantV2Runnable to community (#22044 )	2024-05-22 21:22:50 +00:00

1 2 3 4 5 ...

9538 Commits