langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-08 07:10:35 +00:00

Author	SHA1	Message	Date
rajib	0f6ef048d2	The openai_info.py does not have gpt-35-turbo which is the underlying Azure Open AI model name (#6321 ) Since this model name is not there in the list MODEL_COST_PER_1K_TOKENS, when we use get_openai_callback(), for gpt 3.5 model in Azure AI, we do not get the cost of the tokens. This will fix this issue #### Who can review? @hwchase17 @agola11 Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:16:39 -07:00
ArchimedesFTW	fe941cb54a	Change tags(str) to tags(dict) in mlflow_callback.py docs (#6473 ) Fixes #6472 #### Who can review? @agola11	2023-06-26 02:12:23 -07:00
0xcrusher	9187d2f3a9	Fixed caching bug for Multiple Caching types by correctly checking types (#6746 ) - Fixed an issue where some caching types check the wrong types, hence not allowing caching to work Maintainer responsibilities: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-06-26 01:14:32 -07:00
Harrison Chase	e9877ea8b1	Tiktoken override (#6697 )	2023-06-26 00:49:32 -07:00
Gabriel Altay	f9771700e4	prevent DuckDuckGoSearchAPIWrapper from consuming top result (#6727 ) remove the `next` call that checks for None on the results generator	2023-06-25 19:54:15 -07:00
Pau Ramon Revilla	87802c86d9	Added a MHTML document loader (#6311 ) MHTML is a very interesting format since it's used both for emails but also for archived webpages. Some scraping projects want to store pages in disk to process them later, mhtml is perfect for that use case. This is heavily inspired from the beautifulsoup html loader, but extracting the html part from the mhtml file. --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-25 13:12:08 -07:00
Janos Tolgyesi	05eec99269	beautifulsoup get_text kwargs in WebBaseLoader (#6591 ) # beautifulsoup get_text kwargs in WebBaseLoader - Description: this PR introduces an optional `bs_get_text_kwargs` parameter to `WebBaseLoader` constructor. It can be used to pass kwargs to the downstream BeautifulSoup.get_text call. The most common usage might be to pass a custom text separator, as seen also in `BSHTMLLoader`. - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: jtolgyesi	2023-06-25 12:42:27 -07:00
Matt Robinson	be68f6f8ce	feat: Add `UnstructuredRSTLoader` (#6594 ) ### Summary Adds an `UnstructuredRSTLoader` for loading [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file. ### Testing ```python from langchain.document_loaders import UnstructuredRSTLoader loader = UnstructuredRSTLoader( file_path="example_data/README.rst", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @hwchase17 - @rlancemartin - @eyurtsev	2023-06-25 12:41:57 -07:00
Chip Davis	b32cc01c9f	feat: added tqdm progress bar to UnstructuredURLLoader (#6600 ) - Description: Adds a simple progress bar with tqdm when using UnstructuredURLLoader. Exposes new paramater `show_progress_bar`. Very simple PR. - Issue: N/A - Dependencies: N/A - Tag maintainer: @rlancemartin @eyurtsev --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-25 12:41:25 -07:00
Augustine Theodore	afc292e58d	Fix WhatsAppChatLoader : Enable parsing additional formats (#6663 ) - Description: Updated regex to support a new format that was observed when whatsapp chat was exported. - Issue: #6654 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-25 12:08:43 -07:00
Sumanth Donthula	3e30a5d967	updated sql_database.py for returning sorted table names. (#6692 ) Added code to get the tables info in sorted order in methods get_usable_table_names and get_table_info. Linked to Issue: #6640	2023-06-25 12:04:24 -07:00
刘方瑞	9d1b3bab76	Fix Typo in LangChain MyScale Integration Doc (#6705 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: Fix Typo in LangChain MyScale Integration Doc @hwchase17	2023-06-25 11:54:00 -07:00
sudolong	408c8d0178	fix chroma _similarity_search_with_relevance_scores missing `kwargs` … (#6708 ) Issue: https://github.com/hwchase17/langchain/issues/6707	2023-06-25 11:53:42 -07:00
Zander Chase	d89e10d361	Fix Multi Functions Agent Tracing (#6702 ) Confirmed it works now: https://dev.langchain.plus/public/0dc32ce0-55af-432e-b09e-5a1a220842f5/r	2023-06-25 10:39:04 -07:00
Harrison Chase	1742db0c30	bump version to 215 (#6719 )	2023-06-25 08:52:51 -07:00
Ankush Gola	e1b801be36	split up batch llm calls into separate runs (#5804 )	2023-06-24 21:03:31 -07:00
Davis Chase	1da99ce013	bump v214 (#6694 )	2023-06-24 14:23:11 -07:00
Lance Martin	dd36adc0f4	Make bs4 a local import in recursive_url_loader.py (#6693 ) Resolve https://github.com/hwchase17/langchain/issues/6679	2023-06-24 13:54:10 -07:00
Harrison Chase	ef4c7b54ef	bump to version 213 (#6688 )	2023-06-24 11:56:37 -07:00
UmerHA	068142fce2	Add caching to BaseChatModel (issue #1644 ) (#5089 ) # Add caching to BaseChatModel Fixes #1644 (Sidenote: While testing, I noticed we have multiple implementations of Fake LLMs, used for testing. I consolidated them.) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-24 11:45:09 -07:00
Harrison Chase	c289cc891a	Harrison/optional ids opensearch (#6684 ) Co-authored-by: taekimsmar <66041442+taekimsmar@users.noreply.github.com>	2023-06-24 09:19:57 -07:00
Hrag Balian	2518e6c95b	Session deletion method in motorhead memory (#6609 ) Motorhead Memory module didn't support deletion of a session. Added a method to enable deletion. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-23 21:27:42 -07:00
Baichuan Sun	9fbe346860	Amazon API Gateway hosted LLM (#6673 ) This PR adds a new LLM class for the Amazon API Gateway hosted LLM. The PR also includes example notebooks for using the LLM class in an Agent chain. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-23 21:27:25 -07:00
Davis Chase	fa1bb873e2	Fix openapi parameter parsing (#6676 ) Ensure parameters are json serializable, related to #6671	2023-06-23 21:19:12 -07:00
Akash	b7e1c54947	Just corrected a small inconsistency on a doc page (#6603 ) ### Just corrected a small inconsistency on a doc page (not exactly a typo, per se) - Description: There was inconsistency due to the use of single quotes at one place on the [Squential Chains](https://python.langchain.com/docs/modules/chains/foundational/sequential_chains) page of the docs, - Issue: NA, - Dependencies: NA, - Tag maintainer: @dev2049, - Twitter handle: kambleakash0	2023-06-23 16:09:29 -07:00
Davis Chase	2da1aab50b	Wiki loader lint (#6670 )	2023-06-23 16:05:42 -07:00
Leonid Ganeline	1c81883d42	added docstrings where they missed (#6626 ) This PR targets the `API Reference` documentation. - Several classes and functions missed `docstrings`. These docstrings were created. - In several places this ``` except ImportError: raise ValueError( ``` was replaced to ``` except ImportError: raise ImportError( ```	2023-06-23 15:49:44 -07:00
Shashank	3364e5818b	Changed generate_prompt.py (#6644 ) Modified regex for Fix: ValueError: Could not parse output	2023-06-23 15:48:33 -07:00
Davis Chase	f1e1ac2a01	chroma nb close img tag (#6669 )	2023-06-23 15:41:54 -07:00
eLafo	db8b13df4c	adds doc_content_chars_max argument to WikipediaLoader (#6645 ) # Description It adds a new initialization param in `WikipediaLoader` so we can override the `doc_content_chars_max` param used in `WikipediaAPIWrapper` under the hood, e.g: ```python from langchain.document_loaders import WikipediaLoader # doc_content_chars_max is the new init param loader = WikipediaLoader(query="python", doc_content_chars_max=90000) ``` ## Decisions `doc_content_chars_max` default value will be 4000, because it's the current value I have added pycode comments # Issue #6639 # Dependencies None # Twitter handle [@elafo](https://twitter.com/elafo)	2023-06-23 15:22:09 -07:00
Davis Chase	5e5b30b74f	openapi -> openai nit (#6667 )	2023-06-23 15:09:02 -07:00
Jeff Huber	2acf109c4b	update chroma notebook (#6664 ) @rlancemartin I updated the notebook for Chroma to hopefully be a lot easier for users.	2023-06-23 15:03:06 -07:00
Eduard van Valkenburg	48381f1f78	PowerBI: catch outdated token (#6634 ) This adds just a small tweak to catch the error that says the token is expired rather then retrying.	2023-06-23 15:01:08 -07:00
Piyush Jain	b1de927f1b	Kendra retriever api (#6616 ) ## Description Replaces [Kendra Retriever](https://github.com/hwchase17/langchain/blob/master/langchain/retrievers/aws_kendra_index_retriever.py) with an updated version that uses the new [retriever API](https://docs.aws.amazon.com/kendra/latest/dg/searching-retrieve.html) which is better suited for retrieval augmented generation (RAG) systems. Note: This change requires the latest version (1.26.159) of boto3 to work. `pip install -U boto3` to upgrade the boto3 version. cc @hupe1980 cc @dev2049	2023-06-23 14:59:35 -07:00
ChrisLovejoy	4e5d78579b	fix minor typo in vector_db_qa.mdx (#6604 ) - Description: minor typo fixed - doesn't instead of does. No other changes.	2023-06-23 14:57:37 -07:00
Ikko Eltociear Ashimine	73da193a4b	Fix typo in myscale_self_query.ipynb (#6601 )	2023-06-23 14:57:12 -07:00
Saarthak Maini	ba256b23f2	Fix Typo (#6595 ) Resolves #6582	2023-06-23 14:56:54 -07:00
kourosh hakhamaneshi	f6fdabd20b	Fix ray-project/Aviary integration (#6607 ) - Description: The aviary integration has changed url link. This PR provide fix for those changes and also it makes providing the input URL optional to the API (since they can be set via env variables). - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2023-06-23 14:49:53 -07:00
northern-64bit	dbe1d029ec	Fix grammar mistake in base.py in planners (#6611 ) Fix a typo in `langchain/experimental/plan_and_execute/planners/base.py`, by changing "Given input, decided what to do." to "Given input, decide what to do." This is in the docstring for functions running LLM chains which shall create a plan, "decided" does not make any sense in this context.	2023-06-23 14:47:10 -07:00
Aaron Pham	082976d8d0	fix(docs): broken link for OpenLLM (#6622 ) This link for the notebook of OpenLLM is not migrated to the new format Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-23 13:59:17 -07:00
Davis Chase	fe828185ed	Dev2049/bump 212 (#6665 )	2023-06-23 13:48:02 -07:00
Hassan Ouda	9e52134d30	ChatVertexAI broken - Fix error with sending context in params (#6652 ) vertex Ai chat is broken right now. That is because context is in params and chat.send_message doesn't accept that as a params. - Closes issue [ChatVertexAI Error: _ChatSessionBase.send_message() got an unexpected keyword argument 'context' #6610](https://github.com/hwchase17/langchain/issues/6610)	2023-06-23 13:38:21 -07:00
Lance Martin	c2b25c17c5	Recursive URL loader (#6455 ) We may want to process load all URLs under a root directory. For example, let's look at the [LangChain JS documentation](https://js.langchain.com/docs/). This has many interesting child pages that we may want to read in bulk. Of course, the `WebBaseLoader` can load a list of pages. But, the challenge is traversing the tree of child pages and actually assembling that list! We do this using the `RecusiveUrlLoader`. This also gives us the flexibility to exclude some children (e.g., the `api` directory with > 800 child pages).	2023-06-23 13:09:00 -07:00
Lance Martin	be02572d58	Add delete and ensure add_texts performs upsert (w/ ID optional) (#6126 ) ## Goal We want to ensure consistency across vectordbs: 1/ add `delete` by ID method to the base vectorstore class 2/ ensure `add_texts` performs `upsert` with ID optionally passed ## Testing - [x] Pinecone: notebook test w/ `langchain_test` vectorstore. - [x] Chroma: Review by @jeffchuber, notebook test w/ in memory vectorstore. - [x] Supabase: Review by @copple, notebook test w/ `langchain_test` table. - [x] Weaviate: Notebook test w/ `langchain_test` index. - [x] Elastic: Revied by @vestal. Notebook test w/ `langchain_test` table. - [ ] Redis: Asked for review from owner of recent `delete` method https://github.com/hwchase17/langchain/pull/6222	2023-06-23 13:03:10 -07:00
Lance Martin	393f469eb3	Create merge loader that combines documents from a set of loaders (#6659 ) Simple utility loader that combines documents from a set of specified loaders.	2023-06-23 13:02:48 -07:00
Davis Chase	6988039975	openapi_openai docstring (#6661 )	2023-06-23 11:38:33 -07:00
Davis Chase	b25933b607	bump 211 (#6660 )	2023-06-23 11:10:48 -07:00
Davis Chase	e013459b18	Openapi to openai (#6658 )	2023-06-23 11:00:34 -07:00
Davis Chase	b062a3f938	bump 210 (#6656 )	2023-06-23 09:37:58 -07:00
Alejandra De Luna	980c865174	fix: remove callbacks arg from Tool and StructuredTool inferred schema (#6483 ) Fixes #5456 This PR removes the `callbacks` argument from a tool's schema when creating a `Tool` or `StructuredTool` with the `from_function` method and `infer_schema` is set to `True`. The `callbacks` argument is now removed in the `create_schema_from_function` and `_get_filtered_args` methods. As suggested by @vowelparrot, this fix provides a straightforward solution that minimally affects the existing implementation. A test was added to verify that this change enables the expected use of `Tool` and `StructuredTool` when using a `CallbackManager` and inferring the tool's schema. - @hwchase17	2023-06-23 01:48:27 -07:00

... 4 5 6 7 8 ...

2969 Commits