langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
georgian	e64bafed3a	Fixes typo in Vectara.similarity_search (#6277 ) Fixes a simple typo. @hwchase17 @dev2049 Co-authored-by: Georgian Sarghi <georgian.sarghi@gmail.com>	2023-06-18 17:48:54 -07:00
Ted	112695e4da	Iterate through filtered file types instead of all listed files (#6258 ) # Iterate through filtered file types instead of all listed files Fixes https://github.com/hwchase17/langchain/issues/6257 https://github.com/hwchase17/langchain/pull/4926 originally added the functionality to filter by file type, storing the filtered files in `_files` https://github.com/hwchase17/langchain/pull/5220 removed the functionality when adding code to filter trashed files by using the `files` variables instead of the `_files` variable. This PR simply adds the functionality back by using `_files` again. #### Who can review? @hwchase17 - project lead @eyurtsev	2023-06-18 17:47:58 -07:00
Dhruvil Shah	ba90e3c990	Update web_base.ipynb for guiding purposes (#6248 ) To bypass SSL verification errors during fetching, you can include the `verify=False` parameter. This markdown proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #6079 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:10 -07:00
Dhruvil Shah	92f05a67a4	Add markdown to specify important arguments (#6246 ) To bypass SSL verification errors during web scraping, you can include the ssl_verify=False parameter along with the headers parameter. This combination of arguments proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #1829 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:00 -07:00
ikebo	ca7a44d024	add max_context_size property in BaseOpenAI (#6239 ) Hi, I make a small improvement for BaseOpenAI. I added a max_context_size attribute to BaseOpenAI so that we can get the max context size directly instead of only getting the maximum token size of the prompt through the max_tokens_for_prompt method. Who can review? @hwchase17 @agola11 I followed the [Common Tasks](`c7db9febb0/.github/CONTRIBUTING.md`), the test is all passed. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:46:35 -07:00
Jan Pawellek	3e3ed8c5c9	Fix LLM types so that they can be loaded from config dicts (#6235 ) LLM configurations can be loaded from a Python dict (or JSON file deserialized as dict) using the [load_llm_from_config](`8e1a7a8646/langchain/llms/loading.py (L12)`) function. However, the type string in the `type_to_cls_dict` lookup dict differs from the type string defined in some LLM classes. This means that the LLM object can be saved, but not loaded again, because the type strings differ.	2023-06-18 17:46:22 -07:00
Shu	46782ad79b	Fixed an unhandled error that was raised when DynamoDB did not have any chat history. (#6141 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The current version of chat history with DynamoDB doesn't handle the case correctly when a table has no chat history. This change solves this error handling. <!-- Remove if not applicable --> Fixes https://github.com/hwchase17/langchain/issues/6088 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:39:19 -07:00
Cameron Vetter	2286204354	Correct AzureSearch Vector Store not applying search_kwargs when searching (#6132 ) Fixes #6131 Simply passes kwargs forward from similarity_search to helper functions so that search_kwargs are applied to search as originally intended. See bug for repro steps. #### Who can review? @hwchase17 @dev2049 Twitter: poshporcupine	2023-06-18 17:39:06 -07:00
Pierre Dulac	395a2a3724	Fix typo in the CAI critique prompt (#6123 ) Very small typo in the Constitutional AI critique default prompt. The negation "If there is no material critique of ..." is used two times, should be used only on the first one. Cheers, Pierre	2023-06-18 17:38:56 -07:00
Hao Chen	38057f0d2e	Fix latest clickhouse vector schema change (#6385 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes https://github.com/hwchase17/langchain/issues/6208 <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> VectorStores / Retrievers / Memory - @dev2049	2023-06-18 17:34:53 -07:00
Davit Buniatyan	1ab9dc8293	[hotfix] Deep Lake fails on newer version due to hardcode (#6383 ) Hot Fixes for Deep Lake [would highly appreciate expedited review] * deeplake version was hardcoded and since deeplake upgraded the integration fails with confusing error * an additional integration test fixed due to embedding function * Additionally fixed docs for code understanding links after docs upgraded * notebook removal of public parameter to make sure code understanding notebook works #### Who can review? @hwchase17 @dev2049 --------- Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-06-18 17:33:49 -07:00
hp0404	6aa7b04f79	Fix integration tests for Faiss vector store (#6281 ) Fixes #5807 (issue) #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:25:49 -07:00
Chakib Benziane	ddd518a161	searx_search: updated tools and doc (#6276 ) - Allows using the same wrapper to create multiple tools ```python wrapper = SearxSearchWrapper(searx_host="**") github_tool = SearxSearchResults(name="Github", wrapper=wrapper, kwargs = { "engines": ["github"], }) arxiv_tool = SearxSearchResults(name="Arxiv", wrapper=wrapper, kwargs = { "engines": ["arxiv"] }) ``` - Updated link to searx documentation Agents / Tools / Toolkits - @hwchase17	2023-06-18 17:23:12 -07:00
ju-bezdek	e2f36ee608	OpenAI functions dont work with async streaming... #6225 (#6226 ) Related to this https://github.com/hwchase17/langchain/issues/6225 Just copied the implementation from `generate` function to `agenerate` and tested it. Didn't run any official tests thought <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #6225 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:05:16 -07:00
Jan Pawellek	ea6a5b03e0	Fix output final text for HuggingFaceTextGenInference when streaming (#6211 ) The LLM integration [HuggingFaceTextGenInference](https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_text_gen_inference.py) already has streaming support. However, when streaming is enabled, it always returns an empty string as the final output text when the LLM is finished. This is because `text` is instantiated with an empty string and never updated. This PR fixes the collection of the final output text by concatenating new tokens.	2023-06-18 17:01:15 -07:00
Tomaz Bratanic	b3bccabc66	Add option to save/load graph cypher QA (#6219 ) Similar as https://github.com/hwchase17/langchain/pull/5818 Added the functionality to save/load Graph Cypher QA Chain due to a user reporting the following error > raise NotImplementedError("Saving not supported for this chain type.")\nNotImplementedError: Saving not supported for this chain type.\n'	2023-06-18 17:00:27 -07:00
Harrison Chase	495128ba95	Harrison/functions docs improvements (#6389 ) Co-authored-by: Sumanth Donthula <46747610+sumanthdonthula@users.noreply.github.com>	2023-06-18 16:57:33 -07:00
Leonid Ganeline	c7ca350cd3	Fix class promotion (#6187 ) In LangChain, all module classes are enumerated in the `__init__.py` file of the correspondent module. But some classes were missed and were not included in the module `__init__.py` This PR: - added the missed classes to the module `__init__.py` files - `__init__.py:__all_` variable value (a list of the class names) was sorted - `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into the `QuerySQLCheckerTool` because it conflicted with `langchain.tools.spark_sql.tool.QueryCheckerTool` - changes to `pyproject.toml`: - added `pgvector` to `pyproject.toml:extended_testing` - added `pandas` to `pyproject.toml:[tool.poetry.group.test.dependencies]` - commented out the `streamlit` from `collbacks/__init__.py`, It is because now the `streamlit` requires Python >=3.7, !=3.9.7 - fixed duplicate names in `tools` - fixed correspondent ut-s #### Who can review? @hwchase17 @dev2049	2023-06-18 16:55:18 -07:00
Harrison Chase	c0c2fd0782	Harrison/zep mem (#6388 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-06-18 16:53:35 -07:00
Harrison Chase	b7159c15cc	Harrison/metaphor search fix (#6387 ) Co-authored-by: jeffzwang <jeffreyzhiyuanwang@gmail.com>	2023-06-18 16:53:24 -07:00
Harrison Chase	9bf5b0defa	Harrison/myscale self query (#6376 ) Co-authored-by: Fangrui Liu <fangruil@moqi.ai> Co-authored-by: 刘方瑞 <fangrui.liu@outlook.com> Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>	2023-06-18 16:53:10 -07:00
Harrison Chase	bd8d418a95	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-18 16:45:49 -07:00
Harrison Chase	3a75d59c3d	searx - docs	2023-06-18 16:45:42 -07:00
MIDORIBIN	5be465bd86	Fixed PermissionError on windows (#6170 ) Fixed PermissionError that occurred when downloading PDF files via http in BasePDFLoader on windows. When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile is used. This function cannot open the file again on Windows.[Python Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile) So, we created a temporary directory with TemporaryDirectory and placed the downloaded file there. temporary directory is deleted in the deconstruct. Fixes #2698 #### Who can review? Tag maintainers/contributors who might be interested: - @eyurtsev - @hwchase17	2023-06-18 16:39:57 -07:00
xleven	4fc7939848	fix link of callbacks on modules page (#6323 ) Since [Callbacks](https://python.langchain.com/docs/modules/callbacks/getting_started/) on [Modules](https://python.langchain.com/docs/modules/) went to a "Page Not Found".	2023-06-18 15:08:12 -07:00
Vijay	2b3b4e0f60	Add the ability to run the map_reduce chains process results step as async (#6181 ) This will add the ability to add an AsyncCallbackManager (handler) for the reducer chain, which would be able to stream the tokens via the `async def on_llm_new_token` callback method Fixes # (issue) [5532](https://github.com/hwchase17/langchain/issues/5532) @hwchase17 @agola11 The following code snippet explains how this change would be used to enable `reduce_llm` with streaming support in a `map_reduce` chain I have tested this change and it works for the streaming use-case of reducer responses. I am happy to share more information if this makes solution sense. ``` AsyncHandler .......................... class StreamingLLMCallbackHandler(AsyncCallbackHandler): """Callback handler for streaming LLM responses.""" def __init__(self, websocket): self.websocket = websocket # This callback method is to be executed in async async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: resp = ChatResponse(sender="bot", message=token, type="stream") await self.websocket.send_json(resp.dict()) Chain .......... stream_handler = StreamingLLMCallbackHandler(websocket) stream_manager = AsyncCallbackManager([stream_handler]) streaming_llm = ChatOpenAI( streaming=True, callback_manager=stream_manager, verbose=False, temperature=0, ) main_llm = OpenAI( temperature=0, verbose=False, ) doc_chain = load_qa_chain( llm=main_llm, reduce_llm=streaming_llm, chain_type="map_reduce", callback_manager=manager ) qa_chain = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator, callback_manager=manager, ) # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited result = await qa_chain.acall( {"question": question, "chat_history": chat_history} ) ```	2023-06-18 13:19:56 -07:00
Alvaro Bartolome	e0dea577ee	Extend `ArgillaCallbackHandler` support (#6153 ) Hi again @agola11! 🤗 ## What's in this PR? After playing around with different chains we noticed that some chains were using different `output_key`s and we were just handling some, so we've extended the support to any output, either if it's a Python list or a string. Kudos to @dvsrepo for spotting this! --------- Co-authored-by: Daniel Vila Suero <daniel@argilla.io>	2023-06-18 11:18:33 -07:00
Harrison Chase	a8cb9ee013	Harrison/gdrive enhancements (#6375 ) Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>	2023-06-18 11:07:23 -07:00
rafael	ebfffaa38f	Guardrails output parser: Pass LLM api for reasking (#6089 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes https://github.com/ShreyaR/guardrails/issues/155 Enables guardrails reasking by specifying an LLM api in the output parser.	2023-06-18 10:50:20 -07:00
Davis Chase	ec850e607f	bump 203 (#6372 )	2023-06-18 09:20:47 -07:00
Lance Martin	370becdfc2	Add self query retriever example with MD header splitting (#6359 ) Flesh out the notebook example for `MarkdownHeaderTextSplitter`	2023-06-17 21:40:20 -07:00
Lance Martin	2c97fbabbd	Update MD header text splitter notebook (#6339 ) Highlight use case for maintaining header groups when splitting.	2023-06-17 13:19:27 -07:00
Harrison Chase	a2bbe3dda4	Harrison/mmr support for opensearch (#6349 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-06-17 12:22:37 -07:00
Davis Chase	2eea5d4cb4	Add ignore vercel preview script (#6320 ) skip building preview of docs for anything branch that doesn't start with `__docs__`. will eventually update to look at code diff directories but patching for now	2023-06-17 11:17:08 -07:00
Harrison Chase	7a48d9ee82	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-17 11:16:19 -07:00
Kenny	e30fdffd1e	Add new openai 0613 model costs (#6110 ) Added costs for gpt-4-32k-0613, gpt-4-0613, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, and gpt-3.5-turbo-16k-0613 to openai_info callback based on this [OpenAI post](https://openai.com/blog/function-calling-and-other-api-updates) @agola11	2023-06-17 11:11:47 -07:00
Dhruvil Shah	2eec687474	update web_base.py to have verify option (#6107 ) We propose an enhancement to the web-based loader initialize method by introducing a "verify" option. This enhancement addresses the issue of SSL verification errors encountered on certain web pages. By providing users with the option to set the verify parameter to False, we offer greater flexibility and control. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ### Fixes #6079 #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 11:10:48 -07:00
Harrison Chase	680d6bbbf8	fix titles in documentation	2023-06-17 11:09:11 -07:00
Nuno Campos	e194dc5306	Make lckwargs private (#6344 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 19:08:25 +01:00
Harrison Chase	8cfb52ddbb	fix spelling	2023-06-17 11:06:54 -07:00
zengbo	5d5298087f	Custom Anthropic API URL (#6221 ) [Feature] User can custom the Anthropic API URL #### Who can review? Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11	2023-06-17 11:01:29 -07:00
Harrison Chase	61e4a1adf9	Harrison/faiss score (#6341 ) Co-authored-by: Frank Stein <16441059+simonfromla@users.noreply.github.com> Co-authored-by: Sims Juju <sims@Ju.lan>	2023-06-17 11:00:47 -07:00
Harrison Chase	42a28ac1ba	Harrison/error zero tools (#6340 ) Co-authored-by: Juhee Kim <46583939+juppytt@users.noreply.github.com>	2023-06-17 11:00:35 -07:00
Slawomir Gonet	eef62bf4e9	qdrant: search by vector (#6043 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Added support to `search_by_vector` to Qdrant Vector store. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review VectorStores / Retrievers / Memory - @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 -->	2023-06-17 09:44:28 -07:00
Mark	b7ba7e8a7b	Allow GoogleDrive to authenticate via application default credentials on Cloud Run/GCE etc without service key (#6035 ) @eyurtsev The existing GoogleDrive implementation always needs a service account to be available at the credentials location. When running on GCP services such as Cloud Run, a service account already exists in the metadata of the service, so no physical key is necessary. This change adds a check to see if it is running in such an environment, and uses that authentication instead. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 09:44:17 -07:00
lonestriker	6f36f0f930	Add oobabooga/text-generation-webui support as a llm (#5997 ) Add oobabooga/text-generation-webui support as an LLM. Currently, supports using text-generation-webui's non-streaming API interface. Allows users who already have text-gen running to use the same models with langchain. #### Before submitting Simple usage, similar to existing LLM supported: ``` from langchain.llms import TextGen llm = TextGen(model_url = "http://localhost:5000") ``` #### Who can review? @hwchase17 - project lead --------- Co-authored-by: Hien Ngo <Hien.Ngo@adia.ae>	2023-06-17 09:42:15 -07:00
Richy Wang	444ca3f669	Improve AnalyticDB Vector Store implementation without affecting user (#6086 ) Hi there: As I implement the AnalyticDB VectorStore use two table to store the document before. It seems just use one table is a better way. So this commit is try to improve AnalyticDB VectorStore implementation without affecting user behavior: 1. Streamline the `post_init `behavior by creating a single table with vector indexing. 2. Update the `add_texts` API for document insertion. 3. Optimize `similarity_search_with_score_by_vector` to retrieve results directly from the table. 4. Implement `_similarity_search_with_relevance_scores`. 5. Add `embedding_dimension` parameter to support different dimension embedding functions. Users can continue using the API as before. Test cases added before is enough to meet this commit.	2023-06-17 09:36:31 -07:00
Ja-sonYun	cdd1d78bf2	make modelname_to_contextsize as a staticmethod (#6040 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes ##6039 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17　@agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 09:13:08 -07:00
Saba Sturua	427551eabf	DocArray as a Retriever (#6031 ) ## DocArray as a Retriever [DocArray](https://github.com/docarray/docarray) is an open-source tool for managing your multi-modal data. It offers flexibility to store and search through your data using various document index backends. This PR introduces `DocArrayRetriever` - which works with any available backend and serves as a retriever for Langchain apps. Also, I added 2 notebooks: DocArray Backends - intro to all 5 currently supported backends, how to initialize, index, and use them as a retriever DocArray Usage - showcasing what additional search parameters you can pass to create versatile retrievers Example: ```python from docarray.index import InMemoryExactNNIndex from docarray import BaseDoc, DocList from docarray.typing import NdArray from langchain.embeddings.openai import OpenAIEmbeddings from langchain.retrievers import DocArrayRetriever # define document schema class MyDoc(BaseDoc): description: str description_embedding: NdArray[1536] embeddings = OpenAIEmbeddings() # create documents descriptions = ["description 1", "description 2"] desc_embeddings = embeddings.embed_documents(texts=descriptions) docs = DocList[MyDoc]( [ MyDoc(description=desc, description_embedding=embedding) for desc, embedding in zip(descriptions, desc_embeddings) ] ) # initialize document index with data db = InMemoryExactNNIndex[MyDoc](docs) # create a retriever retriever = DocArrayRetriever( index=db, embeddings=embeddings, search_field="description_embedding", content_field="description", ) # find the relevant document doc = retriever.get_relevant_documents("action movies") print(doc) ``` #### Who can review? @dev2049 --------- Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	2023-06-17 09:09:33 -07:00
Masafumi Mori	7bb437146d	fix links to prompt templates and example selectors (#6332 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # links to prompt templates and example selectors on the [Prompts](https://python.langchain.com/docs/modules/model_io/prompts/) page are invalid. #### Before submitting Just a small note that I tried to run `make docs_clean` and other related commands before PR written [here](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md#build-documentation-locally), it gives me an error: ```bash langchain % make docs_clean Traceback (most recent call last): File "/Users/masafumi/Downloads/langchain/.venv/bin/make", line 5, in <module> from scripts.proto import main ModuleNotFoundError: No module named 'scripts' make: *** [docs_clean] Error 1 # Poetry (version 1.5.1) # Python 3.9.13 ``` I couldn't figure out how to fix this, so I didn't run those command. But links should work. #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 Similar issue #6323 Co-authored-by: masafumimori <m.masafumimori@outlook.com>	2023-06-17 09:07:14 -07:00

... 4 5 6 7 8 ...

2823 Commits