langchain

Author	SHA1	Message	Date
Philipp Schmid	064be93edf	[Embeddings] Add SageMaker Endpoint Embedding class (#1859 ) # What does this PR do? This PR adds similar to `llms` a SageMaker-powered `embeddings` class. This is helpful if you want to leverage Hugging Face models on SageMaker for creating your indexes. I added a example into the [docs/modules/indexes/examples/embeddings.ipynb](https://github.com/hwchase17/langchain/compare/master...philschmid:add-sm-embeddings?expand=1#diff-e82629e2894974ec87856aedd769d4bdfe400314b03734f32bee5990bc7e8062) document. The example currently includes some `_### TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging Face model ###_ ` code showing how you can deploy a sentence-transformers to SageMaker and then run the methods of the embeddings class. @hwchase17 please let me know if/when i should remove the `_### TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging Face model ###_` in the description i linked to a detail blog on how to deploy a Sentence Transformers so i think we don't need to include those steps here. I also reused the `ContentHandlerBase` from `langchain.llms.sagemaker_endpoint` and changed the output type to `any` since it is depending on the implementation.	2023-03-21 21:51:48 -07:00
anupam-tiwari	86822d1cc2	Fixes the import typo in the vector db text generator notebook (#1874 ) Fixes the import typo in the vector db text generator notebook for the chroma library Co-authored-by: Anupam <anupam@10-16-252-145.dynapool.wireless.nyu.edu>	2023-03-21 21:48:26 -07:00
Harrison Chase	a581bce379	remove key (#1863 )	2023-03-21 12:43:41 -07:00
Harrison Chase	2ffc643086	add listen api docs (#1855 )	2023-03-21 09:29:34 -07:00
Harrison Chase	2136dc94bb	bump version to 118 (#1854 )	2023-03-21 09:15:52 -07:00
Matt Tucker	a92344f476	Use regex match for bash process error output test assertion. (#1837 ) I was getting the same issue reported in #1339 by [MacYang555](https://github.com/MacYang555) when running the test suite on my Mac. I implemented the fix they suggested to use a regex match in the output assertion for the scenario under test. Resolves #1339	2023-03-21 09:06:52 -07:00
Tomoko Uchida	b706966ebc	Add setup instruction in Getting Started for Indexing (#1847 ) `VectorstoreIndexCreator` [uses Chroma as the vectorstore by default](`1c22657256/langchain/indexes/vectorstore.py (L49)`). It may be helpful to add a short note for the setup. You can see how the notebook looks here. https://github.com/mocobeta/langchain/blob/feat/add-setup-instruction-to-index-getting-started/docs/modules/indexes/getting_started.ipynb	2023-03-21 09:06:35 -07:00
Harrison Chase	1c22657256	Harrison/faiss merge (#1843 ) Co-authored-by: Ting Su <ting.su.1995@outlook.com>	2023-03-20 22:54:08 -07:00
Harrison Chase	6f02286805	Harrison/subtitles (#1842 ) Co-authored-by: David Ruan <ruanwz@gmail.com> Co-authored-by: David Ruan <david.ruan@analyticservice.net>	2023-03-20 22:53:52 -07:00
Simon Zhou	3674074eb0	Add Qdrant to ecosystem page (#1830 ) Add [Qdrant](https://qdrant.tech/) to [LangChain ecosystem](https://langchain.readthedocs.io/en/latest/ecosystem.html) page.	2023-03-20 22:06:40 -07:00
Wenbin Fang	a7e09d46c5	Add podcast api tool to use NLP to search all podcasts or episodes. (#1833 ) Use the following code to test: ```python import os from langchain.llms import OpenAI from langchain.chains.api import podcast_docs from langchain.chains import APIChain # Get api key here: https://openai.com/pricing os.environ["OPENAI_API_KEY"] = "sk-xxxxx" # Get api key here: https://www.listennotes.com/api/pricing/ listen_api_key = 'xxx' llm = OpenAI(temperature=0) headers = {"X-ListenAPI-Key": listen_api_key} chain = APIChain.from_llm_and_api_docs(llm, podcast_docs.PODCAST_DOCS, headers=headers, verbose=True) chain.run("Search for 'silicon valley bank' podcast episodes, audio length is more than 30 minutes, return only 1 results") ``` Known issues: the api response data might be too big, and we'll get such error: `openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 6733 tokens (6477 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.`	2023-03-20 22:04:17 -07:00
Matt Tucker	fa2e546b76	Add workaround for debugpy install issue to contrib docs. (#1835 ) When following the Quick Start instructions in the contributing docs, I was getting a "WheelFileValidationError" on installation of debugpy which was blocking the installation of a number of other deps. Google turned up this [GitHub issue](https://github.com/microsoft/debugpy/issues/1246) indicating a regression in Poetry 1.4.1 and workarounds. This PR updates the contrib docs noting the issue and the workarounds.	2023-03-20 22:03:19 -07:00
Daniel Dror (Dubovski)	c592b12043	Allow passing in encoding to csv_loader (#1836 )	2023-03-20 22:03:00 -07:00
Ikko Eltociear Ashimine	9555bbd5bb	Fix typo in sqlite.ipynb (#1828 ) overriden -> overridden	2023-03-20 16:47:19 -07:00
Harrison Chase	0ca1641b14	release 0.0.117 (#1819 )	2023-03-20 08:04:04 -07:00
Harrison Chase	d5b4393bb2	Harrison/llm math (#1808 ) Co-authored-by: Vadym Barda <vadim.barda@gmail.com>	2023-03-20 07:53:26 -07:00
Bryan Helmig	7b6ff7fe00	Follow up to #1803 to remove dynamic docs route. (#1818 ) The base docs are going to be more stable and familiar for folks. Dynamic route is currently in flux.	2023-03-20 07:52:41 -07:00
Harrison Chase	76c7b1f677	Harrison/wandb (#1764 ) Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com>	2023-03-20 07:52:27 -07:00
Paul	5aa8ece211	Corrected small typo in error message. (#1791 )	2023-03-20 07:51:35 -07:00
Harrison Chase	f6d24d5740	fix bug with openai token count (#1806 )	2023-03-20 07:51:18 -07:00
Harrison Chase	b1c4480d7c	fix typing (#1807 )	2023-03-20 07:50:49 -07:00
Daniel Chalef	b6ba989f2f	Add request timeout to ChatOpenAI (#1798 ) Add request_timeout field to ChatOpenAI. Defaults to 60s. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-03-19 20:19:42 -07:00
Ankush Gola	04acda55ec	Don't use dynamic api endpoint for Zapier NLA (#1803 ) From Robert "Right now the dynamic/ route for specifically the above endpoints is acting on all providers a user has set up, not just the provider for the supplied API key."	2023-03-19 20:12:33 -07:00
Harrison Chase	8e5c4ac867	bump version to 0.0.116 (#1788 )	2023-03-19 11:01:16 -07:00
Aratako	df8702fead	Small fix: Remove unused variable `summary_message_role` (#1789 ) After the changes in #1783, `summary_message_role` is no longer used in `ConversationSummaryBufferMemory`, so this PR removes it.	2023-03-19 11:01:03 -07:00
Harrison Chase	d5d50c39e6	Harrison/azure embeddings (#1787 ) Co-authored-by: Hemant <4627288+ghaccount@users.noreply.github.com>	2023-03-19 10:42:33 -07:00
Harrison Chase	1f18698b2a	Harrison/token buffer memory (#1786 ) Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>	2023-03-19 10:42:24 -07:00
Harrison Chase	ef4945af6b	Harrison/chat token usage (#1785 )	2023-03-19 10:32:31 -07:00
Harrison Chase	7de2ada3ea	Harrison/add source column (#1784 ) Co-authored-by: Brian Graham <46691715+briangrahamww@users.noreply.github.com> Co-authored-by: briangrahamww <brian.graham@ww.com>	2023-03-19 10:32:13 -07:00
Bernat Felip i Díaz	262d4cb9a8	Use embedding instead of embedding function in ElasticVectorStore (#1692 ) While it might be a bit more restrictive, I find that using the Embedding interface as an input for the vector store creation is better than an embedding function because we can use bulk requests and possibly the retry logic if needed. I have seen that some vector store implementations use Embedding while others use embedding function so I don't know what is the criteria to have one or the other, in my opinion they should all just be Embedding or have a way more complex embedding function that accepts multiple texts instead of one by one. --------- Co-authored-by: Bernat Felip <bernat.felip@rea.ch>	2023-03-19 10:23:38 -07:00
Harrison Chase	951c158106	Harrison/summary message rol (#1783 ) Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>	2023-03-19 10:09:18 -07:00
Bao Nguyen	85e4dd7fc3	Fix wrong prompt in refine chain (#1770 ) I got this during testing ``` ValueError: Missing some input keys: {'existing_answer'} ``` Upon review, the initial prompt should be `QUESTION_PROMPT_SELECTOR`. Co-authored-by: Bao Nguyen <bnguyen@roku.com>	2023-03-19 10:03:45 -07:00
Harrison Chase	b1b4a4065a	change chat default (#1782 ) Resolves https://github.com/hwchase17/langchain/issues/1532, resolves https://github.com/hwchase17/langchain/issues/1652.	2023-03-19 10:01:59 -07:00
Huang Chongdi	08f23c95d9	add encoding parameter to ObsidianLoader (#1752 )	2023-03-19 09:48:31 -07:00
hitoshi44	3cf493b089	Fix Document & Expose StringPromptTemplate as a custom-prompt-template. (#1753 ) Regarding [this issue](https://github.com/hwchase17/langchain/issues/1754), the code in the document [Creating a custom prompt template](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/custom_prompt_template.html) is no longer functional and outdated. To address this, I have made the following changes: 1. Updated the guide in the document to use `StringPromptTemplate` instead of `BasePromptTemplate`. 2. Exposed `StringPromptTemplate` in `prompts/__init__.py` for easier importing.	2023-03-19 09:47:56 -07:00
hitoshi44	e635c86145	Slightly modified the docstring in `BasePromptTemplate` and `StringPromptTemplate`. (#1755 ) Regarding [this issue](https://github.com/hwchase17/langchain/issues/1754), `BasePromptTample` class docstring is a little outdated, thus it requires new method `format_prompt` for now. As such, I have made some modifications to the docstring to bring it up to date. I tried to adhere to the established document style, and would appreciate you for taking a look at this PR.	2023-03-19 09:47:37 -07:00
Harrison Chase	779790167e	Harrison/add warning to openaichat (#1781 )	2023-03-19 09:43:56 -07:00
Nils Durner	3161ced4bc	GPT-4 support (#1778 )	2023-03-19 09:29:44 -07:00
hung_ng__	3d6fcb85dc	Add load json prompt example (#1776 ) Hi, I just want to add a PR on the prompt serialization examples of loading from JSON so that it can contain the same as loading from YAML.	2023-03-19 09:28:56 -07:00
LeoGrin	3701b2901e	use namespace argument in Pinecone constructor (#1757 ) Fix #1756 Use the `namespace` argument of `Pinecone.from_exisiting_index` to set the default value of `namespace` for other methods. Leads to more expected behavior and easier integration in chains. For the test, I've added a line to delete and rebuild the `langchain-demo` index at the beginning of the test. I'm not 100% sure if it's a good idea but it makes the test reproducible.	2023-03-18 19:55:38 -07:00
Ben Gahtan	280cb4160d	Update tool.py (#1760 ) Fixed typo that said the Wikipedia tool was using Wolfram Alpha (instead of Wikipedia)	2023-03-18 19:55:26 -07:00
Kevin	80d8db5f60	Add service account support to Google Drive (#1761 ) Having service account support in the drive document loader would be nice. This is already present in the youtube loader. `cb646082ba/langchain/document_loaders/youtube.py (L76-L78)`	2023-03-18 19:55:17 -07:00
Piyush Jain	1a8790d808	Corrects copyright year (#1762 ) Corrected copyright year.	2023-03-18 19:55:05 -07:00
Eric Zhu	34840f3aee	AzureChatOpenAI for Azure Open AI's ChatGPT API (#1673 ) Add support for Azure OpenAI's ChatGPT API, which uses ChatML markups to format messages instead of objects. Related issues: #1591, #1659	2023-03-18 19:54:20 -07:00
Harrison Chase	8685d53adc	querying tabular data (#1758 )	2023-03-18 11:12:18 -07:00
Harrison Chase	2f6833d433	hotfix (#1742 )	2023-03-17 09:05:08 -07:00
Harrison Chase	dd90fd02d5	Harrison/move docs (#1741 )	2023-03-17 08:49:10 -07:00
Harrison Chase	07766a69f3	move docs (#1740 )	2023-03-17 08:42:28 -07:00
Harrison Chase	aa854988bf	bump version to 114 (#1739 )	2023-03-17 08:26:06 -07:00
Harrison Chase	96ebe98dc2	Harrison/latex splitter (#1738 ) Co-authored-by: Aidan Holland <thehappydinoa@gmail.com> Co-authored-by: Jan de Boer <44832123+Janldeboer@users.noreply.github.com>	2023-03-17 08:10:27 -07:00

... 3 4 5 6 7 ...

1100 Commits