langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
olgavrou	d6320cc2c0	..	2023-09-04 23:47:26 -04:00
olgavrou	7a4387c60d	notebook fix	2023-09-04 23:46:04 -04:00
olgavrou	235dacc74a	Merge branch 'langchain-ai:master' into master	2023-09-05 11:14:08 -04:00
Bagatur	c8d7ee62ba	bump 282 (#10233 )	2023-09-05 07:58:00 -07:00
Predrag Gruevski	e34ad6fefd	Temporarily disable step that seems to be transiently failing. (#10234 )	2023-09-05 10:55:47 -04:00
Nuno Campos	5d8673a3c1	Fix usage of AsyncHtmlLoader with an already running event loop (#10220 )	2023-09-05 07:25:28 -07:00
olgavrou	3a4c895280	Merge pull request #11 from VowpalWabbit/add_notebook add random policy and notebook example	2023-09-05 09:36:20 -04:00
vintro	ac2310a405	add NumberedListOutputParser to output_parser init (#10204 ) `from langchain.output_parsers import NumberedListOutputParser` did not work, needed to add it to the init file	2023-09-05 01:12:41 -07:00
Junlin Zhou	8b95dabfe3	update(llms/TGI): Allow None as temperature value (#10212 ) Text Generation Inference's client permits the use of a None temperature as seen [here](`033230ae66/clients/python/text_generation/client.py (L71C9-L71C20)`). While I haved dived into TGI's server code and don't know about the implications of using None as a temperature setting, I think we should grant users the option to pass None as a temperature parameter to TGI.	2023-09-05 01:07:57 -07:00
William FH	be152b6a56	Better ls info (#10202 )	2023-09-04 18:21:15 -07:00
olgavrou	9c45d5a27e	restore hash keys	2023-09-04 20:58:05 -04:00
olgavrou	f22fcb8bcd	no cache	2023-09-04 20:52:18 -04:00
olgavrou	8dc5365ee2	no cache key	2023-09-04 20:50:25 -04:00
olgavrou	5b6ebbc825	fixes in notebook	2023-09-04 19:42:43 -04:00
Christophe Bornet	f389c4fcab	Fix S3DirectoryLoader exception (#10193 ) #9304 introduced a critical bug. The S3DirectoryLoader fails completely because boto3 checks the naming of kw arguments and one of the args is badly named (very sorry for that) cc @baskaryan	2023-09-04 15:59:22 -07:00
olgavrou	5c2069890f	policy fixes	2023-09-04 18:46:45 -04:00
olgavrou	736e0dd46e	fix	2023-09-04 18:40:53 -04:00
olgavrou	5b1812f95b	fix linting checks	2023-09-04 18:35:59 -04:00
olgavrou	f1d144cd6c	run notebook and change location	2023-09-04 18:33:05 -04:00
Manuel Soria	dde1992fdd	Adding custom tools to SQL Agent (#10198 ) Changes in: - `create_sql_agent` function so that user can easily add custom tools as complement for the toolkit. - updating sql use case notebook to showcase 2 examples of extra tools. Motivation for these changes is having the possibility of including domain expert knowledge to the agent, which improves accuracy and reduces time/tokens. --------- Co-authored-by: Manuel Soria <manuel.soria@greyscaleai.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-04 15:28:28 -07:00
olgavrou	62cf108700	add random policy and notebook	2023-09-04 18:08:46 -04:00
olgavrou	af4b560b86	fix poetry after merge	2023-09-04 17:28:11 -04:00
ElReyZero	5dbae94e04	OpenAIEmbeddings: Add optional an optional parameter to skip empty embeddings (#10196 ) ## Description ### Issue This pull request addresses a lingering issue identified in PR #7070. In that previous pull request, an attempt was made to address the problem of empty embeddings when using the `OpenAIEmbeddings` class. While PR #7070 introduced a mechanism to retry requests for embeddings, it didn't fully resolve the issue as empty embeddings still occasionally persisted. ### Problem In certain specific use cases, empty embeddings can be encountered when requesting data from the OpenAI API. In some cases, these empty embeddings can be skipped or removed without affecting the functionality of the application. However, they might not always be resolved through retries, and their presence can adversely affect the functionality of applications relying on the `OpenAIEmbeddings` class. ### Solution To provide a more robust solution for handling empty embeddings, we propose the introduction of an optional parameter, `skip_empty`, in the `OpenAIEmbeddings` class. When set to `True`, this parameter will enable the behavior of automatically skipping empty embeddings, ensuring that problematic empty embeddings do not disrupt the processing flow. The developer will be able to optionally toggle this behavior if needed without disrupting the application flow. ## Changes Made - Added an optional parameter, `skip_empty`, to the `OpenAIEmbeddings` class. - When `skip_empty` is set to `True`, empty embeddings are automatically skipped without causing errors or disruptions. ### Example Usage ```python from openai.embeddings import OpenAIEmbeddings # Initialize the OpenAIEmbeddings class with skip_empty=True embeddings = OpenAIEmbeddings(api_key="your_api_key", skip_empty=True) # Request embeddings, empty embeddings are automatically skipped. docs is a variable containing the already splitted text. results = embeddings.embed_documents(docs) # Process results without interruption from empty embeddings ```	2023-09-04 14:10:36 -07:00
Lance Martin	8998060d85	Update docs w/ prompt hub (#10197 ) Small updates to docs	2023-09-04 14:09:08 -07:00
olgavrou	00d56fb0fc	merge from upstream	2023-09-04 16:48:59 -04:00
olgavrou	b59e2b5afa	Merge pull request #10 from VowpalWabbit/dot_prods_auto_embed Dot prods auto embed	2023-09-05 05:01:42 -04:00
olgavrou	ae5edefdcd	cleanup	2023-09-04 16:36:29 -04:00
Bagatur	a94dc6ee44	model garden nit (#10194 )	2023-09-04 11:42:35 -07:00
Louis	bb8c095127	Add 'download_dir' argument to VLLM (#9754 ) - Description: Add a 'download_dir' argument to VLLM model (to change the cache download directotu when retrieving a model from HF hub) - Issue: On some remote machine, I want the cache dir to be in a volume where I have space (models are heavy nowadays). Sometimes the default HF cache dir might not be what we want. - Dependencies: None --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-04 10:53:48 -07:00
Aashish Saini	8bba69ffd0	Fixed some grammatical typos in doc files (#10191 ) Fixed some grammatical typos in doc files CC: @baskaryan, @eyurtsev, @rlancemartin. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Md Nazish Arman <142379599+MdNazishArmanShorthillsAI@users.noreply.github.com> Co-authored-by: KamalSharmaShorthillsAI <142474019+KamalSharmaShorthillsAI@users.noreply.github.com> Co-authored-by: Lakshya <lakshyagupta87@yahoo.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 10:48:08 -07:00
Bagatur	098b4aa465	bump 281 (#10189 )	2023-09-04 08:51:50 -07:00
Aashish Saini	699f58fb83	Fixed Import Error type (#10168 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 08:43:28 -07:00
刘方瑞	de9e545542	MyScale hot fix on type check (#10180 ) Previous PR #9353 has incomplete type checks and deprecation warnings. This PR will fix those type check and add deprecation warning to myscale vectorstore	2023-09-04 08:40:58 -07:00
JunXiang	cb928ed3d5	Fix: the duplicate characters wrong results when using `pdfplumber loader` (#10165 ) (Reopen PR #7706, hope this problem can fix.) When using `pdfplumber`, some documents may be parsed incorrectly, resulting in duplicated characters. Taking the [linked](https://bruusgaard.no/wp-content/uploads/2021/05/Datasheet1000-series.pdf) document as an example: ## Before ```python from langchain.document_loaders import PDFPlumberLoader pdf_file = 'file.pdf' loader = PDFPlumberLoader(pdf_file) docs = loader.load() print(docs[0].page_content) ``` Results: ``` 11000000 SSeerriieess PPoorrttaabbllee ssiinnggllee ggaass ddeetteeccttoorrss ffoorr HHyyddrrooggeenn aanndd CCoommbbuussttiibbllee ggaasseess TThhee RRiikkeenn KKeeiikkii GGPP--11000000 iiss aa ccoommppaacctt aanndd lliigghhttwweeiigghhtt ggaass ddeetteeccttoorr wwiitthh hhiigghh sseennssiittiivviittyy ffoorr tthhee ddeetteeccttiioonn ooff hhyyddrrooccaarrbboonnss.. TThhee mmeeaassuurreemmeenntt iiss ppeerrffoorrmmeedd ffoorr tthhiiss ppuurrppoossee bbyy mmeeaannss ooff ccaattaallyyttiicc sseennssoorr.. TThhee GGPP--11000000 hhaass aa bbuuiilltt--iinn ppuummpp wwiitthh ppuummpp bboooosstteerr ffuunnccttiioonn aanndd aa ddiirreecctt sseelleeccttiioonn ffrroomm aa lliisstt ooff 2255 hhyyddrrooccaarrbboonnss ffoorr eexxaacctt aalliiggnnmmeenntt ooff tthhee ttaarrggeett ggaass -- OOnnllyy ccaalliibbrraattiioonn oonn CCHH iiss nneecceessssaarryy.. 44 FFeeaattuurreess TThhee RRiikkeenn KKeeiikkii 110000vvvvttaabbllee ssiinnggllee HHyyddrrooggeenn aanndd CCoommbbuussttiibbllee ggaass ddeetteeccttoorrss.. TThheerree aarree 33 ssttaannddaarrdd mmooddeellss:: GGPP--11000000:: 00--1100%%LLEELL // 00--110000%%LLEELL ›› LLEELL ddeetteeccttoorr NNCC--11000000:: 00--11000000ppppmm // 00--1100000000ppppmm ›› PPPPMM ddeetteeccttoorr DDiirreecctt rreeaaddiinngg ooff tthhee ccoonncceennttrraattiioonn vvaalluueess ooff ccoommbbuussttiibbllee ggaasseess ooff 2255 ggaasseess ((55 NNPP--11000000)).. EEaassyy ooppeerraattiioonn ffeeaattuurree ooff cchhaannggiinngg tthhee ggaass nnaammee ddiissppllaayy wwiitthh 11 sswwiittcchh bbuuttttoonn.. LLoonngg ddiissttaannccee ddrraawwiinngg ppoossssiibbllee wwiitthh tthhee ppuummpp bboooosstteerr ffuunnccttiioonn.. VVaarriioouuss ccoommbbuussttiibbllee ggaasseess ccaann bbee mmeeaassuurreedd bbyy tthhee ppppmm oorrddeerr wwiitthh NNCC--11000000.. www.bruusgaard.no postmaster@bruusgaard.no +47 67 54 93 30 Rev: 446-2 ``` We can see that there are a large number of duplicated characters in the text, which can cause issues in subsequent applications. ## After Therefore, based on the [solution](https://github.com/jsvine/pdfplumber/issues/71) provided by the `pdfplumber` source project. I added the `"dedupe_chars()"` method to address this problem. (Just pass the parameter `dedupe` to `True`) ```python from langchain.document_loaders import PDFPlumberLoader pdf_file = 'file.pdf' loader = PDFPlumberLoader(pdf_file, dedupe=True) docs = loader.load() print(docs[0].page_content) ``` Results: ``` 1000 Series Portable single gas detectors for Hydrogen and Combustible gases The Riken Keiki GP-1000 is a compact and lightweight gas detector with high sensitivity for the detection of hydrocarbons. The measurement is performed for this purpose by means of catalytic sensor. The GP-1000 has a built-in pump with pump booster function and a direct selection from a list of 25 hydrocarbons for exact alignment of the target gas - Only calibration on CH is necessary. 4 Features The Riken Keiki 100vvtable single Hydrogen and Combustible gas detectors. There are 3 standard models: GP-1000: 0-10%LEL / 0-100%LEL › LEL detector NC-1000: 0-1000ppm / 0-10000ppm › PPM detector Direct reading of the concentration values of combustible gases of 25 gases (5 NP-1000). Easy operation feature of changing the gas name display with 1 switch button. Long distance drawing possible with the pump booster function. Various combustible gases can be measured by the ppm order with NC-1000. www.bruusgaard.no postmaster@bruusgaard.no +47 67 54 93 30 Rev: 446-2 ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-04 08:37:00 -07:00
olgavrou	e10980d445	fix linting error	2023-09-04 08:56:34 -04:00
olgavrou	0f7cde023b	fix linting errors	2023-09-04 08:43:48 -04:00
olgavrou	4e9aecda90	formatting	2023-09-04 08:35:29 -04:00
olgavrou	67dc1a9dd2	cleanup	2023-09-04 07:36:47 -04:00
olgavrou	ca163f0ee6	fixes and tests	2023-09-04 07:10:44 -04:00
olgavrou	b162f1c8e1	dot product of encodings as default auto_embed	2023-09-04 05:50:15 -04:00
Aashish Saini	27944cb611	Fixed Import Error (#10167 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 00:32:09 -07:00
Massimiliano Pronesti	10e0431e48	feat(llms): add model_kwargs to hf tgi (#10139 ) @baskaryan Following what we discussed in #9724 and your suggestion, I've added a `model_kwargs` parameter to hf tgi.	2023-09-04 00:24:13 -07:00
Eugene Yurtsev	e0f6ba08d6	FileSysteBlobLoader: Expand user path (#10133 ) Fix for: https://github.com/langchain-ai/langchain/issues/10019 Verified fix manually	2023-09-04 00:21:33 -07:00
Krish Dholakia	31bbe80758	add additional model support to chatlitellm (#10134 ) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-04 00:16:40 -07:00
IlyaKIS1	de3322609e	Implemented Milvus translator for self-querying (#10162 ) - Implemented the MilvusTranslator for self-querying using Milvus vector store - Made unit tests to test its functionality - Documented the Milvus self-querying	2023-09-04 00:16:18 -07:00
Aashish Saini	7403faa063	Fixed typo in get_started.mdx (#10163 ) Fix typo: 'Whats up' -> 'What's up' Thanks CC: @baskaryan, @eyurtsev, @rlancemartin. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>	2023-09-04 00:09:50 -07:00
Aashish Saini	f6f0b0f975	Fixed typo in bittensor.mdx (#10160 ) Fixed Typo in bittenaor.mdx --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>	2023-09-03 21:49:33 -07:00
Christophe Bornet	803d0d9656	Add the possibility to configure boto3 in the S3 loaders (#9304 ) - Description: this PR adds the possibility to configure boto3 in the S3 loaders. Any named argument you add will be used to create the Boto3 session. This is useful when the AWS credentials can't be passed as env variables or can't be read from the credentials file. - Issue: N/A - Dependencies: N/A - Tag maintainer: ? - Twitter handle: cbornet_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 21:06:49 -07:00
Leonid Ganeline	03174c91d0	docs: `MLflow API` and examples (#9547 ) Added docs and links to the API and examples provided by MLflow itself	2023-09-03 20:52:20 -07:00
Xiaoyu Xee	9bcfd58580	Add dashvector self query retriever (#9684 ) ## Description Add `Dashvector` retriever and self-query retriever ## How to use ```python from langchain.vectorstores.dashvector import DashVector vectorstore = DashVector.from_documents(docs, embeddings) retriever = SelfQueryRetriever.from_llm( llm, vectorstore, document_content_description, metadata_field_info, verbose=True ) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 20:51:04 -07:00

1 2 3 4 5 ...

4434 Commits