langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Bagatur	a94dc6ee44	model garden nit (#10194 )	2023-09-04 11:42:35 -07:00
Louis	bb8c095127	Add 'download_dir' argument to VLLM (#9754 ) - Description: Add a 'download_dir' argument to VLLM model (to change the cache download directotu when retrieving a model from HF hub) - Issue: On some remote machine, I want the cache dir to be in a volume where I have space (models are heavy nowadays). Sometimes the default HF cache dir might not be what we want. - Dependencies: None --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-04 10:53:48 -07:00
Aashish Saini	8bba69ffd0	Fixed some grammatical typos in doc files (#10191 ) Fixed some grammatical typos in doc files CC: @baskaryan, @eyurtsev, @rlancemartin. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Md Nazish Arman <142379599+MdNazishArmanShorthillsAI@users.noreply.github.com> Co-authored-by: KamalSharmaShorthillsAI <142474019+KamalSharmaShorthillsAI@users.noreply.github.com> Co-authored-by: Lakshya <lakshyagupta87@yahoo.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 10:48:08 -07:00
Bagatur	098b4aa465	bump 281 (#10189 )	2023-09-04 08:51:50 -07:00
Aashish Saini	699f58fb83	Fixed Import Error type (#10168 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 08:43:28 -07:00
刘方瑞	de9e545542	MyScale hot fix on type check (#10180 ) Previous PR #9353 has incomplete type checks and deprecation warnings. This PR will fix those type check and add deprecation warning to myscale vectorstore	2023-09-04 08:40:58 -07:00
JunXiang	cb928ed3d5	Fix: the duplicate characters wrong results when using `pdfplumber loader` (#10165 ) (Reopen PR #7706, hope this problem can fix.) When using `pdfplumber`, some documents may be parsed incorrectly, resulting in duplicated characters. Taking the [linked](https://bruusgaard.no/wp-content/uploads/2021/05/Datasheet1000-series.pdf) document as an example: ## Before ```python from langchain.document_loaders import PDFPlumberLoader pdf_file = 'file.pdf' loader = PDFPlumberLoader(pdf_file) docs = loader.load() print(docs[0].page_content) ``` Results: ``` 11000000 SSeerriieess PPoorrttaabbllee ssiinnggllee ggaass ddeetteeccttoorrss ffoorr HHyyddrrooggeenn aanndd CCoommbbuussttiibbllee ggaasseess TThhee RRiikkeenn KKeeiikkii GGPP--11000000 iiss aa ccoommppaacctt aanndd lliigghhttwweeiigghhtt ggaass ddeetteeccttoorr wwiitthh hhiigghh sseennssiittiivviittyy ffoorr tthhee ddeetteeccttiioonn ooff hhyyddrrooccaarrbboonnss.. TThhee mmeeaassuurreemmeenntt iiss ppeerrffoorrmmeedd ffoorr tthhiiss ppuurrppoossee bbyy mmeeaannss ooff ccaattaallyyttiicc sseennssoorr.. TThhee GGPP--11000000 hhaass aa bbuuiilltt--iinn ppuummpp wwiitthh ppuummpp bboooosstteerr ffuunnccttiioonn aanndd aa ddiirreecctt sseelleeccttiioonn ffrroomm aa lliisstt ooff 2255 hhyyddrrooccaarrbboonnss ffoorr eexxaacctt aalliiggnnmmeenntt ooff tthhee ttaarrggeett ggaass -- OOnnllyy ccaalliibbrraattiioonn oonn CCHH iiss nneecceessssaarryy.. 44 FFeeaattuurreess TThhee RRiikkeenn KKeeiikkii 110000vvvvttaabbllee ssiinnggllee HHyyddrrooggeenn aanndd CCoommbbuussttiibbllee ggaass ddeetteeccttoorrss.. TThheerree aarree 33 ssttaannddaarrdd mmooddeellss:: GGPP--11000000:: 00--1100%%LLEELL // 00--110000%%LLEELL ›› LLEELL ddeetteeccttoorr NNCC--11000000:: 00--11000000ppppmm // 00--1100000000ppppmm ›› PPPPMM ddeetteeccttoorr DDiirreecctt rreeaaddiinngg ooff tthhee ccoonncceennttrraattiioonn vvaalluueess ooff ccoommbbuussttiibbllee ggaasseess ooff 2255 ggaasseess ((55 NNPP--11000000)).. EEaassyy ooppeerraattiioonn ffeeaattuurree ooff cchhaannggiinngg tthhee ggaass nnaammee ddiissppllaayy wwiitthh 11 sswwiittcchh bbuuttttoonn.. LLoonngg ddiissttaannccee ddrraawwiinngg ppoossssiibbllee wwiitthh tthhee ppuummpp bboooosstteerr ffuunnccttiioonn.. VVaarriioouuss ccoommbbuussttiibbllee ggaasseess ccaann bbee mmeeaassuurreedd bbyy tthhee ppppmm oorrddeerr wwiitthh NNCC--11000000.. www.bruusgaard.no postmaster@bruusgaard.no +47 67 54 93 30 Rev: 446-2 ``` We can see that there are a large number of duplicated characters in the text, which can cause issues in subsequent applications. ## After Therefore, based on the [solution](https://github.com/jsvine/pdfplumber/issues/71) provided by the `pdfplumber` source project. I added the `"dedupe_chars()"` method to address this problem. (Just pass the parameter `dedupe` to `True`) ```python from langchain.document_loaders import PDFPlumberLoader pdf_file = 'file.pdf' loader = PDFPlumberLoader(pdf_file, dedupe=True) docs = loader.load() print(docs[0].page_content) ``` Results: ``` 1000 Series Portable single gas detectors for Hydrogen and Combustible gases The Riken Keiki GP-1000 is a compact and lightweight gas detector with high sensitivity for the detection of hydrocarbons. The measurement is performed for this purpose by means of catalytic sensor. The GP-1000 has a built-in pump with pump booster function and a direct selection from a list of 25 hydrocarbons for exact alignment of the target gas - Only calibration on CH is necessary. 4 Features The Riken Keiki 100vvtable single Hydrogen and Combustible gas detectors. There are 3 standard models: GP-1000: 0-10%LEL / 0-100%LEL › LEL detector NC-1000: 0-1000ppm / 0-10000ppm › PPM detector Direct reading of the concentration values of combustible gases of 25 gases (5 NP-1000). Easy operation feature of changing the gas name display with 1 switch button. Long distance drawing possible with the pump booster function. Various combustible gases can be measured by the ppm order with NC-1000. www.bruusgaard.no postmaster@bruusgaard.no +47 67 54 93 30 Rev: 446-2 ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-04 08:37:00 -07:00
Aashish Saini	27944cb611	Fixed Import Error (#10167 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 00:32:09 -07:00
Massimiliano Pronesti	10e0431e48	feat(llms): add model_kwargs to hf tgi (#10139 ) @baskaryan Following what we discussed in #9724 and your suggestion, I've added a `model_kwargs` parameter to hf tgi.	2023-09-04 00:24:13 -07:00
Eugene Yurtsev	e0f6ba08d6	FileSysteBlobLoader: Expand user path (#10133 ) Fix for: https://github.com/langchain-ai/langchain/issues/10019 Verified fix manually	2023-09-04 00:21:33 -07:00
Krish Dholakia	31bbe80758	add additional model support to chatlitellm (#10134 ) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-04 00:16:40 -07:00
IlyaKIS1	de3322609e	Implemented Milvus translator for self-querying (#10162 ) - Implemented the MilvusTranslator for self-querying using Milvus vector store - Made unit tests to test its functionality - Documented the Milvus self-querying	2023-09-04 00:16:18 -07:00
Aashish Saini	7403faa063	Fixed typo in get_started.mdx (#10163 ) Fix typo: 'Whats up' -> 'What's up' Thanks CC: @baskaryan, @eyurtsev, @rlancemartin. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>	2023-09-04 00:09:50 -07:00
Aashish Saini	f6f0b0f975	Fixed typo in bittensor.mdx (#10160 ) Fixed Typo in bittenaor.mdx --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>	2023-09-03 21:49:33 -07:00
Christophe Bornet	803d0d9656	Add the possibility to configure boto3 in the S3 loaders (#9304 ) - Description: this PR adds the possibility to configure boto3 in the S3 loaders. Any named argument you add will be used to create the Boto3 session. This is useful when the AWS credentials can't be passed as env variables or can't be read from the credentials file. - Issue: N/A - Dependencies: N/A - Tag maintainer: ? - Twitter handle: cbornet_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 21:06:49 -07:00
Leonid Ganeline	03174c91d0	docs: `MLflow API` and examples (#9547 ) Added docs and links to the API and examples provided by MLflow itself	2023-09-03 20:52:20 -07:00
Xiaoyu Xee	9bcfd58580	Add dashvector self query retriever (#9684 ) ## Description Add `Dashvector` retriever and self-query retriever ## How to use ```python from langchain.vectorstores.dashvector import DashVector vectorstore = DashVector.from_documents(docs, embeddings) retriever = SelfQueryRetriever.from_llm( llm, vectorstore, document_content_description, metadata_field_info, verbose=True ) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 20:51:04 -07:00
Leonid Ganeline	056e59672b	docs: `DeepLake` example (#9663 ) Updated the `Deep Lake` example. Added a link to an example provided by Activeloop.	2023-09-03 20:42:52 -07:00
Sajal Sharma	0b6993987f	feature: add verbosity to create_qa_with_sources_chain (#9742 ) Adds a verbose parameter to the create_qa_with_sources_chain and create_qa_with_structure_chain functions	2023-09-03 20:42:20 -07:00
Jayson Ng	68f2363f5d	Allow specifying arbitrary keyword arguments in `langchain.llms.VLLM` (#9683 ) Description: add arbitrary keyword arguments for VLLM Issue: https://github.com/langchain-ai/langchain/issues/9682 Dependencies: none Tag maintainer: @hwchase17, @baskaryan	2023-09-03 20:40:06 -07:00
seamusp	43c4c6dfcc	docs: misc modelIO fixes (#9734 ) Various improvements to the Model I/O section of the documentation - Changed "Chat Model" to "chat model" in a few spots for internal consistency - Minor spelling & grammar fixes to improve readability & comprehension	2023-09-03 20:33:20 -07:00
Ackermann Yuriy	c585351bdc	Fixed query/instruction typoes (#10158 ) Fixed typoes in embedding parameters.	2023-09-03 20:31:37 -07:00
Nino Risteski	433c4a721e	typo in locall llms fixed (#9755 ) Hi, I noticed a typo in the local_llms.ipynb file and fixed it. The word challenge is without 'a' in the original file. @baskaryan , @eyurtsev Thanks. Co-authored-by: Fliprise <fliprise@Fliprises-MacBook-Pro.local>	2023-09-03 20:29:41 -07:00
Stefano Lottini	c9ff0ab2e9	Cassandra support for LLM cache (exact-match and semantic) (#9772 ) This PR implements two new classes in the cache module: `CassandraCache` and `CassandraSemanticCache`, similar in structure and functionality to their Redis counterpart: providing a cache for the response to a (prompt, llm) pair. Integration tests are included. Moreover, linting and type checks are all passing on my machine. Dependencies: the `pyproject.toml` and `poetry.lock` have the newest version of cassIO (the very same as in the Cassandra vector store metadata PR, submitted as #9280). If I may suggest, this issue and #9280 might be reviewed together (as they bring the same poetry changes along), so I'm tagging @baskaryan who already helped out a little with poetry-related conflicts there. (Thank you!) I'd be happy to add a short notebook if this is deemed necessary (but it seems to me that, contrary e.g. to vector stores, caches are not covered in specific notebooks). Thank you! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 20:27:02 -07:00
seamusp	16945c9922	docs: misc retrievers fixes (#9791 ) Various miscellaneous fixes to most pages in the 'Retrievers' section of the documentation: - "VectorStore" and "vectorstore" changed to "vector store" for consistency - Various spelling, grammar, and formatting improvements for readability Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 20:26:49 -07:00
Terry Tan	8bc452a466	Enhance Google search tool SerpApi response (#10157 ) Enhance SerpApi response which potential to have more relevant output. <img width="345" alt="Screenshot 2023-09-01 at 8 26 13 AM" src="https://github.com/langchain-ai/langchain/assets/10222402/80ff684d-e02e-4143-b218-5c1b102cbf75"> Query: What is the weather in Pomfret? Before: > I should look up the current weather conditions. ... Final Answer: The current weather in Pomfret is 73°F with 1% chance of precipitation and winds at 10 mph. After: > I should look up the current weather conditions. ... Final Answer: The current weather in Pomfret is 62°F, 1% precipitation, 61% humidity, and 4 mph wind. --- Query: Top team in english premier league? Before: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Liverpool FC is currently at the top of the English Premier League. After: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Man City is currently at the top of the English Premier League. --- Query: Top team in english premier league? Before: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Liverpool FC is currently at the top of the English Premier League. After: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Man City is currently at the top of the English Premier League. --- Query: Any upcoming events in Paris? Before: > I should look for events in Paris Action: Search ... Final Answer: Upcoming events in Paris this month include Whit Sunday & Whit Monday (French National Holiday), Makeup in Paris, Paris Jazz Festival, Fete de la Musique, and Salon International de la Maison de. After: > I should look for events in Paris Action: Search ... Final Answer: Upcoming events in Paris include Elektric Park 2023, The Aces, and BEING AS AN OCEAN.	2023-09-03 20:24:19 -07:00
Aashish Saini	fe0e191fb3	Made some Grammatical error fixes (#10156 ) Made some Grammatical error fixes. CC: @baskaryan, @eyurtsev, @rlancemartin. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>	2023-09-03 20:21:46 -07:00
liunux4odoo	7d48c2884e	Update json_loader.py: encoding bug (#9785 ) JSONLoader.load does not specify `encoding` in `self.file_path.read_text()` as `self.file_path.open()` <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 16:16:02 -07:00
Geonwoo Kim	e34dde3d15	docs: Fix `CustomLLM` and `Question_answering` docs (#9782 ) ### Description - Update `CustomLLM._call`: Corrected the _call method in CustomLLM to include **kwargs, ensuring consistency with parent class. - Update `Question_answering`: To fix `Page not found` error - https://python.langchain.com/docs/use_cases/code -> https://python.langchain.com/docs/use_cases/code_understanding ### Issue N/A ### Dependencies N/A ### Tag maintainer N/A ### Twitter handle N/A	2023-09-03 16:15:46 -07:00
Aashish Saini	94efede93c	Fixed Typos and grammatical issues in document files (#9789 ) Fixed typos and grammatical issues in document files. @baskaryan , @eyurtsev --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>	2023-09-03 16:09:14 -07:00
Harrison Chase	c0518be1f1	fix syntax (#10155 )	2023-09-03 16:08:43 -07:00
Juhee Kim	50ca44c79f	fix multipart email body retrieval (#9790 ) Description: Gmail message retrieval in GmailGetMessage and GmailSearch returned an empty string when encountering multipart emails. This change correctly extracts the email body for multipart emails. Dependencies: None @hwchase17 @vowelparrot	2023-09-03 16:04:36 -07:00
Cameron Hutchison	7d8bb78e5c	Extraction Chain - Custom Prompt (#9828 ) # Description This change allows you to customize the prompt used in `create_extraction_chain` as well as `create_extraction_chain_pydantic`. It also adds the `verbose` argument to `create_extraction_chain_pydantic` - because `create_extraction_chain` had it already and `create_extraction_chain_pydantic` did not. # Issue N/A # Dependencies N/A # Twitter https://twitter.com/CamAHutchison	2023-09-03 16:01:55 -07:00
mgvalverde	33f43cc1b0	Bugfix/jsonloader metadata (#9793 ) Hi, - Description: - Solves the issue #6478. - Includes some additional rework on the `JSONLoader` class: - Getting metadata is decoupled from `_get_text` - Validating metadata_func is perform now by `_validate_metadata_func`, instead of `_validate_content_key` - Issue: #6478 - Dependencies: NA - Tag maintainer: @hwchase17	2023-09-03 16:01:43 -07:00
Dane Summers	7d1b0fbe79	Adds dataview fields and tags to metadata #9800 (#9801 ) Description: Adds tags and dataview fields to ObsidianLoader doc metadata. - Issue: #9800, #4991 - Dependencies: none - Tag maintainer: My best guess is @hwchase17 looking through the git logs - Twitter handle: I don't use twitter, sorry!	2023-09-03 15:56:48 -07:00
Harrison Chase	ce47124e8f	add numbered list parser (#9837 )	2023-09-03 15:55:31 -07:00
Philippe PRADOS	f59e5d48ed	Google drive integration (lite) (#9999 ) My other [pull-request](https://github.com/langchain-ai/langchain/pull/5135) is too big to be acceptable. I propose another 'lite' version. I update only notebook to propose an integration with the external project [`langchain-googledrive`](https://github.com/pprados/langchain-googledrive). --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 15:54:42 -07:00
Viktor Zhemchuzhnikov	507e46844e	Extend SQLChatMessageHistory (#9849 ) ### Description There is a really nice class for saving chat messages into a database - SQLChatMessageHistory. It leverages SqlAlchemy to be compatible with any supported database (in contrast with PostgresChatMessageHistory, which is basically the same but is limited to Postgres). However, the class is not really customizable in terms of what you can store. I can imagine a lot of use cases, when one will need to save a message date, along with some additional metadata. To solve this, I propose to extract the converting logic from BaseMessage to SQLAlchemy model (and vice versa) into a separate class - message converter. So instead of rewriting the whole SQLChatMessageHistory class, a user will only need to write a custom model and a simple mapping class, and pass its instance as a parameter. I also noticed that there is no documentation on this class, so I added that too, with an example of custom message converter. ### Issue N/A ### Dependencies N/A ### Tag maintainer Not yet ### Twitter handle N/A	2023-09-03 15:49:53 -07:00
Jon Bennion	fed137a8a9	adding new chain for logical fallacy removal from model output in chain (#9887 ) Description: new chain for logical fallacy removal from model output in chain and docs Issue: n/a see above Dependencies: none Tag maintainer: @hinthornw in past from my end but not sure who that would be for maintenance of chains Twitter handle: no twitter feel free to call out my git user if shout out j-space-b Note: created documentation in docs/extras --------- Co-authored-by: Jon Bennion <jb@Jons-MacBook-Pro.local> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 15:44:27 -07:00
Harrison Chase	794ff2dae8	Harrison/hf lru (#10154 ) Co-authored-by: Pascal Bro <git@pascalbrokmeier.de> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:39:25 -07:00
Stanko Kuveljic	4765c09703	Pinecone upsert parallelization (#9859 ) Issue: closes #9855 * consolidates `from_texts` and `add_texts` functions for pinecone upsert * adds two types of batching (one for embeddings and one for index upsert) * adds thread pool size when instantiating pinecone index	2023-09-03 15:37:41 -07:00
Lance Martin	16a27ab244	Add prompt hub for various use-cases (#9879 ) Use prompt hub in our use-case docs and guides.	2023-09-03 15:32:22 -07:00
Lorenzo	00a7c31ffd	Fix: Nested Dicts Handling of Document Metadata (#9880 ) ## Description When the `MultiQueryRetriever` is used to get the list of documents relevant according to a query, inside a vector store, and at least one of these contain metadata with nested dictionaries, a `TypeError: unhashable type: 'dict'` exception is thrown. This is caused by the `unique_union` function which, to guarantee the uniqueness of the returned documents, tries, unsuccessfully, to hash the nested dictionaries and use them as a part of key. ```python unique_documents_dict = { (doc.page_content, tuple(sorted(doc.metadata.items()))): doc for doc in documents } ``` ## Issue #9872 (MultiQueryRetriever (get_relevant_documents) raises TypeError: unhashable type: 'dict' with dic metadata) ## Solution A possible solution is to dump the metadata dict to a string and use it as a part of hashed key. ```python unique_documents_dict = { (doc.page_content, json.dumps(doc.metadata, sort_keys=True)): doc for doc in documents } ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:27:46 -07:00
Leonid Ganeline	a52fe9528e	docs: fixed title in `Bittensor` example (#9893 ) Fixed title in the `Bittensor` example. The old title brakes the sorted order of items in the navbar. Added some formatting.	2023-09-03 15:10:42 -07:00
Davide Menini	b8baead70c	fix (Html2TextTransformer): allow configuration of html2text (#9914 ) Hi, this PR enables configuring the html2text package, instead of being bound to use the hardcoded values. While simply passing `ignore_links` and `ignore_images` to the `transform_documents` method was possible, I preferred passing them to the `__init__` method for 2 reasons: 1. It is more efficient in case of subsequent calls to `transform_documents`. 2. It allows to move the "complexity" to the instantiation, keeping the actual execution simple and general enough. IMO the transformers should all follow this pattern, allowing something like this: ```python # Instantiate transformers transformers = [ TransformerA(foo='bar'), TransformerB(bar='foo'), # others ] # During execution, call them sequentially documents = ... for tr in transformers: documents = tr.transform_documents(documents) ``` Thanks for the reviews! --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com>	2023-09-03 15:10:25 -07:00
seamusp	abd8681341	docs: chains & memory fixes (#9895 ) Various improvements to the Chains & Memory sections of the documentation including formatting, spelling, and grammar fixes to improve readability.	2023-09-03 15:06:20 -07:00
Frédéric Lepied	4dc47bd3ac	time_weighted_retriever: use a timestamp if needed (#9906 ) If last_accessed_at metadata is a float use it as a timestamp. This allows to support vector stores that do not store datetime objects like ChromaDb. Fixes: https://github.com/langchain-ai/langchain/issues/3685 <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:05:30 -07:00
Josh White	bc8cceebf7	Extend DynamoDBChatMessageHistory to support composite keys (#9896 ) - Description: Adds two optional parameters to the DynamoDBChatMessageHistory class to enable users to pass in a name for their PrimaryKey, or a Key object itself to enable the use of composite keys, a common DynamoDB paradigm. [AWS DynamoDB Key docs](https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/) - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: Josh White <josh@ctrlstack.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:05:16 -07:00
Programmers Emperor	872d829201	Update __init__.py (#9955 ) Add SQLDatabaseSequentialChain Class to __init__.py so it can be accessed and used <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: SQLDatabaseSequentialChain is not found when importing Langchain_experimental package, when I open __init__.py Langchain_expermental.sql, I found that SQLDatabaseSequentialChain is imported and add to __all__ list - Issue: SQLDatabaseSequentialChain is not found in Langchain_experimental package - Dependencies: None, - Tag maintainer: None, - Twitter handle: None, Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:02:58 -07:00
Lucas Rodrigues Pereira	5c7afe8aae	Fix json parsing error of MULTI_PROMPT_ROUTER_TEMPLATE (#9944 ) The output at times lacks the closing markdown code block. The prompt is changed to explicitly request the closing backticks. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:00:50 -07:00

1 2 3 4 5 ...

4457 Commits