langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Aashish Saini	699f58fb83	Fixed Import Error type (#10168 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 08:43:28 -07:00
刘方瑞	de9e545542	MyScale hot fix on type check (#10180 ) Previous PR #9353 has incomplete type checks and deprecation warnings. This PR will fix those type check and add deprecation warning to myscale vectorstore	2023-09-04 08:40:58 -07:00
JunXiang	cb928ed3d5	Fix: the duplicate characters wrong results when using `pdfplumber loader` (#10165 ) (Reopen PR #7706, hope this problem can fix.) When using `pdfplumber`, some documents may be parsed incorrectly, resulting in duplicated characters. Taking the [linked](https://bruusgaard.no/wp-content/uploads/2021/05/Datasheet1000-series.pdf) document as an example: ## Before ```python from langchain.document_loaders import PDFPlumberLoader pdf_file = 'file.pdf' loader = PDFPlumberLoader(pdf_file) docs = loader.load() print(docs[0].page_content) ``` Results: ``` 11000000 SSeerriieess PPoorrttaabbllee ssiinnggllee ggaass ddeetteeccttoorrss ffoorr HHyyddrrooggeenn aanndd CCoommbbuussttiibbllee ggaasseess TThhee RRiikkeenn KKeeiikkii GGPP--11000000 iiss aa ccoommppaacctt aanndd lliigghhttwweeiigghhtt ggaass ddeetteeccttoorr wwiitthh hhiigghh sseennssiittiivviittyy ffoorr tthhee ddeetteeccttiioonn ooff hhyyddrrooccaarrbboonnss.. TThhee mmeeaassuurreemmeenntt iiss ppeerrffoorrmmeedd ffoorr tthhiiss ppuurrppoossee bbyy mmeeaannss ooff ccaattaallyyttiicc sseennssoorr.. TThhee GGPP--11000000 hhaass aa bbuuiilltt--iinn ppuummpp wwiitthh ppuummpp bboooosstteerr ffuunnccttiioonn aanndd aa ddiirreecctt sseelleeccttiioonn ffrroomm aa lliisstt ooff 2255 hhyyddrrooccaarrbboonnss ffoorr eexxaacctt aalliiggnnmmeenntt ooff tthhee ttaarrggeett ggaass -- OOnnllyy ccaalliibbrraattiioonn oonn CCHH iiss nneecceessssaarryy.. 44 FFeeaattuurreess TThhee RRiikkeenn KKeeiikkii 110000vvvvttaabbllee ssiinnggllee HHyyddrrooggeenn aanndd CCoommbbuussttiibbllee ggaass ddeetteeccttoorrss.. TThheerree aarree 33 ssttaannddaarrdd mmooddeellss:: GGPP--11000000:: 00--1100%%LLEELL // 00--110000%%LLEELL ›› LLEELL ddeetteeccttoorr NNCC--11000000:: 00--11000000ppppmm // 00--1100000000ppppmm ›› PPPPMM ddeetteeccttoorr DDiirreecctt rreeaaddiinngg ooff tthhee ccoonncceennttrraattiioonn vvaalluueess ooff ccoommbbuussttiibbllee ggaasseess ooff 2255 ggaasseess ((55 NNPP--11000000)).. EEaassyy ooppeerraattiioonn ffeeaattuurree ooff cchhaannggiinngg tthhee ggaass nnaammee ddiissppllaayy wwiitthh 11 sswwiittcchh bbuuttttoonn.. LLoonngg ddiissttaannccee ddrraawwiinngg ppoossssiibbllee wwiitthh tthhee ppuummpp bboooosstteerr ffuunnccttiioonn.. VVaarriioouuss ccoommbbuussttiibbllee ggaasseess ccaann bbee mmeeaassuurreedd bbyy tthhee ppppmm oorrddeerr wwiitthh NNCC--11000000.. www.bruusgaard.no postmaster@bruusgaard.no +47 67 54 93 30 Rev: 446-2 ``` We can see that there are a large number of duplicated characters in the text, which can cause issues in subsequent applications. ## After Therefore, based on the [solution](https://github.com/jsvine/pdfplumber/issues/71) provided by the `pdfplumber` source project. I added the `"dedupe_chars()"` method to address this problem. (Just pass the parameter `dedupe` to `True`) ```python from langchain.document_loaders import PDFPlumberLoader pdf_file = 'file.pdf' loader = PDFPlumberLoader(pdf_file, dedupe=True) docs = loader.load() print(docs[0].page_content) ``` Results: ``` 1000 Series Portable single gas detectors for Hydrogen and Combustible gases The Riken Keiki GP-1000 is a compact and lightweight gas detector with high sensitivity for the detection of hydrocarbons. The measurement is performed for this purpose by means of catalytic sensor. The GP-1000 has a built-in pump with pump booster function and a direct selection from a list of 25 hydrocarbons for exact alignment of the target gas - Only calibration on CH is necessary. 4 Features The Riken Keiki 100vvtable single Hydrogen and Combustible gas detectors. There are 3 standard models: GP-1000: 0-10%LEL / 0-100%LEL › LEL detector NC-1000: 0-1000ppm / 0-10000ppm › PPM detector Direct reading of the concentration values of combustible gases of 25 gases (5 NP-1000). Easy operation feature of changing the gas name display with 1 switch button. Long distance drawing possible with the pump booster function. Various combustible gases can be measured by the ppm order with NC-1000. www.bruusgaard.no postmaster@bruusgaard.no +47 67 54 93 30 Rev: 446-2 ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-04 08:37:00 -07:00
Aashish Saini	27944cb611	Fixed Import Error (#10167 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. --------- Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com> Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com> Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com> Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com> Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com> Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com> Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com> Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>	2023-09-04 00:32:09 -07:00
Massimiliano Pronesti	10e0431e48	feat(llms): add model_kwargs to hf tgi (#10139 ) @baskaryan Following what we discussed in #9724 and your suggestion, I've added a `model_kwargs` parameter to hf tgi.	2023-09-04 00:24:13 -07:00
Eugene Yurtsev	e0f6ba08d6	FileSysteBlobLoader: Expand user path (#10133 ) Fix for: https://github.com/langchain-ai/langchain/issues/10019 Verified fix manually	2023-09-04 00:21:33 -07:00
Krish Dholakia	31bbe80758	add additional model support to chatlitellm (#10134 ) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-04 00:16:40 -07:00
IlyaKIS1	de3322609e	Implemented Milvus translator for self-querying (#10162 ) - Implemented the MilvusTranslator for self-querying using Milvus vector store - Made unit tests to test its functionality - Documented the Milvus self-querying	2023-09-04 00:16:18 -07:00
Christophe Bornet	803d0d9656	Add the possibility to configure boto3 in the S3 loaders (#9304 ) - Description: this PR adds the possibility to configure boto3 in the S3 loaders. Any named argument you add will be used to create the Boto3 session. This is useful when the AWS credentials can't be passed as env variables or can't be read from the credentials file. - Issue: N/A - Dependencies: N/A - Tag maintainer: ? - Twitter handle: cbornet_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 21:06:49 -07:00
Xiaoyu Xee	9bcfd58580	Add dashvector self query retriever (#9684 ) ## Description Add `Dashvector` retriever and self-query retriever ## How to use ```python from langchain.vectorstores.dashvector import DashVector vectorstore = DashVector.from_documents(docs, embeddings) retriever = SelfQueryRetriever.from_llm( llm, vectorstore, document_content_description, metadata_field_info, verbose=True ) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 20:51:04 -07:00
Sajal Sharma	0b6993987f	feature: add verbosity to create_qa_with_sources_chain (#9742 ) Adds a verbose parameter to the create_qa_with_sources_chain and create_qa_with_structure_chain functions	2023-09-03 20:42:20 -07:00
Jayson Ng	68f2363f5d	Allow specifying arbitrary keyword arguments in `langchain.llms.VLLM` (#9683 ) Description: add arbitrary keyword arguments for VLLM Issue: https://github.com/langchain-ai/langchain/issues/9682 Dependencies: none Tag maintainer: @hwchase17, @baskaryan	2023-09-03 20:40:06 -07:00
Ackermann Yuriy	c585351bdc	Fixed query/instruction typoes (#10158 ) Fixed typoes in embedding parameters.	2023-09-03 20:31:37 -07:00
Stefano Lottini	c9ff0ab2e9	Cassandra support for LLM cache (exact-match and semantic) (#9772 ) This PR implements two new classes in the cache module: `CassandraCache` and `CassandraSemanticCache`, similar in structure and functionality to their Redis counterpart: providing a cache for the response to a (prompt, llm) pair. Integration tests are included. Moreover, linting and type checks are all passing on my machine. Dependencies: the `pyproject.toml` and `poetry.lock` have the newest version of cassIO (the very same as in the Cassandra vector store metadata PR, submitted as #9280). If I may suggest, this issue and #9280 might be reviewed together (as they bring the same poetry changes along), so I'm tagging @baskaryan who already helped out a little with poetry-related conflicts there. (Thank you!) I'd be happy to add a short notebook if this is deemed necessary (but it seems to me that, contrary e.g. to vector stores, caches are not covered in specific notebooks). Thank you! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 20:27:02 -07:00
Terry Tan	8bc452a466	Enhance Google search tool SerpApi response (#10157 ) Enhance SerpApi response which potential to have more relevant output. <img width="345" alt="Screenshot 2023-09-01 at 8 26 13 AM" src="https://github.com/langchain-ai/langchain/assets/10222402/80ff684d-e02e-4143-b218-5c1b102cbf75"> Query: What is the weather in Pomfret? Before: > I should look up the current weather conditions. ... Final Answer: The current weather in Pomfret is 73°F with 1% chance of precipitation and winds at 10 mph. After: > I should look up the current weather conditions. ... Final Answer: The current weather in Pomfret is 62°F, 1% precipitation, 61% humidity, and 4 mph wind. --- Query: Top team in english premier league? Before: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Liverpool FC is currently at the top of the English Premier League. After: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Man City is currently at the top of the English Premier League. --- Query: Top team in english premier league? Before: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Liverpool FC is currently at the top of the English Premier League. After: > I need to find out which team is currently at the top of the English Premier League ... Final Answer: Man City is currently at the top of the English Premier League. --- Query: Any upcoming events in Paris? Before: > I should look for events in Paris Action: Search ... Final Answer: Upcoming events in Paris this month include Whit Sunday & Whit Monday (French National Holiday), Makeup in Paris, Paris Jazz Festival, Fete de la Musique, and Salon International de la Maison de. After: > I should look for events in Paris Action: Search ... Final Answer: Upcoming events in Paris include Elektric Park 2023, The Aces, and BEING AS AN OCEAN.	2023-09-03 20:24:19 -07:00
liunux4odoo	7d48c2884e	Update json_loader.py: encoding bug (#9785 ) JSONLoader.load does not specify `encoding` in `self.file_path.read_text()` as `self.file_path.open()` <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 16:16:02 -07:00
Juhee Kim	50ca44c79f	fix multipart email body retrieval (#9790 ) Description: Gmail message retrieval in GmailGetMessage and GmailSearch returned an empty string when encountering multipart emails. This change correctly extracts the email body for multipart emails. Dependencies: None @hwchase17 @vowelparrot	2023-09-03 16:04:36 -07:00
Cameron Hutchison	7d8bb78e5c	Extraction Chain - Custom Prompt (#9828 ) # Description This change allows you to customize the prompt used in `create_extraction_chain` as well as `create_extraction_chain_pydantic`. It also adds the `verbose` argument to `create_extraction_chain_pydantic` - because `create_extraction_chain` had it already and `create_extraction_chain_pydantic` did not. # Issue N/A # Dependencies N/A # Twitter https://twitter.com/CamAHutchison	2023-09-03 16:01:55 -07:00
mgvalverde	33f43cc1b0	Bugfix/jsonloader metadata (#9793 ) Hi, - Description: - Solves the issue #6478. - Includes some additional rework on the `JSONLoader` class: - Getting metadata is decoupled from `_get_text` - Validating metadata_func is perform now by `_validate_metadata_func`, instead of `_validate_content_key` - Issue: #6478 - Dependencies: NA - Tag maintainer: @hwchase17	2023-09-03 16:01:43 -07:00
Dane Summers	7d1b0fbe79	Adds dataview fields and tags to metadata #9800 (#9801 ) Description: Adds tags and dataview fields to ObsidianLoader doc metadata. - Issue: #9800, #4991 - Dependencies: none - Tag maintainer: My best guess is @hwchase17 looking through the git logs - Twitter handle: I don't use twitter, sorry!	2023-09-03 15:56:48 -07:00
Harrison Chase	ce47124e8f	add numbered list parser (#9837 )	2023-09-03 15:55:31 -07:00
Viktor Zhemchuzhnikov	507e46844e	Extend SQLChatMessageHistory (#9849 ) ### Description There is a really nice class for saving chat messages into a database - SQLChatMessageHistory. It leverages SqlAlchemy to be compatible with any supported database (in contrast with PostgresChatMessageHistory, which is basically the same but is limited to Postgres). However, the class is not really customizable in terms of what you can store. I can imagine a lot of use cases, when one will need to save a message date, along with some additional metadata. To solve this, I propose to extract the converting logic from BaseMessage to SQLAlchemy model (and vice versa) into a separate class - message converter. So instead of rewriting the whole SQLChatMessageHistory class, a user will only need to write a custom model and a simple mapping class, and pass its instance as a parameter. I also noticed that there is no documentation on this class, so I added that too, with an example of custom message converter. ### Issue N/A ### Dependencies N/A ### Tag maintainer Not yet ### Twitter handle N/A	2023-09-03 15:49:53 -07:00
Jon Bennion	fed137a8a9	adding new chain for logical fallacy removal from model output in chain (#9887 ) Description: new chain for logical fallacy removal from model output in chain and docs Issue: n/a see above Dependencies: none Tag maintainer: @hinthornw in past from my end but not sure who that would be for maintenance of chains Twitter handle: no twitter feel free to call out my git user if shout out j-space-b Note: created documentation in docs/extras --------- Co-authored-by: Jon Bennion <jb@Jons-MacBook-Pro.local> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 15:44:27 -07:00
Harrison Chase	794ff2dae8	Harrison/hf lru (#10154 ) Co-authored-by: Pascal Bro <git@pascalbrokmeier.de> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:39:25 -07:00
Stanko Kuveljic	4765c09703	Pinecone upsert parallelization (#9859 ) Issue: closes #9855 * consolidates `from_texts` and `add_texts` functions for pinecone upsert * adds two types of batching (one for embeddings and one for index upsert) * adds thread pool size when instantiating pinecone index	2023-09-03 15:37:41 -07:00
Lorenzo	00a7c31ffd	Fix: Nested Dicts Handling of Document Metadata (#9880 ) ## Description When the `MultiQueryRetriever` is used to get the list of documents relevant according to a query, inside a vector store, and at least one of these contain metadata with nested dictionaries, a `TypeError: unhashable type: 'dict'` exception is thrown. This is caused by the `unique_union` function which, to guarantee the uniqueness of the returned documents, tries, unsuccessfully, to hash the nested dictionaries and use them as a part of key. ```python unique_documents_dict = { (doc.page_content, tuple(sorted(doc.metadata.items()))): doc for doc in documents } ``` ## Issue #9872 (MultiQueryRetriever (get_relevant_documents) raises TypeError: unhashable type: 'dict' with dic metadata) ## Solution A possible solution is to dump the metadata dict to a string and use it as a part of hashed key. ```python unique_documents_dict = { (doc.page_content, json.dumps(doc.metadata, sort_keys=True)): doc for doc in documents } ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:27:46 -07:00
Davide Menini	b8baead70c	fix (Html2TextTransformer): allow configuration of html2text (#9914 ) Hi, this PR enables configuring the html2text package, instead of being bound to use the hardcoded values. While simply passing `ignore_links` and `ignore_images` to the `transform_documents` method was possible, I preferred passing them to the `__init__` method for 2 reasons: 1. It is more efficient in case of subsequent calls to `transform_documents`. 2. It allows to move the "complexity" to the instantiation, keeping the actual execution simple and general enough. IMO the transformers should all follow this pattern, allowing something like this: ```python # Instantiate transformers transformers = [ TransformerA(foo='bar'), TransformerB(bar='foo'), # others ] # During execution, call them sequentially documents = ... for tr in transformers: documents = tr.transform_documents(documents) ``` Thanks for the reviews! --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com>	2023-09-03 15:10:25 -07:00
Frédéric Lepied	4dc47bd3ac	time_weighted_retriever: use a timestamp if needed (#9906 ) If last_accessed_at metadata is a float use it as a timestamp. This allows to support vector stores that do not store datetime objects like ChromaDb. Fixes: https://github.com/langchain-ai/langchain/issues/3685 <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:05:30 -07:00
Josh White	bc8cceebf7	Extend DynamoDBChatMessageHistory to support composite keys (#9896 ) - Description: Adds two optional parameters to the DynamoDBChatMessageHistory class to enable users to pass in a name for their PrimaryKey, or a Key object itself to enable the use of composite keys, a common DynamoDB paradigm. [AWS DynamoDB Key docs](https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/) - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: Josh White <josh@ctrlstack.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:05:16 -07:00
Programmers Emperor	872d829201	Update __init__.py (#9955 ) Add SQLDatabaseSequentialChain Class to __init__.py so it can be accessed and used <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: SQLDatabaseSequentialChain is not found when importing Langchain_experimental package, when I open __init__.py Langchain_expermental.sql, I found that SQLDatabaseSequentialChain is imported and add to __all__ list - Issue: SQLDatabaseSequentialChain is not found in Langchain_experimental package - Dependencies: None, - Tag maintainer: None, - Twitter handle: None, Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:02:58 -07:00
Lucas Rodrigues Pereira	5c7afe8aae	Fix json parsing error of MULTI_PROMPT_ROUTER_TEMPLATE (#9944 ) The output at times lacks the closing markdown code block. The prompt is changed to explicitly request the closing backticks. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:00:50 -07:00
Lance Martin	387813bfb2	Sort by most recent chatIDs (#9946 ) When we `lazy_load` iMessage chats, return chats w/ most recent msg first (matches what is visualized in app).	2023-09-03 15:00:20 -07:00
German Martin	cf5a50469f	TextGen is missing async methods. (#9986 ) Adding _acall and _astream method that were missing. Preventing streaming during async executions. @rlancemartin.	2023-09-03 14:57:40 -07:00
Blake (Yung Cher Ho)	f4bed8a04c	Takeoff baseurl support (#10091 ) ## Description This PR introduces a minor change to the TitanTakeoff integration. Instead of specifying a port on localhost, this PR will allow users to specify a baseURL instead. This will allow users to use the integration if they have TitanTakeoff deployed externally (not on localhost). This removes the hardcoded reference to localhost "http://localhost:{port}". ### Info about Titan Takeoff Titan Takeoff is an inference server created by [TitanML](https://www.titanml.co/) that allows you to deploy large language models locally on your hardware in a single command. Most generative model architectures are included, such as Falcon, Llama 2, GPT2, T5 and many more. Read more about Titan Takeoff here: - [Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e) - [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started) ### Dependencies No new dependencies are introduced. However, users will need to install the titan-iris package in their local environment and start the Titan Takeoff inferencing server in order to use the Titan Takeoff integration. Thanks for your help and please let me know if you have any questions. cc: @hwchase17 @baskaryan --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 14:45:59 -07:00
Eddie Cohen	565c021730	Add ne comparator (#10006 ) Description: Adds the not comparator and operator to pinecone, chroma and deeplake. Issue: Not a registered issue but when using a selfqueryretriever with pinecone I got this error + stacktrace when I entered a query that asked to not include specific data: > raised following `error:` > Received unrecognized function ne. Valid functions are [<Operator.AND: 'and'>, <Operator.OR: 'or'>, <Operator.NOT: 'not'>, <Comparator.EQ: 'eq'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>] I noticed that chroma and deeplake also support not equals/not filtering so I added it there as well [pinecone](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language) [chroma](https://docs.trychroma.com/usage-guide#filtering-by-metadata) [deeplake](https://docs.activeloop.ai/enterprise-features/compute-engine/querying-datasets/query-syntax#and-or-not)	2023-09-03 14:45:11 -07:00
Leonid Ganeline	2221194450	`Yahoo Finance News` tool (#10014 ) Added: - the `Yahoo Finance News` tool - Ut-s - An example	2023-09-03 14:43:57 -07:00
Lars von Wedel	6d82503eb1	Add parser and loader for Azure document intelligence service. (#10136 ) Hi, this PR contains loader / parser for Azure Document intelligence which is a ML-based service to ingest arbitrary PDFs / images, even if scanned. The loader generates Documents by pages of the original document. This is my first contribution to LangChain. Unfortunately I could not find the correct place for test cases. Happy to add one if you can point me to the location, but as this is a cloud-based service, a test would require network access and credentials - so might be of limited help. Dependencies: The needed dependency was already part of pyproject.toml, no change. Twitter: feel free to mention @LarsAC on the announcement	2023-09-03 14:25:39 -07:00
Harrison Chase	4abe85be57	Harrison/string inplace (#10153 ) Co-authored-by: Wrick Talukdar <wrick.talukdar@gmail.com> Co-authored-by: Anjan Biswas <anjanavb@amazon.com> Co-authored-by: Jha <nikjha@amazon.com> Co-authored-by: Lucky-Lance <77819606+Lucky-Lance@users.noreply.github.com> Co-authored-by: 陆徐东 <luxudong@MacBook-Pro.local>	2023-09-03 14:25:29 -07:00
Harrison Chase	f5af756397	fake messages list model (#10152 ) create a fake chat model that you can configure with list of messages	2023-09-03 13:49:43 -07:00
Harrison Chase	9e6cc7b236	make hub push public by default (#10138 )	2023-09-03 13:04:58 -07:00
Bagatur	0e4c5dd176	bump 13 (#10130 )	2023-09-02 10:22:31 -07:00
Bagatur	42582adb66	bump 280 (#10117 )	2023-09-01 17:43:14 -07:00
Bagatur	9e196cb470	rm sqlite3 import (#10115 )	2023-09-01 17:14:06 -07:00
Arpan Pokharel	f8bca156d4	Add where filter in weaviate similarity search with score (#9978 ) - Description: Add where filter in weaviate similarity search with score - Issue: #9853 - Dependencies: - - Tag maintainer: - - Twitter handle: -	2023-09-01 16:09:19 -07:00
Leonid Kuligin	30239b3025	added support for inference from Model Garden (#9367 ) #8850 --------- Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-09-01 15:58:21 -07:00
Benjamin Matson	58d7d86e51	feat: add bedrock chat model (#8017 ) Replace this comment with: - Description: Add Bedrock implementation of Anthropic Claude for Chat - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: @bwmatson --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-01 13:16:57 -07:00
Massimiliano Pronesti	a7c9bd30d4	feat(llms): add missing params to huggingface text-generation (#9724 ) This small PR aims at supporting the following missing parameters in the `HuggingfaceTextGen` LLM: - `return_full_text` - sometimes useful for completion tasks - `do_sample` - quite handy to control the randomness of the model. - `watermark` @hwchase17 @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-01 13:16:27 -07:00
KyrianC	491089754d	EdenAI LLM update. Add models name option (#8963 ) This PR follows the Eden AI (LLM + embeddings) integration. #8633 We added an optional parameter to choose different AI models for providers (like 'text-bison' for provider 'google', 'text-davinci-003' for provider 'openai', etc.). Usage: ```python llm = EdenAI( feature="text", provider="google", params={ "model": "text-bison", # new "temperature": 0.2, "max_tokens": 250, }, ) ``` You can also change the provider + model after initialization ```python llm = EdenAI( feature="text", provider="google", params={ "temperature": 0.2, "max_tokens": 250, }, ) prompt = """ hi """ llm(prompt, providers='openai', model='text-davinci-003') # change provider & model ``` The jupyter notebook as been updated with an example well. Ping: @hwchase17, @baskaryan --------- Co-authored-by: RedhaWassim <rwasssim@gmail.com> Co-authored-by: sam <melaine.samy@gmail.com>	2023-09-01 12:11:33 -07:00
maks-operlejn-ds	b5a74fb973	Temporarily remove language selection (#10097 ) Adapting Microsoft Presidio to other languages requires a bit more work, so for now it will be good idea to remove the language option to choose, so as not to cause errors and confusion. https://microsoft.github.io/presidio/analyzer/languages/ I will handle different languages after the weekend 😄	2023-09-01 11:30:48 -07:00
Bagatur	71c418725f	index rename delete_mode -> cleanup (#10103 )	2023-09-01 11:12:10 -07:00

1 2 3 4 5 ...

756 Commits