langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
Lance Martin	16a27ab244	Add prompt hub for various use-cases (#9879 ) Use prompt hub in our use-case docs and guides.	2023-09-03 15:32:22 -07:00
Lorenzo	00a7c31ffd	Fix: Nested Dicts Handling of Document Metadata (#9880 ) ## Description When the `MultiQueryRetriever` is used to get the list of documents relevant according to a query, inside a vector store, and at least one of these contain metadata with nested dictionaries, a `TypeError: unhashable type: 'dict'` exception is thrown. This is caused by the `unique_union` function which, to guarantee the uniqueness of the returned documents, tries, unsuccessfully, to hash the nested dictionaries and use them as a part of key. ```python unique_documents_dict = { (doc.page_content, tuple(sorted(doc.metadata.items()))): doc for doc in documents } ``` ## Issue #9872 (MultiQueryRetriever (get_relevant_documents) raises TypeError: unhashable type: 'dict' with dic metadata) ## Solution A possible solution is to dump the metadata dict to a string and use it as a part of hashed key. ```python unique_documents_dict = { (doc.page_content, json.dumps(doc.metadata, sort_keys=True)): doc for doc in documents } ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:27:46 -07:00
Leonid Ganeline	a52fe9528e	docs: fixed title in `Bittensor` example (#9893 ) Fixed title in the `Bittensor` example. The old title brakes the sorted order of items in the navbar. Added some formatting.	2023-09-03 15:10:42 -07:00
Davide Menini	b8baead70c	fix (Html2TextTransformer): allow configuration of html2text (#9914 ) Hi, this PR enables configuring the html2text package, instead of being bound to use the hardcoded values. While simply passing `ignore_links` and `ignore_images` to the `transform_documents` method was possible, I preferred passing them to the `__init__` method for 2 reasons: 1. It is more efficient in case of subsequent calls to `transform_documents`. 2. It allows to move the "complexity" to the instantiation, keeping the actual execution simple and general enough. IMO the transformers should all follow this pattern, allowing something like this: ```python # Instantiate transformers transformers = [ TransformerA(foo='bar'), TransformerB(bar='foo'), # others ] # During execution, call them sequentially documents = ... for tr in transformers: documents = tr.transform_documents(documents) ``` Thanks for the reviews! --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com>	2023-09-03 15:10:25 -07:00
seamusp	abd8681341	docs: chains & memory fixes (#9895 ) Various improvements to the Chains & Memory sections of the documentation including formatting, spelling, and grammar fixes to improve readability.	2023-09-03 15:06:20 -07:00
Frédéric Lepied	4dc47bd3ac	time_weighted_retriever: use a timestamp if needed (#9906 ) If last_accessed_at metadata is a float use it as a timestamp. This allows to support vector stores that do not store datetime objects like ChromaDb. Fixes: https://github.com/langchain-ai/langchain/issues/3685 <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:05:30 -07:00
Josh White	bc8cceebf7	Extend DynamoDBChatMessageHistory to support composite keys (#9896 ) - Description: Adds two optional parameters to the DynamoDBChatMessageHistory class to enable users to pass in a name for their PrimaryKey, or a Key object itself to enable the use of composite keys, a common DynamoDB paradigm. [AWS DynamoDB Key docs](https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/) - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: Josh White <josh@ctrlstack.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-03 15:05:16 -07:00
Programmers Emperor	872d829201	Update __init__.py (#9955 ) Add SQLDatabaseSequentialChain Class to __init__.py so it can be accessed and used <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: SQLDatabaseSequentialChain is not found when importing Langchain_experimental package, when I open __init__.py Langchain_expermental.sql, I found that SQLDatabaseSequentialChain is imported and add to __all__ list - Issue: SQLDatabaseSequentialChain is not found in Langchain_experimental package - Dependencies: None, - Tag maintainer: None, - Twitter handle: None, Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:02:58 -07:00
Lucas Rodrigues Pereira	5c7afe8aae	Fix json parsing error of MULTI_PROMPT_ROUTER_TEMPLATE (#9944 ) The output at times lacks the closing markdown code block. The prompt is changed to explicitly request the closing backticks. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-03 15:00:50 -07:00
Lance Martin	387813bfb2	Sort by most recent chatIDs (#9946 ) When we `lazy_load` iMessage chats, return chats w/ most recent msg first (matches what is visualized in app).	2023-09-03 15:00:20 -07:00
German Martin	cf5a50469f	TextGen is missing async methods. (#9986 ) Adding _acall and _astream method that were missing. Preventing streaming during async executions. @rlancemartin.	2023-09-03 14:57:40 -07:00
Blake (Yung Cher Ho)	f4bed8a04c	Takeoff baseurl support (#10091 ) ## Description This PR introduces a minor change to the TitanTakeoff integration. Instead of specifying a port on localhost, this PR will allow users to specify a baseURL instead. This will allow users to use the integration if they have TitanTakeoff deployed externally (not on localhost). This removes the hardcoded reference to localhost "http://localhost:{port}". ### Info about Titan Takeoff Titan Takeoff is an inference server created by [TitanML](https://www.titanml.co/) that allows you to deploy large language models locally on your hardware in a single command. Most generative model architectures are included, such as Falcon, Llama 2, GPT2, T5 and many more. Read more about Titan Takeoff here: - [Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e) - [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started) ### Dependencies No new dependencies are introduced. However, users will need to install the titan-iris package in their local environment and start the Titan Takeoff inferencing server in order to use the Titan Takeoff integration. Thanks for your help and please let me know if you have any questions. cc: @hwchase17 @baskaryan --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-09-03 14:45:59 -07:00
Pu Cao	05664a6f20	docs(text_splitter): update document of character splitter with tiktoken (#10001 ) The current document has not mentioned that splits larger than chunk size would happen. I update the related document and explain why it happens and how to solve it. related issue #1349 #3838 #2140	2023-09-03 14:45:45 -07:00
Eddie Cohen	565c021730	Add ne comparator (#10006 ) Description: Adds the not comparator and operator to pinecone, chroma and deeplake. Issue: Not a registered issue but when using a selfqueryretriever with pinecone I got this error + stacktrace when I entered a query that asked to not include specific data: > raised following `error:` > Received unrecognized function ne. Valid functions are [<Operator.AND: 'and'>, <Operator.OR: 'or'>, <Operator.NOT: 'not'>, <Comparator.EQ: 'eq'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>] I noticed that chroma and deeplake also support not equals/not filtering so I added it there as well [pinecone](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language) [chroma](https://docs.trychroma.com/usage-guide#filtering-by-metadata) [deeplake](https://docs.activeloop.ai/enterprise-features/compute-engine/querying-datasets/query-syntax#and-or-not)	2023-09-03 14:45:11 -07:00
Leonid Ganeline	2221194450	`Yahoo Finance News` tool (#10014 ) Added: - the `Yahoo Finance News` tool - Ut-s - An example	2023-09-03 14:43:57 -07:00
Ismail Pelaseyed	5c3e9c9083	Add example of running Q&A over structured data using the `Airbyte` loaders and `pandas` (#10069 ) - Description: Added example of running Q&A over structured data using the `Airbyte` loaders and `pandas` - Dependencies: any dependencies required for this change, - Tag maintainer: @hwchase17 - Twitter handle: @pelaseyed	2023-09-03 14:32:33 -07:00
Lars von Wedel	6d82503eb1	Add parser and loader for Azure document intelligence service. (#10136 ) Hi, this PR contains loader / parser for Azure Document intelligence which is a ML-based service to ingest arbitrary PDFs / images, even if scanned. The loader generates Documents by pages of the original document. This is my first contribution to LangChain. Unfortunately I could not find the correct place for test cases. Happy to add one if you can point me to the location, but as this is a cloud-based service, a test would require network access and credentials - so might be of limited help. Dependencies: The needed dependency was already part of pyproject.toml, no change. Twitter: feel free to mention @LarsAC on the announcement	2023-09-03 14:25:39 -07:00
Harrison Chase	4abe85be57	Harrison/string inplace (#10153 ) Co-authored-by: Wrick Talukdar <wrick.talukdar@gmail.com> Co-authored-by: Anjan Biswas <anjanavb@amazon.com> Co-authored-by: Jha <nikjha@amazon.com> Co-authored-by: Lucky-Lance <77819606+Lucky-Lance@users.noreply.github.com> Co-authored-by: 陆徐东 <luxudong@MacBook-Pro.local>	2023-09-03 14:25:29 -07:00
Harrison Chase	f5af756397	fake messages list model (#10152 ) create a fake chat model that you can configure with list of messages	2023-09-03 13:49:43 -07:00
Harrison Chase	9e6cc7b236	make hub push public by default (#10138 )	2023-09-03 13:04:58 -07:00
Nino Risteski	0c0a7d19eb	Update openai_multi_functions_agent.ipynb (#10144 ) typo fix	2023-09-03 13:00:48 -07:00
Nino Risteski	f968b86652	Update apis.ipynb (#10145 ) few typo fixes	2023-09-03 13:00:22 -07:00
Guy Korland	765ef3b486	Add FalkorDB to imports (#10151 )	2023-09-03 12:52:28 -07:00
Nino Risteski	746c6ff9c3	Update index.mdx (#10142 ) fixed typos	2023-09-02 22:36:26 -07:00
Nino Risteski	fdebd3e02f	Update chat_vector_db.mdx (#10141 ) typo fix	2023-09-02 22:36:09 -07:00
Bagatur	0e4c5dd176	bump 13 (#10130 )	2023-09-02 10:22:31 -07:00
Bagatur	42582adb66	bump 280 (#10117 )	2023-09-01 17:43:14 -07:00
Bagatur	9e196cb470	rm sqlite3 import (#10115 )	2023-09-01 17:14:06 -07:00
Arpan Pokharel	f8bca156d4	Add where filter in weaviate similarity search with score (#9978 ) - Description: Add where filter in weaviate similarity search with score - Issue: #9853 - Dependencies: - - Tag maintainer: - - Twitter handle: -	2023-09-01 16:09:19 -07:00
Leonid Kuligin	30239b3025	added support for inference from Model Garden (#9367 ) #8850 --------- Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-09-01 15:58:21 -07:00
Leonid Ganeline	54a8df87b9	📖 docs: fixed `integration/llms` navbar (#9277 ) Fixed navbar: - renamed several files, so ToC is sorted correctly - made ToC items consistent: formatted several Titles - added several links - reformatted several docs to a consistent format - renamed several files (removed `_example` suffix) - added renamed files to the `docs/docs_skeleton/vercel.json`	2023-09-01 15:30:37 -07:00
Bagatur	b485c3048b	rm base64 images from docs (#10110 ) Causing problems indexing docs and notebook images don't render after markdown conversion anyways	2023-09-01 15:15:12 -07:00
William FH	f2fc4173c3	Update redirects meta tags (#10109 )	2023-09-01 15:14:34 -07:00
Leonid Ganeline	37e435bd00	docs: `youtube_search` tool example update (#9958 ) Added a link to source package; updated title, description.	2023-09-01 13:32:27 -07:00
Leonid Ganeline	3b8ee74e38	docs: `google-drive-tool` example fix (#10000 ) This notebook was mistakenly placed in the `toolkits` folder and appears within `Agents & Toolkits` menu. But it should be in `Tools`. Moved example into `tools/`; updated title to consistent format.	2023-09-01 13:31:26 -07:00
seamusp	afd96b2460	docs: agents & callbacks fixes (#10066 ) Various improvements to the Agents & Callbacks sections of the documentation including formatting, spelling, and grammar fixes to improve readability.	2023-09-01 13:28:55 -07:00
Benjamin Matson	58d7d86e51	feat: add bedrock chat model (#8017 ) Replace this comment with: - Description: Add Bedrock implementation of Anthropic Claude for Chat - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: @bwmatson --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-01 13:16:57 -07:00
Massimiliano Pronesti	a7c9bd30d4	feat(llms): add missing params to huggingface text-generation (#9724 ) This small PR aims at supporting the following missing parameters in the `HuggingfaceTextGen` LLM: - `return_full_text` - sometimes useful for completion tasks - `do_sample` - quite handy to control the randomness of the model. - `watermark` @hwchase17 @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-09-01 13:16:27 -07:00
KyrianC	491089754d	EdenAI LLM update. Add models name option (#8963 ) This PR follows the Eden AI (LLM + embeddings) integration. #8633 We added an optional parameter to choose different AI models for providers (like 'text-bison' for provider 'google', 'text-davinci-003' for provider 'openai', etc.). Usage: ```python llm = EdenAI( feature="text", provider="google", params={ "model": "text-bison", # new "temperature": 0.2, "max_tokens": 250, }, ) ``` You can also change the provider + model after initialization ```python llm = EdenAI( feature="text", provider="google", params={ "temperature": 0.2, "max_tokens": 250, }, ) prompt = """ hi """ llm(prompt, providers='openai', model='text-davinci-003') # change provider & model ``` The jupyter notebook as been updated with an example well. Ping: @hwchase17, @baskaryan --------- Co-authored-by: RedhaWassim <rwasssim@gmail.com> Co-authored-by: sam <melaine.samy@gmail.com>	2023-09-01 12:11:33 -07:00
maks-operlejn-ds	b5a74fb973	Temporarily remove language selection (#10097 ) Adapting Microsoft Presidio to other languages requires a bit more work, so for now it will be good idea to remove the language option to choose, so as not to cause errors and confusion. https://microsoft.github.io/presidio/analyzer/languages/ I will handle different languages after the weekend 😄	2023-09-01 11:30:48 -07:00
Bagatur	71c418725f	index rename delete_mode -> cleanup (#10103 )	2023-09-01 11:12:10 -07:00
Nuno Campos	427f696fb0	Nc/runnables seqmap tags (#9753 )	2023-09-01 18:53:10 +01:00
Bagatur	b927277809	Bagatur/eden type 2 (#10102 )	2023-09-01 10:27:27 -07:00
Bagatur	d4380339c1	eden tool nb nit (#10101 )	2023-09-01 10:16:39 -07:00
Harrison Chase	d7bf7dc412	add repr for not serializable (#10071 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-09-01 09:18:32 -07:00
Bagatur	355ff09cce	bump 279 (#10098 )	2023-09-01 08:49:26 -07:00
Pihplipe Oegr	3dafbd852e	Add sqlite-vss as a vector database (#10047 ) This adds sqlite-vss as an option for a vector database. Contains the code and a few tests. Tests are passing and the library sqlite-vss is added as optional as explained in the contributing guidelines. I adjusted the code for lint/black/ and mypy. It looks that everything is currently passing. Adding sqlite-vss was mentioned in this issue: https://github.com/langchain-ai/langchain/issues/1019. Also mentioned here in the sqlite-vss repo for the curious: https://github.com/asg017/sqlite-vss/issues/66 Maintainer tag: @baskaryan --------- Co-authored-by: Philippe Oger <philippe.oger@adevinta.com>	2023-09-01 08:36:34 -07:00
KyrianC	c7a5504789	Add EdenAI Tools (#9764 ) This PR follows the Eden AI (LLM + embeddings) integration. #8633 We added different Tools to empower agents with new capabilities : - text: explicit content detection - image: explicit content detection - image: object detection - OCR: invoice parsing - OCR: ID parsing - audio: speech to text - audio: text to speech We plan to add more in the future (like translation, language detection, + others). Usage: ```python llm=EdenAI(feature="text",provider="openai", params={"temperature" : 0.2,"max_tokens" : 250}) tools = [ EdenAiTextModerationTool(providers=["openai"],language="en"), EdenAiObjectDetectionTool(providers=["google","api4ai"]), EdenAiTextToSpeechTool(providers=["amazon"],language="en",voice="MALE"), EdenAiExplicitImageTool(providers=["amazon","google"]), EdenAiSpeechToTextTool(providers=["amazon"]), EdenAiParsingIDTool(providers=["amazon","klippa"],language="en"), EdenAiParsingInvoiceTool(providers=["amazon","google"],language="en"), ] agent_chain = initialize_agent( tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, return_intermediate_steps=True, ) result = agent_chain(""" i have this text : 'i want to slap you' first : i want to know if this text contains explicit content or not . second : if it does contain explicit content i want to know what is the explicit content in this text, third : i want to make the text into speech . if there is URL in the observations , you will always put it in the output (final answer) . """) ``` output: > Entering new AgentExecutor chain... > I need to extract the information from the ID and then convert it to text and then to speech > Action: edenai_identity_parsing > Action Input: "https://www.citizencard.com/images/citizencard-uk-id-card-2023.jpg" > Observation: last_name : > value : ANGELA > given_names : > value : GREENE > birth_place : > birth_date : > value : 2000-11-09 > issuance_date : > expire_date : > document_id : > issuing_state : > address : > age : > country : > document_type : > value : DRIVER LICENSE FRONT > gender : > image_id : > image_signature : > mrz : > nationality : > Thought: I now need to convert the information to text and then to speech > Action: edenai_text_to_speech > Action Input: "Welcome Angela Greene!" > Observation: https://d14uq1pz7dzsdq.cloudfront.net/0c494819-0bbc-4433-bfa4-6e99bd9747ea_.mp3?Expires=1693316851&Signature=YcMoVQgPuIMEOuSpFuvhkFM8JoBMSoGMcZb7MVWdqw7JEf5~67q9dEI90o5todE5mYXB5zSYoib6rGrmfBl4Rn5~yqDwZ~Tmc24K75zpQZIEyt5~ZSnHuXy4IFWGmlIVuGYVGMGKxTGNeCRNUXDhT6TXGZlr4mwa79Ei1YT7KcNyc1dsTrYB96LphnsqOERx4X9J9XriSwxn70X8oUPFfQmLcitr-syDhiwd9Wdpg6J5yHAJjf657u7Z1lFTBMoXGBuw1VYmyno-3TAiPeUcVlQXPueJ-ymZXmwaITmGOfH7HipZngZBziofRAFdhMYbIjYhegu5jS7TxHwRuox32A__&Key-Pair-Id=K1F55BTI9AHGIK > Thought: I now know the final answer > Final Answer: https://d14uq1pz7dzsdq.cloudfront.net/0c494819-0bbc-4433-bfa4-6e99bd9747ea_.mp3?Expires=1693316851&Signature=YcMoVQgPuIMEOuSpFuvhkFM8JoBMSoGMcZb7MVWdqw7JEf5~67q9dEI90o5todE5mYXB5zSYoib6rGrmfBl4Rn5~yqDwZ~Tmc24K75zpQZIEyt5~ZSnHuXy4IFWGmlIVuGYVGMGKxTGNeCRNUXDhT6TXGZlr4mwa79Ei1YT7KcNyc1dsTrYB96LphnsqOERx4X9J9XriSwxn70X8oUPFfQmLcitr-syDhiwd9Wdpg6J5y > > Finished chain. Other examples are available in the jupyter notebook. This PR is made in parallel with EdenAI LLM update #8963 I apologize for the messy PR. While working in implementing Tools we realized there was a few problems we needed to fix on LLM as well. Ping: @hwchase17, @baskaryan --------- Co-authored-by: RedhaWassim <rwasssim@gmail.com>	2023-09-01 08:26:56 -07:00
Bagatur	5f1c67b47c	Mv LCEL docs up a level (#10073 )	2023-09-01 08:20:55 -07:00
Nuno Campos	561ac17248	Add root run wrapping call to RunnableEach() (#9864 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-09-01 15:57:33 +01:00

1 2 3 4 5 ...

4416 Commits