langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-02 09:40:22 +00:00

Author	SHA1	Message	Date
Massimiliano Pronesti	0c0db7c5db	feat(community): support semantic hybrid score threshold in Azure AI Search (#21527 ) Support semantic hybrid search with a score threshold -- similar to what we do for similarity search and for hybrid search (#20907).	2024-05-16 15:54:32 -04:00
Stefano Lottini	040597e832	community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765 ) This PR improves on the `CassandraCache` and `CassandraSemanticCache` classes, mainly in the constructor signature, and also introduces several minor improvements around these classes. ### Init signature A (sigh) breaking change is tentatively introduced to the constructor. To me, the advantages outweigh the possible discomfort: the new syntax places the DB-connection objects `session` and `keyspace` later in the param list, so that they can be given a default value. This is what enables the pattern of _not_ specifying them, provided one has previously initialized the Cassandra connection through the versatile utility method `cassio.init(...)`. In this way, a much less unwieldy instantiation can be done, such as `CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`, everything else falling back to defaults. A downside is that, compared to the earlier signature, this might turn out to be breaking for those doing positional instantiation. As a way to mitigate this problem, this PR typechecks its first argument trying to detect the legacy usage. (And to make this point less tricky in the future, most arguments are left to be keyword-only). If this is considered too harsh, I'd like guidance on how to further smoothen this transition. Our plan is to make the pattern of optional session/keyspace a standard across all Cassandra classes, so that a repeatable strategy would be ideal. A possibility would be to keep positional arguments for legacy reasons but issue a deprecation warning if any of them is actually used, to later remove them with 0.2 - please advise on this point. ### Other changes - class docstrings: enriched, completely moved to class level, added note on `cassio.init(...)` pattern, added tiny sample usage code. - semantic cache: revised terminology to never mention "distance" (it is in fact a similarity!). Kept the legacy constructor param with a deprecation warning if used. - `llm_caching` notebook: uniform flow with the Cassandra and Astra DB separate cases; better and Cassandra-first description; all imports made explicit and from community where appropriate. - cache integration tests moved to community (incl. the imported tools), env var bugfix for `CASSANDRA_CONTACT_POINTS`. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-16 17:22:24 +00:00
Kyle Cassidy	eca8c4bcc6	Standardized openai init params (#21739 ) ## Patch Summary community:openai[patch]: standardize init args ## Details I made changes to the OpenAI Chat API wrapper test in the Langchain open-source repository - File: `libs/community/tests/unit_tests/chat_models/test_openai.py` - Changes: - Updated `max_retries` with Pydantic Field - Updated the corresponding unit test - Related Issues: #20085 - Updated max_retries with Pydantic Field, updated the unit test. --------- Co-authored-by: JuHyung Son <sonju0427@gmail.com>	2024-05-16 16:30:52 +00:00
Ethan Yang	e44b448ec3	community: update openvino doc with streaming support (#21519 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-05-16 15:54:45 +00:00
ccurme	19e6bf814b	community: fix CI (#21766 )	2024-05-16 15:41:03 +00:00
Mish Ushakov	d77e60a7f4	community: updated Browserbase loader (#21757 ) Thank you for contributing to LangChain! - [x] PR title: "community: updated Browserbase loader" - [x] PR message: Updates the Browserbase loader with more options and improved docs. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-05-16 08:21:23 -07:00
Cheese	0ead09f84d	community: Implement `bind_tools` for ChatTongyi (#20725 ) ## Description Implement `bind_tools` in ChatTongyi. Usage example: ```py from langchain_core.tools import tool from langchain_community.chat_models.tongyi import ChatTongyi @tool def multiply(first_int: int, second_int: int) -> int: """Multiply two integers together.""" return first_int * second_int llm = ChatTongyi(model="qwen-turbo") llm_with_tools = llm.bind_tools([multiply]) msg = llm_with_tools.invoke("What's 5 times forty two") print(msg) ``` Streaming is also supported. ## Dependencies No Dependency is required for this change. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-05-16 10:39:35 -04:00
Harrison Chase	15be439719	Harrison/move flashrank rerank (#21448 ) third party integration, should be in community	2024-05-15 13:08:52 -07:00
Rajendra Kadam	54e003268e	langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641 ) - Description: PebbloRetrievalQA chain introduces identity enforcement using vector-db metadata filtering - Dependencies: None - Issue: None - Documentation: Adding documentation for PebbloRetrievalQA chain in a separate PR(https://github.com/langchain-ai/langchain/pull/20746) - Unit tests: New unit-tests added --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-05-15 13:14:52 +00:00
Erick Friis	c77d2f2b06	multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646 ) 0.2 is not a breaking release for core (but it is for langchain and community) To keep the core+langchain+community packages in sync at 0.2, we will relax deps throughout the ecosystem to tolerate `langchain-core` 0.2	2024-05-13 19:50:36 -07:00
Anush	edd68e4ad4	qdrant: init package (#21146 ) ## Description This PR introduces the new `langchain-qdrant` partner package, intending to deprecate the community package. ## Changes - Moved the Qdrant vector store implementation `/libs/partners/qdrant` with integration tests. - The conditional imports of the client library are now regular with minor implementation improvements. - Added a deprecation warning to `langchain_community.vectorstores.qdrant.Qdrant`. - Replaced references/imports from `langchain_community` with either `langchain_core` or by moving the definitions to the `langchain_qdrant` package itself. - Updated the Qdrant vector store documentation to reflect the changes. ## Testing - `QDRANT_URL` and [`QDRANT_API_KEY`](`583e36bf6b`) env values need to be set to [run integration tests](`d608c93d1f`) in the [cloud](https://cloud.qdrant.tech). - If a Qdrant instance is running at `http://localhost:6333`, the integration tests will use it too. - By default, tests use an [`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode) instance(Not comprehensive). --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-05-13 18:20:03 -07:00
Prashanth Rao	63c3a0e56c	[community][graph]: Update KuzuQAChain and docs (#21218 ) This PR makes some small updates for `KuzuQAChain` for graph QA. - Updated Cypher generation prompt (we now support `WHERE EXISTS`) and generalize it more - Support different LLMs for Cypher generation and QA - Update docs and examples	2024-05-13 17:17:14 -07:00
Jofthomas	afd85b60fc	huggingface: init package (#21097 ) First Pr for the langchain_huggingface partner Package - Moved some of the hugging face related class from `community` to the new `partner package` Still needed : - Documentation - Tests - Support for the new apply_chat_template in `ChatHuggingFace` - Confirm choice of class to support for embeddings witht he sentence-transformer team. cc : @efriis --------- Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-13 20:53:15 +00:00
Tomaz Bratanic	9fce03e7db	community[patch]: Fix neo4j enhanced schema (#21582 )	2024-05-13 15:26:06 -04:00
Christophe Bornet	66a4da8ad0	community[patch]: Improve Cassandra VectorStore docsctrings (#21620 )	2024-05-13 15:24:26 -04:00
Eugene Yurtsev	25fbe356b4	community[patch]: upgrade to recent version of mypy (#21616 ) This PR upgrades community to a recent version of mypy. It inserts type: ignore on all existing failures.	2024-05-13 14:55:07 -04:00
Jorge Piedrahita Ortiz	4378fbbef0	community[patch]: Fix typos in Sambanova integration doc-strings (#21617 ) - Description: Sambanova integration docstrings updated, bad formated --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-05-13 18:35:16 +00:00
Christophe Bornet	bcf53f93e1	[community]: Add missing docstring param to CassandraLoader (#21611 )	2024-05-13 16:03:18 +00:00
Christophe Bornet	e6fa4547b1	community[minor]: Add alazy_load to AsyncHtmlLoader (#21536 ) Also fixes a bug that `_scrape` was called and was doing a second HTTP request synchronously. Twitter handle: cbornet_	2024-05-13 12:01:03 -04:00
Leonid Ganeline	500569da48	community[patch]: `vectorstores` import update (#21169 ) Issue: we have several helper functions to import third-party libraries like lancedb.import_lancedb in [community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb). And we have core.utils.utils.guard_import that works exactly for this purpose. The import_<package> functions work inconsistently and rather be private functions. Change: replaced these functions with the guard_import function. Related to #21133	2024-05-13 10:45:31 -04:00
ccurme	3003363605	langchain, community: remove cap on sqlalchemy and bump duckdb (#21509 )	2024-05-13 10:16:09 -04:00
Erick Friis	3db85cbb5b	community: deps (#21508 )	2024-05-09 15:12:34 -07:00
ccurme	375f447e58	community: fix builds with min dependencies (#21495 )	2024-05-09 13:01:44 -07:00
ccurme	3bb9bec314	bedrock: add unit test for retriever (#21485 ) This was implemented in https://github.com/langchain-ai/langchain/pull/21349 but dropped before merge.	2024-05-09 11:37:03 -04:00
Renu Rozera	4035a1d234	Add source metadata to bedrock retriever response (#21349 ) Thank you for contributing to LangChain! - [X] PR title: "community: Add source metadata to bedrock retriever response" - [X] PR message: - Description: Bedrock retrieve API returns extra metadata in the response which is currently not returned in the retriever response - Issue: The change adds the metadata from bedrock retrieve API response to the bedrock retriever in a backward compatible way. Renamed metadata to sourceMetadata as metadata term is being used in the Document already. This is in sync with what we are doing in llama-index as well. - Dependencies: No - [X] Add tests and docs: 1. Added unit tests 2. Notebook already exists and does not need any change 3. Response from end to end testing, just to ensure backward compatibility: `[Document(page_content='Exoplanets.', metadata={'location': {'s3Location': {'uri': 's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647, 'source_metadata': {'x-amz-bedrock-kb-source-uri': 's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year': 1946.0}})]` - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Piyush Jain <piyushjain@duck.com>	2024-05-09 11:06:22 -04:00
Erick Friis	f178c67ad0	community: release 0.2.0rc1, bump deps (#21470 )	2024-05-08 23:32:44 -07:00
roiperlman	9992beaff9	community: Add arguments to whisper parser (#20378 ) Description: Added a few additional arguments to the whisper parser, which can be consumed by the underlying API. The prompt is especially important to fine-tune transcriptions. --------- Co-authored-by: Roi Perlman <roi@fivesigmalabs.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-08 17:53:13 -07:00
Yash	cb31c3611f	Ndb enterprise (#21233 ) Description: Adds NeuralDBClientVectorStore to the langchain, which is our enterprise client. --------- Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>	2024-05-08 16:30:58 -07:00
Oguz Vuruskaner	5b35f077f9	[community][fix](DeepInfraEmbeddings): Implement chunking for large batches (#21189 ) Description: This PR introduces chunking logic to the `DeepInfraEmbeddings` class to handle large batch sizes without exceeding maximum batch size of the backend. This enhancement ensures that embedding generation processes large batches by breaking them down into smaller, manageable chunks, each conforming to the maximum batch size limit. Issue: Fixes #21189 Dependencies: No new dependencies introduced.	2024-05-08 14:45:42 -07:00
Sokolov Fedor	f4ddf64faa	community: Add MarkdownifyTransformer to langchain_community.document_transformers (#21247 ) - Added new document_transformer: MarkdonifyTransformer, that uses `markdonify` package with customizable options to convert HTML to Markdown. It's similar to Html2TextTransformer, but has more flexible options and also I've noticed that sometimes MarkdownifyTransformer performs better than html2text one, so that's why I use markdownify on my project. - Added docs and tests - Usage: ```python from langchain_community.document_transformers import MarkdownifyTransformer markdownify = MarkdownifyTransformer() docs_transform = markdownify.transform_documents(docs) ``` - Example of better performance on simple task, that I've noticed: ``` <html> <head><title>Reports on product movement</title></head> <body> <p data-block-key="2wst7">The reports on product movement will be useful for forming supplier orders and controlling outcomes.</p> </body> ``` Html2TextTransformer: ```python [Document(page_content='The reports on product movement will be useful for forming supplier orders and\ncontrolling outcomes.\n\n')] # Here we can see 'and\ncontrolling', which has extra '\n' in it ``` MarkdownifyTranformer: ```python [Document(page_content='Reports on product movement\n\nThe reports on product movement will be useful for forming supplier orders and controlling outcomes.')] ``` --------- Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.bbrouter> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.local> Co-authored-by: Sokolov Fedor <f.sokolov@192.168.1.6>	2024-05-08 14:45:13 -07:00
Alex JW	d3ce6aad2e	community: Instantiate GPT4AllEmbeddings with parameters (#21238 ) ### GPT4AllEmbeddings parameters --- Description: As of right now the Embed4All class inside _GPT4AllEmbeddings_ is instantiated as it's default which leaves no room to customize the chosen model and it's behavior. Thus: - GPT4AllEmbeddings can now be instantiated with custom parameters like a different model that shall be used. --------- Co-authored-by: AlexJauchWalser <alexander.jauch-walser@knime.com>	2024-05-08 14:44:47 -07:00
Philippe PRADOS	7be68228da	community[patch]: Make sql record manager fully compatible with async (#20735 ) The `_amake_session()` method does not allow modifying the `self.session_factory` with anything other than `async_sessionmaker`. This prohibits advanced uses of `index()`. In a RAG architecture, it is necessary to import document chunks. To keep track of the links between chunks and documents, we can use the `index()` API. This API proposes to use an SQL-type record manager. In a classic use case, using `SQLRecordManager` and a vector database, it is impossible to guarantee the consistency of the import. Indeed, if a crash occurs during the import (problem with the network, ...) there is an inconsistency between the SQL database and the vector database. With the [PR](https://github.com/langchain-ai/langchain-postgres/pull/32) we are proposing for `langchain-postgres`, it is now possible to guarantee the consistency of the import of chunks into a vector database. It's possible only if the outer session is built with the connection. ```python def main(): db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/" engine = create_engine(db_url, echo=True) embeddings = FakeEmbeddings() pgvector:VectorStore = PGVector( embeddings=embeddings, connection=engine, ) record_manager = SQLRecordManager( namespace="namespace", engine=engine, ) record_manager.create_schema() with engine.connect() as connection: session_maker = scoped_session(sessionmaker(bind=connection)) # NOTE: Update session_factories record_manager.session_factory = session_maker pgvector.session_maker = session_maker with connection.begin(): loader = CSVLoader( "data/faq/faq.csv", source_column="source", autodetect_encoding=True, ) result = index( source_id_key="source", docs_source=loader.load()[:1], cleanup="incremental", vector_store=pgvector, record_manager=record_manager, ) print(result) ``` The same thing is possible asynchronously, but a bug in `sql_record_manager.py` in `_amake_session()` must first be fixed. ```python async def _amake_session(self) -> AsyncGenerator[AsyncSession, None]: """Create a session and close it after use.""" # FIXME: REMOVE if not isinstance(self.session_factory, async_sessionmaker):~~ if not isinstance(self.engine, AsyncEngine): raise AssertionError("This method is not supported for sync engines.") async with self.session_factory() as session: yield session ``` Then, it is possible to do the same thing asynchronously: ```python async def main(): db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/" engine = create_async_engine(db_url, echo=True) embeddings = FakeEmbeddings() pgvector:VectorStore = PGVector( embeddings=embeddings, connection=engine, ) record_manager = SQLRecordManager( namespace="namespace", engine=engine, async_mode=True, ) await record_manager.acreate_schema() async with engine.connect() as connection: session_maker = async_scoped_session( async_sessionmaker(bind=connection), scopefunc=current_task) record_manager.session_factory = session_maker pgvector.session_maker = session_maker async with connection.begin(): loader = CSVLoader( "data/faq/faq.csv", source_column="source", autodetect_encoding=True, ) result = await aindex( source_id_key="source", docs_source=loader.load()[:1], cleanup="incremental", vector_store=pgvector, record_manager=record_manager, ) print(result) asyncio.run(main()) ``` --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Sean <sean@upstage.ai> Co-authored-by: JuHyung-Son <sonju0427@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: YISH <mokeyish@hotmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Jason_Chen <820542443@qq.com> Co-authored-by: Joan Fontanals <joan.fontanals.martinez@jina.ai> Co-authored-by: Pavlo Paliychuk <pavlo.paliychuk.ca@gmail.com> Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com> Co-authored-by: samanhappy <samanhappy@gmail.com> Co-authored-by: Lei Zhang <zhanglei@apache.org> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: merdan <48309329+merdan-9@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Andres Algaba <andresalgaba@gmail.com> Co-authored-by: davidefantiniIntel <115252273+davidefantiniIntel@users.noreply.github.com> Co-authored-by: Jingpan Xiong <71321890+klaus-xiong@users.noreply.github.com> Co-authored-by: kaka <kaka@zbyte-inc.cloud> Co-authored-by: jingsi <jingsi@leadincloud.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com> Co-authored-by: Michael Schock <mjschock@users.noreply.github.com> Co-authored-by: Anish Chakraborty <anish749@users.noreply.github.com> Co-authored-by: am-kinetica <85610855+am-kinetica@users.noreply.github.com> Co-authored-by: Dristy Srivastava <58721149+dristysrivastava@users.noreply.github.com> Co-authored-by: Matt <matthew.gotteiner@microsoft.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-05-08 17:31:11 -04:00
Andreas Motl	17e42bbd18	community[patch]: pgvector: Slight refactoring to make code a bit more reusable (#16243 ) - Description: Improve [pgvector vector store adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py) to make it reusable by adapters deriving from that. - Issue: NA - Dependencies: NA - References: https://github.com/crate-workbench/langchain/pull/1 - Addressed to: @eyurtsev, @cbornet Hi from the CrateDB team, first of all, thanks a stack for conceiving and maintaining LangChain. We are currently [preparing a patch](https://github.com/crate-workbench/langchain/pull/1) for adding [CrateDB](https://github.com/crate/crate) to the list of community adapters. Because CrateDB aims to be compatible with PostgreSQL to some degree, the vector store subsystem in LangChain derives functionality from the corresponding implementation for pgvector. Therefore, in order to make the implementation more reusable, we needed to rename the private methods `__from` and `__query_collection` to the less private counterparts `_from` and `_query_collection`, so they can be overwritten, in order to unlock other adapters deriving from [pgvector](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py). With kind regards, Andreas.	2024-05-08 17:21:30 -04:00
Mehrdad Shokri	f103927b88	bugfix(community): fix Playwright import paths. (#21395 ) - Description: Fix import class name exporeted from 'playwright.async_api' and 'playwright.sync_api' to match the correct name in playwright tool. Change import from inline guard_import to helper function that calls guard_import to make code more readable in gmail tool. Upgrade playwright version to 1.43.0 - Issue: #21354 - Dependencies: upgrade playwright version(this is not required for the bugfix itself, just trying to keep dependencies fresh. I can remove the playwright version upgrade if you want.)	2024-05-08 14:20:25 -07:00
Shailendra Mishra	aa966b6161	Replaced bind variable in SQL with formatted string for compatibility with sql syntax. (#21439 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-05-08 13:51:30 -07:00
Eugene Yurtsev	f92006de3c	multiple: langchain 0.2 in master (#21191 ) 0.2rc migrations - [x] Move memory - [x] Move remaining retrievers - [x] graph_qa chains - [x] some dependency from evaluation code potentially on math utils - [x] Move openapi chain from `langchain.chains.api.openapi` to `langchain_community.chains.openapi` - [x] Migrate `langchain.chains.ernie_functions` to `langchain_community.chains.ernie_functions` - [x] migrate `langchain/chains/llm_requests.py` to `langchain_community.chains.llm_requests` - [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder` -> `langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder` (namespace not ideal, but it needs to be moved to `langchain` to avoid circular deps) - [x] unit tests langchain -- add pytest.mark.community to some unit tests that will stay in langchain - [x] unit tests community -- move unit tests that depend on community to community - [x] mv integration tests that depend on community to community - [x] mypy checks Other todo - [x] Make deprecation warnings not noisy (need to use warn deprecated and check that things are implemented properly) - [x] Update deprecation messages with timeline for code removal (likely we actually won't be removing things until 0.4 release) -- will give people more time to transition their code. - [ ] Add information to deprecation warning to show users how to migrate their code base using langchain-cli - [ ] Remove any unnecessary requirements in langchain (e.g., is SQLALchemy required?) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-08 16:46:52 -04:00
Dobiichi-Origami	5b00885b49	community: add `bind_tools` and `with_structured_output` support to `QianfanChatEndpoint` (#21412 ) …Endpoint` Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: add `bind_tools` and `with_structured_output` support to `QianfanChatEndpoint` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-05-08 11:35:10 -04:00
Leonid Ganeline	791d59a2c8	community: `callbacks` guard_imports (#21173 ) Issue: we have several helper functions to import third-party libraries like import_uptrain in [community.callbacks](https://api.python.langchain.com/en/latest/callbacks/langchain_community.callbacks.uptrain_callback.import_uptrain.html#langchain_community.callbacks.uptrain_callback.import_uptrain). And we have core.utils.utils.guard_import that works exactly for this purpose. The import_<package> functions work inconsistently and rather be private functions. Change: replaced these functions with the guard_import function. Related to #21133	2024-05-07 15:04:54 -07:00
Rahul Triptahi	7994cba18d	[Community][Minor]: Fetch loader_source of GoogleDriveLoader in PebbloSafeLoader. (#21314 ) Description: This PR includes fix for loader_source to be fetched from metadata in case of GdriveLoaders. Documentation: NA Unit Test: NA Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-05-07 14:45:58 -07:00
Eugene Yurtsev	6a1d61dbf1	community[patch]: Fix in memory vectorstore to take into account ids when adding docs (#21384 ) Should respect `ids` if passed	2024-05-07 15:05:16 -04:00
Miroslav	04e2611fea	Added additional headers for HuggingFaceInferenceAPIEmbeddings endpoint. (#21282 ) Thank you for contributing to LangChain! - [ ] HuggingFaceInferenceAPIEmbeddings: "Additional Headers" - Where: langchain, community, embeddings. huggingface.py. - Community: add additional headers when needed by custom HuggingFace TEI embedding endpoints. HuggingFaceInferenceAPIEmbeddings" - [ ] PR message: *Delete this entire checklist* and replace with - Description: Adding the `additional_headers` to be passed to requests library if needed - Dependencies: none - [ ] Add tests and docs: If you're adding a new integration, please include 1. Tested with locally available TEI endpoints with and without `additional_headers` 2. Example Usage ```python embeddings=HuggingFaceInferenceAPIEmbeddings( api_key=MY_CUSTOM_API_KEY, api_url=MY_CUSTOM_TEI_URL, additional_headers={ "Content-Type": "application/json" } ) ``` Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-05-07 14:17:53 -04:00
Guangdong Liu	1fe66f5d39	community(patch) fix MoonshotChat moonshot_api_key is invaild for api key (#21361 ) Description: close https://github.com/langchain-ai/langchain/issues/21237 @baskaryan, @eyurtsev	2024-05-07 08:44:30 -07:00
Wu Enze	32c61b3ece	community[patch]: chat message history mypy fixes #17048 (#20114 ) Relates [#17048] Description : Applied fix to redis and neo4j file. Error was : `Cannot override writeable attribute with read-only property` fix with the same solution of [[langchain/libs/community/langchain_community/chat_message_histories/elasticsearch.py](`d5c412b0a9/libs/community/langchain_community/chat_message_histories/elasticsearch.py (L170-L175)`)] --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-05-06 22:17:45 +00:00
nrpd25	95cc8e3fc3	premai[patch]:Standardized model init args (#21308 ) [Standardized model init args #20085](https://github.com/langchain-ai/langchain/issues/20085) - Enable premai chat model to be initialized with `model_name` as an alias for `model`, `api_key` as an alias for `premai_api_key`. - Add initialization test `test_premai_initialization` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-05-06 18:12:29 -04:00
Tomaz Bratanic	ac14f171ac	Add indexed properties to neo4j enhanced schema (#21335 )	2024-05-06 14:28:34 -07:00
scaserini	a6cdf6572f	community: add Kendra DocumentRelevanceOverrideConfigurations request parameter (#20695 ) - Description: add DocumentRelevanceOverrideConfigurations request parameter to Kendra retriever Co-authored-by: Simone Caserini <simone.caserini@klarna.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-05-06 14:26:36 -07:00
Jorge Piedrahita Ortiz	e65652c3e8	community: add SambaNova embeddings integration (#21227 ) - Description: SambaNova hosted embeddings integration	2024-05-06 13:29:59 -07:00
Jorge Piedrahita Ortiz	df1c10260c	community: minor changes sambanova integration (#21231 ) - Description: fix: variable names in root validator not allowing pass credentials as named parameters in llm instancing, also added sambanova's sambaverse and sambastudio llms to __init__.py for module import	2024-05-06 13:28:35 -07:00
Jan Soubusta	d9a61c0fa9	fix: respect table_name argument when calling from_texts (#21252 ) valid for from_documents() as well fixes #21251	2024-05-06 20:28:22 +00:00
Pedro Lima	bebf46c4a2	community: added args_schema to YahooFinanceNewsTool (#21232 ) Description: this change adds args_schema (pydantic BaseModel) to YahooFinanceNewsTool for correct schema formatting on LLM function calls Issue: currently using YahooFinanceNewsTool with OpenAI function calling returns the following error "TypeError("YahooFinanceNewsTool._run() got an unexpected keyword argument '__arg1'")". This happens because the schema sent to the LLM is "input: "{'__arg1': 'MSFT'}"" while the method should be called with the "query" parameter. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-05-06 13:27:54 -07:00

1 2 3 4 5 ...

1002 Commits