Commit Graph

11349 Commits (e1d113ea84a2edcf4a7709fc5be0e972ea74a5d9)
 

Author SHA1 Message Date
Bagatur e1d113ea84
core,openai,grow,fw[patch]: deprecate bind_functions, update chat mod… (#26584)
…el api ref
5 days ago
ccurme 7c05f71e0f
milvus[patch]: fix vectorstore integration tests (#26583)
Resolves https://github.com/langchain-ai/langchain/issues/26564
5 days ago
Bagatur 145a49cca2
core[patch]: Release 0.3.1 (#26581) 5 days ago
Nuno Campos 5fc44989bf
core[patch]: Fix "argument of type 'NoneType' is not iterable" error in LangChainTracer (#26576)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 days ago
Erick Friis f4a65236ee
infra: only force reinstall on release (#26580) 5 days ago
Isaac Francisco 06cde06a20
core[minor]: remove beta from RemoveMessage (#26579) 5 days ago
Erick Friis 3e51fdc840
infra: more skip if pull request libs (#26578) 5 days ago
RUO 0a177ec2cc
community: Enhance MongoDBLoader with flexible metadata and optimized field extraction (#23376)
### Description:
This pull request significantly enhances the MongodbLoader class in the
LangChain community package by adding robust metadata customization and
improved field extraction capabilities. The updated class now allows
users to specify additional metadata fields through the metadata_names
parameter, enabling the extraction of both top-level and deeply nested
document attributes as metadata. This flexibility is crucial for users
who need to include detailed contextual information without altering the
database schema.

Moreover, the include_db_collection_in_metadata flag offers optional
inclusion of database and collection names in the metadata, allowing for
even greater customization depending on the user's needs.

The loader's field extraction logic has been refined to handle missing
or nested fields more gracefully. It now employs a safe access mechanism
that avoids the KeyError previously encountered when a specified nested
field was absent in a document. This update ensures that the loader can
handle diverse and complex data structures without failure, making it
more resilient and user-friendly.

### Issue:
This pull request addresses a critical issue where the MongodbLoader
class in the LangChain community package could throw a KeyError when
attempting to access nested fields that may not exist in some documents.
The previous implementation did not handle the absence of specified
nested fields gracefully, leading to runtime errors and interruptions in
data processing workflows.

This enhancement ensures robust error handling by safely accessing
nested document fields, using default values for missing data, thus
preventing KeyError and ensuring smoother operation across various data
structures in MongoDB. This improvement is crucial for users working
with diverse and complex data sets, ensuring the loader can adapt to
documents with varying structures without failing.

### Dependencies: 
Requires motor for asynchronous MongoDB interaction.

### Twitter handle: 
N/A

### Add tests and docs
Tests: Unit tests have been added to verify that the metadata inclusion
toggle works as expected and that the field extraction correctly handles
nested fields.
Docs: An example notebook demonstrating the use of the enhanced
MongodbLoader is included in the docs/docs/integrations directory. This
notebook includes setup instructions, example usage, and outputs.
(Here is the notebook link : [colab
link](https://colab.research.google.com/drive/1tp7nyUnzZa3dxEFF4Kc3KS7ACuNF6jzH?usp=sharing))
Lint and test
Before submitting, I ran make format, make lint, and make test as per
the contribution guidelines. All tests pass, and the code style adheres
to the LangChain standards.

```python
import unittest
from unittest.mock import patch, MagicMock
import asyncio
from langchain_community.document_loaders.mongodb import MongodbLoader

class TestMongodbLoader(unittest.TestCase):
    def setUp(self):
        """Setup the MongodbLoader test environment by mocking the motor client 
        and database collection interactions."""
        # Mocking the AsyncIOMotorClient
        self.mock_client = MagicMock()
        self.mock_db = MagicMock()
        self.mock_collection = MagicMock()

        self.mock_client.get_database.return_value = self.mock_db
        self.mock_db.get_collection.return_value = self.mock_collection

        # Initialize the MongodbLoader with test data
        self.loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )

    @patch('langchain_community.document_loaders.mongodb.AsyncIOMotorClient', return_value=MagicMock())
    def test_constructor(self, mock_motor_client):
        """Test if the constructor properly initializes with the correct database and collection names."""
        loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )
        self.assertEqual(loader.db_name, "testdb")
        self.assertEqual(loader.collection_name, "testcol")

    def test_aload(self):
        """Test the aload method to ensure it correctly queries and processes documents."""
        # Setup mock data and responses for the database operations
        self.mock_collection.count_documents.return_value = asyncio.Future()
        self.mock_collection.count_documents.return_value.set_result(1)
        self.mock_collection.find.return_value = [
            {"_id": "1", "content": "Test document content"}
        ]

        # Run the aload method and check responses
        loop = asyncio.get_event_loop()
        results = loop.run_until_complete(self.loader.aload())
        self.assertEqual(len(results), 1)
        self.assertEqual(results[0].page_content, "Test document content")

    def test_construct_projection(self):
        """Verify that the projection dictionary is constructed correctly based on field names."""
        self.loader.field_names = ['content', 'author']
        self.loader.metadata_names = ['timestamp']
        expected_projection = {'content': 1, 'author': 1, 'timestamp': 1}
        projection = self.loader._construct_projection()
        self.assertEqual(projection, expected_projection)

if __name__ == '__main__':
    unittest.main()
```


### Additional Example for Documentation
Sample Data:

```json
[
    {
        "_id": "1",
        "title": "Artificial Intelligence in Medicine",
        "content": "AI is transforming the medical industry by providing personalized medicine solutions.",
        "author": {
            "name": "John Doe",
            "email": "john.doe@example.com"
        },
        "tags": ["AI", "Healthcare", "Innovation"]
    },
    {
        "_id": "2",
        "title": "Data Science in Sports",
        "content": "Data science provides insights into player performance and strategic planning in sports.",
        "author": {
            "name": "Jane Smith",
            "email": "jane.smith@example.com"
        },
        "tags": ["Data Science", "Sports", "Analytics"]
    }
]
```
Example Code:

```python
loader = MongodbLoader(
    connection_string="mongodb://localhost:27017",
    db_name="example_db",
    collection_name="articles",
    filter_criteria={"tags": "AI"},
    field_names=["title", "content"],
    metadata_names=["author.name", "author.email"],
    include_db_collection_in_metadata=True
)

documents = loader.load()

for doc in documents:
    print("Page Content:", doc.page_content)
    print("Metadata:", doc.metadata)
```
Expected Output:

```
Page Content: Artificial Intelligence in Medicine AI is transforming the medical industry by providing personalized medicine solutions.
Metadata: {'author_name': 'John Doe', 'author_email': 'john.doe@example.com', 'database': 'example_db', 'collection': 'articles'}
```

Thank you.

---

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
5 days ago
ccurme 6758894af1
docs: update v0.3 integrations table (#26571) 5 days ago
venkatram-dev 6ba3c715b7
doc_fix_chroma_integration (#26565)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
docs:integrations:vectorstores:chroma:fix_typo


- [x] **PR message**: ***Delete this entire checklist*** and replace
with


- **Description:** fix_typo in docs:integrations:vectorstores:chroma
https://python.langchain.com/docs/integrations/vectorstores/chroma/
    - **Issue:** https://github.com/langchain-ai/langchain/issues/26561

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
5 days ago
Bagatur d8952b8e8c
langchain[patch]: infer mistral provider in init_chat_model (#26557) 6 days ago
Bagatur 31f61d4d7d
docs: v0.3 nits (#26556) 6 days ago
Bagatur 99abd254fb
docs: clean up init_chat_model (#26551) 6 days ago
Tomaz Bratanic 3bcd641bc1
Add check for prompt based approach in llm graph transformer (#26519) 6 days ago
Bagatur 0bd98c99b3
docs: add sema4 to release table (#26549) 6 days ago
Eugene Yurtsev 8a2f2fc30b
docs: what langchain-cli migrate can do (#26547) 6 days ago
SQpgducray 724a53711b
docs: Fix missing `self` argument in `_get_docs_with_query` method of `Cust… (#26312)
…omSelfQueryRetriever`

This commit corrects an issue in the `_get_docs_with_query` method of
the `CustomSelfQueryRetriever` class. The method was incorrectly using
`self.vectorstore.similarity_search_with_score(query, **search_kwargs)`
without including the `self` argument, which is required for proper
method invocation.

The `self` argument is necessary for calling instance methods and
accessing instance attributes. By including `self` in the method call,
we ensure that the method is correctly executed in the context of the
current instance, allowing it to function as intended.

No other changes were made to the method's logic or functionality.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
6 days ago
Eugene Yurtsev c6a78132d6
docs: show how to use langchain-cli for migration (#26535)
Update v0.3 instructions a bit

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 days ago
Bagatur a319a0ff1d
docs: add redirects for tools and lcel (#26541) 6 days ago
Eugene Yurtsev 63c3cc1f1f
ci: updates issue and discussion templates (#26542)
Update issue and discussion templates
6 days ago
ccurme 0154c586d3
docs: update integrations table in 0.3 guide (#26536) 6 days ago
Eugene Yurtsev c2588b334f
unstructured: release 0.1.4 (#26540)
Release to work with langchain 0.3
6 days ago
Eugene Yurtsev 8b985a42e9
milvus: 0.1.6 release (#26538)
Release to work with langchain 0.3
6 days ago
Eugene Yurtsev 5b4206acd8
box: 0.2.0 release (#26539)
Release to work with langchain 0.3
6 days ago
ccurme 0592c29e9b
qdrant[patch]: release 0.1.4 (#26534)
`langchain-qdrant` imports pydantic but was importing pydantic proper
before 0.3 release:
042e84170b/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py (L5-L8)
6 days ago
Eugene Yurtsev 88891477eb
langchain-cli: release 0.0.31 (#26533)
langchain-cli 0.0.31 release
6 days ago
ccurme 88bc15d69b
standard-tests[patch]: add async test for structured output (#26527) 6 days ago
Erick Friis 1ab181f514
voyageai: release 0.1.2 (#26512) 7 days ago
Erick Friis ee4e11379f
nomic: release 0.1.3, core 0.3 compat but not required (#26511) 7 days ago
Yoshitaka Fujii bd42344b0a
docs: Update concepts.mdx (#26496)
- Fix comments in Python
- Fix repeated sentences
7 days ago
Erick Friis 9f5960a0aa
docs: new algolia index (#26508) 7 days ago
Erick Friis 135afdf4fb
docs: most 0.1 redirects too (#26494)
takes redirects from 0.1 docs and factors them into suggested redirects
in 0.3 docs
1 week ago
Erick Friis 4131be63af
multiple: 0.3.0 not dev version (#26502) 1 week ago
Bhadresh Savani f66b7ba32d
Update google_search.ipynb (#26420)
Added changes for pip installation
1 week ago
jessicaou 9c6aa3f0b7
broken LangGraph docs link (#26438)
Update broken langgraph link in the README.md file

Co-authored-by: Jess Ou <jessou@jesss-mbp.local.meter>
1 week ago
Nicolas 2240ca2979
docs: Fix Firecrawl v0 version (#26452)
Firecrawl integration is currently on v0 - which is supported until
version 0.0.20.

@rafaelsideguide is working on a pr for v1 but meanwhile we should fix
the docs.
1 week ago
Eugene Yurtsev 77ccb4b1cf
cli[patch]: Update the migration script message (#26490)
Update the migration script message
1 week ago
Bagatur b47f4cfe51
mongodb[minor]: Release 0.2.0 (#26484) 1 week ago
Bagatur 779a008d4e
docs: update v3 versions (#26483) 1 week ago
Bagatur 4e6620ecdd
chroma[patch]: Release 0.1.4 (#26470) 1 week ago
Bagatur 543a80569c
prompty[minor]: Release 0.1.0 (#26481) 1 week ago
ccurme 9c88037dbc
huggingface[patch]: xfail test (#26479) 1 week ago
Bagatur a2bfa41216
azure-dynamic-sessions[minor]: Release 0.2.0 (#26478) 1 week ago
ccurme 8abc7ff55a
experimental: release 0.3 (#26477) 1 week ago
Bagatur 6abb23ca97
exa[minor]: Release 0.2.0 (#26476) 1 week ago
ccurme 900115a568
community: release 0.3 (#26472) 1 week ago
Bagatur 17b397ef93
pinecone[minor]: Release 0.2.0 (#26474) 1 week ago
Erick Friis ca304ae046
robocorp: rm package (now langchain-sema4) (#26471) 1 week ago
Erick Friis 537f6924dc
partners/ollama: release 0.2.0 (#26468) 1 week ago
Erick Friis 995dfc6b05
partners/fireworks: release 0.2.0 (#26467) 1 week ago