Commit Graph

4000 Commits (3e749369efb702ef806373fed3d1399fbc9118d1)

Author SHA1 Message Date
Jamsheed Mistri 3e749369ef
community[minor]: bump version of LayerupSecurity, add support for untrusted_input parameter (#19985)
**Description:** update version of LayerupSecurity package for the
Layerup Security integration. Add untrusted_input parameter.
5 months ago
fubuki8087 f1c3687aa5
community[patch]: Using the right encoding to parse the web page in RecursiveUrlLoader (#20632)
As shown in #13749 , `RecursiveUrlLoader` has encoding issue. This PR is
to solve this.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Jakub Pawłowski b0b1a67771
community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042)
### Description:
When attempting to download PDF files from arXiv, an unexpected 404
error frequently occurs. This error halts the operation, regardless of
whether there are additional documents to process. As a solution, I
suggest implementing a mechanism to ignore and communicate this error
and continue processing the next document from the list.

Proposed Solution: To address the issue of unexpected 404 errors during
PDF downloads from arXiv, I propose implementing the following solution:

- Error Handling: Implement error handling mechanisms to catch and
handle 404 errors gracefully.
- Communication: Inform the user or logging system about the occurrence
of the 404 error.
- Continued Processing: After encountering a 404 error, continue
processing the remaining documents from the list without interruption.

This solution ensures that the application can handle unexpected errors
without terminating the entire operation. It promotes resilience and
robustness in the face of intermittent issues encountered during PDF
downloads from arXiv.

### Issue:
#20909 
### Dependencies:
none

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Erick Friis b9c53e95b7
community: release 0.0.35 (#21104) 5 months ago
Eugene Yurtsev 3c064a757f
core[minor],langchain[patch],community[patch]: Move storage interfaces to core (#20750)
* Move storage interface to core
* Move in memory and file system implementation to core
5 months ago
Charlie Marsh 8f38b7a725
multiple: Remove unnecessary Ruff suppression comments (#21050)
## Summary

I ran `ruff check --extend-select RUF100 -n` to identify `# noqa`
comments that weren't having any effect in Ruff, and then `ruff check
--extend-select RUF100 -n --fix` on select files to remove all of the
unnecessary `# noqa: F401` violations. It's possible that these were
needed at some point in the past, but they're not necessary in Ruff
v0.1.15 (used by LangChain) or in the latest release.

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 748f2ba9ea
core: release 0.1.47 (#21094) 5 months ago
Eugene Yurtsev c8f18a2524
langchain[patch]: Update import handling in `adapters` (#21079) 5 months ago
William FH 5c63ac3dd7
[Patch] Dedent docstring (#20959)
Technically a slight prompt breaking change, but I think positive EV in
that it saves tokens and results in more sane / in-distribution prompts
5 months ago
Eugene Yurtsev 845d8e0025
langchain[patch]: Update handling of deprecation warnings (#21083)
Chains should not be emitting deprecation warnings.
5 months ago
Christophe Bornet 5c77f45b06
community[minor]: Add async methods to CassandraCache and CassandraSemanticCache (#20654) 5 months ago
William FH db14d4326d
[Core] Feat Pretty Print Tool calls (#20997)
Right now, `tool_calls` are not included in the `pretty_print()` output.
Would be nice to show!


![image](https://github.com/langchain-ai/langchain/assets/13333726/6a0ffca3-d02f-4e18-bc76-513eeca2e964)
5 months ago
Kuro Denjiro fa4124b821
community[minor]: add mintbase loader to langchain (#20089)
- [x] **Add Near NFT loader**: "community: Load NFT near block chain
using mintbase graph API"

- [x] **PR message**: 
    - **Description:** a description of the change
    - **Twitter handle:**Kurodenjiro

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Alexander Dicke d7e12750df
community[patch]: allows using `text-generation-inference` /generate route with `HuggingFaceEndpoint` (#20100)
- **Description:** allows to use the /generate route of
`text-generation-inference` with the `HuggingFaceEndpoint`
5 months ago
davidkgp 28b0b0d863
community[patch]: Fix for github issue #17690 (#20117)
…/17690

Thank you for contributing to LangChain!

- [x] **Fix Google Lens knowledge graph issue**: "langchain: community"
- Fix for [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** handled the existence of keys in the json response of
Google Lens
- **Issue:** [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
高远 a7a4630bf4
community[patch]: Modify the text field type and add new exception handling (#20116)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
5 months ago
Rahul Triptahi c172611647
community[patch]: Add classifier_url argument in PebbloSafeLoader and documentation update. (#21030)
Description: Add classifier_url argument in PebbloSafeLoader.
Documentation: Updated PebbloSafeLoader documentation with above change
and new links for pebblo github pages.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Leonid Ganeline 08d08d7c83
docs: langchain docstrings updates (#21032)
Added missed docstings. Formatted docstrings into a consistent format.
5 months ago
Leonid Ganeline 85094cbb3a
docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
5 months ago
Rodrigo Nogueira 90f19028e5
community[patch]: Add maritalk streaming (sync and async) (#19203)
Co-authored-by: RosevalJr <rdmalajr@gmail.com>
Co-authored-by: Roseval Donisete Malaquias Junior <roseval@maritaca.ai>
5 months ago
Cahid Arda Öz cc6191cb90
community[minor]: Add support for Upstash Vector (#20824)
## Description

Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!

#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.

## Dependencies

[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).

## Tests

Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.

There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.

---------

Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline 1a2ff56cd8
core[patch[: docstring update (#21036)
Added missed docstrings. Updated docstrings to consistent format.
5 months ago
Eugene Yurtsev f479a337cc
langchain[patch]: replace deprecated imports with imports from langchain_core (#21033)
* Output of running the migration script.
* Ran only against langchain code itself and not the unit tests.
5 months ago
Eugene Yurtsev 82d4afcac0
langchain[minor]: Code to handle dynamic imports (#20893)
Proposing to centralize code for handling dynamic imports. This allows treating langchain-community as an optional dependency.

---

The proposal is to scan the code base and to replace all existing imports with dynamic imports using this functionality.
5 months ago
Erick Friis 854ae3e1de
mistralai: release 0.1.5, allow client passing in (#21034) 5 months ago
chyroc 3e241956d3
community[minor]: add coze chat model (#20770)
add coze chat model, to call coze.com apis
5 months ago
Eugene Yurtsev 29493bb598
cli[minor]: improve confirmation message with more details (#21027)
Improve confirmation message with more details
5 months ago
Eugene Yurtsev aab78a37f3
cli[patch]: Ignore imports that change the name of the class (#21026)
Not currently handeled by migration script
5 months ago
Massimiliano Pronesti ce89b34fc0
community[patch]: support hybrid search with threshold in Azure AI Search Retriever (#20907)
Support hybrid search with a score threshold -- similar to what we do
for similarity search.
5 months ago
Andrei Panferov b3efa38cc0
community[patch]: GigaChat model selection fix (#20988)
Fixed the error that the model name is never actually put into GigaChat
request payload, always defaulting to `GigaChat-Lite`.

With this fix, model selection through
```python
import os
from langchain.chat_models.gigachat import GigaChat

chat = GigaChat(
    name="GigaChat-Pro", # <- HERE!!!!!
    ...
)
```
should actually work, as intended in
[here](804390ba4b/libs/community/langchain_community/llms/gigachat.py (L36)).

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Patrick McFadin 3331865f6b
community[minor]: add Cassandra Database Toolkit (#20246)
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.

Includes unit test and two notebooks to demonstrate usage. 

**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin

---------

Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Igor Brai b3e74f2b98
community[minor]: add mojeek search util (#20922)
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None

---------

Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
5 months ago
hmn falahi 4822beb298
Ignore self/cls from required args of class functions in convert_to_openai_tool (#20691)
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:

```python
class MemberTool:
    def search_member(
            self,
            keyword: str,
            *args,
            **kwargs,
    ):
        """Search on members with any keyword like first_name, last_name, email

        Args:
            keyword: Any keyword of member
        """

        headers = dict(authorization=kwargs['token'])
        members = []
        try:
            members = request_(
                method='SEARCH',
                url=f'{service_url}/apiv1/members',
                headers=headers,
                json=dict(query=keyword),
            )

        except Exception as e:
            logger.info(e.__doc__)

        return members

convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```

#20685

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev 4f4ee8e2cf
cli[patch]: Update migrations file manually (#21021)
We need to replace occurrences in the code of RunnableMap not just the
import,
so for now, we don't replace RunnableMap.
5 months ago
Tomaz Bratanic 67428c4052
community[patch]: Neo4j enhanced schema (#20983)
Scan the database for example values and provide them to an LLM for
better inference of Text2cypher
5 months ago
aditya thomas 8b59bddc03
anthropic[patch]: add tests for secret_str for api key (#20986)
**Description:** Add tests to check API keys are masked
**Issue:** Resolves
https://github.com/langchain-ai/langchain/issues/12165 for Anthropic
models
**Dependencies:** None
5 months ago
Pengcheng Liu 1fad39be1c
community[minor]: Add LarkSuite wiki document loader. (#21016)
**Description:** Add LarkSuite wiki document loader. Refer to [LarkSuite
api document
](https://open.feishu.cn/document/server-docs/docs/wiki-v2/space-node/list)for
details.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
5 months ago
Leonid Ganeline dc7c06bc07
community[minor]: import fix (#20995)
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
5 months ago
Karim Lalani 2ddac9a7c3
experimental[minor]: Add bind_tools and with_structured_output functions to OllamaFunctions (#20881)
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.

integration unit test has been updated.
notebook has been updated.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev d781560722
cli[minor]: Add ipynb support, add text_splitters (#20963) 5 months ago
WilliamEspegren 804390ba4b
community: Spider integration (#20937)
Added the [Spider.cloud](https://spider.cloud) document loader.
[Spider](https://github.com/spider-rs/spider) is the
[fastest](https://github.com/spider-rs/spider/blob/main/benches/BENCHMARKS.md)
and cheapest crawler that returns LLM-ready data.

```
- **Description:** Adds Spider data loader
- **Dependencies:** spider-client
- **Twitter handle:** @WilliamEspegren 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: = <=>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
ccurme 9ec7151317
fireworks: fix integration tests (#20973) 5 months ago
William FH 9fa9f05e5d
Catch System Error in ast parse (#20961)
I can't seem to reproduce, but i got this:

```
SystemError: AST constructor recursion depth mismatch (before=102, after=37)
```

And the operation isn't critical for the actual forward pass so seems
preferable to expand our caught exceptions
5 months ago
YH 2aca7fcdcf
core[patch]: Enhance link extraction with query parameters (#20259)
**Description**: This update enhances the `extract_sub_links` function
within the `langchain_core/utils/html.py` module to include query
parameters in the extracted URLs.

**Issue**: N/A

**Dependencies**: No additional dependencies required for this change.

**Twitter handle**: N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Chip Davis e818c75f8a
infra: test directory loader multithreaded (#20281)
This is a unit test for #20230 which was a fix for using multithreaded
mode with directory loader @eyurtsev
5 months ago
Guilherme Zanotelli f931a9ce60
community[patch]: Pass kwargs to SPARQLStore from RdfGraph (#20385)
This introduces `store_kwargs` which behaves similarly to `graph_kwargs`
on the `RdfGraph` object, which will enable users to pass `headers` and
other arguments to the underlying `SPARQLStore` object. I have also made
a [PR in `rdflib` to support passing
`default_graph`](https://github.com/RDFLib/rdflib/pull/2761).

Example usage:
```python
from langchain_community.graphs import RdfGraph

graph = RdfGraph(
    query_endpoint="http://localhost/sparql",
    standard="rdf",
    store_kwargs=dict(
        default_graph="http://example.com/mygraph"
    )
)
```

<!--If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.-->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Jorge Piedrahita Ortiz 40b2e2916b
community[minor]: Sambanova llm integration (#20955)
- **Description:** Added [Sambanova systems](https://sambanova.ai/)
integration, including sambaverse and sambastudio LLMs
- **Dependencies:**   sseclient-py  (optional)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Rahul Triptahi 955cf186d2
community[patch]: Ingest source, owner and full_path if present in Document's metadata. (#20949)
Description: The PebbloSafeLoader should first check for owner,
full_path and size in metadata before implementing its own logic.
Dependencies: None
Documentation: NA.

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Amine Djeghri 790ea75cf7
community[minor]: add exllamav2 library for GPTQ & EXL2 models (#17817)
Added 3 files : 
- Library : ExLlamaV2 
- Test integration
- Notebook

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Naveen Tatikonda 8bbdb4f6a0
community[patch]: Add OpenSearch as semantic cache (#20254)
### Description
Use OpenSearch vector store as Semantic Cache.

### Twitter Handle
**@OpenSearchProj**

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Co-authored-by: Harish Tatikonda <harishtatikonda@Harishs-MacBook-Air.local>
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-31-155.ec2.internal>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago