Commit Graph

3226 Commits (7f504c1f817885b43f6d081983d2c4e3ac7ad335)

Author SHA1 Message Date
Eugene Yurtsev 6caceb5473
core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743)
Automatic upgrade to transform and atransform

Closes: 

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
6 months ago
Yunmo Koo fee6f983ef
community[minor]: Integration for `Friendli` LLM and `ChatFriendli` ChatModel. (#17913)
## Description
- Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM
and `ChatFriendli` chat model.
- Unit tests and integration tests corresponding to this change are
added.
- Documentations corresponding to this change are added.

## Dependencies
- Optional dependency
[`friendli-client`](https://pypi.org/project/friendli-client/) package
is added only for those who use `Frienldi` or `ChatFriendli` model.

## Twitter handle
- https://twitter.com/friendliai
6 months ago
Smit Parmar aed46cd6f2
community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920)
**Description:** It will add support for filter out kendra search by
score confidence which will make result more accurate.
    For example
   ```
retriever = AmazonKendraRetriever(
        index_id=kendra_index_id, top_k=5, region_name=region,
        score_confidence="HIGH"
    )
```
Result will not include the records which has score confidence "LOW" or "MEDIUM". 
Relevant docs 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html

 **Issue:** the issue # it resolve #11801 
**twitter:** [@SmitCode](https://twitter.com/SmitCode)
6 months ago
Ian 390ef6abe3
community[minor]: Add Initial Support for TiDB Vector Store (#15796)
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
6 months ago
Bagatur 3b1eb1f828
community[patch]: chat hf typing fix (#18693) 6 months ago
Jib d60e93b6ae
langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
6 months ago
Eugene Yurtsev ca299a8e08
Docs: Add custom parsing documentation and extending langchain (#18331)
* Added extending langchain.mdx -- we'll need to add links as we add
more custom documentation
* Added partial documentation about parsers
6 months ago
Eugene Yurtsev 8c71f92cb2
core: upgrade mypy to recent mypy (#18753)
Testing this works per package on CI
6 months ago
Eugene Yurtsev e188d4ecb0
Add dangerous parameter to requests tool (#18697)
The tools are already documented as dangerous. Not clear whether adding
an opt-in parameter is necessary or not
6 months ago
Erick Friis 1beb84b061
community[patch]: move pdf text tests to integration (#18746) 6 months ago
Christophe Bornet 4a7d73b39d
community: If load() has been overridden, use it in default lazy_load() (#18690) 6 months ago
Christophe Bornet 6cd7607816
community[patch]: Implement lazy_load() for MHTMLLoader (#18648)
Covered by `tests/unit_tests/document_loaders/test_mhtml.py`
6 months ago
axiangcoding 9745b5894d
community[patch]: Chroma use uuid4 instead of uuid1 to generate random ids (#18723)
- **Description:** Chroma use uuid4 instead of uuid1 as random ids. Use
uuid1 may leak mac address, changing to uuid4 will not cause other
effects.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
6 months ago
Guangdong Liu ced5e7bae7
community[patch]: Fix sparkllm authentication problem. (#18651)
- **Description:** fix sparkllm authentication problem.The current
timestamp is in RFC1123 format. The time deviation must be controlled
within 300s. I changed to re-obtain the url every time I ask a question.
https://www.xfyun.cn/doc/spark/general_url_authentication.html#_1-2-%E9%89%B4%E6%9D%83%E5%8F%82%E6%95%B0
6 months ago
Erick Friis 89d32ffbbd
community[patch]: release 0.0.27 (#18708) 6 months ago
Erick Friis c09b520ce4
core[patch]: release 0.1.30 (#18706) 6 months ago
Piyush Jain 2b234a4d96
Support for claude v3 models. (#18630)
Fixes #18513.

## Description
This PR attempts to fix the support for Anthropic Claude v3 models in
BedrockChat LLM. The changes here has updated the payload to use the
`messages` format instead of the formatted text prompt for all models;
`messages` API is backwards compatible with all models in Anthropic, so
this should not break the experience for any models.


## Notes
The PR in the current form does not support the v3 models for the
non-chat Bedrock LLM. This means, that with these changes, users won't
be able to able to use the v3 models with the Bedrock LLM. I can open a
separate PR to tackle this use-case, the intent here was to get this out
quickly, so users can start using and test the chat LLM. The Bedrock LLM
classes have also grown complex with a lot of conditions to support
various providers and models, and is ripe for a refactor to make future
changes more palatable. This refactor is likely to take longer, and
requires more thorough testing from the community. Credit to PRs
[18579](https://github.com/langchain-ai/langchain/pull/18579) and
[18548](https://github.com/langchain-ai/langchain/pull/18548) for some
of the code here.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Sam Khano 1b4dcf22f3
community[minor]: Add DocumentDBVectorSearch VectorStore (#17757)
**Description:**
- Added Amazon DocumentDB Vector Search integration (HNSW index)
- Added integration tests
- Updated AWS documentation with DocumentDB Vector Search instructions
- Added notebook for DocumentDB integration with example usage

---------

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>
6 months ago
Vittorio Rigamonti 51f3902bc4
community[minor]: Adding support for Infinispan as VectorStore (#17861)
**Description:**
This integrates Infinispan as a vectorstore.
Infinispan is an open-source key-value data grid, it can work as single
node as well as distributed.

Vector search is supported since release 15.x 

For more: [Infinispan Home](https://infinispan.org)

Integration tests are provided as well as a demo notebook
6 months ago
Max Jakob cca0167917
elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506)
Follow up on https://github.com/langchain-ai/langchain/pull/17467.

- Update all references to the Elasticsearch classes to use the partners
package.
- Deprecate community classes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Djordje 12b4a4d860
community[patch]: Opensearch delete method added - indexing supported (#18522)
- **Description:** Added delete method for OpenSearchVectorSearch,
therefore indexing supported
    - **Issue:** No
    - **Dependencies:** No
    - **Twitter handle:** stkbmf
6 months ago
Erick Friis 687d27567d
openai[patch]: unit test azure init (#18703) 6 months ago
Christophe Bornet db8db6faae
community: Implement lazy_load() for PlaywrightURLLoader (#18676)
Integration tests:
`tests/integration_tests/document_loaders/test_url_playwright.py`
6 months ago
Aaron Yi c092db862e
community[patch]: make metadata and text optional as expected in DocArray (#18678)
ValidationError: 2 validation errors for DocArrayDoc
text
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
metadata
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
```
In the `_get_doc_cls` method, the `DocArrayDoc` class is defined as
follows:

```python
class DocArrayDoc(BaseDoc):
    text: Optional[str]
    embedding: Optional[NdArray] = Field(**embeddings_params)
    metadata: Optional[dict]
```
6 months ago
Eugene Yurtsev 4c25b49229
community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696)
This is a PR that adds a dangerous load parameter to force users to opt in to use pickle.

This is a PR that's meant to raise user awareness that the pickling module is involved.
6 months ago
Eugene Yurtsev 0e52961562
community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695)
This is a patch for `CVE-2024-2057`:
https://www.cve.org/CVERecord?id=CVE-2024-2057

This affects users that: 

* Use the  `TFIDFRetriever`
* Attempt to de-serialize it from an untrusted source that contains a
malicious payload
6 months ago
Erick Friis 2619420df1
mongodb[patch]: release 0.1.1 (#18692) 6 months ago
Christophe Bornet ea141511d8
core: Move document loader interfaces to core (#17723)
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
6 months ago
Christophe Bornet 5985454269
Merge pull request #18539
* Implement lazy_load() for GitLoader
6 months ago
Christophe Bornet 9a6f7e213b
Merge pull request #18423
* Implement lazy_load() for BSHTMLLoader
6 months ago
Christophe Bornet b3a0c44838
Merge pull request #18673
* Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader
6 months ago
Christophe Bornet 68fc0cf909
Merge pull request #18674
* Implement lazy_load() for TextLoader
6 months ago
Christophe Bornet 5b92f962f1
Merge pull request #18671
* Implement lazy_load() for MastodonTootsLoader
6 months ago
Christophe Bornet 15b1770326
Merge pull request #18421
* Implement lazy_load() for AssemblyAIAudioTranscriptLoader
6 months ago
Christophe Bornet bb284eebe4
Merge pull request #18436
* Implement lazy_load() for ConfluenceLoader
6 months ago
Christophe Bornet 691480f491
Merge pull request #18647
* Implement lazy_load() for UnstructuredBaseLoader
6 months ago
Christophe Bornet 52ac67c5d8
Merge pull request #18654
* Implement lazy_load() for ObsidianLoader
6 months ago
Christophe Bornet b9c0cf9025
Merge pull request #18656
* Implement lazy_load() for PsychicLoader
6 months ago
Christophe Bornet aa7ac57b67
community: Implement lazy_load() for TrelloLoader (#18658)
Covered by `tests/unit_tests/document_loaders/test_trello.py`
6 months ago
Christophe Bornet 302985fea1
community: Implement lazy_load() for SlackDirectoryLoader (#18675)
Integration tests:
`tests/integration_tests/document_loaders/test_slack.py`
6 months ago
Christophe Bornet ed36f9f604
community: Implement lazy_load() for WhatsAppChatLoader (#18677)
Integration test:
`tests/integration_tests/document_loaders/test_whatsapp_chat.py`
6 months ago
Christophe Bornet f414f5cdb9
community[minor]: Implement lazy_load() for WikipediaLoader (#18680)
Integration test:
`tests/integration_tests/document_loaders/test_wikipedia.py`
6 months ago
Bagatur 4cbfeeb1c2
community[patch]: Release 0.0.26 (#18683) 6 months ago
Christophe Bornet 1100f8de7a
community[minor]: Implement lazy_load() for ArxivLoader (#18664)
Integration tests: `tests/integration_tests/utilities/test_arxiv.py` and
`tests/integration_tests/document_loaders/test_arxiv.py`
6 months ago
Christophe Bornet 2d96803ddd
community[minor]: Implement lazy_load() for OutlookMessageLoader (#18668)
Integration test:
`tests/integration_tests/document_loaders/test_email.py`
6 months ago
Christophe Bornet ae167fb5b2
community[minor]: Implement lazy_load() for SitemapLoader (#18667)
Integration tests: `test_sitemap.py` and `test_docusaurus.py`
6 months ago
Christophe Bornet 623dfcc55c
community[minor]: Implement lazy_load() for FacebookChatLoader (#18669)
Integration test:
`tests/integration_tests/document_loaders/test_facebook_chat.py`
6 months ago
Christophe Bornet 20794bb889
community[minor]: Implement lazy_load() for GitbookLoader (#18670)
Integration test:
`tests/integration_tests/document_loaders/test_gitbook.py`
6 months ago
Liang Zhang 81985b31e6
community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607)
- **Description:** Databricks SerDe uses cloudpickle instead of pickle
when serializing a user-defined function transform_input_fn since pickle
does not support functions defined in `__main__`, and cloudpickle
supports this.
- **Dependencies:** cloudpickle>=2.0.0

Added a unit test.
6 months ago
Christophe Bornet 7d6de96186
community[patch]: Implement lazy_load() for CubeSemanticLoader (#18535)
Covered by `test_cube_semantic.py`
6 months ago