Commit Graph

7993 Commits (e00c1ff2b0c324d5d0f1adfe9de9de66113c29f5)
 

Author SHA1 Message Date
Christophe Bornet ea141511d8
core: Move document loader interfaces to core (#17723)
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
6 months ago
aditya thomas 97de498d39
docs: update to the streaming tutorial notebook in the lcel documentation (#18378)
**Description:** Update to the streaming tutorial notebook in the LCEL
documentation
**Issue:** Fixed an import and (minor) changes in documentation language
**Dependencies:** None
6 months ago
Guangdong Liu 32db9e74e4
docs: Fix some issues with sparkllm use cases (#17674) 6 months ago
Christophe Bornet 5985454269
Merge pull request #18539
* Implement lazy_load() for GitLoader
6 months ago
Christophe Bornet 9a6f7e213b
Merge pull request #18423
* Implement lazy_load() for BSHTMLLoader
6 months ago
Christophe Bornet b3a0c44838
Merge pull request #18673
* Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader
6 months ago
Christophe Bornet 68fc0cf909
Merge pull request #18674
* Implement lazy_load() for TextLoader
6 months ago
Christophe Bornet 5b92f962f1
Merge pull request #18671
* Implement lazy_load() for MastodonTootsLoader
6 months ago
Christophe Bornet 15b1770326
Merge pull request #18421
* Implement lazy_load() for AssemblyAIAudioTranscriptLoader
6 months ago
Christophe Bornet bb284eebe4
Merge pull request #18436
* Implement lazy_load() for ConfluenceLoader
6 months ago
Christophe Bornet 691480f491
Merge pull request #18647
* Implement lazy_load() for UnstructuredBaseLoader
6 months ago
Christophe Bornet 52ac67c5d8
Merge pull request #18654
* Implement lazy_load() for ObsidianLoader
6 months ago
Christophe Bornet b9c0cf9025
Merge pull request #18656
* Implement lazy_load() for PsychicLoader
6 months ago
Christophe Bornet aa7ac57b67
community: Implement lazy_load() for TrelloLoader (#18658)
Covered by `tests/unit_tests/document_loaders/test_trello.py`
6 months ago
Christophe Bornet 302985fea1
community: Implement lazy_load() for SlackDirectoryLoader (#18675)
Integration tests:
`tests/integration_tests/document_loaders/test_slack.py`
6 months ago
Christophe Bornet ed36f9f604
community: Implement lazy_load() for WhatsAppChatLoader (#18677)
Integration test:
`tests/integration_tests/document_loaders/test_whatsapp_chat.py`
6 months ago
Christophe Bornet f414f5cdb9
community[minor]: Implement lazy_load() for WikipediaLoader (#18680)
Integration test:
`tests/integration_tests/document_loaders/test_wikipedia.py`
6 months ago
Bagatur 4cbfeeb1c2
community[patch]: Release 0.0.26 (#18683) 6 months ago
Eugene Yurtsev b9f3c7a0c9
Use Case: Extraction set temperature to 0, qualify a statement (#18672)
Minor changes:
1) Set temperature to 0 (important)
2) Better qualify one of the statements with confidence
6 months ago
Eugene Yurtsev a4a6978224
Docs: Revamp Extraction Use Case (#18588)
Revamp the extraction use case documentation

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Christophe Bornet 1100f8de7a
community[minor]: Implement lazy_load() for ArxivLoader (#18664)
Integration tests: `tests/integration_tests/utilities/test_arxiv.py` and
`tests/integration_tests/document_loaders/test_arxiv.py`
6 months ago
Christophe Bornet 2d96803ddd
community[minor]: Implement lazy_load() for OutlookMessageLoader (#18668)
Integration test:
`tests/integration_tests/document_loaders/test_email.py`
6 months ago
Christophe Bornet ae167fb5b2
community[minor]: Implement lazy_load() for SitemapLoader (#18667)
Integration tests: `test_sitemap.py` and `test_docusaurus.py`
6 months ago
Christophe Bornet 623dfcc55c
community[minor]: Implement lazy_load() for FacebookChatLoader (#18669)
Integration test:
`tests/integration_tests/document_loaders/test_facebook_chat.py`
6 months ago
Christophe Bornet 20794bb889
community[minor]: Implement lazy_load() for GitbookLoader (#18670)
Integration test:
`tests/integration_tests/document_loaders/test_gitbook.py`
6 months ago
Liang Zhang 81985b31e6
community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607)
- **Description:** Databricks SerDe uses cloudpickle instead of pickle
when serializing a user-defined function transform_input_fn since pickle
does not support functions defined in `__main__`, and cloudpickle
supports this.
- **Dependencies:** cloudpickle>=2.0.0

Added a unit test.
6 months ago
Erick Friis f3e28289f6
infra: reorder api docs build steps (#18618) 6 months ago
Leonid Ganeline 114d64d4a7
docs: `providers` update (#18527)
Added missed pages. Added links and descriptions. Foratted to the
consistent form.
6 months ago
Christophe Bornet 7d6de96186
community[patch]: Implement lazy_load() for CubeSemanticLoader (#18535)
Covered by `test_cube_semantic.py`
6 months ago
Christophe Bornet a6b5d45e31
community[patch]: Implement lazy_load() for EverNoteLoader (#18538)
Covered by `test_evernote_loader.py`
6 months ago
PSV d7dd3cd248
docs: structured_output (#18608)
- **Description:** Fixed some typos and copy errors in the Beta
Structured Output docs
    - **Issue:** N/A
    - **Dependencies:** Docs only
    - **Twitter handle:** @psvann

Co-authored-by: P.S. Vann <psvann@yahoo.com>
6 months ago
Bagatur 29f1619d61
docs: why lcel nit (#18616) 6 months ago
Max Jakob ee7a7954b9
elasticsearch: add `ElasticsearchRetriever` (#18587)
Implement
[Retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/)
interface for Elasticsearch.

I opted to only expose the `body`, which gives you full flexibility, and
none the other 68 arguments of the [search
method](https://elasticsearch-py.readthedocs.io/en/v8.12.1/api/elasticsearch.html#elasticsearch.Elasticsearch.search).

Added a user agent header for usage tracking in Elastic Cloud.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Jib 8bc347c5fc
mongodb[patch]: include LLM caches in toplevel library import (#18601) 6 months ago
Bagatur 080904689c
docs: text splitters install (#18589) 6 months ago
Sunchao Wang dc81dba6cf
community[patch]: Improve amadeus tool and doc (#18509)
Description:

This pull request addresses two key improvements to the langchain
repository:

**Fix for Crash in Flight Search Interface**:

Previously, the code would crash when encountering a failure scenario in
the flight ticket search interface. This PR resolves this issue by
implementing a fix to handle such scenarios gracefully. Now, the code
handles failures in the flight search interface without crashing,
ensuring smoother operation.

**Documentation Update for Amadeus Toolkit**:

Prior to this update, examples provided in the documentation for the
Amadeus Toolkit were unable to run correctly due to outdated
information. This PR includes an update to the documentation, ensuring
that all examples can now be executed successfully. With this update,
users can effectively utilize the Amadeus Toolkit with accurate and
functioning examples.
These changes aim to enhance the reliability and usability of the
langchain repository by addressing issues related to error handling and
ensuring that documentation remains up-to-date and actionable.

Issue: https://github.com/langchain-ai/langchain/issues/17375

Twitter Handle: SingletonYxx
6 months ago
Christophe Bornet f77f7dc3ec
community[patch]: Fix VectorStoreQATool (#18529)
Fix #18460
6 months ago
Utkarsh Kapil 539a13dbda
docs: minor spelling errors (#18429)
Description: Noticed spelling errors. 'Colab' mispelt as 'Collab'.
https://python.langchain.com/docs/use_cases
Dependencies: n/a
6 months ago
Dounx ad48f55357
community[minor]: add Yuque document loader (#17924)
This pull request support loading documents from Yuque with Langchain.

Yuque is a professional cloud-based knowledge base for team
collaboration in documentation.

Website: https://www.yuque.com
OpenAPI: https://www.yuque.com/yuque/developer/openapi
6 months ago
Kazuki Maeda 60c5d964a8
community[minor]: use jq schema for content_key in json_loader (#18003)
### Description
Changed the value specified for `content_key` in JSONLoader from a
single key to a value based on jq schema.
I created [similar
PR](https://github.com/langchain-ai/langchain/pull/11255) before, but it
has several conflicts because of the architectural change associated
stable version release, so I re-create this PR to fit new architecture.

### Why
For json data like the following, specify `.data[].attributes.message`
for page_content and `.data[].attributes.id` or
`.data[].attributes.attributes. tags`, etc., the `content_key` must also
parse the json structure.

<details>
<summary>sample json data</summary>

```json
{
  "data": [
    {
      "attributes": {
        "message": "message1",
        "tags": [
          "tag1"
        ]
      },
      "id": "1"
    },
    {
      "attributes": {
        "message": "message2",
        "tags": [
          "tag2"
        ]
      },
      "id": "2"
    }
  ]
}
```

</details>

<details>
<summary>sample code</summary>

```python
def metadata_func(record: dict, metadata: dict) -> dict:

    metadata["source"] = None
    metadata["id"] = record.get("id")
    metadata["tags"] = record["attributes"].get("tags")

    return metadata

sample_file = "sample1.json"
loader = JSONLoader(
    file_path=sample_file,
    jq_schema=".data[]",
    content_key=".attributes.message", ## content_key is parsable into jq schema
    is_content_key_jq_parsable=True, ## this is added parameter
    metadata_func=metadata_func
)

data = loader.load()
data
```

</details>

### Dependencies
none

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)
6 months ago
Rodrigo Nogueira f4bb33bbf3
docs: fix link and missing package (#18405)
**Issue:** fix broken links and missing package on colab example
6 months ago
Max Jakob 81e9ab6e3a
docs: Update elasticsearch README (#18497)
Update Elasticsearch README with information on how to start a
deployment.

Also make some cosmetic changes to the [Elasticsearch
docs](https://python.langchain.com/docs/integrations/vectorstores/elasticsearch).

Follow up on https://github.com/langchain-ai/langchain/pull/17467
6 months ago
Hech 6a08134661
community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106) 6 months ago
Mikhail Khludnev d039dcb6ba
nvidia-trt[patch]: add TritonTensorRTLLM(verbose_client=False) (#16848)
- **Description:** adding verbose flag to TritonTensorRTLLM, 
  - **Issue:** nope,
  - **Dependencies:** not any,
  - **Twitter handle:**
6 months ago
Bagatur 1569b19191
docs: query analysis links (#18614) 6 months ago
Asaf Joseph Gardin 27441555d0
ai21[patch]: AI21 Labs Contextual Answers support (#18270)
Description: Added support for AI21 Labs model - Contextual Answers
Dependencies: ai21, ai21-tokenizer
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Erick Friis e169ee8863
anthropic[patch]: handle lists in function calling (#18609) 6 months ago
Erick Friis 1831733c2e
anthropic[patch]: fix argument integration test (#18605) 6 months ago
Leonid Ganeline bd4993141d
docs: `providers` update 5 (#18550)
Added missed sections. Added descriptions.
6 months ago
Yudhajit Sinha 4570b477b9
community[patch]: Invoke callback prior to yielding token (titan_takeoff) (#18560)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/titan_takeoff.
- Issue: #16913 
- Dependencies: None
6 months ago