Commit Graph

5952 Commits

Author SHA1 Message Date
ZhangShenao
f3925d71b9
community: Fix word spelling in Text2vecEmbeddings (#27183)
Fix word spelling in `Text2vecEmbeddings`
2024-10-15 09:28:48 -07:00
Erick Friis
92ae61bcc8
multiple: rely on asyncio_mode auto in tests (#27200) 2024-10-15 16:26:38 +00:00
William FH
0a3e089827
[Anthropic] Shallow Copy (#27105)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 15:50:48 +00:00
Matthew Peveler
c6533616b6
docs: fix community pgvector deprecation warning formatting (#27094)
**Description**: PR fixes some formatting errors in deprecation message
in the `langchain_community.vectorstores.pgvector` module, where it was
missing spaces between a few words, and one word was misspelled.
**Issue**: n/a
**Dependencies**: n/a

Signed-off-by: mpeveler@timescale.com
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 15:45:53 +00:00
Erick Friis
3fa5ce3e5f
community: clear mypy syntax warning in openapi (#27370)
not completely clear the regex is functional
2024-10-15 15:43:53 +00:00
Ahmet Yasin Aytar
443b37403d
community: refactor Arxiv search logic (#27084)
PR message:

Description:
This PR refactors the Arxiv API wrapper by extracting the Arxiv search
logic into a helper function (_fetch_results) to reduce code duplication
and improve maintainability. The helper function is used in methods like
get_summaries_as_docs, run, and lazy_load, streamlining the code and
making it easier to maintain in the future.

Issue:
This is a minor refactor, so no specific issue is being fixed.

Dependencies:
No new dependencies are introduced with this change.

Add tests and docs:
No new integrations were added, so no additional tests or docs are
necessary for this PR.
Lint and test:
I have run make format, make lint, and make test to ensure all checks
pass successfully.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 08:43:03 -07:00
Qiu Qin
57fbc6bdf1
community: Update OCI data science integration (#27083)
This PR updates the integration with OCI data science model deployment
service.

- Update LLM to support streaming and async calls.
- Added chat model.
- Updated tests and docs.
- Updated `libs/community/scripts/check_pydantic.sh` since the use of
`@pre_init` is removed from existing integration.
- Updated `libs/community/extended_testing_deps.txt` as this integration
requires `langchain_openai`.

---------

Co-authored-by: MING KANG <ming.kang@oracle.com>
Co-authored-by: Dmitrii Cherkasov <dmitrii.cherkasov@oracle.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 08:32:54 -07:00
Rafael Miller
fc14f675f1
Community: Updated Firecrawl Document Loader to v1 (#26548)
This PR updates the Firecrawl Document Loader to use the recently
released V1 API of Firecrawl.

**Key Updates:**

**Firecrawl V1 Integration:** Updated the document loader to leverage
the new Firecrawl V1 API for improved performance, reliability, and
developer experience.

**Map Functionality Added:** Introduced the map mode for more flexible
document loading options.

These updates enhance the integration and provide access to the latest
features of Firecrawl.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-10-15 13:13:28 +00:00
Max Tran
8fea07f92e
community: fixed KeyError: 'client' (#27345)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
Updated

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
twitter: @MaxHTran

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Not needed due to small change

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Max Tran <maxtra@amazon.com>
2024-10-14 20:51:13 +00:00
Martin Triska
8dc4bec947
[community] [Bugfix] base_o365 document loader metadata needs to be JSON serializable (#26322)
In order for indexer to work, all metadata in the documents need to be
JSON serializable. Timestamps are not.

See here:

https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/indexing/api.py#L83-L89

@eyurtsev could you please review? It's a tiny PR :-)
2024-10-14 12:48:31 -04:00
Trayan Azarov
59bbda9ba3
chroma: Deprecating versions 0.5.7 thru 0.5.12 (#27305)
**Description:** Deprecated version of Chroma >=0.5.5 <0.5.12 due to a
serious correctness issue that caused some embeddings for deployments
with multiple collections to be lost (read more on the issue in Chroma
repo)
**Issue:** chroma-core/chroma#2922 (fixed by chroma-core/chroma##2923
and released in
[0.5.13](https://github.com/chroma-core/chroma/releases/tag/0.5.13))
**Dependencies:** N/A
**Twitter handle:** `@t_azarov`
2024-10-14 11:56:05 -04:00
Marcelo Nunes Alves
5647276998
community: Problem with embeddings in new versions of clickhouse. (#26041)
Starting with Clickhouse version 24.8, a different type of configuration
has been introduced in the vectorized data ingestion, and if this
configuration occurs, an error occurs when generating the table. As can
be seen below:

![Screenshot from 2024-09-04
11-48-00](https://github.com/user-attachments/assets/70840a93-1001-490c-921a-26924c51d9eb)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-11 18:54:50 +00:00
Eugene Yurtsev
5b9b8fe80f
core[patch]: Ignore ASYNC110 to upgrade to newest ruff version (#27229)
Ignoring ASYNC110 with explanation
2024-10-09 11:25:58 -04:00
Vittorio Rigamonti
7da2efd9d3
community[minor]: VectorStore Infinispan. Adding TLS and authentication (#23522)
**Description**:
this PR enable VectorStore TLS and authentication (digest, basic) with
HTTP/2 for Infinispan server.
Based on httpx.

Added docker-compose facilities for testing
Added documentation

**Dependencies:**
requires `pip install httpx[http2]` if HTTP2 is needed

**Twitter handle:**
https://twitter.com/infinispan
2024-10-09 10:51:39 -04:00
Diao Zihao
4553573acb
core[patch],langchain[patch],community[patch]: Bump version dependency of tenacity to >=8.1.0,!=8.4.0,<10 (#27201)
This should fixes the compatibility issue with graprag as in

- https://github.com/langchain-ai/langchain/discussions/25595

Here are the release notes for tenacity 9
(https://github.com/jd/tenacity/releases/tag/9.0.0)

---------

Signed-off-by: Zihao Diao <hi@ericdiao.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-09 14:00:45 +00:00
Stefano Lottini
d05fdd97dd
community: Cassandra Vector Store: extend metadata-related methods (#27078)
**Description:** this PR adds a set of methods to deal with metadata
associated to the vector store entries. These, while essential to the
Graph-related extension of the `Cassandra` vector store, are also useful
in themselves. These are (all come in their sync+async versions):

- `[a]delete_by_metadata_filter`
- `[a]replace_metadata`
- `[a]get_by_document_id`
- `[a]metadata_search`

Additionally, a `[a]similarity_search_with_embedding_id_by_vector`
method is introduced to better serve the store's internal working (esp.
related to reranking logic).

**Issue:** no issue number, but now all Document's returned bear their
`.id` consistently (as a consequence of a slight refactoring in how the
raw entries read from DB are made back into `Document` instances).

**Dependencies:** (no new deps: packaging comes through langchain-core
already; `cassio` is now required to be version 0.1.10+)


**Add tests and docs**
Added integration tests for the relevant newly-introduced methods.
(Docs will be updated in a separate PR).

**Lint and test** Lint and (updated) test all pass.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-09 06:41:34 +00:00
Erick Friis
84c05b031d
community: release 0.3.2 (#27214) 2024-10-08 23:33:55 -07:00
Serena Ruan
a7c1ce2b3f
[community] Add timeout control and retry for UC tool execution (#26645)
Add timeout at client side for UCFunctionToolkit and add retry logic.
Users could specify environment variable
`UC_TOOL_CLIENT_EXECUTION_TIMEOUT` to increase the timeout value for
retrying to get the execution response if the status is pending. Default
timeout value is 120s.


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Tested in Databricks:
<img width="1200" alt="image"
src="https://github.com/user-attachments/assets/54ab5dfc-5e57-4941-b7d9-bfe3f8ad3f62">



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: serena-ruan <serena.rxy@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-09 06:31:48 +00:00
Tomaz Bratanic
481bd25d29
community: Fix database connections for neo4j (#27190)
Fixes https://github.com/langchain-ai/langchain/issues/27185

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 23:47:55 +00:00
Erick Friis
cedf4d9462
langchain: release 0.3.3 (#27213) 2024-10-08 16:39:42 -07:00
Erick Friis
7264fb254c
core: release 0.3.10 (#27209) 2024-10-08 16:21:42 -07:00
Bagatur
ce33c4fa40
openai[patch]: default temp=1 for o1 (#27206) 2024-10-08 15:45:21 -07:00
RIdham Golakiya
73ad7f2e7a
langchain_chroma[patch]: updated example for get documents with where clause (#26767)
Example updated for vectorstore ChromaDB.

If we want to apply multiple filters then ChromaDB supports filters like
this:
Reference: [ChromaDB
filters](https://cookbook.chromadb.dev/core/filters/)

Thank you.
2024-10-08 20:21:58 +00:00
Bagatur
e3e9ee8398
core[patch]: utils for adding/subtracting usage metadata (#27203) 2024-10-08 13:15:33 -07:00
ccurme
e3920f2320
community[patch]: fix structured_output in llamacpp integration (#27202)
Resolves https://github.com/langchain-ai/langchain/issues/25318.
2024-10-08 15:16:59 -04:00
Erick Friis
b84e00283f
standard-tests: test that only one chunk sets input_tokens (#27177) 2024-10-08 11:35:32 -07:00
Ajayeswar Reddy
9b7bdf1a26
Fixed typo in llibs/community/langchain_community/storage/sql.py (#27029)
- [ ] **PR title**: docs: fix typo in SQLStore import path

- [ ] **PR message**: 
- **Description:** This PR corrects a typo in the docstrings for the
class SQLStore(BaseStore[str, bytes]). The import path in the docstring
currently reads from langchain_rag.storage import SQLStore, which should
be changed to langchain_community.storage import SQLStore. This typo is
also reflected in the official documentation.
    - **Issue:** N/A
    - **Dependencies:** None
    - **Twitter handle:** N/A

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 17:51:26 +00:00
Vadym Barda
8d27325dbc
core[patch]: support ValidationError from pydantic v1 in tools (#27194) 2024-10-08 10:19:04 -04:00
Christophe Bornet
16f5fdb38b
core: Add various ruff rules (#26836)
Adds
- ASYNC
- COM
- DJ
- EXE
- FLY
- FURB
- ICN
- INT
- LOG
- NPY
- PD
- Q
- RSE
- SLOT
- T10
- TID
- YTT

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-07 22:30:27 +00:00
Erick Friis
5c826faece
core: update make format to fix all autofixable things (#27174) 2024-10-07 15:20:47 -07:00
Christophe Bornet
d31ec8810a
core: Add ruff rules for error messages (EM) (#26965)
All auto-fixes

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-07 22:12:28 +00:00
Oleksii Pokotylo
37ca468d03
community: AzureSearch: fix reranking for empty lists (#27104)
**Description:** 
  Fix reranking for empty lists 

**Issue:** 
```
ValueError: not enough values to unpack (expected 3, got 0)
    documents, scores, vectors = map(list, zip(*docs))
  File langchain_community/vectorstores/azuresearch.py", line 1680, in _reorder_results_with_maximal_marginal_relevance
```

Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>
2024-10-07 15:27:09 -04:00
Christophe Bornet
c4ebccfec2
core[minor]: Improve support for id in VectorStore (#26660)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-07 15:01:08 -04:00
Bharat Ramanathan
931ce8d026
core[patch]: Update AsyncCallbackManager to honor run_inline attribute and prevent context loss (#26885)
## Description

This PR fixes the context loss issue in `AsyncCallbackManager`,
specifically in `on_llm_start` and `on_chat_model_start` methods. It
properly honors the `run_inline` attribute of callback handlers,
preventing race conditions and ordering issues.

Key changes:
1. Separate handlers into inline and non-inline groups.
2. Execute inline handlers sequentially for each prompt.
3. Execute non-inline handlers concurrently across all prompts.
4. Preserve context for stateful handlers.
5. Maintain performance benefits for non-inline handlers.

**These changes are implemented in `AsyncCallbackManager` rather than
`ahandle_event` because the issue occurs at the prompt and message_list
levels, not within individual events.**

## Testing

- Test case implemented in #26857 now passes, verifying execution order
for inline handlers.

## Related Issues

- Fixes issue discussed in #23909 

## Dependencies

No new dependencies are required.

---

@eyurtsev: This PR implements the discussed changes to respect
`run_inline` in `AsyncCallbackManager`. Please review and advise on any
needed changes.

Twitter handle: @parambharat

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-07 14:59:29 -04:00
João Carlos Ferra de Almeida
780ce00dea
core[minor]: add **kwargs to index and aindex functions for custom vector_field support (#26998)
Added `**kwargs` parameters to the `index` and `aindex` functions in
`libs/core/langchain_core/indexing/api.py`. This allows users to pass
additional arguments to the `add_documents` and `aadd_documents`
methods, enabling the specification of a custom `vector_field`. For
example, users can now use `vector_field="embedding"` when indexing
documents in `OpenSearchVectorStore`

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-07 14:52:50 -04:00
Jorge Piedrahita Ortiz
14de81b140
community: sambastudio chat model (#27056)
**Description:**: sambastudio chat model integration added, previously
only LLM integration
     included docs and tests

---------

Co-authored-by: luisfucros <luisfucros@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-07 14:31:39 -04:00
Aditya Anand
f70650f67d
core[patch]: correct typo doc-string for astream_events method (#27108)
This commit addresses a typographical error in the documentation for the
async astream_events method. The word 'evens' was incorrectly used in
the introductory sentence for the reference table, which could lead to
confusion for users.\n\n### Changes Made:\n- Corrected 'Below is a table
that illustrates some evens that might be emitted by various chains.' to
'Below is a table that illustrates some events that might be emitted by
various chains.'\n\nThis enhancement improves the clarity of the
documentation and ensures accurate terminology is used throughout the
reference material.\n\nIssue Reference: #27107
2024-10-07 14:12:42 -04:00
Bagatur
38099800cc
docs: fix anthropic max_tokens docstring (#27166) 2024-10-07 16:51:42 +00:00
ogawa
07dd8dd3d7
community[patch]: update gpt-4o cost (#27038)
updated OpenAI cost definition according to the following:
https://openai.com/api/pricing/
2024-10-07 09:06:30 -04:00
Bagatur
06ce5d1d5c
anthropic[patch]: Release 0.2.3 (#27126) 2024-10-04 22:38:03 +00:00
Bagatur
0b8416bd2e
anthropic[patch]: fix input_tokens when cached (#27125) 2024-10-04 22:35:51 +00:00
Bagatur
bd5b335cb4
standard-tests[patch]: fix oai usage metadata test (#27122) 2024-10-04 20:00:48 +00:00
Bagatur
827bdf4f51
fireworks[patch]: Release 0.2.1 (#27120) 2024-10-04 18:59:15 +00:00
Bagatur
98942edcc9
openai[patch]: Release 0.2.2 (#27119) 2024-10-04 11:54:01 -07:00
Bagatur
414fe16071
anthropic[patch]: Release 0.2.2 (#27118) 2024-10-04 11:53:53 -07:00
Bagatur
11df1b2b8d
core[patch]: Release 0.3.9 (#27117) 2024-10-04 18:35:33 +00:00
Scott Hurrey
558fb4d66d
box: Add citation support to langchain_box.retrievers.BoxRetriever when used with Box AI (#27012)
Thank you for contributing to LangChain!

**Description:** Box AI can return responses, but it can also be
configured to return citations. This change allows the developer to
decide if they want the answer, the citations, or both. Regardless of
the combination, this is returned as a single List[Document] object.

**Dependencies:** Updated to the latest Box Python SDK, v1.5.1
**Twitter handle:** BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-04 18:32:34 +00:00
Bagatur
1e768a9ec7
anthropic[patch]: correctly handle tool msg with empty list (#27109) 2024-10-04 11:30:50 -07:00
Bagatur
4935a14314
core,integrations[minor]: Dont error on fields in model_kwargs (#27110)
Given the current erroring behavior, every time we've moved a kwarg from
model_kwargs and made it its own field that was a breaking change.
Updating this behavior to support the old instantiations /
serializations.

Assuming build_extra_kwargs was not something that itself is being used
externally and needs to be kept backwards compatible
2024-10-04 11:30:27 -07:00
Bagatur
0495b7f441
anthropic[patch]: add usage_metadata details (#27087)
fixes https://github.com/langchain-ai/langchain/pull/27087
2024-10-04 08:46:49 -07:00
Erick Friis
e8e5d67a8d
openai: fix None token detail (#27091)
happens in Azure
2024-10-04 01:25:38 +00:00
Erick Friis
ab4dab9a0c
core: fix batch race condition in FakeListChatModel (#26924)
fixed #26273
2024-10-03 23:14:31 +00:00
Bagatur
87fc5ce688
core[patch]: exclude model cache from ser (#27086) 2024-10-03 22:00:31 +00:00
Bagatur
c09da53978
openai[patch]: add usage metadata details (#27080) 2024-10-03 14:01:03 -07:00
Bagatur
546dc44da5
core[patch]: add UsageMetadata details (#27072) 2024-10-03 20:36:17 +00:00
Tibor Reiss
47a9199fa6
community[patch]: Fix missing protected_namespaces (#27076)
Fixes #26861

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-03 20:12:11 +00:00
Bagatur
2a54448a0a
langchain[patch]: Release 0.3.2 (#27073) 2024-10-03 18:13:23 +00:00
Bharat Ramanathan
103e573f9b
community[patch]: chore warn deprecate the wandb callback handler (#27062)
- **Description:**: This PR deprecates the wandb callback handler in
favor of the new
[WeaveTracer](https://weave-docs.wandb.ai/guides/integrations/langchain#using-weavetracer)
in W&B
- **Dependencies:** No dependencies, just a deprecation warning.
- **Twitter handle:** @parambharat


@baskaryan
2024-10-03 11:59:20 -04:00
Eugene Yurtsev
635c55c039
core[patch]: Release 0.3.8 (#27046)
0.3.8 release for core
2024-10-02 16:58:38 +00:00
Eugene Yurtsev
74bf620e97
core[patch]: Support injected tool args that are arbitrary types (#27045)
This adds support for inject tool args that are arbitrary types when
used with pydantic 2.

We'll need to add similar logic on the v1 path, and potentially mirror
the config from the original model when we're doing the subset.
2024-10-02 12:50:58 -04:00
Bagatur
099235da01
Revert "huggingface[patch]: make HuggingFaceEndpoint serializable (#2… (#27032)
…7027)"

This reverts commit b5e28d3a6d.
2024-10-01 21:26:38 +00:00
Bagatur
5f2e93ffea
huggingface[patch]: xfail test (#27031) 2024-10-01 21:14:07 +00:00
Bagatur
b5e28d3a6d
huggingface[patch]: make HuggingFaceEndpoint serializable (#27027) 2024-10-01 13:16:10 -07:00
ccurme
9d10151123
core[patch]: fix init of RunnableAssign (#26903)
Example in API ref currently raises ValidationError.

Resolves https://github.com/langchain-ai/langchain/issues/26862
2024-10-01 14:21:54 -04:00
Erick Friis
95a87291fd
community: deprecate community ollama integrations (#26733) 2024-10-01 09:18:07 -07:00
ZhangShenao
e317d457cf
Bug-Fix[Community] Fix FastEmbedEmbeddings (#26764)
#26759 

- Fix https://github.com/langchain-ai/langchain/issues/26759 
- Change `model` param from private to public, which may not be
initiated.
- Add test case
2024-09-30 21:23:08 -04:00
Erick Friis
a8e1577f85
milvus: mv to external repo (#26920) 2024-10-01 00:38:30 +00:00
Erick Friis
35f6393144
unstructured: mv to external repo (#26923) 2024-09-30 17:38:21 -07:00
Erick Friis
7ecd720120
multiple: update docs urls to latest 2 (#26837) 2024-09-30 17:37:07 -07:00
Tomaz Bratanic
446144e7c6
Update neo4j vector procedures (#26775) 2024-09-30 14:45:09 -07:00
federico-pisanu
2538963945
core[patch]: improve index/aindex api when batch_size<n_docs (#25754)
- **Description:** prevent index function to re-index entire source
document even if nothing has changed.
- **Issue:** #22135

I worked on a solution to this issue that is a compromise between being
cheap and being fast.
In the previous code, when batch_size is greater than the number of docs
from a certain source almost the entire source is deleted (all documents
from that source except for the documents in the first batch)
My solution deletes documents from vector store and record manager only
if at least one document has changed for that source.

Hope this can help!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-30 20:57:41 +00:00
Eugene Yurtsev
7fde2791dc
core[patch]: Add kwargs to Runnable (#27008)
Fixes #26685

---------

Co-authored-by: Tibor Reiss <tibor.reiss@gmail.com>
2024-09-30 16:45:29 -04:00
Christophe Bornet
2a6abd3f0a
community[patch]: Add docstring for Links (#25969)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-30 20:33:50 +00:00
Mohammad Mohtashim
e12f570ead
Merge pull request #26794
* [chore]: Agent Observation should be casted to string to avoid errors

* Merge branch 'master' into fix_observation_type_streaming

* [chore]: Using Json.dumps

* [chore]: Exact same logic as  when casting agent oobservation to string
2024-09-30 15:54:51 -04:00
Bagatur
34bd718fe1
core[patch]: Release 0.3.7 (#27004) 2024-09-30 18:52:42 +00:00
Bagatur
248be02259
core[patch]: fix structured prompt template format (#27003)
template_format is an init argument on ChatPromptTemplate but not an
attribute on the object so was getting shoved into
StructuredPrompt.structured_ouptut_kwargs
2024-09-30 11:47:46 -07:00
Bagatur
0078493a80
fireworks[patch]: allow tool_choice with multiple tools (#26999)
https://docs.fireworks.ai/api-reference/post-chatcompletions
2024-09-30 11:28:43 -07:00
Bagatur
c7120d87dd
groq[patch]: support tool_choice=any/required (#27000)
https://console.groq.com/docs/api-reference#chat-create
2024-09-30 11:28:35 -07:00
Christophe Bornet
db8845a62a
core: Add ruff rules for pycodestyle Warning (W) (#26964)
All auto-fixes.
2024-09-30 09:31:43 -04:00
Bagatur
9404e7af9d
openai[patch]: exclude http client (#26891)
httpx clients aren't serializable
2024-09-29 11:16:27 -07:00
Ben Chambers
29bf89db25
community: Add conversions from GVS to networkx (#26906)
These allow converting linked documents (such as those used with
GraphVectorStore) to networkx for rendering and/or in-memory graph
algorithms such as community detection.
2024-09-27 16:48:55 -04:00
Christophe Bornet
7809b31b95
core[patch]: Add ruff rules for flake8-simplify (SIM) (#26848)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-27 20:13:23 +00:00
Eugene Yurtsev
de0b48c41a
docs: Upgrade examples with RunnableWithMessageHistory to langgraph memory (#26855)
This PR updates the documentation examples that used
RunnableWithMessageHistory to show how to achieve the same
implementation with langgraph memory.

Some of the underlying PRs (not all of them):

- docs[patch]: update chatbot tutorial and migration guide (#26780)
- docs[patch]: update chatbot memory how-to (#26790)
- docs[patch]: update chatbot tools how-to (#26816)
- docs: update chat history in rag how-to (#26821)
- docs: update trim messages notebook (#26793)
- docs: clean up imports in how to guide for rag qa with chat history
(#26825)
- docs[patch]: update conversational rag tutorial (#26814)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Vadym Barda <vadym@langchain.dev>
Co-authored-by: mercyspirit <ziying.qiu@gmail.com>
Co-authored-by: aqiu7 <aqiu7@gatech.edu>
Co-authored-by: John <43506685+Coniferish@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Subhrajyoty Roy <subhrajyotyroy@gmail.com>
Co-authored-by: Rajendra Kadam <raj.725@outlook.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Devin Gaffney <itsme@devingaffney.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-27 20:04:30 +00:00
Christophe Bornet
f4e738bb40
core: Add ruff rules for PIE (#26939)
All auto-fixes.
2024-09-27 12:08:35 -04:00
ccurme
39987ebd91
openai[patch]: update deprecation target in API ref (#26921) 2024-09-27 08:42:31 -04:00
Subhrajyoty Roy
7f37fd8b80
community[patch]: callback before yield for cloudflare (#26927)
**Description:** Moves yield to after callback for `_stream` function
for the cloudfare workersai model in the community llm package
**Issue:** #16913
2024-09-27 08:42:01 -04:00
Youshin Kim
2d9a09dfa4
Fix typo in mlflow code example in mlflow.py (#26931)
- [x] PR title: Fix typo in code example in mlflow.py
- In libs/community/langchain_community/chat_models/mlflow.py
2024-09-27 12:41:39 +00:00
Subhrajyoty Roy
7037ba0f06
community[patch]: callback before yield for mlx pipeline (#26928)
**Description:** Moves yield to after callback for `_stream` function
for the MLX pipeline model in the community llm package
**Issue:** #16913
2024-09-27 08:41:34 -04:00
Subhrajyoty Roy
adcfecdb67
community[patch]: callback before yield for textgen (#26929)
**Description:** Moves callback to before yield for `_stream` and
`_astream` function for the textgen model in the community llm package
**Issue:** #16913
2024-09-27 08:41:13 -04:00
Subhrajyoty Roy
5f2cc4ecb2
community[patch]: callback before yield for titan takeoff (#26930)
**Description:** Moves yield to after callback for `_stream` function
for the titan takeoff model in the community llm package
**Issue:** #16913
2024-09-27 08:40:22 -04:00
Emmanuel Sciara
c6350d636e
core[fix]: using async rate limiter methods in async code (#26914)
**Description:** Replaced blocking (sync) rate_limiter code in async
methods.

**Issue:** #26913

**Dependencies:** N/A

**Twitter handle:** no need 🤗
2024-09-26 20:44:28 +00:00
Abhi Agarwal
696114e145
community: add sqlite-vec vectorstore (#25003)
**Description**:

Adds a vector store integration with
[sqlite-vec](https://alexgarcia.xyz/sqlite-vec/), the successor to
sqlite-vss that is a single C file with no external dependencies.

Pretty straightforward, just copy-pasted the sqlite-vss integration and
made a few tweaks and added integration tests. Only question is whether
all documentation should be directed away from sqlite-vss if it is
defacto deprecated (cc @asg017).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: philippe-oger <philippe.oger@adevinta.com>
2024-09-26 17:37:10 +00:00
Erick Friis
8bc12df2eb
voyageai: new models (#26907)
Co-authored-by: fzowl <zoltan@voyageai.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
2024-09-26 17:07:10 +00:00
Subhrajyoty Roy
ba467f1a36
community[patch]: callback before yield for gigachat (#26881)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the gigachat model in the community llm package
**Issue:** #16913
2024-09-26 12:47:28 -04:00
Subhrajyoty Roy
11e703a97e
community[patch]: callback before yield for google palm (#26882)
**Description:** Moves yield to after callback for `_stream` function
for the google palm model in the community package
**Issue:** #16913
2024-09-26 12:47:05 -04:00
Julius Stopforth
121e79b1f0
core: Fix IndexError when trim_messages invoked with empty list (#26896)
This prevents `trim_messages` from raising an `IndexError` when invoked
with `include_system=True`, `strategy="last"`, and an empty message
list.

Fixes #26895

Dependencies: none
2024-09-26 11:29:58 -04:00
ccurme
7091a1a798
openai[patch]: increase token limit in azure integration tests (#26901)
`test_json_mode` occasionally runs into this
2024-09-26 14:31:33 +00:00
Erick Friis
2ea5f60cc5
experimental: migrate to external repo (#26879)
security scanners can't distinguish monorepo sources from each other.
this will resolve issues for folks trying to use e.g. langchain-core but
getting security issues from experimental flagged!
2024-09-25 19:02:19 -07:00
Erick Friis
6f3c8313ba
community: bump langchain version (#26876) 2024-09-25 12:58:24 -07:00
Erick Friis
e068407f18
community: bump core versoin (#26875) 2024-09-25 12:57:16 -07:00
Eugene Yurtsev
25cb44c9ee
0.3.1 release community (#26872)
Release for 0.3.1

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-25 19:38:53 +00:00
Erick Friis
9a31ad6f60
langchain: release 0.3.1 (#26868) 2024-09-25 11:43:54 -07:00
Erick Friis
ef2ab26113
core: release 0.3.6 (#26863) 2024-09-25 11:05:53 -07:00
Bagatur
eaffa92c1d
openai[patch]: Release 0.2.1 (#26858) 2024-09-25 15:55:49 +00:00
Rajendra Kadam
51c4393298
community[patch]: Fix validation error in SettingsConfigDict across multiple Langchain modules (#26852)
- **Description:** This pull request addresses the validation error in
`SettingsConfigDict` due to extra fields in the `.env` file. The issue
is prevalent across multiple Langchain modules. This fix ensures that
extra fields in the `.env` file are ignored, preventing validation
errors.
  **Changes include:**
    - Applied fixes to modules using `SettingsConfigDict`.

- **Issue:** NA, similar
https://github.com/langchain-ai/langchain/issues/26850
- **Dependencies:** NA
2024-09-25 10:02:14 -04:00
Christophe Bornet
3a1b9259a7
core: Add ruff rules for comprehensions (C4) (#26829) 2024-09-25 09:34:17 -04:00
Rajendra Kadam
7e5a9c317f
community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag (#26812)
- **Description:** The flag is named `anonymize_snippets`. When set to
true, the Pebblo server will anonymize snippets by redacting all
personally identifiable information (PII) from the snippets going into
VectorDB and the generated reports
- **Issue:** NA
- **Dependencies:** NA
- **docs**: Updated
2024-09-25 09:33:06 -04:00
Rajendra Kadam
92003b3724
community[patch]: [SharePointLoader] Fix validation error in _O365Settings due to extra fields in .env file (#26851)
**Description:** Fix validation error in _O365Settings by ignoring extra
fields in .env file
**Issue:** https://github.com/langchain-ai/langchain/issues/26850
**Dependencies:** NA
2024-09-25 09:31:59 -04:00
Subhrajyoty Roy
b61fb98466
community[patch]: callback before yield for friendli (#26842)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the friendli model in the community package
**Issue:** #16913
2024-09-25 09:31:12 -04:00
ccurme
13acf9e6b0
langchain[patch]: add deprecation warnings (#26853) 2024-09-25 09:26:44 -04:00
William FH
82b5b77940
[Core] Add more interops tests (#26841)
To test that the client propagates both ways
2024-09-24 20:18:20 -07:00
William FH
9b6ac41442
[Core] Inherit tracing metadata & tags (#26838) 2024-09-24 19:33:12 -07:00
Erick Friis
425c0f381f
experimental: release 0.3.1 (#26830) 2024-09-24 15:03:05 -07:00
John
6c3ea262c8
partners/unstructured: release 0.1.5 (#26831)
**Description:** update package version to support loading URLs #26670
**Issue:**  #26697
2024-09-24 15:02:53 -07:00
mercyspirit
0414be4b80
experimental[major]: CVE-2024-46946 fix (#26783)
Description: Resolve CVE-2024-46946 by switching out sympify with
parse_expr with a very specific allowed set of operations.

https://nvd.nist.gov/vuln/detail/cve-2024-46946

Sympify uses eval which makes it vulnerable to code execution.
parse_expr is limited to specific expressions.

Bandit results

![image](https://github.com/user-attachments/assets/170a6376-7028-4e70-a7ef-9acfb49c1d8a)

---------

Co-authored-by: aqiu7 <aqiu7@gatech.edu>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-24 21:37:56 +00:00
Subhrajyoty Roy
b1da532522
community[patch]: callback before yield for deepsparse llm (#26822)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the deepsparse model in the community package
**Issue:** #16913
2024-09-24 13:55:52 -04:00
Nuno Campos
de70a64e3a
core: Run LangChainTracer inline (#26797)
- this flag ensures the tracer always runs in the same thread as the run
being traced for both sync and async runs
- pro: less chance for ordering bugs and other oddities
- blocking the event loop is not a concern given all code in the tracer
holds the GIL anyway
2024-09-24 08:31:18 -07:00
Jorge Piedrahita Ortiz
408a930d55
community: Add Sambanova Cloud Chat model community integration (#26333)
**Description:** : Add SambaNova Cloud Chat model community integration
Includes 
- chat model integration (following Standardize ChatModel docstrings)
-  tests
- docs usage notebook (following Standardize ChatModel integration docs)

https://cloud.sambanova.ai/

---------

Co-authored-by: luisfucros <luisfucros@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-24 14:11:32 +00:00
Tom
2b83c7c3ab
community[patch]: Fix tool_calls parsing when streaming from DeepInfra (#26813)
- **Description:** This PR fixes the response parsing logic for
`ChatDeepInfra`, more specifially `_convert_delta_to_message_chunk()`,
which is invoked when streaming via `ChatDeepInfra`.
- **Issue:** Streaming from DeepInfra via `ChatDeepInfra` is currently
broken because the response parsing logic doesn't handle that
`tool_calls` can be `None`. (There is no GitHub issue for this problem
yet.)
- **Dependencies:** –
- **Twitter handle:** –

Keeping this here as a reminder:
> If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-24 13:47:36 +00:00
Subhrajyoty Roy
997d95c8f8
community[patch]: callback before yield for bedrock llm (#26804)
**Description:** Moves yield to after callback for
`_prepare_input_and_invoke_stream` and
`_aprepare_input_and_invoke_stream` for bedrock llm in community
package.
**Issue:** #16913
2024-09-24 12:14:59 +00:00
ccurme
2a4c5713cd
openai[patch]: fix azure integration tests (#26791) 2024-09-23 17:49:15 -04:00
Mohammad Mohtashim
154a5ff7ca
core[patch]: On Chain Start Fix for Chain Class (#26593)
- **Issue:** #26588

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-23 19:30:59 +00:00
ccurme
bba7af903b
core[patch]: set default on Blob (#26787)
Resolves https://github.com/langchain-ai/langchain/issues/26781
2024-09-23 18:55:56 +00:00
ccurme
97b27f0930
langchain[patch]: fix extended tests (#26788)
Broken by addition of `disabled_params`
2024-09-23 18:52:09 +00:00
Bagatur
e1e4f88b3e
openai[patch]: enable Azure structured output, parallel_tool_calls=Fa… (#26599)
…lse, tool_choice=required

response_format=json_schema, tool_choice=required, parallel_tool_calls
are all supported for gpt-4o on azure.
2024-09-22 22:25:22 -07:00
Gabriel Altay
bb40a0fb32
Remove pydantic restricted namespaces from HuggingFaceInferenceAPIEmbedings (#26744)
without this `model_config` importing this package produces warnings
about "model_name" having conflicts with protected namespace "model_".

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-22 08:05:37 -04:00
Gor Hayrapetyan
f97ac92f00
community[patch]: Handle empty PR body in get_pull_request in Github utility (#26739)
**Description:**
When PR body is empty `get_pull_request` method fails with bellow
exception.


**Issue:**
```
TypeError('expected string or buffer')Traceback (most recent call last):


  File ".../.venv/lib/python3.9/site-packages/langchain_core/tools/base.py", line 661, in run
    response = context.run(self._run, *tool_args, **tool_kwargs)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/tools/github/tool.py", line 52, in _run
    return self.api_wrapper.run(self.mode, query)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 816, in run
    return json.dumps(self.get_pull_request(int(query)))


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 495, in get_pull_request
    add_to_dict(response_dict, "body", pull.body)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 487, in add_to_dict
    tokens = get_tokens(value)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 483, in get_tokens
    return len(tiktoken.get_encoding("cl100k_base").encode(text))


  File "....venv/lib/python3.9/site-packages/tiktoken/core.py", line 116, in encode
    if match := _special_token_regex(disallowed_special).search(text):


TypeError: expected string or buffer
```

**Twitter:**  __gorros__
2024-09-22 01:56:24 +00:00
Erick Friis
238a31bbd9
core: release 0.3.5 (#26737) 2024-09-21 00:26:39 +00:00
William FH
55af6fbd02
[LangChainTracer] Omit Chunk (#26602)
in events / new llm token
2024-09-20 17:10:34 -07:00
Anton Dubovik
3e2cb4e8a4
openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767)
Chunking of the input array controlled by `self.chunk_size` is being
ignored when `self.check_embedding_ctx_length` is disabled. Effectively,
the chunk size is assumed to be equal 1 in such a case. This is
suprising.

The PR takes into account `self.chunk_size` passed by the user.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:58:45 -07:00
William FH
864020e592
[Tracer] add project name to run from tracer (#26736) 2024-09-20 16:48:37 -07:00
Nithish Raghunandanan
2d21274bf6
couchbase: Add ttl support to caches & chat_message_history (#26214)
**Description:** Add support to delete documents automatically from the
caches & chat message history by adding a new optional parameter, `ttl`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:44:29 +00:00
Krishna Kulkarni
c6c508ee96
Refining Skip Count Calculation by Filtering Documents with session_id (#26020)
In the previous implementation, `skip_count` was counting all the
documents in the collection. Instead, we want to filter the documents by
`session_id` and calculate `skip_count` by subtracting `history_size`
from the filtered count.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 23:40:56 +00:00
Tibor Reiss
a8b24135a2
fix[experimental]: Fix text splitter with gradient (#26629)
Fixes #26221

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:35:50 +00:00
Alejandro Rodríguez
4ac9a6f52c
core: fix "template" not allowed as prompt param (#26060)
- **Description:**  fix "template" not allowed as prompt param
- **Issue:** #26058
- **Dependencies:** none


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:33:06 +00:00
Christophe Bornet
58f339a67c
community: Fix links in GraphVectorStore pydoc (#25959)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:17:53 +00:00
Christophe Bornet
e49c413977
core: Add docstring for GraphVectorStoreRetriever (#26224)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-20 23:16:37 +00:00
Lucain
a2023a1e96
huggingface; fix huggingface_endpoint.py (initialize clients only with supported kwargs) (#26378)
## Description

By default, `HuggingFaceEndpoint` instantiates both the
`InferenceClient` and the `AsyncInferenceClient` with the
`"server_kwargs"` passed as input. This is an issue as both clients
might not support exactly the same kwargs. This has been highlighted in
https://github.com/huggingface/huggingface_hub/issues/2522 by
@morgandiverrez with the `trust_env` parameter. In order to make
`langchain` integration future-proof, I do think it's wiser to forward
only the supported parameters to each client. Parameters that are not
supported are simply ignored with a warning to the user. From a
`huggingface_hub` maintenance perspective, this allows us much more
flexibility as we are not constrained to support the exact same kwargs
in both clients.

## Issue

https://github.com/huggingface/huggingface_hub/issues/2522

## Dependencies

None

## Twitter 

https://x.com/Wauplin

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:05:24 -07:00
ccurme
f2285376a5
community[patch]: add web loader tests (#26728) 2024-09-20 18:29:54 -04:00
Erick Friis
4a2745064a
core: release 0.3.4 (#26729) 2024-09-20 14:47:15 -07:00
Nuno Campos
345edeb1f0
core: In astream_events propagate cancellation reason to inner task (#26727)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-20 14:42:10 -07:00
Erick Friis
465e43cd43
core: release 0.3.3 (#26713) 2024-09-20 13:54:19 -07:00
Eugene Yurtsev
4fc69d61ad
core[patch]: Fix defusedxml import (#26718)
Fix defusedxml import. Haven't investigated what's actually going on
under the hood -- defusedxml probably does some weird things in __init__
2024-09-20 16:53:24 -04:00
Eugene Yurtsev
79b224f6f3
core/langchain: fix version used in deprecation (#26724)
in core deprecation should be version 0.3.3 instead of 0.3.4
in langchain deprecation should be version 0.3.1 instead of 0.3.4
2024-09-20 16:47:18 -04:00
Eugene Yurtsev
91f4711e53
core[patch],langchain[patch]: deprecate memory and entity abstractions and implementations (#26717)
This PR deprecates the old memory, entity abstractions and implementations
2024-09-20 15:06:25 -04:00
William FH
19ce95d3c9
Avoid copying runs (#26689)
Also, re-unify run trees. Use a single shared client.
2024-09-20 10:57:41 -07:00
Eric
90031b1b3e
support epsilla cloud vector database in langchain (#26065)
Description

- support epsilla cloud in langchain

---------

Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 17:14:23 +00:00
ZhangShenao
baef7639fd
Improvement[text-splitter] Fix import of ExperimentalMarkdownSyntaxTextSplitter (#26703)
#26028 

Export `ExperimentalMarkdownSyntaxTextSplitter` in __init__

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 17:04:31 +00:00
stein1988
91594928c5
fix:fix ChatZhipuAI tool call bug (#26693)
- [ ] **PR title**: "community:fix ChatZhipuAI tool call bug"

- [ ] **Description:** ZhipuAI api response as follows:
{'id': '20240920132549e379a9152a6a4d7c', 'created': 1726809949, 'model':
'glm-4-flash', 'choices': [{'index': 0, 'finish_reason': 'tool_calls',
'delta': {'role': 'assistant', 'tool_calls': [{'id':
'call_20240920132549e379a9152a6a4d7c', 'index': 0, 'type': 'function',
'function': {'name': 'get_datetime_offline', 'arguments': '{}'}}]}}]}
so, tool_calls = dct.get("tool_call", None) in
_convert_delta_to_message_chunk should be "tool_calls"
2024-09-20 13:06:42 +00:00
Bagatur
f7bb3640f1
core[patch]: support js chat model namespaces (#26688) 2024-09-19 16:14:20 -07:00
Bagatur
c453b76579
core[patch]: Release 0.3.2 (#26686) 2024-09-19 14:58:45 -07:00
Piyush Jain
f087ab43fd
core[patch]: Fix load of ChatBedrock (#26679)
Complementary PR to master for #26643.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-19 21:57:20 +00:00
Bagatur
409f35363b
core[patch]: support load from path for default namespaces (#26675) 2024-09-19 14:47:27 -07:00
ccurme
eef18dec44
unstructured[patch]: support loading URLs (#26670)
`unstructured.partition.auto.partition` supports a `url` kwarg, but
`url` in `UnstructuredLoader.__init__` is reserved for the server URL.
Here we add a `web_url` kwarg that is passed to the partition kwargs:
```python
self.unstructured_kwargs["url"] = web_url
```
2024-09-19 11:40:25 -07:00
Erick Friis
311f861547
core, community: move graph vectorstores to community (#26678)
remove beta namespace from core, add to community
2024-09-19 11:38:14 -07:00
Serena Ruan
c77c28e631
[community] Fix WorkspaceClient error with pydantic validation (#26649)
Thank you for contributing to LangChain!

Fix error like
<img width="1167" alt="image"
src="https://github.com/user-attachments/assets/2e219b26-ec7e-48ef-8111-e0ff2f5ac4c0">

After the fix:
<img width="584" alt="image"
src="https://github.com/user-attachments/assets/48f36fe7-628c-48b6-81b2-7fe741e4ca85">


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: serena-ruan <serena.rxy@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 18:25:33 +00:00
ccurme
7d49ee9741
unstructured[patch]: add to integration tests (#26666)
- Add to tests on parsed content;
- Add tests for async + lazy loading;
- Add a test for `strategy="hi_res"`.
2024-09-19 13:43:34 -04:00
ccurme
f91bdd12d2
community[patch]: add to pypdf tests and run in CI (#26663) 2024-09-19 14:45:49 +00:00
Rajendra Kadam
60dc19da30
[community] Added PebbloTextLoader for loading text data in PebbloSafeLoader (#26582)
- **Description:** Added PebbloTextLoader for loading text in
PebbloSafeLoader.
- Since PebbloSafeLoader wraps document loaders, this new loader enables
direct loading of text into Documents using PebbloSafeLoader.
- **Issue:** NA
- **Dependencies:** NA
- [x] **Tests**: Added/Updated tests
2024-09-19 09:59:04 -04:00
Jorge Piedrahita Ortiz
55b641b761
community: fix error in sambastudio embeddings (#26260)
fix error in samba studio embeddings  result unpacking
2024-09-19 09:57:04 -04:00
Jorge Piedrahita Ortiz
37b72023fe
community: remove sambaverse (#26265)
removing Sambaverse llm model and references given is not available
after Sep/10/2024

<img width="1781" alt="image"
src="https://github.com/user-attachments/assets/4dcdb5f7-5264-4a03-b8e5-95c88304e059">
2024-09-19 09:56:30 -04:00
Martin Triska
3fc0ea510e
community : [bugfix] Use document ids as keys in AzureSearch vectorstore (#25486)
# Description
[Vector store base
class](4cdaca67dc/libs/core/langchain_core/vectorstores/base.py (L65))
currently expects `ids` to be passed in and that is what it passes along
to the AzureSearch vector store when attempting to `add_texts()`.
However AzureSearch expects `keys` to be passed in. When they are not
present, AzureSearch `add_embeddings()` makes up new uuids. This is a
problem when trying to run indexing. [Indexing code
expects](b297af5482/libs/core/langchain_core/indexing/api.py (L371))
the documents to be uploaded using provided ids. Currently AzureSearch
ignores `ids` passed from `indexing` and makes up new ones. Later when
`indexer` attempts to delete removed file, it uses the `id` it had
stored when uploading the document, however it was uploaded under
different `id`.

**Twitter handle: @martintriska1**
2024-09-19 09:37:18 -04:00
Tomaz Bratanic
a8561bc303
Fix async parsing for llm graph transformer (#26650) 2024-09-19 09:15:33 -04:00
Erik
4e0a6ebe7d
community: Add warning when page_content is empty (#25955)
Page content sometimes is empty when PyMuPDF can not find text on pages.
For example, this can happen when the text of the PDF is not copyable
"by hand". Then an OCR solution is need - which is not integrated here.

This warning should accurately warn the user that some pages are lost
during this process.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 05:22:09 +00:00
Christophe Bornet
fd21ffe293
core: Add N(naming) ruff rules (#25362)
Public classes/functions are not renamed and rule is ignored for them.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 05:09:39 +00:00
Daniel Cooke
7835c0651f
langchain_chroma: Pass through kwargs to Chroma collection.delete (#25970)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:21:24 +00:00
Tibor Reiss
85caaa773f
docs[community]: Fix raw string in docstring (#26350)
Fixes #26212: replaced the raw string with backslashes. Alternative:
raw-stringif the full docstring.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-19 04:18:56 +00:00
Erick Friis
8fb643a6e8
partners/box: release 0.2.1 (#26644) 2024-09-19 04:02:06 +00:00
Tomaz Bratanic
03b9aca55d
community: Retry retriable errors in Neo4j (#26211)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:01:07 +00:00
Scott Hurrey
acbb4e4701
box: Add searchoptions for BoxRetriever, documentation for BoxRetriever as agent tool (#26181)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


Added search options for BoxRetriever and added documentation to
demonstrate how to use BoxRetriever as an agent tool - @BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-18 21:00:06 -07:00
Erick Friis
9909354cd0
core: use ruff.target-version instead (#26634)
tested on one of the replacement cases and seems to work! 
![ScreenShot 2024-09-18 at 02 02
43PM](https://github.com/user-attachments/assets/7170975a-2542-43ed-a203-d4126c6a2c81)
2024-09-18 21:06:14 +00:00
Erick Friis
84b831356c
core: remove [project] tag from pyproject (#26633)
makes core incompatible with uv installs
2024-09-18 20:39:49 +00:00
Christophe Bornet
a47b332841
core: Put Python version as a project requirement so it is considered by ruff (#26608)
Ruff doesn't know about the python version in
`[tool.poetry.dependencies]`. It can get it from
`project.requires-python`.

Notes:
* poetry seems to have issues getting the python constraints from
`requires-python` and using `python` in per dependency constraints. So I
had to duplicate the info. I will open an issue on poetry.
* `inspect.isclass()` doesn't work correctly with `GenericAlias`
(`list[...]`, `dict[..., ...]`) on Python <3.11 so I added some `not
isinstance(type, GenericAlias)` checks:

Python 3.11
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
False
```

Python 3.9
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
True
```

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-18 14:37:57 +00:00
ZhangShenao
c3b3f46cb8
Improvement[Community] Improve api doc of BeautifulSoupTransformer (#26423)
- Add missing args

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:00:07 +00:00
ogawa
e2245fac82
community[patch]: o1-preview and o1-mini costs (#26411)
updated OpenAI cost definitions according to the following:
https://openai.com/api/pricing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:59:46 +00:00
ZhangShenao
1a8e9023de
Improvement[Community] Improve streamlit_callback_handler (#26373)
- add decorator for static methods

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:54:37 +00:00
Bagatur
1a62f9850f
anthropic[patch]: Release 0.2.1 (#26592) 2024-09-17 14:44:21 -07:00
Bagatur
5ced41bf50
anthropic[patch]: fix tool call and tool res image_url handling (#26587)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-17 14:30:07 -07:00
Christophe Bornet
c6bdd6f482
community: Fix references in link extractors docstrings (#26314)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:26:25 +00:00
Christophe Bornet
3a99467ccb
core[patch]: Add ruff rule UP006(use PEP585 annotations) (#26574)
* Added rules `UPD006` now that Pydantic is v2+

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-17 21:22:50 +00:00
wlleiiwang
2ef4c9466f
community: modify document links for tencent vectordb (#26316)
- modify document links for create a tencent vectordb database instance.

Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:11:10 +00:00
Erick Friis
194adc485c
docs: pypi readme image links (#26590) 2024-09-17 20:41:34 +00:00
Bagatur
97b05d70e6
docs: anthropic api ref nit (#26591) 2024-09-17 20:39:53 +00:00
Bagatur
e1d113ea84
core,openai,grow,fw[patch]: deprecate bind_functions, update chat mod… (#26584)
…el api ref
2024-09-17 11:32:39 -07:00
ccurme
7c05f71e0f
milvus[patch]: fix vectorstore integration tests (#26583)
Resolves https://github.com/langchain-ai/langchain/issues/26564
2024-09-17 14:17:05 -04:00
Bagatur
145a49cca2
core[patch]: Release 0.3.1 (#26581) 2024-09-17 17:34:09 +00:00
Nuno Campos
5fc44989bf
core[patch]: Fix "argument of type 'NoneType' is not iterable" error in LangChainTracer (#26576)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 10:29:46 -07:00
Isaac Francisco
06cde06a20
core[minor]: remove beta from RemoveMessage (#26579) 2024-09-17 17:09:58 +00:00
RUO
0a177ec2cc
community: Enhance MongoDBLoader with flexible metadata and optimized field extraction (#23376)
### Description:
This pull request significantly enhances the MongodbLoader class in the
LangChain community package by adding robust metadata customization and
improved field extraction capabilities. The updated class now allows
users to specify additional metadata fields through the metadata_names
parameter, enabling the extraction of both top-level and deeply nested
document attributes as metadata. This flexibility is crucial for users
who need to include detailed contextual information without altering the
database schema.

Moreover, the include_db_collection_in_metadata flag offers optional
inclusion of database and collection names in the metadata, allowing for
even greater customization depending on the user's needs.

The loader's field extraction logic has been refined to handle missing
or nested fields more gracefully. It now employs a safe access mechanism
that avoids the KeyError previously encountered when a specified nested
field was absent in a document. This update ensures that the loader can
handle diverse and complex data structures without failure, making it
more resilient and user-friendly.

### Issue:
This pull request addresses a critical issue where the MongodbLoader
class in the LangChain community package could throw a KeyError when
attempting to access nested fields that may not exist in some documents.
The previous implementation did not handle the absence of specified
nested fields gracefully, leading to runtime errors and interruptions in
data processing workflows.

This enhancement ensures robust error handling by safely accessing
nested document fields, using default values for missing data, thus
preventing KeyError and ensuring smoother operation across various data
structures in MongoDB. This improvement is crucial for users working
with diverse and complex data sets, ensuring the loader can adapt to
documents with varying structures without failing.

### Dependencies: 
Requires motor for asynchronous MongoDB interaction.

### Twitter handle: 
N/A

### Add tests and docs
Tests: Unit tests have been added to verify that the metadata inclusion
toggle works as expected and that the field extraction correctly handles
nested fields.
Docs: An example notebook demonstrating the use of the enhanced
MongodbLoader is included in the docs/docs/integrations directory. This
notebook includes setup instructions, example usage, and outputs.
(Here is the notebook link : [colab
link](https://colab.research.google.com/drive/1tp7nyUnzZa3dxEFF4Kc3KS7ACuNF6jzH?usp=sharing))
Lint and test
Before submitting, I ran make format, make lint, and make test as per
the contribution guidelines. All tests pass, and the code style adheres
to the LangChain standards.

```python
import unittest
from unittest.mock import patch, MagicMock
import asyncio
from langchain_community.document_loaders.mongodb import MongodbLoader

class TestMongodbLoader(unittest.TestCase):
    def setUp(self):
        """Setup the MongodbLoader test environment by mocking the motor client 
        and database collection interactions."""
        # Mocking the AsyncIOMotorClient
        self.mock_client = MagicMock()
        self.mock_db = MagicMock()
        self.mock_collection = MagicMock()

        self.mock_client.get_database.return_value = self.mock_db
        self.mock_db.get_collection.return_value = self.mock_collection

        # Initialize the MongodbLoader with test data
        self.loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )

    @patch('langchain_community.document_loaders.mongodb.AsyncIOMotorClient', return_value=MagicMock())
    def test_constructor(self, mock_motor_client):
        """Test if the constructor properly initializes with the correct database and collection names."""
        loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )
        self.assertEqual(loader.db_name, "testdb")
        self.assertEqual(loader.collection_name, "testcol")

    def test_aload(self):
        """Test the aload method to ensure it correctly queries and processes documents."""
        # Setup mock data and responses for the database operations
        self.mock_collection.count_documents.return_value = asyncio.Future()
        self.mock_collection.count_documents.return_value.set_result(1)
        self.mock_collection.find.return_value = [
            {"_id": "1", "content": "Test document content"}
        ]

        # Run the aload method and check responses
        loop = asyncio.get_event_loop()
        results = loop.run_until_complete(self.loader.aload())
        self.assertEqual(len(results), 1)
        self.assertEqual(results[0].page_content, "Test document content")

    def test_construct_projection(self):
        """Verify that the projection dictionary is constructed correctly based on field names."""
        self.loader.field_names = ['content', 'author']
        self.loader.metadata_names = ['timestamp']
        expected_projection = {'content': 1, 'author': 1, 'timestamp': 1}
        projection = self.loader._construct_projection()
        self.assertEqual(projection, expected_projection)

if __name__ == '__main__':
    unittest.main()
```


### Additional Example for Documentation
Sample Data:

```json
[
    {
        "_id": "1",
        "title": "Artificial Intelligence in Medicine",
        "content": "AI is transforming the medical industry by providing personalized medicine solutions.",
        "author": {
            "name": "John Doe",
            "email": "john.doe@example.com"
        },
        "tags": ["AI", "Healthcare", "Innovation"]
    },
    {
        "_id": "2",
        "title": "Data Science in Sports",
        "content": "Data science provides insights into player performance and strategic planning in sports.",
        "author": {
            "name": "Jane Smith",
            "email": "jane.smith@example.com"
        },
        "tags": ["Data Science", "Sports", "Analytics"]
    }
]
```
Example Code:

```python
loader = MongodbLoader(
    connection_string="mongodb://localhost:27017",
    db_name="example_db",
    collection_name="articles",
    filter_criteria={"tags": "AI"},
    field_names=["title", "content"],
    metadata_names=["author.name", "author.email"],
    include_db_collection_in_metadata=True
)

documents = loader.load()

for doc in documents:
    print("Page Content:", doc.page_content)
    print("Metadata:", doc.metadata)
```
Expected Output:

```
Page Content: Artificial Intelligence in Medicine AI is transforming the medical industry by providing personalized medicine solutions.
Metadata: {'author_name': 'John Doe', 'author_email': 'john.doe@example.com', 'database': 'example_db', 'collection': 'articles'}
```

Thank you.

---

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-17 10:23:17 -04:00
Bagatur
d8952b8e8c
langchain[patch]: infer mistral provider in init_chat_model (#26557) 2024-09-17 00:35:54 +00:00
Bagatur
99abd254fb
docs: clean up init_chat_model (#26551) 2024-09-16 22:08:22 +00:00
Tomaz Bratanic
3bcd641bc1
Add check for prompt based approach in llm graph transformer (#26519) 2024-09-16 15:01:09 -07:00
Eugene Yurtsev
c2588b334f
unstructured: release 0.1.4 (#26540)
Release to work with langchain 0.3
2024-09-16 17:38:38 +00:00
Eugene Yurtsev
8b985a42e9
milvus: 0.1.6 release (#26538)
Release to work with langchain 0.3
2024-09-16 13:33:09 -04:00
Eugene Yurtsev
5b4206acd8
box: 0.2.0 release (#26539)
Release to work with langchain 0.3
2024-09-16 13:32:59 -04:00
ccurme
0592c29e9b
qdrant[patch]: release 0.1.4 (#26534)
`langchain-qdrant` imports pydantic but was importing pydantic proper
before 0.3 release:
042e84170b/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py (L5-L8)
2024-09-16 13:04:12 -04:00
Eugene Yurtsev
88891477eb
langchain-cli: release 0.0.31 (#26533)
langchain-cli 0.0.31 release
2024-09-16 12:57:24 -04:00
ccurme
88bc15d69b
standard-tests[patch]: add async test for structured output (#26527) 2024-09-16 11:15:23 -04:00
Erick Friis
1ab181f514
voyageai: release 0.1.2 (#26512) 2024-09-16 03:11:15 +00:00
Erick Friis
ee4e11379f
nomic: release 0.1.3, core 0.3 compat but not required (#26511) 2024-09-15 20:10:25 -07:00