Commit Graph

4616 Commits (9b3a025f9c806a6f8a00030c7058c689536ae5a0)

Author SHA1 Message Date
Eugene Yurtsev ded53297e0
core[patch]: Add unit test for RunnableGenerator for eventstream v2 (#21990)
No unit tests with runnable generator
4 months ago
Nuno Campos fb6108c8f5
core[patch]: In astream_events(version=v2) tap output of root run (#21977)
- if tap_output_iter/aiter is called multiple times for the same run
issue events only once
- if chat model run is tapped don't issue duplicate on_llm_new_token
events
- if first chunk arrives after run has ended do not emit it as a stream
event

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Bagatur 72d4a8eeed
community[patch]: AzureSearch dont overwrite default async (#21989) 4 months ago
ccurme a983465694
docs: set default anthropic model (#21988)
`ChatAnthropic()` raises ValidationError.
4 months ago
ccurme 4be5537837
Revert "anthropic: set default model" (#21987)
Reverts langchain-ai/langchain#21986
4 months ago
ccurme 35439cf3bd
anthropic: set default model (#21986)
Various docs reference `ChatAnthropic()`, but this currently raises
ValidationError.
4 months ago
ccurme 0923136851
langchain: default to Runnable in MultiQueryRetriever (#21770)
- `llm_chain` becomes `Union[LLMChain, Runnable]`
- `.from_llm` creates a runnable

tested by verifying that docs/how_to/MultiQueryRetriever.ipynb runs
unchanged with sync/async invoke (and that it runs if we specifically
instantiate with LLMChain).
4 months ago
Yulong Wang 8e1aeb8ad5
community[patch]: Fix typo in arxiv tool's doc (#21970)
Fix typo in arxiv tool's doc
4 months ago
Robert Caulk 54adcd9e82
community[minor]: add AskNews retriever and AskNews tool (#21581)
We add a tool and retriever for the [AskNews](https://asknews.app)
platform with example notebooks.

The retriever can be invoked with:

```py
from langchain_community.retrievers import AskNewsRetriever

retriever = AskNewsRetriever(k=3)

retriever.invoke("impact of fed policy on the tech sector")
```

To retrieve 3 documents in then news related to fed policy impacts on
the tech sector. The included notebook also includes deeper details
about controlling filters such as category and time, as well as
including the retriever in a chain.

The tool is quite interesting, as it allows the agent to decide how to
obtain the news by forming a query and deciding how far back in time to
look for the news:

```py
from langchain_community.tools.asknews import AskNewsSearch
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI

tool = AskNewsSearch()

instructions = """You are an assistant."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)
llm = ChatOpenAI(temperature=0)
asknews_tool = AskNewsSearch()
tools = [asknews_tool]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
)

agent_executor.invoke({"input": "How is the tech sector being affected by fed policy?"})
```

---------

Co-authored-by: Emre <e@emre.pm>
4 months ago
Jesse S fc79b372cb
community[minor]: add aerospike vectorstore integration (#21735)
Please let me know if you see any possible areas of improvement. I would
very much appreciate your constructive criticism if time allows.

**Description:**
- Added a aerospike vector store integration that utilizes
[Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/)
add-on.
- Added both unit tests and integration tests
- Added a docker compose file for spinning up a test environment
- Added a notebook

 **Dependencies:** any dependencies required for this change
- aerospike-vector-search

 **Twitter handle:** 
- No twitter, you can use my GitHub handle or LinkedIn if you'd like

Thanks!

---------

Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Prince Canuma 3587c60396
community[patch]: Fix MLX LLM Stream (#20575)
Closes #20561

This PR fixes MLX LLM stream `AttributeError`. 

Recently, `mlx-lm` changed the token decoding logic, which affected the
LC+MLX integration.

Additionally, I made minor fixes such as: docs example broken link and
enforcing pipeline arguments (max_tokens, temp and etc) for invoke.
   
- **Issue:** #20561
    
- **Twitter handle:** @Prince_Canuma
4 months ago
Rahul Triptahi 96bd0b0844
community[patch]: Remove redundant pebblo cloud api call (#21589)
Description: removed redundant pebblo cloud api call. Changed classified
`doc` key to `ai_apps_data`.
Documentation: N/A
Unit tests: N/A
4 months ago
Param Singh d07885f8b7
community[patch]: standardized sparkllm init args (#21633)
Related to #20085 
@baskaryan 

Thank you for contributing to LangChain!

community:sparkllm[patch]: standardized init args

updated `spark_api_key` so that aliased to `api_key`. Added integration
test for `sparkllm` to test that it continues to set the same underlying
attribute.

updated temperature with Pydantic Field, added to the integration test.

Ran `make format`,`make test`, `make lint`, `make spell_check`
4 months ago
Dhruv Chawla d4359d3de6
community[patch]: Update UpTrain Callback Handler to support the new UpTrain evaluation schema (#21656)
UpTrain has a new dashboard now that makes it easier to view projects
and evaluations. Using this requires specifying both project_name and
evaluation_name when performing evaluations. I have updated the code to
support it.
4 months ago
Alex Riina c0e3c3a350
openai[patch], community[patch]: add pricing and max context window for GPT-4o (#21673)
# Add pricing and max context window for GPT-4o
- community: add cost per 1k tokens and max context window
- partners: add max context window

**Description:** adds static information about GPT-4o based on
https://openai.com/api/pricing/ and
https://platform.openai.com/docs/models/gpt-4o so that GPT-4o reporting
is accurate.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
缨缨 bd39b2ccdf
community: enable SupabaseVectorStore to support extended table fields (#21762)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: enable SupabaseVectorStore to support
extended table fields"

- [x] **PR message**: 
- Added extension fields to the function _add_vectors so that users can
add other custom fields when insert a record into the database. eg:
    

![image](https://github.com/langchain-ai/langchain/assets/10885578/e1d5ca20-936e-4cab-ba69-8fdd23b8ce8f)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Leonid Ganeline e98a4fd19a
ai21[patch]: configuration fix (#21790)
added "repository" and "Source Code" parameters (these parameters are
missed only in this partner package configuration).
4 months ago
Trayan Azarov f54cbf8ff5
chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817)
**Description**:

- Reference to `Collection` object is set to `None` when deleting a
collection `delete_collection()`
- Added utility method `reset_collection()` to allow recreating the
collection
- Moved collection creation out of `__init__` into
`__ensure_collection()` to be reused by object init and
`reset_collection()`
- `_collection` is now a property to avoid breaking changes

**Issues**: 

- chroma-core/chroma#2213

**Twitter**: @t_azarov
4 months ago
Jens b0b302ec6b
community[patch]: fixed aleph alpha default emedding request (#21826)
- **Description:** In the aleph alpha client the paramater `normalize`
is *not* optional. Setting this to `None` gives an error.
- **Dependencies:** None

Co-authored-by: Jens Lücke <jens.luecke@tngtech.com>
Co-authored-by: Jens <jens.luecke@hu-berlin.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Jorge Piedrahita Ortiz e6207ad4f3
community[patch]: Sambanova integration api update (#21848)
- **Description:**:
        SambaStudio generic endpoint compatibility added
        Improved error description, and handling
        streaming examples added
4 months ago
Michael Reed 7a5e1bcf99
core[patch]: Fix NPE in function_calling._get_python_function_required_args (#21863)
Example error message:
line 206, in _get_python_function_required_args
    if is_function_type and required[0] == "self":
                            ~~~~~~~~^^^
IndexError: list index out of range

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Liuww 332ffed393
community[patch]: Adopting the lighter-weight xinference_client (#21900)
While integrating the xinference_embedding, we observed that the
downloaded dependency package is quite substantial in size. With a focus
on resource optimization and efficiency, if the project requirements are
limited to its vector processing capabilities, we recommend migrating to
the xinference_client package. This package is more streamlined,
significantly reducing the storage space requirements of the project and
maintaining a feature focus, making it particularly suitable for
scenarios that demand lightweight integration. Such an approach not only
boosts deployment efficiency but also enhances the application's
maintainability, rendering it an optimal choice for our current context.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Tomaz Bratanic a43515ca65
experimental[patch]: Pass enum only to openai in llm graph transformer (#21860)
Some models like Groq return bad request if you pass in `enum` parameter
in tool definition
4 months ago
Jiří Spilka 6499897c87
community[patch]: update apify integration to attribute API activity to langchain (#21909)
**Description:** Add `Origin/langchain` to Apify's client's user-agent
to attribute API activity to LangChain (at Apify, we aim to monitor our
integrations to evaluate whether we should invest more in the LangChain
integration regarding functionality and content)

**Issue:** None
**Dependencies:** None
**Twitter handle:** None
4 months ago
Jared Van Bortel 25d1c1c9bb
nomic: implement local embeddings with the inference_mode parameter (#21934)
## Description

This PR implements local and dynamic mode in the Nomic Embed integration
using the inference_mode and device parameters. They work as documented
[here](https://docs.nomic.ai/reference/python-api/embeddings#local-inference).

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
4 months ago
ccurme 0e72ed39a0
infra: fix CI on text-splitters (#21935) 4 months ago
ccurme 4470d3b4a0
partners: bump core in packages implementing ls_params (#21868)
These packages all import `LangSmithParams` which was released in
langchain-core==0.2.0.

N.B. we will need to release `openai` and then bump `langchain-openai`
in `together` and `upstage`.
4 months ago
ccurme 9c76739425
mistral: implement ls_params (#21867) 4 months ago
Tomaz Bratanic d85e46321a
community[patch]: Better error message for neo4j vector when text is null (#21861) 4 months ago
Stefano Lottini f2e75f9500
cli[minor]: fix import path for two Astra DB classes in the migration json data (#21926)
This PR fixes two mistakes in the import paths from community for the
json data aiding the cli migration to 0.2.

It is intended as a quick follow-up to
https://github.com/langchain-ai/langchain/pull/21913 .

@nicoloboschi FYI
4 months ago
WilliamEspegren 30bca57aae
doc list not empty (#21208)
Make sure the doc list is not empty, and set Metadata: true in param, to
enable the user to disable metadata for slightly faster crawls.
4 months ago
David Charles 8da35fba7f
langchain[minor]: add libs/partners to dev.Dockerfile (#21902)
Resolves #21886 by adding "COPY libs/partners ../partners/" to
libs/dev.Dockerfile

Twitter: @kabakongo
4 months ago
TJ 8cd6ed3e1e
community[patch]: Update documentation string in databricks chat model (#21915)
Update typos in documentation string in databricks chat model
4 months ago
Nicolò Boschi dd00aac7ad
cli[minor]: add astradb in the cli migration to 0.2 (#21913)
astradb has a new partner package but the automatic migration cli tool
doesn't take care of migration astradb integrations
4 months ago
Coozywana b6c8b6f944
Fix base.py typo (#21862)
ChatOpenaAI --> ChatOpenAI

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
fzowl d3624eaba1
partners: Remove unnecessary print from voyageai embeddings (#21865)
Thank you for contributing to LangChain!

Remove unnecessary print from voyageai embeddings

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
Bagatur 8b3c5f93f5
docs: lcel how to and cheatsheet (#21851) 4 months ago
Nuno Campos b1e7b40b6a
core: Tap output of sync iterators for astream_events (#21842)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
Eugene Yurtsev 67b6f6c82a
core[patch]: Check if event loop is closed in memory stream (#21841)
Check if event stream is closed in memory loop.

Using try/except here to avoid race condition, but this may incur a
small overhead in versions prios to 3.11
4 months ago
Erick Friis 2d3f4e1a16
experimental: release 0.0.59 (#21835) 4 months ago
Erick Friis 169f525cfb
community: release 0.2.0 (#21834) 4 months ago
Erick Friis e5046cbd72
langchain: release 0.2.0, fix min deps (#21833) 4 months ago
Erick Friis 1b555021f7
text-splitters: release 0.2.0 (#21832) 4 months ago
Erick Friis 0ad8de5eb7
langchain: release 0.2.0 (#21831) 4 months ago
Erick Friis 23310626b3
core: release 0.2.0 (#21829) 4 months ago
Eugene Yurtsev e3f30b4cde
docs: clean up link to bing search (#21825)
Documentation should be inlined, not linking to medium article.
4 months ago
ccurme 181dfef118
core, standard tests, partner packages: add test for model params (#21677)
1. Adds `.get_ls_params` to BaseChatModel which returns
```python
class LangSmithParams(TypedDict, total=False):
    ls_provider: str
    ls_model_name: str
    ls_model_type: Literal["chat"]
    ls_temperature: Optional[float]
    ls_max_tokens: Optional[int]
    ls_stop: Optional[List[str]]
```
by default it will only return
```python
{ls_model_type="chat", ls_stop=stop}
```

2. Add these params to inheritable metadata in
`CallbackManager.configure`

3. Implement `.get_ls_params` and populate all params for Anthropic +
all subclasses of BaseChatOpenAI

Sample trace:
https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r

**OpenAI**:
<img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910">

**Anthropic**:
<img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7">

**Mistral** (and all others for which params are not yet populated):
<img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">
4 months ago
Sen Lin eb7f07ae36
community[patch]: fix typo in ValueError message in load_local function (#21818)
**Description:**
Corrected an error in the `allow_dangerous_deserialization` message
within the `load_local` functions
4 months ago
Jorge Piedrahita Ortiz 700b1c7212
community: sambaverse api update (#21816)
- **Description:** fix sambaverse integration to make it compatible with
sambaverse API update / minor changes in docs
4 months ago
maang-h 9f8d18c028
community[patch]: Fix unintended newline in print statement in exception for BaichuanTextEmbeddings (#21820)
- **Code:** langchain_community/embeddings/baichuan.py:82
- **Description:** When I make an error using 'baichuan embeddings', the
printed error message is wrapped (there is actually no need to wrap)
```python
# example
from langchain_community.embeddings import BaichuanTextEmbeddings

# error key
BAICHUAN_API_KEY = "sk-xxxxxxxxxxxxx"
embeddings = BaichuanTextEmbeddings(baichuan_api_key=BAICHUAN_API_KEY)

text_1 = "今天天气不错"
query_result = embeddings.embed_query(text_1)
```



![unintended
newline](https://github.com/langchain-ai/langchain/assets/55082429/e1178ce8-62bb-405d-a4af-e3b28eabc158)
4 months ago
Bakar Tavadze 3b5ac44e03
langchain-robocorp[minor]: Enable passing additional headers to the action server. (#21809)
Actions can optionally receive secrets via request headers. This PR
enables this functionality.
4 months ago
Asaf Joseph Gardin f3289b898c
partners: Revert AI21 Labs docs scan feature (#21699)
Description: Reverted commit #21614

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Eugene Yurtsev 8607735b80
langchain[patch],community[patch]: Move unit tests that depend on community to community (#21685) 4 months ago
Marco Lamina d0fae6cd54
community: Add token cost for GPT-4o model (#21771)
Adding [token cost for the new GPT-4o
model](https://openai.com/api/pricing/):
* Input cost US$5.00 / 1M tokens
* Output cost US$15.00 / 1M tokens
4 months ago
Massimiliano Pronesti 0c0db7c5db
feat(community): support semantic hybrid score threshold in Azure AI Search (#21527)
Support semantic hybrid search with a score threshold -- similar to what
we do for similarity search and for hybrid search (#20907).
4 months ago
Bagatur 6416d16d39
anthropic[patch]: Release 0.1.13, tool_choice support (#21773) 4 months ago
Stefano Lottini 040597e832
community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765)
This PR improves on the `CassandraCache` and `CassandraSemanticCache`
classes, mainly in the constructor signature, and also introduces
several minor improvements around these classes.

### Init signature

A (sigh) breaking change is tentatively introduced to the constructor.
To me, the advantages outweigh the possible discomfort: the new syntax
places the DB-connection objects `session` and `keyspace` later in the
param list, so that they can be given a default value. This is what
enables the pattern of _not_ specifying them, provided one has
previously initialized the Cassandra connection through the versatile
utility method `cassio.init(...)`.

In this way, a much less unwieldy instantiation can be done, such as
`CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`,
everything else falling back to defaults.

A downside is that, compared to the earlier signature, this might turn
out to be breaking for those doing positional instantiation. As a way to
mitigate this problem, this PR typechecks its first argument trying to
detect the legacy usage.
(And to make this point less tricky in the future, most arguments are
left to be keyword-only).

If this is considered too harsh, I'd like guidance on how to further
smoothen this transition. **Our plan is to make the pattern of optional
session/keyspace a standard across all Cassandra classes**, so that a
repeatable strategy would be ideal. A possibility would be to keep
positional arguments for legacy reasons but issue a deprecation warning
if any of them is actually used, to later remove them with 0.2 - please
advise on this point.

### Other changes

- class docstrings: enriched, completely moved to class level, added
note on `cassio.init(...)` pattern, added tiny sample usage code.
- semantic cache: revised terminology to never mention "distance" (it is
in fact a similarity!). Kept the legacy constructor param with a
deprecation warning if used.
- `llm_caching` notebook: uniform flow with the Cassandra and Astra DB
separate cases; better and Cassandra-first description; all imports made
explicit and from community where appropriate.
- cache integration tests moved to community (incl. the imported tools),
env var bugfix for `CASSANDRA_CONTACT_POINTS`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Kyle Cassidy eca8c4bcc6
Standardized openai init params (#21739)
## Patch Summary
community:openai[patch]: standardize init args

## Details
I made changes to the OpenAI Chat API wrapper test in the Langchain
open-source repository

- **File**: `libs/community/tests/unit_tests/chat_models/test_openai.py`
- **Changes**:
  - Updated `max_retries` with Pydantic Field
  - Updated the corresponding unit test
- **Related Issues**: #20085
  - Updated max_retries with Pydantic Field, updated the unit test.

---------

Co-authored-by: JuHyung Son <sonju0427@gmail.com>
4 months ago
Ethan Yang e44b448ec3
community: update openvino doc with streaming support (#21519)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
ccurme 19e6bf814b
community: fix CI (#21766) 4 months ago
Mish Ushakov d77e60a7f4
community: updated Browserbase loader (#21757)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: updated Browserbase loader"

- [x] **PR message**:
    Updates the Browserbase loader with more options and improved docs.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
4 months ago
Eugene Yurtsev 6ed0aa3239
core[major]: only use function description (#21622)
Do not prefix function signature

---

* Reason for this is that information is already present with tool
calling models.
* This will save on tokens for those models, and makes it more obvious
what the description is!
* The @tool can get more parameters to allow a user to re-introduce the
the signature if we want
4 months ago
William FH 8498b41cda
Finish agent migration doc (#21731) 4 months ago
Cheese 0ead09f84d
community: Implement `bind_tools` for ChatTongyi (#20725)
## Description

Implement `bind_tools` in ChatTongyi. Usage example:

```py
from langchain_core.tools import tool
from langchain_community.chat_models.tongyi import ChatTongyi

@tool
def multiply(first_int: int, second_int: int) -> int:
    """Multiply two integers together."""
    return first_int * second_int

llm = ChatTongyi(model="qwen-turbo")

llm_with_tools = llm.bind_tools([multiply])

msg = llm_with_tools.invoke("What's 5 times forty two")

print(msg)
```

Streaming is also supported.

## Dependencies

No Dependency is required for this change.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
Bagatur 867adbf27b
docs: add aca-ds (#21746) 4 months ago
Erick Friis 06110e20b9
pinecone: bump min core version (#21742) 4 months ago
Erick Friis bd3e7d50f3
fireworks: bump min core version (#21741) 4 months ago
Erick Friis f5c31078d7
airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738) 4 months ago
Erick Friis 3d33b89fa4
ibm[patch]: release 0.1.7 (#21737) 4 months ago
Erick Friis e41d801369
openai[patch]: fix embedding float precision issue (#21736)
also clean up + comment some of the embedding batching code
4 months ago
JuHyung Son 38c297a025
upstage: Support batch input in embedding request. (#21730)
**Description:** upstage embedding now supports batch input.
4 months ago
Harrison Chase 15be439719
Harrison/move flashrank rerank (#21448)
third party integration, should be in community
4 months ago
Erick Friis aca98fd150
multiple: releases with relaxed core dep (#21724) 4 months ago
Bagatur af284518bc
openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723) 4 months ago
William FH ca768c8353
[Core] Check is async callable (#21714)
To permit proper coercion of objects like the following:


```python
class MyAsyncCallable:
    async def __call__(self, foo):
        return await ...

class MyAsyncGenerator:
    async def __call__(self, foo):
        await ...
        yield 
```
4 months ago
Eugene Yurtsev 5c2cfabec6
core[minor]: Add v2 implementation of astream events (#21638)
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
4 months ago
Rajendra Kadam 54e003268e
langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641)
- **Description:** PebbloRetrievalQA chain introduces identity
enforcement using vector-db metadata filtering
- **Dependencies:** None
- **Issue:** None
- **Documentation:** Adding documentation for PebbloRetrievalQA chain in
a separate PR(https://github.com/langchain-ai/langchain/pull/20746)
- **Unit tests:** New unit-tests added

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
4 months ago
Bagatur 241a6e43a5
docs: update structured how to (#21679) 4 months ago
Jib f369495fa0
mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608) 4 months ago
Eugene Yurtsev e69a9bedf8
core[patch]: Update mypy config (#21684)
Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)
4 months ago
Erick Friis 9973547aef
mongodb: release 0.1.4 (#21678) 4 months ago
Jib a97473c846
mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394) 4 months ago
Eugene Yurtsev 5c64c004cc
core[patch]: Add unit tests with some streaming scenarios (#21668)
Add unit tests that show differences between sync / async versions when
streaming.

The inner on_chain_chunk event is missing if mixing sync and async
functionality. Likely due to missing tap_output_iter implementation on
the sync variant of `_transform_stream_with_config`
4 months ago
Eugene Yurtsev 2ac4d2960c
core[patch]: Add unit test to catch ordering (#21669)
Add unit test to catch ordering issues
4 months ago
Zhao Blake 972d2071c6
core[patch]: Fix typo in VectorStoreExampleSelector doc-string (#21574) 4 months ago
Erick Friis 2a984e8e3f
docs: huggingface package (#21645) 4 months ago
Erick Friis c77d2f2b06
multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646)
0.2 is not a breaking release for core (but it is for langchain and
community)

To keep the core+langchain+community packages in sync at 0.2, we will
relax deps throughout the ecosystem to tolerate `langchain-core` 0.2
4 months ago
Anush edd68e4ad4
qdrant: init package (#21146)
## Description

This PR introduces the new `langchain-qdrant` partner package, intending
to deprecate the community package.

## Changes

- Moved the Qdrant vector store implementation `/libs/partners/qdrant`
with integration tests.
- The conditional imports of the client library are now regular with
minor implementation improvements.
- Added a deprecation warning to
`langchain_community.vectorstores.qdrant.Qdrant`.
- Replaced references/imports from `langchain_community` with either
`langchain_core` or by moving the definitions to the `langchain_qdrant`
package itself.
- Updated the Qdrant vector store documentation to reflect the changes.

## Testing
- `QDRANT_URL` and
[`QDRANT_API_KEY`](583e36bf6b)
env values need to be set to [run integration
tests](d608c93d1f)
in the [cloud](https://cloud.qdrant.tech).
- If a Qdrant instance is running at `http://localhost:6333`, the
integration tests will use it too.
- By default, tests use an
[`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode)
instance(Not comprehensive).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
4 months ago
Prashanth Rao 63c3a0e56c
[community][graph]: Update KuzuQAChain and docs (#21218)
This PR makes some small updates for `KuzuQAChain` for graph QA.

- Updated Cypher generation prompt (we now support `WHERE EXISTS`) and
generalize it more
- Support different LLMs for Cypher generation and QA
- Update docs and examples
4 months ago
Tomaz Bratanic 89ff6a3d3b
Add sentiment and confidence levels to diffbotgraphtransformer (#21590)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Erick Friis 9b51ca08bc
huggingface: fix community dep checking (#21628) 4 months ago
Erick Friis 91a2ea5cd6
chroma, mongodb: fix docstrings (#21629) 4 months ago
Jofthomas afd85b60fc
huggingface: init package (#21097)
First Pr for the langchain_huggingface partner Package

- Moved some of the hugging face related class from `community` to the
new `partner package`

Still needed :
- Documentation
- Tests
- Support for the new apply_chat_template in `ChatHuggingFace`
- Confirm choice of class to support for embeddings witht he
sentence-transformer team.

cc : @efriis

---------

Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Tomaz Bratanic 9fce03e7db
community[patch]: Fix neo4j enhanced schema (#21582) 4 months ago
Christophe Bornet 66a4da8ad0
community[patch]: Improve Cassandra VectorStore docsctrings (#21620) 4 months ago
adreo00 40aff1eacc
core[major]: AsyncCallbackManagerForToolRun no longer casts return object to string (#20374)
- **Description:** Stops `AsyncCallbackManagerForToolRun` from
converting the output to str
- **Issue:** #20372
- **Dependencies:** None
4 months ago
Eugene Yurtsev 25fbe356b4
community[patch]: upgrade to recent version of mypy (#21616)
This PR upgrades community to a recent version of mypy. It inserts type:
ignore on all existing failures.
4 months ago
Eugene Yurtsev b923951062
langchain[patch]: CI add lint rule for community imports (#21618)
Add a rule to check for imports from community in global scope
4 months ago
Jorge Piedrahita Ortiz 4378fbbef0
community[patch]: Fix typos in Sambanova integration doc-strings (#21617)
- **Description:** Sambanova integration docstrings updated, bad
formated

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
4 months ago
Christophe Bornet bcf53f93e1
[community]: Add missing docstring param to CassandraLoader (#21611) 4 months ago
Christophe Bornet e6fa4547b1
community[minor]: Add alazy_load to AsyncHtmlLoader (#21536)
Also fixes a bug that `_scrape` was called and was doing a second HTTP
request synchronously.

**Twitter handle:** cbornet_
4 months ago
Wang Guan b53548dcda
langchain[minor]: allow CacheBackedEmbeddings to cache queries (#20073)
Add optional caching of queries to cache backed embeddings
4 months ago
Guangdong Liu a156aace2b
core[patch]:Fix Incorrect listeners parameters for Runnable.with_listeners() and .map() (#20661)
- **Issue:** fix #20509
-  @baskaryan, @eyurtsev


![image](https://github.com/langchain-ai/langchain/assets/48236177/f799a976-b983-4d8b-b373-64392e1fd6c6)
4 months ago
junkeon 480c02bf55
upstage[minor]: add merge_and_split function for document loader (#21603)
- Introduce the `merge_and_split` function in the
`UpstageLayoutAnalysisLoader`.
- The `merge_and_split` function takes a list of documents and a
splitter as inputs.
- This function merges all documents and then divides them using the
`split_documents` method, which is a proprietary function of the
splitter.
- If the provided splitter is `None` (which is the default setting), the
function will simply merge the documents without splitting them.
4 months ago
Leonid Ganeline 500569da48
community[patch]: `vectorstores` import update (#21169)
Issue: we have several helper functions to import third-party libraries
like lancedb.import_lancedb in
[community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
4 months ago
ccurme 3003363605
langchain, community: remove cap on sqlalchemy and bump duckdb (#21509) 4 months ago
ccurme 01a3228d8e
standard tests: add test for few-shot examples (#21019) 4 months ago
Chuyuan Qu af875cff57
prompty: adding Microsoft langchain_prompty package (#21346)
Co-authored-by: Micky Liu <wayliu@microsoft.com>
Co-authored-by: wayliums <wayliums@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Matt Florence d3ca2cc8c3
langchain: Fix broken `OpenAIModerationChain` and implement async (#18537)
Thank you for contributing to LangChain!

## PR title
lancghain[patch]: fix `OpenAIModerationChain` and implement async

## PR message
Description: fix `OpenAIModerationChain` and implement async

Issues: 
- https://github.com/langchain-ai/langchain/issues/18533 
- https://github.com/langchain-ai/langchain/issues/13685

Dependencies: none
Twitter handle: mattflo


## Add tests and docs
 
Existing documentation is broken:
https://python.langchain.com/docs/guides/safety/moderation


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Emilia Katari <emilia@outpace.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
4 months ago
ccurme 4170e72a42
openai: fix loads unit test (#21542)
following changes to tests in core here:
https://github.com/langchain-ai/langchain/pull/21342/files
4 months ago
Erick Friis 3db85cbb5b
community: deps (#21508) 5 months ago
Erick Friis 8580e350be
cli: release 0.0.22 (#21507) 5 months ago
Anthony Chu c735849e76
azure-dynamic-sessions: add Python REPL tool (#21264)
Adds a Python REPL that executes code in a code interpreter session
using Azure Container Apps dynamic sessions.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 02701c277f
langchain: core min version (#21506) 5 months ago
Erick Friis 13b01104c9
langchain: drop sqlalchemy max, release 0.2.0rc2 (#21504) 5 months ago
ccurme 375f447e58
community: fix builds with min dependencies (#21495) 5 months ago
Trayan Azarov ba7d53689c
community: Chroma Adding create_collection_if_not_exists flag to Chroma constructor (#21420)
- **Description:** Adds the ability to either `get_or_create` or simply
`get_collection`. This is useful when dealing with read-only Chroma
instances where users are constraint to using `get_collection`. Targeted
at Http/CloudClients mostly.
- **Issue:** chroma-core/chroma#2163
- **Dependencies:** N/A
- **Twitter handle:** `@t_azarov`




| Collection Exists | create_collection_if_not_exists | Outcome | test |

|-------------------|---------------------------------|----------------------------------------------------------------|----------------------------------------------------------|
| True | False | No errors, collection state unchanged |
`test_create_collection_if_not_exist_false_existing` |
| True | True | No errors, collection state unchanged |
`test_create_collection_if_not_exist_true_existing` |
| False | False | Error, `get_collection()` fails |
`test_create_collection_if_not_exist_false_non_existing` |
| False | True | No errors, `get_or_create_collection()` creates the
collection | `test_create_collection_if_not_exist_true_non_existing` |
5 months ago
ccurme 3bb9bec314
bedrock: add unit test for retriever (#21485)
This was implemented in
https://github.com/langchain-ai/langchain/pull/21349 but dropped before
merge.
5 months ago
Renu Rozera 4035a1d234
Add source metadata to bedrock retriever response (#21349)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: Add source metadata to bedrock retriever
response"

- [X] **PR message**: 
- **Description:** Bedrock retrieve API returns extra metadata in the
response which is currently not returned in the retriever response
- **Issue:** The change adds the metadata from bedrock retrieve API
response to the bedrock retriever in a backward compatible way. Renamed
metadata to sourceMetadata as metadata term is being used in the
Document already. This is in sync with what we are doing in llama-index
as well.
    - **Dependencies:** No


- [X] **Add tests and docs**:
  1. Added unit tests
  2. Notebook already exists and does not need any change
3. Response from end to end testing, just to ensure backward
compatibility: `[Document(page_content='Exoplanets.',
metadata={'location': {'s3Location': {'uri':
's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647,
'source_metadata': {'x-amz-bedrock-kb-source-uri':
's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year':
1946.0}})]`


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
5 months ago
Erick Friis f178c67ad0
community: release 0.2.0rc1, bump deps (#21470) 5 months ago
William FH b28be5d407
Pass through Run ID Explicitly (#21469) 5 months ago
Erick Friis 83eecd54fe
experimental: 0.2 relax (#21468) 5 months ago
roiperlman 9992beaff9
community: Add arguments to whisper parser (#20378)
**Description:** Added a few additional arguments to the whisper parser,
which can be consumed by the underlying API.
The prompt is especially important to fine-tune transcriptions.

---------

Co-authored-by: Roi Perlman <roi@fivesigmalabs.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Yash cb31c3611f
Ndb enterprise (#21233)
Description: Adds NeuralDBClientVectorStore to the langchain, which is
our enterprise client.

---------

Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
5 months ago
Oguz Vuruskaner 5b35f077f9
[community][fix](DeepInfraEmbeddings): Implement chunking for large batches (#21189)
**Description:**
This PR introduces chunking logic to the `DeepInfraEmbeddings` class to
handle large batch sizes without exceeding maximum batch size of the
backend. This enhancement ensures that embedding generation processes
large batches by breaking them down into smaller, manageable chunks,
each conforming to the maximum batch size limit.

**Issue:**
Fixes #21189

**Dependencies:**
No new dependencies introduced.
5 months ago
Sokolov Fedor f4ddf64faa
community: Add MarkdownifyTransformer to langchain_community.document_transformers (#21247)
- Added new document_transformer: MarkdonifyTransformer, that uses
`markdonify` package with customizable options to convert HTML to
Markdown. It's similar to Html2TextTransformer, but has more flexible
options and also I've noticed that sometimes MarkdownifyTransformer
performs better than html2text one, so that's why I use markdownify on
my project.
- Added docs and tests

- Usage:
```python
from langchain_community.document_transformers import MarkdownifyTransformer

markdownify = MarkdownifyTransformer()
docs_transform = markdownify.transform_documents(docs)
```

- Example of better performance on simple task, that I've noticed:
```
<html>
<head><title>Reports on product movement</title></head>
<body>
<p data-block-key="2wst7">The reports on product movement will be useful for forming supplier orders and controlling outcomes.</p>
</body>
```
**Html2TextTransformer**: 
```python
[Document(page_content='The reports on product movement will be useful for forming supplier orders and\ncontrolling outcomes.\n\n')]
# Here we can see 'and\ncontrolling', which has extra '\n' in it
```
**MarkdownifyTranformer**:
```python
[Document(page_content='Reports on product movement\n\nThe reports on product movement will be useful for forming supplier orders and controlling outcomes.')]
```

---------

Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.bbrouter>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.local>
Co-authored-by: Sokolov Fedor <f.sokolov@192.168.1.6>
5 months ago
Alex JW d3ce6aad2e
community: Instantiate GPT4AllEmbeddings with parameters (#21238)
### GPT4AllEmbeddings parameters
---

**Description:** 
As of right now the **Embed4All** class inside _GPT4AllEmbeddings_ is
instantiated as it's default which leaves no room to customize the
chosen model and it's behavior. Thus:

- GPT4AllEmbeddings can now be instantiated with custom parameters like
a different model that shall be used.

---------

Co-authored-by: AlexJauchWalser <alexander.jauch-walser@knime.com>
5 months ago
Philippe PRADOS 7be68228da
community[patch]: Make sql record manager fully compatible with async (#20735)
The `_amake_session()` method does not allow modifying the
`self.session_factory` with
anything other than `async_sessionmaker`. This prohibits advanced uses
of `index()`.

In a RAG architecture, it is necessary to import document chunks.
To keep track of the links between chunks and documents, we can use the
`index()` API.
This API proposes to use an SQL-type record manager.

In a classic use case, using `SQLRecordManager` and a vector database,
it is impossible
to guarantee the consistency of the import. Indeed, if a crash occurs
during the import
(problem with the network, ...)
there is an inconsistency between the SQL database and the vector
database.

With the
[PR](https://github.com/langchain-ai/langchain-postgres/pull/32) we are
proposing for `langchain-postgres`,
it is now possible to guarantee the consistency of the import of chunks
into
a vector database.  It's possible only if the outer session is built
with the connection.

```python
def main():
    db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/"
    engine = create_engine(db_url, echo=True)
    embeddings = FakeEmbeddings()
    pgvector:VectorStore = PGVector(
        embeddings=embeddings,
        connection=engine,
    )

    record_manager = SQLRecordManager(
        namespace="namespace",
        engine=engine,
    )
    record_manager.create_schema()

    with engine.connect() as connection:
        session_maker = scoped_session(sessionmaker(bind=connection))
        # NOTE: Update session_factories
        record_manager.session_factory = session_maker
        pgvector.session_maker = session_maker
        with connection.begin():
            loader = CSVLoader(
                    "data/faq/faq.csv",
                    source_column="source",
                    autodetect_encoding=True,
                )
            result = index(
                source_id_key="source",
                docs_source=loader.load()[:1],
                cleanup="incremental",
                vector_store=pgvector,
                record_manager=record_manager,
            )
            print(result)
```
The same thing is possible asynchronously, but a bug in
`sql_record_manager.py`
in `_amake_session()` must first be fixed.

```python
    async def _amake_session(self) -> AsyncGenerator[AsyncSession, None]:
        """Create a session and close it after use."""

        # FIXME: REMOVE if not isinstance(self.session_factory, async_sessionmaker):~~
        if not isinstance(self.engine, AsyncEngine):
            raise AssertionError("This method is not supported for sync engines.")

        async with self.session_factory() as session:
            yield session
``` 

Then, it is possible to do the same thing asynchronously:

```python
async def main():
    db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/"
    engine = create_async_engine(db_url, echo=True)
    embeddings = FakeEmbeddings()
    pgvector:VectorStore = PGVector(
        embeddings=embeddings,
        connection=engine,
    )
    record_manager = SQLRecordManager(
        namespace="namespace",
        engine=engine,
        async_mode=True,
    )
    await record_manager.acreate_schema()

    async with engine.connect() as connection:
        session_maker = async_scoped_session(
            async_sessionmaker(bind=connection),
            scopefunc=current_task)
        record_manager.session_factory = session_maker
        pgvector.session_maker = session_maker
        async with connection.begin():
            loader = CSVLoader(
                "data/faq/faq.csv",
                source_column="source",
                autodetect_encoding=True,
            )
            result = await aindex(
                source_id_key="source",
                docs_source=loader.load()[:1],
                cleanup="incremental",
                vector_store=pgvector,
                record_manager=record_manager,
            )
            print(result)


asyncio.run(main())
```

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Sean <sean@upstage.ai>
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: YISH <mokeyish@hotmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Jason_Chen <820542443@qq.com>
Co-authored-by: Joan Fontanals <joan.fontanals.martinez@jina.ai>
Co-authored-by: Pavlo Paliychuk <pavlo.paliychuk.ca@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
Co-authored-by: samanhappy <samanhappy@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: merdan <48309329+merdan-9@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Andres Algaba <andresalgaba@gmail.com>
Co-authored-by: davidefantiniIntel <115252273+davidefantiniIntel@users.noreply.github.com>
Co-authored-by: Jingpan Xiong <71321890+klaus-xiong@users.noreply.github.com>
Co-authored-by: kaka <kaka@zbyte-inc.cloud>
Co-authored-by: jingsi <jingsi@leadincloud.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
Co-authored-by: Michael Schock <mjschock@users.noreply.github.com>
Co-authored-by: Anish Chakraborty <anish749@users.noreply.github.com>
Co-authored-by: am-kinetica <85610855+am-kinetica@users.noreply.github.com>
Co-authored-by: Dristy Srivastava <58721149+dristysrivastava@users.noreply.github.com>
Co-authored-by: Matt <matthew.gotteiner@microsoft.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
5 months ago
Andreas Motl 17e42bbd18
community[patch]: pgvector: Slight refactoring to make code a bit more reusable (#16243)
- **Description:** Improve [pgvector vector store
adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py)
to make it reusable by adapters deriving from that.
  - **Issue:** NA
  - **Dependencies:** NA
  - **References:** https://github.com/crate-workbench/langchain/pull/1
  - **Addressed to:** @eyurtsev, @cbornet


Hi from the CrateDB team,

first of all, thanks a stack for conceiving and maintaining LangChain.
We are currently [preparing a
patch](https://github.com/crate-workbench/langchain/pull/1) for adding
[CrateDB](https://github.com/crate/crate) to the list of community
adapters.

Because CrateDB aims to be compatible with PostgreSQL to some degree,
the vector store subsystem in LangChain derives functionality from the
corresponding implementation for pgvector.

Therefore, in order to make the implementation more reusable, we needed
to rename the private methods `__from` and `__query_collection` to the
less private counterparts `_from` and `_query_collection`, so they can
be overwritten, in order to unlock other adapters deriving from
[pgvector](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py).

With kind regards,
Andreas.
5 months ago
Mehrdad Shokri f103927b88
bugfix(community): fix Playwright import paths. (#21395)
- **Description:** Fix import class name exporeted from
'playwright.async_api' and 'playwright.sync_api' to match the correct
name in playwright tool. Change import from inline guard_import to
helper function that calls guard_import to make code more readable in
gmail tool. Upgrade playwright version to 1.43.0
- **Issue:** #21354
- **Dependencies:** upgrade playwright version(this is not required for
the bugfix itself, just trying to keep dependencies fresh. I can remove
the playwright version upgrade if you want.)
5 months ago
Shailendra Mishra aa966b6161
Replaced bind variable in SQL with formatted string for compatibility with sql syntax. (#21439)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Eugene Yurtsev f92006de3c
multiple: langchain 0.2 in master (#21191)
0.2rc 

migrations

- [x] Move memory
- [x] Move remaining retrievers
- [x] graph_qa chains
- [x] some dependency from evaluation code potentially on math utils
- [x] Move openapi chain from `langchain.chains.api.openapi` to
`langchain_community.chains.openapi`
- [x] Migrate `langchain.chains.ernie_functions` to
`langchain_community.chains.ernie_functions`
- [x] migrate `langchain/chains/llm_requests.py` to
`langchain_community.chains.llm_requests`
- [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder`
->
`langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder`
(namespace not ideal, but it needs to be moved to `langchain` to avoid
circular deps)
- [x] unit tests langchain -- add pytest.mark.community to some unit
tests that will stay in langchain
- [x] unit tests community -- move unit tests that depend on community
to community
- [x] mv integration tests that depend on community to community
- [x] mypy checks

Other todo

- [x] Make deprecation warnings not noisy (need to use warn deprecated
and check that things are implemented properly)
- [x] Update deprecation messages with timeline for code removal (likely
we actually won't be removing things until 0.4 release) -- will give
people more time to transition their code.
- [ ] Add information to deprecation warning to show users how to
migrate their code base using langchain-cli
- [ ] Remove any unnecessary requirements in langchain (e.g., is
SQLALchemy required?)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
ccurme 6b392d6d12
robocorp: release 0.0.6 (#21441) 5 months ago
Tommi Holmgren ee35b9ba56
langchain-robocorp: remove toolkit return content max length (#21436)
Robocorp (action server) toolkit had a limitation that the content
length returned by the tool was always cut to max 5000 chars. This was
from the time when context windows were much more limited.

This PR removes the limitation. Whatever the underlying tool provides
gets sent back to the agent.

As the robocorp toolkit no longer restricts the content, the implication
is that either the Action (tool) developer or the agent developer needs
to be aware of potentially oversized tool responses. Our point of view
is this should be the agent developer's responsibility, them being in
control of the use case and aware of the context window the LLM has.
5 months ago
JuHyung Son 710e57d779
upstage: deprecate UPSTAGE_DOCUMENT_AI_API_KEY (#21363)
Description: We are merging UPSTAGE_DOCUMENT_AI_API_KEY and
UPSTAGE_API_KEY into one, and only UPSTAGE_API_KEY will be used going
forward. And we changed the base class of ChatUpstage to BaseChatOpenAI.

---------

Co-authored-by: Sean <chosh0615@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 6a295d1ec0
upstage: release 0.1.4 (#21432) 5 months ago
Mateusz Szewczyk 7926cc1929
ibm: Fix llm and embeddings "verify" attribute default value (#21429)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain-ibm: Fix llm and embeddings 'verify'
attribute default value"


- [x] **PR message**: 
    - **Description:** fix default value of "verify" attribute
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Dobiichi-Origami 5b00885b49
community: add `bind_tools` and `with_structured_output` support to `QianfanChatEndpoint` (#21412)
…Endpoint`

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** add `bind_tools` and `with_structured_output` support
to `QianfanChatEndpoint`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
5 months ago
Silas Xu aafaf3e193
The predict_and_parse is deprecated, instead pass an output parser directly to LLMChain. (#20130)
The `predict_and_parse` method is deprecated, instead pass an output
parser directly to LLMChain.

- [x] **PR title**: "langchain: update chain_extract.py"


![image](https://github.com/langchain-ai/langchain/assets/40889019/e950d79f-5a0f-4086-86e9-89f627990fe5)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
ccurme 3c31bd0ed0
langchain: update use of predict_and_parse in LLMChainFilter (#21389)
Following https://github.com/langchain-ai/langchain/pull/20130

Removes deprecation warnings in docs here:
https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/

Tested using the same docs notebook + existing integration test.
5 months ago
Erick Friis bbdf0f8801
experimental[patch]: core and langchain dep (#21402) 5 months ago
Erick Friis e4aca0d052
experimental[patch]: release 0.0.58 (#21397) 5 months ago
Leonid Ganeline 791d59a2c8
community: `callbacks` guard_imports (#21173)
Issue: we have several helper functions to import third-party libraries
like import_uptrain in
[community.callbacks](https://api.python.langchain.com/en/latest/callbacks/langchain_community.callbacks.uptrain_callback.import_uptrain.html#langchain_community.callbacks.uptrain_callback.import_uptrain).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
5 months ago
Rahul Triptahi 7994cba18d
[Community][Minor]: Fetch loader_source of GoogleDriveLoader in PebbloSafeLoader. (#21314)
Description: This PR includes fix for loader_source to be fetched from
metadata in case of GdriveLoaders.
Documentation: NA
Unit Test: NA

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Nuno Campos ad0f3c14c2
core: allow mermaid node labels to have any characters (#21385)
- it's only node ids that are limited

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Eugene Yurtsev 6a1d61dbf1
community[patch]: Fix in memory vectorstore to take into account ids when adding docs (#21384)
Should respect `ids` if passed
5 months ago
Miroslav 04e2611fea
Added additional headers for HuggingFaceInferenceAPIEmbeddings endpoint. (#21282)
Thank you for contributing to LangChain!

- [ ] **HuggingFaceInferenceAPIEmbeddings**: "Additional Headers"
  - Where: langchain, community, embeddings. huggingface.py.
- Community: add additional headers when needed by custom HuggingFace
TEI embedding endpoints. HuggingFaceInferenceAPIEmbeddings"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding the `additional_headers` to be passed to
requests library if needed
    - **Dependencies:** none
 

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. Tested with locally available TEI endpoints with and without
`additional_headers`
  2. Example  Usage
  
```python
embeddings=HuggingFaceInferenceAPIEmbeddings(
                             api_key=MY_CUSTOM_API_KEY,
                             api_url=MY_CUSTOM_TEI_URL,
                             additional_headers={
                                "Content-Type": "application/json"
                               }
)
```

 

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
Guangdong Liu 1fe66f5d39
community(patch) fix MoonshotChat moonshot_api_key is invaild for api key (#21361)
Description: close
https://github.com/langchain-ai/langchain/issues/21237
@baskaryan, @eyurtsev
5 months ago
Tomaz Bratanic 0bf7596839
Add simple node properties to llm graph transformer (#21369)
Add support for simple node properties in llm graph transformer.

Linter and dynamic pydantic classes aren't friends, hence I added two
ignores
5 months ago
ccurme 080af0ec53
langchain: sync -> async methods in OpenAI assistants (#21378) 5 months ago
Tomaz Bratanic ad3fd44a7f
experimental: Fix llm graph transformer bug (#21362) 5 months ago
Erick Friis bb81ae5c8c
together: fix chat model and embedding classes (#21353) 5 months ago
Hassan El Mghari d6ef5fe86a
together: add chat models, use openai base (#21337)
**Description:** Adding chat completions to the Together AI package,
which is our most popular API. Also staying backwards compatible with
the old API so folks can continue to use the completions API as well.
Also moved the embedding API to use the OpenAI library to standardize it
further.

**Twitter handle:** @nutlope

- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Jacob Lee a2d31307bb
Adds confirmation logs after creating a new project (#12618)
@efriis @hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 0fb93cd740
core: release 0.1.52 (#21350) 5 months ago
Wu Enze 32c61b3ece
community[patch]: chat message history mypy fixes #17048 (#20114)
Relates [#17048]
Description : Applied fix to redis and neo4j file.

Error was : `Cannot override writeable attribute with read-only
property`

fix with the same solution of
[[langchain/libs/community/langchain_community/chat_message_histories/elasticsearch.py](d5c412b0a9/libs/community/langchain_community/chat_message_histories/elasticsearch.py (L170-L175))]

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
nrpd25 95cc8e3fc3
premai[patch]:Standardized model init args (#21308)
[Standardized model init args
#20085](https://github.com/langchain-ai/langchain/issues/20085)
- Enable premai chat model to be initialized with `model_name` as an
alias for `model`, `api_key` as an alias for `premai_api_key`.
- Add initialization test `test_premai_initialization`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Nuno Campos 6f17158606
fix: core: Include in json output also fields set outside the constructor (#21342) 5 months ago
Tomaz Bratanic ac14f171ac
Add indexed properties to neo4j enhanced schema (#21335) 5 months ago
scaserini a6cdf6572f
community: add Kendra DocumentRelevanceOverrideConfigurations request parameter (#20695)
- **Description:** add **DocumentRelevanceOverrideConfigurations**
request parameter to Kendra retriever

Co-authored-by: Simone Caserini <simone.caserini@klarna.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Nuno Campos 0345bcf4ef
Fix failing test for serialization (#21344)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Trayan Azarov 93226b1945
community: Updated Chroma version range to include 0.5.0 release (#21224)
- Updated Chroma version range to allow releases in 0.5.x.
- Bumped mypy version as linting was failing
5 months ago
Jorge Piedrahita Ortiz e65652c3e8
community: add SambaNova embeddings integration (#21227)
- **Description:**  SambaNova hosted embeddings integration
5 months ago
Jorge Piedrahita Ortiz df1c10260c
community: minor changes sambanova integration (#21231)
- **Description:** fix: variable names in root validator not allowing
pass credentials as named parameters in llm instancing, also added
sambanova's sambaverse and sambastudio llms to __init__.py for module
import
5 months ago
Jan Soubusta d9a61c0fa9
fix: respect table_name argument when calling from_texts (#21252)
valid for from_documents() as well

fixes #21251
5 months ago
Pedro Lima bebf46c4a2
community: added args_schema to YahooFinanceNewsTool (#21232)
Description: this change adds args_schema (pydantic BaseModel) to
YahooFinanceNewsTool for correct schema formatting on LLM function calls

Issue: currently using YahooFinanceNewsTool with OpenAI function calling
returns the following error "TypeError("YahooFinanceNewsTool._run() got
an unexpected keyword argument '__arg1'")". This happens because the
schema sent to the LLM is "input: "{'__arg1': 'MSFT'}"" while the method
should be called with the "query" parameter.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Mark Cusack 060987d755
community[minor]: Add indexing via locality sensitive hashing to the Yellowbrick vector store (#20856)
- **Description:** Add LSH-based indexing to the Yellowbrick vector
store module
- **Twitter handle:** @markcusack

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
Rashmi Pawar a2fdabdad2
mark NemoEmbeddings as deprecated (#21239)
The NemoEmbeddings is deprecated, instead use
langchain-nvidia-ai-endpoints NVIDIAEmbeddings interface.

cc: @mattf

---------

Co-authored-by: Daniel Glogowski <167348611+dglogo@users.noreply.github.com>
Co-authored-by: andyjessen <62343929+andyjessen@users.noreply.github.com>
Co-authored-by: Chris Germann <88305668+TAAGECH9@users.noreply.github.com>
Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Erick Friis 9e4b24a2d6
langchain: release 0.1.18 (#21338) 5 months ago
Erick Friis 5c000f8d79
community: release 0.0.37 (#21332) 5 months ago
Leonid Ganeline 8c13e8a79b
langchain: `qa_chain` fix (#21279)
Issue: `load_qa_chain` is placed in the __init__.py file. As a result,
it is not listed in the API Reference docs.
BTW `load_qa_chain` is heavily presented in the doc examples, but is
missed in API Ref.
Change: moved code from init.py into a new file. Related: #21266
5 months ago
Erick Friis 7ecf9996f1
community: Revert "community: langkit dependency" (#21333)
Reverts langchain-ai/langchain#21174

Hey team - going to revert this because it doesn't seem necessary for
testing. We should only be adding optional + extended_testing
dependencies for deps that have extended tests.

otherwise it just increases probability of dependency conflicts in the
community lockfile.
5 months ago
Param Singh fee91d43b7
baichuan[patch]:standardize chat init args (#21298)
Thank you for contributing to LangChain!

community:baichuan[patch]: standardize init args

updated `baichuan_api_key` so that aliased to `api_key`. Added test that
it continues to set the same underlying attribute. Test checks for
`SecretStr`

updated `temperature` with Pydantic Field, added unit test. 

Related to https://github.com/langchain-ai/langchain/issues/20085
5 months ago
Christophe Bornet 484a009012
community[minor]: Relax constraints on Cassandra VectorStore constructors (#21209)
If Session and/or keyspace are not provided, they are resolved from
cassio's context. So they are not required.
This change is fully backward compatible.
5 months ago
Leonid Ganeline 6feddfae88
community: langkit dependency (#21174)
Issue: the `langkit` package is not presented in the `pyproject.toml`
but it is a requirement for the `WhyLabsCallbackHandler`
Change: added `langkit`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Erick Friis 811e9cee8b
core: release 0.1.51 (#21328) 5 months ago
Mateusz Szewczyk 682d21c3de
ibm: Add support for ibm-watsonx-ai new major version (#21313)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain-ibm: Add support for ibm-watsonx-ai new
major version"


- [x] **PR message**: 
    - **Description:** Add support for ibm-watsonx-ai new major version
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Chris Papademetrious ee6c922c91
langchain[minor]: enhance `LocalFileStore` to offer `update_atime` parameter that updates access times on read (#20951)
**Description:**
The `LocalFileStore` class can be used to create an on-disk
`CacheBackedEmbeddings` cache. The number of files in these embeddings
caches can grow to be quite large over time (hundreds of thousands) as
embeddings are computed for new versions of content, but the embeddings
for old/deprecated content are not removed.

A *least-recently-used* (LRU) cache policy could be applied to the
`LocalFileStore` directory to delete cache entries that have not been
referenced for some time:

```bash
# delete files that have not been accessed in the last 90 days
find embeddings_cache_dir/ -atime 90 -print0 | xargs -0 rm
```

However, most filesystems in enterprise environments disable access time
modification on read to improve performance. As a result, the access
times of these cache entry files are not updated when their values are
read.

To resolve this, this pull request updates the `LocalFileStore`
constructor to offer an `update_atime` parameter that causes access
times to be updated when a cache entry is read.

For example,

```python
file_store = LocalFileStore(temp_dir, update_atime=True)
```

The default is `False`, which retains the original behavior.

**Testing:**
I updated the LocalFileStore unit tests to test the access time update.
5 months ago
Tomaz Bratanic 5b6d1a907d
Add the extract types to diffbot graph transformer (#21315)
Before you could only extract triples (diffbot calls it facts) from
diffbot to avoid isolated nodes. However, sometimes isolated nodes can
still be useful like for prefiltering, so we want to allow users to
extract them if they want. Default behaviour is unchanged.
5 months ago
aditya thomas b868c78a12
partners[anthropic]: update unit test for key passed in from the environment (#21290)
**Description:** Update unit test for ChatAnthropic
**Issue:** Test for key passed in from the environment should not have
the key initialized in the constructor
**Dependencies:** None
5 months ago
Rohan Aggarwal 8021d2a2ab
community[minor]: Oraclevs integration (#21123)
Thank you for contributing to LangChain!

- Oracle AI Vector Search 
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.


- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings


- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.


- We have made sure that make format and make lint run clean.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
ccurme c9e9470c5a
langchain: fix deprecation decorators on extraction chains (#21276)
Calling any of these raises
```
ValueError: A pending deprecation cannot have a scheduled removal
```
5 months ago
Wickes Wong ee1adaacaa
langchain[patch]: Fix summary buffer memory with return message flag (#21115)
## Description
Memory return could be set as `str` or `message` by `return_messages`
flag as mentioned in
https://python.langchain.com/docs/modules/memory/#whether-memory-is-a-string-or-a-list-of-messages,
where
`langchain.chains.conversation.memory.ConversationSummaryBufferMemory`
did not implement that.
This commit added `buffer_as_str` and `buffer_as_messages` function, and
`buffer` now affected by `return_messages` flag.

## Example Test Code and Output

```python
# Fix: ConversationSummaryBufferMemory with return_messages flag function
# Test code
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
from langchain_community.llms.ollama import Ollama

llm = Ollama()

# Create an instance of ConversationSummaryBufferMemory with return_messages set to True
memory = ConversationSummaryBufferMemory(return_messages=True, llm=llm)

# Add user and AI messages to the chat memory
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")

# Print the buffer
print("Buffer:")
print(*map(type, memory.buffer), sep="\n")
print(memory.buffer, "\n")

# Print the buffer as a string
print("Buffer as String:")
print(type(memory.buffer_as_str))
print(memory.buffer_as_str, "\n")

# Print the buffer as messages
print("Buffer as Messages:")
print(*map(type, memory.buffer_as_messages), sep="\n")
print(memory.buffer_as_messages, "\n")

# Print the buffer after setting return_messages to False
memory.return_messages = False
print("Buffer after setting return_messages to False:")
print(type(memory.buffer))
print(memory.buffer, "\n")
```

```plaintext
Buffer:
<class 'langchain_core.messages.human.HumanMessage'>
<class 'langchain_core.messages.ai.AIMessage'>
[HumanMessage(content='hi!'), AIMessage(content="what's up?")] 

Buffer as String:
<class 'str'>
Human: hi!
AI: what's up? 

Buffer as Messages:
<class 'langchain_core.messages.human.HumanMessage'>
<class 'langchain_core.messages.ai.AIMessage'>
[HumanMessage(content='hi!'), AIMessage(content="what's up?")] 

Buffer after setting return_messages to False:
<class 'str'>
Human: hi!
AI: what's up? 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Leonid Ganeline 9639457222
community[patch]: `tools` imports (#21156)
Issue: we have several helper functions to import third-party libraries
like tools.gmail.utils.import_google in
[community.tools](https://api.python.langchain.com/en/latest/community_api_reference.html#id37).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
5 months ago
Leonid Ganeline 3ef8b24277
core[patch]: `utils.guard_import` fix (#21133)
Issues (nit): 
1. `utils.guard_import` prints wrong error message when there is an
import `error.` It prints the whole `module_name` but should be only the
first part as the pip package name. E.i. `langchain_core.utils` -> print
not `langchain-core` but `langchain_core.utils`. Also replace '_' with
'-' in the pip package name.
2. it does not handle the `ModuleNotFoundError` which raised if
`guard_import("wrong_module")`

Fixed issues; added ut-s. Controversial: I've reraised
`ModuleNotFoundError` as `ImportError`, since in case of the error, the
proposed action is the same - we need to install a missed package.
5 months ago
Erick Friis 36c2ca3c8b
mistralai: relax tokenizers dep (#21277) 5 months ago
Nuno Campos 6e1e0c7d5c
fix: core: draw_mermaid() would create subgroup for edges with same src and tgt (#21275)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Eugene Yurtsev 26a37dce0a
langchain[patch]: Remove jsonpatch from poetry file (#21272)
jsonpatch is only used in langchain-core not in langchain
5 months ago
Eugene Yurtsev 335bd01e45
langchain[patch]: Update deprecation warning (#21268)
Update deprecation warning
5 months ago
Leonid Ganeline 23a05c3986
langchain: `summarize` chain fix (#21266)
Issue: `load_summarize_chain` is placed in the __init__.py file. As a
result, it doesn't listed in the API Reference docs.
Change: moved code from __init__.py into a new file.
5 months ago
ccurme 6da3d92b42
(all): update removal in deprecation warnings from 0.2 to 0.3 (#21265)
We are pushing out the removal of these to 0.3.

`find . -type f -name "*.py" -exec sed -i ''
's/removal="0\.2/removal="0.3/g' {} +`
5 months ago
Eugene Yurtsev d6e34f9ee5
langchain[patch]: Improve deprecation warnings (#21262)
* Remove spurious derprecation warning
* Make deprecation warnings consistent with 0.1 namespaces that were announced as deprecated
5 months ago
Eugene Yurtsev 487aff7e46
langchain[patch]: Revert 20794 until 0.2 release (#21257)
PR of 2079 was already released as part of 0.1.17rc.


Issue for 0.2 release:
https://github.com/langchain-ai/langchain/issues/21080
5 months ago
Eugene Yurtsev ba4a309d98
langchain[patch]: Revert breaking change until 0.2 release (#21256)
Reverts a minor breaking change until 0.2 release
5 months ago
Eugene Yurtsev 66a1e3f083
langchain[patch]: Fix flaky unit test (#21258)
Should sort the results of the import test since it depends on import order
5 months ago
Eugene Yurtsev 0989c48028
langchain[minor]: Re-add deleted ainetwork tool (#21254)
* Adding __init__.py to turn it into a package in community
* Adding proxy imports that assume that langchain_community is optional
5 months ago
Christophe Bornet 2fbe82f5e6
community[minor]: Relax constraints on CassandraChatMessageHistory constructor (#21241) 5 months ago
Chris Germann 3a8d1d8838
Hotfix RetrievalQA Docs: docs: Fix formatting (#21183)
# Newline Characters breaking formatting 

**Description**: 
As you can see in the image below, the formatting in the documentation
is broken. As far as I can see the two added `\n` characters are
breaking the documentation. Therefore I would propose to remove those

![image](https://github.com/langchain-ai/langchain/assets/88305668/23b6e726-71b2-4812-91ea-3e8600683733)

**Dependencies**:
None

**Twitter Handle**
- epu9byj

---------

Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Bagatur 67a5cc34c6
openai[patch]: Release 0.1.6 (#21236) 5 months ago
Erick Friis c1eb95b967
core: release 0.1.50 (#21230) 5 months ago
Nuno Campos 47ce8d5a57
core: tracer: remove numeric execution order (#21220)
- this hasn't been used in a long time and requires some additional
bookkeeping i'm going to streamline in the next pr
5 months ago
Bagatur 6ac6158a07
openai[patch]: support tool_choice="required" (#21216)
Co-authored-by: ccurme <chester.curme@gmail.com>
5 months ago
xindoo c1aa237bc2
langchain: fix syntax error in code comment for create_tool_calling_agent (#21205)
**PR message**:
- **Description:** Corrected a syntax error in the code comments within
the `create_tool_calling_agent` function in the langchain package.
- **Issue:** N/A
- **Dependencies:** No additional dependencies required.
- **Twitter handle:** N/A
5 months ago
ccurme eb0a2fd53a
mistral: release 0.1.6 (#21214) 5 months ago
ccurme 2d77e5e3a1
(standard tests): add test for basic conversation sequence (#21213) 5 months ago
Maxime Perrin 1ebb5a70ad
partners(mistralai): Removing unused variable in completion request (using tool_calls or content) (#21201)
This PR fixes #21196.

The error was occurring when calling chat completion API with a chat
history. Indeed, the Mistral API does not accept both `content` and
`tool_calls` in the same body.

This PR removes one of theses variables depending on the necessity.

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Christophe Bornet 683fb45c6b
community[patch]: Refactor CassandraDatabase wrapper (#21075)
* Introduce individual `fetch_` methods for easier typing.
* Rework some docstrings to google style
* Move some logic to the tool
* Merge the 2 cassandra utility files
5 months ago
Raghav Dixit 7d451d0041
community[patch]: Update lancedb.py (#21192)
very minor update in LanceDB integration, 'metric' argument was missing.
5 months ago
Bagatur d297d90ad9
core[patch]: Release 0.1.49 (#21211) 5 months ago
Nuno Campos 663747b730
core[patch]: Fixes for convert_messages (#21207)
- support two-tuples of any sequence type (eg. json.loads never produces
tuples)
- support type alias for role key
- if id is passed in in dict form use it
- if tool_calls passed in in dict form use them

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev df49404794
langchain[patch]: Make more memory code handle community dependency as optional (#21199) 5 months ago
ccurme bd5d2c2674
langchain: import InMemoryChatMessageHistory from core (#21198) 5 months ago
Eugene Yurtsev 3cd7fced5f
langchain[patch],community[minor]: Migrate memory implementations to community (#20845)
Migrates memory implementations to community
5 months ago
Eugene Yurtsev b5c3a04e4b
langchain[patch]: chat histories to handle optional community dependence (#21194) 5 months ago
Eugene Yurtsev c9119b0e75
langchain[patch],community[minor]: Move some unit tests from langchain to community, use core for fake models (#21190) 5 months ago
Eugene Yurtsev c306364b06
langchain[patch]: Update more code to use langchain community as an optional dependency (#21170)
More code to use langchain community as an optional dependency
5 months ago
Bagatur 6fa8626e2f
openai[patch]: fix azure open lc serialization, release 0.1.5 (#21159) 5 months ago
Eugene Yurtsev 94a838740e
langchain[patch]: Migrate more code in utils to use optional langchain import (#21166)
Moving is interactive util to avoid circular deps
5 months ago
Eugene Yurtsev 23fdd320bc
langchain[patch]: Migrate more code to use optional community in agents namespace (#21167) 5 months ago
Tomaz Bratanic 9e53fa7d2e
Some more fixes to neo4j enhanced schema (#21139) 5 months ago
Erick Friis 0694538c39
ai21: fix core version (#21168) 5 months ago
Eugene Yurtsev 44602bdc20
langchain[patch],community[minor]: Move load_tools to community (#21158)
Move load tools to community
5 months ago
Eugene Yurtsev 9932f49b3e
langchain[patch]: Migrate llms to use optional community imports (#21101) 5 months ago
Eugene Yurtsev 57e8e70daa
langchain[patch]: Migrate chat models to optional community imports (#21090)
Migrate chat models to optional community imports
5 months ago
Eugene Yurtsev 2914abd747
langchain[patch]: Fix how the serializable test identifies serializable objects (#21165)
dir() will not work if we're using optional imports. The only way to do this is by using contents of __all__
5 months ago
Eugene Yurtsev 23c5d87311
langchain[patch]: Migrate utils to use optional langchain_community (#21163)
Migrate utils to use optional imports from langchain community
5 months ago
Eugene Yurtsev bec3eee3fa
langchain[patch]: Migrate retrievers to use optional langchain community imports (#21155) 5 months ago
Eugene Yurtsev 43110daea5
langchain[patch]: Update some agent tool kits to handle community import as optional (#21157)
A few things that were not caught by the migration script
5 months ago
Eugene Yurtsev 59f10ab3e0
langchain[patch]: Migrate embeddings to optional imports (#21099) 5 months ago
Eugene Yurtsev 2f709d94d7
langchain[patch]: Migrate vectorstores to use optional langchain community imports (#21150) 5 months ago
Eugene Yurtsev 7230e430db
langchain[patch]: Migrate top level files to use optional langchain community (#21152)
Migrate a few top level files to treat langchain community as an optional dependency
5 months ago
Erick Friis daab9789a8
ai21: release 0.1.4 (#21151) 5 months ago
Asaf Joseph Gardin 642975dd9f
partners: AI21 Labs Jamba Support (#20815)
Description: Added support for AI21 new model - Jamba
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Eugene Yurtsev 7a39fe60da
langchain[patch]: Migrate utilities to handle langchain community as optional (#21149) 5 months ago
Eugene Yurtsev b879184595
langchain[patch]: embedddings distance move import of openai embeddings into local scope (#21148) 5 months ago
Eugene Yurtsev 0e5bf16d00
langchain[patch]: Migrate document loaders to use optional langchain community imports (#21095) 5 months ago
Harrison Chase 4d1c21d97d
community[patch]: Fix alternative name in deprecation notice for sql_database (#21144) 5 months ago
East Agile 2a6f78a53f
community[minor]: Rememberizer retriever (#20052)
**Description:**
This pull request introduces a new feature for LangChain: the
integration with the Rememberizer API through a custom retriever.
This enables LangChain applications to allow users to load and sync
their data from Dropbox, Google Drive, Slack, their hard drive into a
vector database that LangChain can query. Queries involve sending text
chunks generated within LangChain and retrieving a collection of
semantically relevant user data for inclusion in LLM prompts.
User knowledge dramatically improved AI applications.
The Rememberizer integration will also allow users to access general
purpose vectorized data such as Reddit channel discussions and US
patents.

**Issue:**
N/A

**Dependencies:**
N/A

**Twitter handle:**
https://twitter.com/Rememberizer
5 months ago
Eugene Yurtsev 1ce1a10f2b
langchain[patch],community[minor]: Move graph index creator (#20795)
Move graph index creator to community
5 months ago
Eugene Yurtsev aa0bc7467c
langchain[patch]: Migrate agents module into optional imports for community (#21088) 5 months ago
Eugene Yurtsev 86ff8a3fb4
langchain[patch]: Update docstore module to use optional imports from community (#21091) 5 months ago
Eugene Yurtsev d640605694
langchain[patch]: Migrate chat loaders to optional community imports (#21089)
Migrate chat loaders to optional community imports
5 months ago
Eugene Yurtsev 2fcab9acd9
langchain[patch]: Upgrade storage to treat langchain community as optional (#21105) 5 months ago
William FH ab55f6996d
[Core] Tracing: update parent run_tree's child_runs (#21049) 5 months ago
aditya thomas 12b1caf295
openai[patch]: add tests for secret_str for keys (#20982)
**Description:** Add tests to check API keys and Active Directory tokens
are masked
**Issue:** Resolves #12165 for OpenAI and Azure OpenAI models
**Dependencies:** None

Also resolves #12473 which may be closed.

Additional contributors @alex4321 (#12473) and @onesolpark (#12542)
5 months ago
Noah 45ddf4d26f
community[patch]: Update comments for lazy_load method (#21063)
- [ ] **PR message**: 
- **Description:** Refactored the lazy_load method to use asynchronous
execution for improved performance. The method now initiates scraping of
all URLs simultaneously using asyncio.gather, enhancing data fetching
efficiency. Each Document object is yielded immediately once its content
becomes available, streamlining the entire process.
    - **Issue:** N/A
- **Dependencies:** Requires the asyncio library for handling
asynchronous tasks, which should already be part of standard Python
libraries in Python 3.7 and above.
    - **Email:** [r73327118@gmail.com](mailto:r73327118@gmail.com)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Liu Xiaodong 3b473d10f2
experimental: clean python repl input(experimental:Added code for PythonREPL) (#20930)
Update python.py(experimental:Added code for PythonREPL)

Added code for PythonREPL, defining a static method 'sanitize_input'
that takes the string 'query' as input and returns a sanitizing string.
The purpose of this method is to remove unwanted characters from the
input string, Specifically:

1. Delete the whitespace at the beginning and end of the string (' \s').
2. Remove the quotation marks (`` ` ``) at the beginning and end of the
string.
3. Remove the keyword "python" at the beginning of the string (case
insensitive) because the user may have typed it.

This method uses regular expressions (regex) to implement sanitizing.

It all started with this code:
from langchain.agents import Tool
from langchain_experimental.utilities import PythonREPL

python_repl = PythonREPL()
repl_tool = Tool(
    name="python_repl",
description="Remove redundant formatting marks at the beginning and end
of source code from input.Use a Python shell to execute python commands.
If you want to see the output of a value, you should print it out with
`print(...)`.",
    func=python_repl.run,
)

When I call the agent to write a piece of code for me and execute it
with the defined code, I must get an error: SyntaxError('invalid
syntax', ('<string>', 1, 1,'In', 1, 2))

After checking, I found that pythonREPL has less formatting of input
code than the soon-to-be deprecated pythonREPL tool, so I added this
step to it, so that no matter what code I ask the agent to write for me,
it can be executed smoothly and get the output result.
I have tried modifying the prompt words to solve this problem before,
but it did not work, and by adding a simple format check, the problem is
well resolved.
<img width="1271" alt="image"
src="https://github.com/langchain-ai/langchain/assets/164149097/c49a685f-d246-4b11-b655-fd952fc2f04c">

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Ismail Hossain Polas 1fdf63fa6c
community[patch]: update package name to bagelML (#19948)
**Description**
This pull request updates the Bagel Network package name from
"betabageldb" to "bagelML" to align with the latest changes made by the
Bagel Network team.

The following modifications have been made:

- Updated all references to the old package name ("betabageldb") with
the new package name ("bagelML") throughout the codebase.
- Modified the documentation, and any relevant scripts to reflect the
package name change.
- Tested the changes to ensure that the functionality remains intact and
no breaking changes were introduced.

By merging this pull request, our project will stay up to date with the
latest Bagel Network package naming convention, ensuring compatibility
and smooth integration with their updated library.

Please review the changes and provide any feedback or suggestions. Thank
you!
5 months ago
Tomaz Bratanic 7860e4c649
experimental[patch]: Add support for non-function calling LLMs in llm graph transformers (#21014) 5 months ago
tianzedavid 5a8909440b
docs: remove repetitive words (#21058)
remove repetitive words

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Tomaz Bratanic c9e96bb5e2
community[patch]: Fix neo4j enhanced schema bugs (#21072) 5 months ago
junkeon 8d2909ee25
upstage[minor]: Update few codes and add upstage loader in pdf section (#21085)
**Description:** Update UpstageLayoutAnalysisParser and Loader and add
upstage loader example in pdf section
**Dependencies:** langchain_community
**Twitter handle:** [@upstageai](https://twitter.com/upstageai)

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Bagatur bef50ded63
openai[patch]: fix special token default behavior (#21131)
By default handle special sequences as regular text
5 months ago
MacanPN 0f7f448603
community[patch]: add delete() method to AzureSearch vector store (#21127)
**Issue:**
Currently `AzureSearch` vector store does not implement `delete` method.
This PR implements it. This also makes it compatible with LangChain
indexer.

**Dependencies:**
None

**Twitter handle:**
@martintriska1

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Erick Friis 14422a4220
langchain: fix core dep (#21128) 5 months ago
Erick Friis 6c938da302
langchain: release 0.1.17 (#21125) 5 months ago
Eugene Yurtsev bf95414758
langchain[minor]: enhance unit test to test imports recursively (#21122) 5 months ago
Eugene Yurtsev e4f51f59a2
langchain[patch]: Migrate tools to treat community imports as optional (#21117)
Migrate tools to treat community imports as optional
5 months ago
Eugene Yurtsev 9e788f09c6
langchain[patch]: Migrate output parsers to support optional community imports (#21103)
Migrate output parsers
5 months ago
Eugene Yurtsev 3853fe9f64
langchain[patch]: Migrate graphs to use optional community imports (#21100)
Migrate graphs to use optional community imports.
5 months ago
Eugene Yurtsev 8658d52587
langchain[patch]: Upgrade prompts to optional imports (#21078)
Upgrades prompts module to use optional imports.

This code was generated with a migration script, but had to be adjusted
manually a bit.

Testing in preparation for applying this code modification across the
rest of the modules in langchain package to reverse the dependency
between langchain community and langchain.
5 months ago
Eugene Yurtsev 9b6d04a187
langchain[patch]: Migrate document transformers (#21098)
Migrate document transformers
5 months ago
Eugene Yurtsev aec13a6123
langchain[patch]: Migrate callbacks module to use optional imports for community (#21086) 5 months ago
Erick Friis 8a62fb0570
community: release 0.0.36 (#21118) 5 months ago
Erick Friis 2407c353be
core: release 0.1.48 (#21113) 5 months ago
Charlie Marsh fd94aa8366
partner[patch]: Upgrade to Ruff v0.4.2 (#21108)
## Summary

No new diagnostics (given that the set of enabled rules hasn't changed),
but gains access to our new parser (much faster) and reduced false
positives all around.
5 months ago
Jamsheed Mistri 3e749369ef
community[minor]: bump version of LayerupSecurity, add support for untrusted_input parameter (#19985)
**Description:** update version of LayerupSecurity package for the
Layerup Security integration. Add untrusted_input parameter.
5 months ago
fubuki8087 f1c3687aa5
community[patch]: Using the right encoding to parse the web page in RecursiveUrlLoader (#20632)
As shown in #13749 , `RecursiveUrlLoader` has encoding issue. This PR is
to solve this.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Jakub Pawłowski b0b1a67771
community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042)
### Description:
When attempting to download PDF files from arXiv, an unexpected 404
error frequently occurs. This error halts the operation, regardless of
whether there are additional documents to process. As a solution, I
suggest implementing a mechanism to ignore and communicate this error
and continue processing the next document from the list.

Proposed Solution: To address the issue of unexpected 404 errors during
PDF downloads from arXiv, I propose implementing the following solution:

- Error Handling: Implement error handling mechanisms to catch and
handle 404 errors gracefully.
- Communication: Inform the user or logging system about the occurrence
of the 404 error.
- Continued Processing: After encountering a 404 error, continue
processing the remaining documents from the list without interruption.

This solution ensures that the application can handle unexpected errors
without terminating the entire operation. It promotes resilience and
robustness in the face of intermittent issues encountered during PDF
downloads from arXiv.

### Issue:
#20909 
### Dependencies:
none

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Erick Friis b9c53e95b7
community: release 0.0.35 (#21104) 5 months ago
Eugene Yurtsev 3c064a757f
core[minor],langchain[patch],community[patch]: Move storage interfaces to core (#20750)
* Move storage interface to core
* Move in memory and file system implementation to core
5 months ago
Charlie Marsh 8f38b7a725
multiple: Remove unnecessary Ruff suppression comments (#21050)
## Summary

I ran `ruff check --extend-select RUF100 -n` to identify `# noqa`
comments that weren't having any effect in Ruff, and then `ruff check
--extend-select RUF100 -n --fix` on select files to remove all of the
unnecessary `# noqa: F401` violations. It's possible that these were
needed at some point in the past, but they're not necessary in Ruff
v0.1.15 (used by LangChain) or in the latest release.

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 748f2ba9ea
core: release 0.1.47 (#21094) 5 months ago
Eugene Yurtsev c8f18a2524
langchain[patch]: Update import handling in `adapters` (#21079) 5 months ago
William FH 5c63ac3dd7
[Patch] Dedent docstring (#20959)
Technically a slight prompt breaking change, but I think positive EV in
that it saves tokens and results in more sane / in-distribution prompts
5 months ago
Eugene Yurtsev 845d8e0025
langchain[patch]: Update handling of deprecation warnings (#21083)
Chains should not be emitting deprecation warnings.
5 months ago
Christophe Bornet 5c77f45b06
community[minor]: Add async methods to CassandraCache and CassandraSemanticCache (#20654) 5 months ago
William FH db14d4326d
[Core] Feat Pretty Print Tool calls (#20997)
Right now, `tool_calls` are not included in the `pretty_print()` output.
Would be nice to show!


![image](https://github.com/langchain-ai/langchain/assets/13333726/6a0ffca3-d02f-4e18-bc76-513eeca2e964)
5 months ago
Kuro Denjiro fa4124b821
community[minor]: add mintbase loader to langchain (#20089)
- [x] **Add Near NFT loader**: "community: Load NFT near block chain
using mintbase graph API"

- [x] **PR message**: 
    - **Description:** a description of the change
    - **Twitter handle:**Kurodenjiro

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Alexander Dicke d7e12750df
community[patch]: allows using `text-generation-inference` /generate route with `HuggingFaceEndpoint` (#20100)
- **Description:** allows to use the /generate route of
`text-generation-inference` with the `HuggingFaceEndpoint`
5 months ago
davidkgp 28b0b0d863
community[patch]: Fix for github issue #17690 (#20117)
…/17690

Thank you for contributing to LangChain!

- [x] **Fix Google Lens knowledge graph issue**: "langchain: community"
- Fix for [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** handled the existence of keys in the json response of
Google Lens
- **Issue:** [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
高远 a7a4630bf4
community[patch]: Modify the text field type and add new exception handling (#20116)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
5 months ago
Rahul Triptahi c172611647
community[patch]: Add classifier_url argument in PebbloSafeLoader and documentation update. (#21030)
Description: Add classifier_url argument in PebbloSafeLoader.
Documentation: Updated PebbloSafeLoader documentation with above change
and new links for pebblo github pages.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Leonid Ganeline 08d08d7c83
docs: langchain docstrings updates (#21032)
Added missed docstings. Formatted docstrings into a consistent format.
5 months ago
Leonid Ganeline 85094cbb3a
docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
5 months ago
Rodrigo Nogueira 90f19028e5
community[patch]: Add maritalk streaming (sync and async) (#19203)
Co-authored-by: RosevalJr <rdmalajr@gmail.com>
Co-authored-by: Roseval Donisete Malaquias Junior <roseval@maritaca.ai>
5 months ago
Cahid Arda Öz cc6191cb90
community[minor]: Add support for Upstash Vector (#20824)
## Description

Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!

#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.

## Dependencies

[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).

## Tests

Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.

There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.

---------

Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline 1a2ff56cd8
core[patch[: docstring update (#21036)
Added missed docstrings. Updated docstrings to consistent format.
5 months ago
Eugene Yurtsev f479a337cc
langchain[patch]: replace deprecated imports with imports from langchain_core (#21033)
* Output of running the migration script.
* Ran only against langchain code itself and not the unit tests.
5 months ago
Eugene Yurtsev 82d4afcac0
langchain[minor]: Code to handle dynamic imports (#20893)
Proposing to centralize code for handling dynamic imports. This allows treating langchain-community as an optional dependency.

---

The proposal is to scan the code base and to replace all existing imports with dynamic imports using this functionality.
5 months ago
Erick Friis 854ae3e1de
mistralai: release 0.1.5, allow client passing in (#21034) 5 months ago
chyroc 3e241956d3
community[minor]: add coze chat model (#20770)
add coze chat model, to call coze.com apis
5 months ago
Eugene Yurtsev 29493bb598
cli[minor]: improve confirmation message with more details (#21027)
Improve confirmation message with more details
5 months ago
Eugene Yurtsev aab78a37f3
cli[patch]: Ignore imports that change the name of the class (#21026)
Not currently handeled by migration script
5 months ago
Massimiliano Pronesti ce89b34fc0
community[patch]: support hybrid search with threshold in Azure AI Search Retriever (#20907)
Support hybrid search with a score threshold -- similar to what we do
for similarity search.
5 months ago
Andrei Panferov b3efa38cc0
community[patch]: GigaChat model selection fix (#20988)
Fixed the error that the model name is never actually put into GigaChat
request payload, always defaulting to `GigaChat-Lite`.

With this fix, model selection through
```python
import os
from langchain.chat_models.gigachat import GigaChat

chat = GigaChat(
    name="GigaChat-Pro", # <- HERE!!!!!
    ...
)
```
should actually work, as intended in
[here](804390ba4b/libs/community/langchain_community/llms/gigachat.py (L36)).

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Patrick McFadin 3331865f6b
community[minor]: add Cassandra Database Toolkit (#20246)
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.

Includes unit test and two notebooks to demonstrate usage. 

**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin

---------

Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Igor Brai b3e74f2b98
community[minor]: add mojeek search util (#20922)
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None

---------

Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
5 months ago
hmn falahi 4822beb298
Ignore self/cls from required args of class functions in convert_to_openai_tool (#20691)
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:

```python
class MemberTool:
    def search_member(
            self,
            keyword: str,
            *args,
            **kwargs,
    ):
        """Search on members with any keyword like first_name, last_name, email

        Args:
            keyword: Any keyword of member
        """

        headers = dict(authorization=kwargs['token'])
        members = []
        try:
            members = request_(
                method='SEARCH',
                url=f'{service_url}/apiv1/members',
                headers=headers,
                json=dict(query=keyword),
            )

        except Exception as e:
            logger.info(e.__doc__)

        return members

convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```

#20685

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev 4f4ee8e2cf
cli[patch]: Update migrations file manually (#21021)
We need to replace occurrences in the code of RunnableMap not just the
import,
so for now, we don't replace RunnableMap.
5 months ago
Tomaz Bratanic 67428c4052
community[patch]: Neo4j enhanced schema (#20983)
Scan the database for example values and provide them to an LLM for
better inference of Text2cypher
5 months ago
aditya thomas 8b59bddc03
anthropic[patch]: add tests for secret_str for api key (#20986)
**Description:** Add tests to check API keys are masked
**Issue:** Resolves
https://github.com/langchain-ai/langchain/issues/12165 for Anthropic
models
**Dependencies:** None
5 months ago
Pengcheng Liu 1fad39be1c
community[minor]: Add LarkSuite wiki document loader. (#21016)
**Description:** Add LarkSuite wiki document loader. Refer to [LarkSuite
api document
](https://open.feishu.cn/document/server-docs/docs/wiki-v2/space-node/list)for
details.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
5 months ago
Leonid Ganeline dc7c06bc07
community[minor]: import fix (#20995)
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
5 months ago
Karim Lalani 2ddac9a7c3
experimental[minor]: Add bind_tools and with_structured_output functions to OllamaFunctions (#20881)
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.

integration unit test has been updated.
notebook has been updated.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev d781560722
cli[minor]: Add ipynb support, add text_splitters (#20963) 5 months ago
WilliamEspegren 804390ba4b
community: Spider integration (#20937)
Added the [Spider.cloud](https://spider.cloud) document loader.
[Spider](https://github.com/spider-rs/spider) is the
[fastest](https://github.com/spider-rs/spider/blob/main/benches/BENCHMARKS.md)
and cheapest crawler that returns LLM-ready data.

```
- **Description:** Adds Spider data loader
- **Dependencies:** spider-client
- **Twitter handle:** @WilliamEspegren 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: = <=>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
ccurme 9ec7151317
fireworks: fix integration tests (#20973) 5 months ago
William FH 9fa9f05e5d
Catch System Error in ast parse (#20961)
I can't seem to reproduce, but i got this:

```
SystemError: AST constructor recursion depth mismatch (before=102, after=37)
```

And the operation isn't critical for the actual forward pass so seems
preferable to expand our caught exceptions
5 months ago
YH 2aca7fcdcf
core[patch]: Enhance link extraction with query parameters (#20259)
**Description**: This update enhances the `extract_sub_links` function
within the `langchain_core/utils/html.py` module to include query
parameters in the extracted URLs.

**Issue**: N/A

**Dependencies**: No additional dependencies required for this change.

**Twitter handle**: N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Chip Davis e818c75f8a
infra: test directory loader multithreaded (#20281)
This is a unit test for #20230 which was a fix for using multithreaded
mode with directory loader @eyurtsev
5 months ago
Guilherme Zanotelli f931a9ce60
community[patch]: Pass kwargs to SPARQLStore from RdfGraph (#20385)
This introduces `store_kwargs` which behaves similarly to `graph_kwargs`
on the `RdfGraph` object, which will enable users to pass `headers` and
other arguments to the underlying `SPARQLStore` object. I have also made
a [PR in `rdflib` to support passing
`default_graph`](https://github.com/RDFLib/rdflib/pull/2761).

Example usage:
```python
from langchain_community.graphs import RdfGraph

graph = RdfGraph(
    query_endpoint="http://localhost/sparql",
    standard="rdf",
    store_kwargs=dict(
        default_graph="http://example.com/mygraph"
    )
)
```

<!--If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.-->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Jorge Piedrahita Ortiz 40b2e2916b
community[minor]: Sambanova llm integration (#20955)
- **Description:** Added [Sambanova systems](https://sambanova.ai/)
integration, including sambaverse and sambastudio LLMs
- **Dependencies:**   sseclient-py  (optional)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Rahul Triptahi 955cf186d2
community[patch]: Ingest source, owner and full_path if present in Document's metadata. (#20949)
Description: The PebbloSafeLoader should first check for owner,
full_path and size in metadata before implementing its own logic.
Dependencies: None
Documentation: NA.

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Amine Djeghri 790ea75cf7
community[minor]: add exllamav2 library for GPTQ & EXL2 models (#17817)
Added 3 files : 
- Library : ExLlamaV2 
- Test integration
- Notebook

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Naveen Tatikonda 8bbdb4f6a0
community[patch]: Add OpenSearch as semantic cache (#20254)
### Description
Use OpenSearch vector store as Semantic Cache.

### Twitter Handle
**@OpenSearchProj**

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Co-authored-by: Harish Tatikonda <harishtatikonda@Harishs-MacBook-Air.local>
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-31-155.ec2.internal>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Mayank Solanki 8c085fc697
community[patch]: Added a function `from_existing_collection` in `Qdrant` vector database. (#20779)
Issue: #20514 
The current implementation of `construct_instance` expects a `texts:
List[str]` that will call the embedding function. This might not be
needed when we already have a client with collection and `path, you
don't want to add any text.

This PR adds a class method that returns a qdrant instance with an
existing client.

Here everytime
cb6e5e56c2/libs/community/langchain_community/vectorstores/qdrant.py (L1592)
`construct_instance` is called, this line sends some text for embedding
generation.

---------

Co-authored-by: Anush <anushshetty90@gmail.com>
5 months ago
Leonid Kuligin 893a924b90
core[minor], community[patch], langchain[patch]: move BaseChatLoader to core (#19607)
Thank you for contributing to LangChain!

- [ ] **PR title**: "core: move BaseChatLoader and BaseToolkit from
community"


- [ ] **PR message**: move BaseChatLoader and BaseToolkit

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Erick Friis d4befd0cfb
core: fix batch ordering test (#20952) 5 months ago
Eugene Yurtsev 8ed150b2fe
cli[minor]: Fix bug to account for name changes (#20948)
* Fix bug to account for name changes / aliases
* Generate migration list from langchain to langchain_core
5 months ago
Eugene Yurtsev 2fa0ff1a2d
cli[minor]: update code to generate migrations from langchain to community (#20946)
Updates code that generates migrations from langchain to community
5 months ago
ccurme bf16cefd18
langchain: deprecate create_structured_output_runnable (#20933) 5 months ago
Erick Friis 38eccab3ae
upstage: release 0.1.3 (#20941) 5 months ago
Sean e1c2e2fdfa
upstage: Upstage Groundedness Check parameter update (#20914)
* Groundedness Check takes `str` or `list[Document]` as input.

* Deprecate `GroundednessCheck` due to its naming.
* Added `UpstageGroundednessCheck`. 

* Hotfix for Groundedness Check parameter. 
  The name `query` was misleading and it should be `answer` instead.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
ccurme 84b8e67c9c
mistral: release 0.1.4 (#20940) 5 months ago
ccurme 465fbaa30b
openai: release 0.1.4 (#20939) 5 months ago
Eugene Yurtsev 12c906f6ce
cli[minor]: Improve partner migrations (#20938)
This auto generates partner migrations.

At the moment the migration is from community -> partner.

So one would need to run the migration script twice to go from langchain to partner.
5 months ago
Eugene Yurtsev 5653f36adc
cli[minor]: Add script to generate migrations for partner packages (#20932)
Add script to help generate migrations.

This works well for partner packages. Migrations are generated based on run time rather than static analysis (much simpler to get the correct migrations implemented).

The script for generating migrations from langchain to community still needs work.
5 months ago
ccurme fe1304afc4
openai: add unit test (#20931)
Test a helper function that was added earlier.
5 months ago
Eugene Yurtsev 6598757037
cli[minor]: Add first version of migrate (#20902)
Adds a first version of the migrate script.
5 months ago
Lei Zhang 9281841cfe
community[patch]: fix integrated test case test_recursive_url_loader.py assertions (issue-20919) (#20920)
**Description:** 
Fix integrated test case test_recursive_url_loader.py

Local testing successful

```shell
(venv) lei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python
cachedir: .pytest_cache
rootdir: /Users/zhanglei/Work/github/langchain/libs/community
configfile: pyproject.toml
plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0
asyncio: mode=Mode.AUTO
collected 6 items                                                                                                                                                                   

tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED                                                                 [ 16%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED                                                   [ 33%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader FAILED                                                                  [ 50%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED                                                                      [ 66%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED                                                                        [ 83%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED                                                   [100%]

===================================================================================== FAILURES ======================================================================================
__________________________________________________________________________ test_sync_recursive_url_loader ___________________________________________________________________________

    def test_sync_recursive_url_loader() -> None:
        url = "https://docs.python.org/3.9/"
        loader = RecursiveUrlLoader(
            url, extractor=lambda _: "placeholder", use_async=False, max_depth=2
        )
        docs = loader.load()
>       assert len(docs) == 23
E       AssertionError: assert 24 == 23
E        +  where 24 = len([Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/', 'content_type': 'text/html', 'title': '3.9.18 Documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/py-modindex.html', 'content_type': 'text/html', 'title': 'Python Module Index — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/download.html', 'content_type': 'text/html', 'title': 'Download — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/howto/index.html', 'content_type': 'text/html', 'title': 'Python HOWTOs — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/whatsnew/index.html', 'content_type': 'text/html', 'title': 'Whatâ\x80\x99s New in Python — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/c-api/index.html', 'content_type': 'text/html', 'title': 'Python/C API Reference Manual — Python 3.9.18 documentation', 'language': None}), ...])

tests/integration_tests/document_loaders/test_recursive_url_loader.py:38: AssertionError
================================================================================= warnings summary ==================================================================================
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
  /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
    k = self.parse_starttag(i)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================ slowest 5 durations ================================================================================
56.75s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
38.99s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
31.20s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
30.37s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
15.44s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
============================================================================== short test summary info ==============================================================================
FAILED tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader - AssertionError: assert 24 == 23
================================================================ 1 failed, 5 passed, 5 warnings in 172.97s (0:02:52) ================================================================
(venv) zhanglei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python
cachedir: .pytest_cache
rootdir: /Users/zhanglei/Work/github/langchain/libs/community
configfile: pyproject.toml
plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0
asyncio: mode=Mode.AUTO
collected 6 items                                                                                                                                                                   

tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED                                                                 [ 16%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED                                                   [ 33%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader PASSED                                                                  [ 50%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED                                                                      [ 66%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED                                                                        [ 83%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED                                                   [100%]

================================================================================= warnings summary ==================================================================================
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
  /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
    k = self.parse_starttag(i)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================ slowest 5 durations ================================================================================
46.99s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
32.43s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
31.23s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
30.75s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
15.89s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
===================================================================== 6 passed, 5 warnings in 157.42s (0:02:37) =====================================================================
(venv) lei@LeideMacBook-Pro community % 
```

**Issue:** https://github.com/langchain-ai/langchain/issues/20919

**Twitter handle:** @coolbeevip
5 months ago
ccurme 7d8d0229fa
remove placeholder error message (#20340) 5 months ago
William FH 4c437ebb9c
Use lstv2 (#20747) 5 months ago
ccurme 891ae37437
langchain: support PineconeVectorStore in self query retriever (#20905)
`langchain_pinecone.Pinecone` is deprecated in favor of
`PineconeVectorStore`, and is currently a subclass of
`PineconeVectorStore`.
```python
@deprecated(since="0.0.3", removal="0.2.0", alternative="PineconeVectorStore")
class Pinecone(PineconeVectorStore):
    """Deprecated. Use PineconeVectorStore instead."""

    pass
```
5 months ago
Matt 28df4750ef
community[patch]: Add initial tests for AzureSearch vector store (#17663)
**Description:** AzureSearch vector store has no tests. This PR adds
initial tests to validate the code can be imported and used.
**Issue:** N/A
**Dependencies:** azure-search-documents and azure-identity are added as
optional dependencies for testing

---------

Co-authored-by: Matt Gotteiner <[email protected]>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Dristy Srivastava 5f1d1666e3
community[patch]: Add support for pebblo server and client version (#20269)
**Description**:
_PebbloSafeLoader_: Add support for pebblo server and client version


**Documentation:** NA
**Unit test:** NA
**Issue:** NA
**Dependencies:**  None

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
am-kinetica b54b19ba1c
community[minor]: Implemented Kinetica Document Loader and added notebooks (#20002)
- [ ] **Kinetica Document Loader**: "community: a class to load
Documents from Kinetica"



- [ ] **Kinetica Document Loader**: 
- **Description:** implemented KineticaLoader in `kinetica_loader.py`
- **Dependencies:** install the Kinetica API using `pip install
gpudb==7.2.0.1 `
5 months ago
Michael Schock 5e60d65917
experimental[patch]: return from HuggingGPT task executor task.run() exception (#20219)
**Description:** Fixes a bug in the HuggingGPT task execution logic
here:

      except Exception as e:
          self.status = "failed"
          self.message = str(e)
      self.status = "completed"
      self.save_product()

where a caught exception effectively just sets `self.message` and can
then throw an exception if, e.g., `self.product` is not defined.

**Issue:** None that I'm aware of.
**Dependencies:** None
**Twitter handle:** https://twitter.com/michaeljschock

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Anish Chakraborty 898362de81
core[patch]: improve comma separated list output parser to handle non-space separated list (#20434)
- **Description:** Changes
`lanchain_core.output_parsers.CommaSeparatedListOutputParser` to handle
`,` as a delimiter alongside the previous implementation which used `, `
as delimiter.
- **Issue:** Started noticing that some results returned by LLMs were
not getting parsed correctly when the output contained `,` instead of `,
`.
  - **Dependencies:** No
  - **Twitter handle:** not active on twitter.


<!---
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
-->
5 months ago
Michael Schock 63a07f52df
experimental[patch]: remove \n from AutoGPT feedback_tool exit check (#20132) 5 months ago
Shengsheng Huang fd1061e7bf
community[patch]: add more data types support to ipex-llm llm integration (#20833)
- **Description**:  
- **add support for more data types**: by default `IpexLLM` will load
the model in int4 format. This PR adds more data types support such as
`sym_in5`, `sym_int8`, etc. Data formats like NF3, NF4, FP4 and FP8 are
only supported on GPU and will be added in future PR.
    - Fix a small issue in saving/loading, update api docs
- **Dependencies**: `ipex-llm` library
- **Document**: In `docs/docs/integrations/llms/ipex_llm.ipynb`, added
instructions for saving/loading low-bit model.
- **Tests**: added new test cases to
`libs/community/tests/integration_tests/llms/test_ipex_llm.py`, added
config params.
- **Contribution maintainer**: @shane-huang
5 months ago
Rahul Triptahi dc921f0823
community[patch]: Add semantic info to metadata, classified by pebblo-server. (#20468)
Description: Add support for Semantic topics and entities.
Classification done by pebblo-server is not used to enhance metadata of
Documents loaded by document loaders.
Dependencies: None
Documentation: Updated.

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Eugene Yurtsev a5028b6356
cli[minor]: Add __version__ (#20903)
Add __version__ to cli
5 months ago
Jingpan Xiong 1202017c56
community[minor]: Add relyt vector database (#20316)
Co-authored-by: kaka <kaka@zbyte-inc.cloud>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: jingsi <jingsi@leadincloud.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
davidefantiniIntel f386f71bb3
community: fix tqdm import (#20263)
Description: Fix tqdm import in QuantizedBiEncoderEmbeddings
5 months ago
Andres Algaba 05ae8ca7d4
community[patch]: deprecate persist method in Chroma (#20855)
Thank you for contributing to LangChain!

- [x] **PR title**

- [x] **PR message**:
- **Description:** Deprecate persist method in Chroma no longer exists
in Chroma 0.4.x
    - **Issue:** #20851 
    - **Dependencies:** None
    - **Twitter handle:** AndresAlgaba1

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
ccurme fdabd3cdf5
mistral, openai: support custom tokenizers in chat models (#20901) 5 months ago
ccurme b8db73233c
core, community: deprecate tool.__call__ (#20900)
Does not update docs.
5 months ago
Tomaz Bratanic 520972fd0f
community[patch]: Support passing graph object to Neo4j integrations (#20876)
For driver connection reusage, we introduce passing the graph object to
neo4j integrations
5 months ago
Lei Zhang 748a6ae609
community[patch]: add HTTP response headers Content-Type to metadata of RecursiveUrlLoader document (#20875)
**Description:** 
The RecursiveUrlLoader loader offers a link_regex parameter that can
filter out URLs. However, this filtering capability is limited, and if
the internal links of the website change, unexpected resources may be
loaded. These resources, such as font files, can cause problems in
subsequent embedding processing.

>
https://blog.langchain.dev/assets/fonts/source-sans-pro-v21-latin-ext_latin-regular.woff2?v=0312715cbf

We can add the Content-Type in the HTTP response headers to the document
metadata so developers can choose which resources to use. This allows
developers to make their own choices.

For example, the following may be a good choice for text knowledge.

- text/plain - simple text file
- text/html - HTML web page
- text/xml - XML format file
- text/json - JSON format data
- application/pdf - PDF file
- application/msword - Word document

and ignore the following

- text/css - CSS stylesheet
- text/javascript - JavaScript script
- application/octet-stream - binary data
- image/jpeg - JPEG image
- image/png - PNG image
- image/gif - GIF image
- image/svg+xml - SVG image
- audio/mpeg - MPEG audio files
- video/mp4 - MP4 video file
- application/font-woff - WOFF font file
- application/font-ttf - TTF font file
- application/zip - ZIP compressed file
- application/octet-stream - binary data

**Twitter handle:** @coolbeevip

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Erick Friis eca3640af7
upstage: release 0.1.2 (#20898) 5 months ago
Joan Fontanals baefbfb14e
community[mionr]: add Jina Reranker in retrievers module (#19406)
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Rerank API
- **Twitter handle:** https://twitter.com/JinaAI_


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Erick Friis 92969d49cb
multiple: remove external repo mds (#20896)
api docs build doesn't tolerate them
5 months ago
Jason_Chen 53bb7dbd29
community[patch]: add BeautifulSoupTransformer remove_unwanted_classnames method (#20467)
Add the remove_unwanted_classnames method to the
BeautifulSoupTransformer class, which can filter more effectively.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
YISH ed26149a29
openai[patch]: Allow disablling safe_len_embeddings(OpenAIEmbeddings) (#19743)
OpenAI API compatible server may not support `safe_len_embedding`, 

use `disable_safe_len_embeddings=True` to disable it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Bagatur 5b83130855
core[minor], langchain[patch], community[patch]: mv StructuredQuery (#20849)
mv StructuredQuery to core
5 months ago
Sean 540f384197
partner: Upstage quick documentation update (#20869)
* Updating the provider docs page. 
The RAG example was meant to be moved to cookbook, but was merged by
mistake.

* Fix bug in Groundedness Check

---------

Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Bagatur ffad3985a1
core[patch]: Release 0.1.46 (#20891) 5 months ago
Mish Ushakov 6ccecf2363
community[minor]: added Browserbase loader (#20478) 5 months ago
Erick Friis 5da9dd1195
mistral: comment batching param (#20868)
Addresses #20523
5 months ago
Ivaylo Bratoev 7c5063ef60
infra: fix how Poetry is installed in the dev container (#20521)
Currently, when a new dev container is created, poetry does not work in
it with the error "No module named 'rapidfuzz'".

Install Poetry outside the project venv so that poetry and project
dependencies do not get mixed. Use pipx to install poetry securely in
its own isolated environment.

Issue: #12237

Twitter handle: https://twitter.com/ibratoev

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
GustavoSept c2d09a5186
experimental[patch]: Makes regex customizable in text_splitter.py (SemanticChunker class) (#20485)
- **Description:** Currently, the regex is static (`r"(?<=[.?!])\s+"`),
which is only useful for certain use cases. The current change only
moves this to be a parameter of split_text(). Which adds flexibility
without making it more complex (as the default regex is still the same).
- **Issue:** Not applicable (I searched, no one seems to have created
this issue yet).
  - **Dependencies:** None.


_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17._

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
William FH a936f696a6
[Core] Feat: update config CVar in tool.invoke (#20808) 5 months ago
Lei Zhang 2cd907ad7e
text-splitters[patch]: fix MarkdownHeaderTextSplitter fails to parse headers with non-printable characters (#20645)
Description: MarkdownHeaderTextSplitter Fails to Parse Headers with
non-printable characters. more #20643

The following is the official test case. Just replacing `# Foo\n\n` with
`\ufeff# Foo\n\n` will cause the test case to fail.

chunk metadata is empty

```python
def test_md_header_text_splitter_1() -> None:
    """Test markdown splitter by header: Case 1."""

    markdown_document = (
        "\ufeff# Foo\n\n"
        "    ## Bar\n\n"
        "Hi this is Jim\n\n"
        "Hi this is Joe\n\n"
        " ## Baz\n\n"
        " Hi this is Molly"
    )
    headers_to_split_on = [
        ("#", "Header 1"),
        ("##", "Header 2"),
    ]
    markdown_splitter = MarkdownHeaderTextSplitter(
        headers_to_split_on=headers_to_split_on,
    )
    output = markdown_splitter.split_text(markdown_document)
    expected_output = [
        Document(
            page_content="Hi this is Jim  \nHi this is Joe",
            metadata={"Header 1": "Foo", "Header 2": "Bar"},
        ),
        Document(
            page_content="Hi this is Molly",
            metadata={"Header 1": "Foo", "Header 2": "Baz"},
        ),
    ]
    assert output == expected_output
```

twitter: @coolbeevip

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
ccurme 481d3855dc
patch: remove usage of llm, chat model __call__ (#20788)
- `llm(prompt)` -> `llm.invoke(prompt)`
- `llm(prompt=prompt` -> `llm.invoke(prompt)` (same with `messages=`)
- `llm(prompt, callbacks=callbacks)` -> `llm.invoke(prompt,
config={"callbacks": callbacks})`
- `llm(prompt, **kwargs)` -> `llm.invoke(prompt, **kwargs)`
5 months ago
Raghav Dixit 9b7fb381a4
community[patch]: LanceDB integration patch update (#20686)
Description : 

- added functionalities - delete, index creation, using existing
connection object etc.
- updated usage 
- Added LaceDB cloud OSS support

make lint_diff , make test checks done
5 months ago
Nikita Pokidyshev 9e983c9500
langchain[patch]: fix agent_token_buffer_memory not working with openai tools (#20708)
- **Description:** fix a bug in the agent_token_buffer_memory
- **Issue:** agent_token_buffer_memory was not working with openai tools
- **Dependencies:** None
- **Twitter handle:** @pokidyshef
5 months ago
Erick Friis 1aef8116de
upstage: release 0.1.1 (#20864) 5 months ago
junkeon c8fd51e8c8
upstage: Add Upstage partner package LA and GC (#20651)
---------

Co-authored-by: Sean <chosh0615@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Sean Cho <sean@upstage.ai>
5 months ago
Alex Lee 243ba71b28
langchain[patch]: add `aprep_output` method to `langchain/chains/base.py` (#20748)
## Description

Add `aprep_output` method to `langchain/chains/base.py`. Some downstream
`ChatMessageHistory` objects that use async connections require an async
way to append to the context.

It turned out that `ainvoke()` was calling `prep_output` which is
synchronous.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Harrison Chase 43c041cda5
support messages in messages out (#20862) 5 months ago
back2nix a1614b88ac
groq[patch]: groq proxy support (#20758)
# Proxy Fix for Groq Class 🐛 🚀

## Description
This PR fixes a bug related to proxy settings in the `Groq` class,
allowing users to connect to LangChain services via a proxy.

## Changes Made
-  FIX support for specifying proxy settings in the `Groq` class.
-  Resolved the bug causing issues with proxy settings.
-  Did not include unit tests and documentation updates.
-  Did not run make format, make lint, and make test to ensure code
quality and functionality because I couldn't get it to run, so I don't
program in Python and couldn't run `ruff`.
-  Ensured that the changes are backwards compatible.
-  No additional dependencies were added to `pyproject.toml`.

### Error Before Fix
```python
Traceback (most recent call last):
  File "/home/bg/Documents/code/github.com/back2nix/test/groq/main.py", line 9, in <module>
    chat = ChatGroq(
           ^^^^^^^^^
  File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for ChatGroq
__root__
  Invalid `http_client` argument; Expected an instance of `httpx.AsyncClient` but got <class 'httpx.Client'> (type=type_error)
  ```
  
### Example usage after fix
  ```python3
import os

import httpx
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq

chat = ChatGroq(
    temperature=0,
    groq_api_key=os.environ.get("GROQ_API_KEY"),
    model_name="mixtral-8x7b-32768",
    http_client=httpx.Client(
        proxies="socks5://127.0.0.1:1080",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
    http_async_client=httpx.AsyncClient(
        proxies="socks5://127.0.0.1:1080",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)

system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | chat
out = chain.invoke({"text": "Explain the importance of low latency LLMs"})

print(out)
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
volodymyr-memsql 493afe4d8d
community[patch]: add hybrid search to singlestoredb vectorstore (#20793)
Implemented the ability to enable full-text search within the
SingleStore vector store, offering users a versatile range of search
strategies. This enhancement allows users to seamlessly combine
full-text search with vector search, enabling the following search
strategies:

* Search solely by vector similarity.
* Conduct searches exclusively based on text similarity, utilizing
Lucene internally.
* Filter search results by text similarity score, with the option to
specify a threshold, followed by a search based on vector similarity.
* Filter results by vector similarity score before conducting a search
based on text similarity.
* Perform searches using a weighted sum of vector and text similarity
scores.

Additionally, integration tests have been added to comprehensively cover
all scenarios.
Updated notebook with examples.

CC: @baskaryan, @hwchase17

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Tomaz Bratanic 9efab3ed66
community[patch]: Add driver config param for neo4j graph (#20772)
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline 13751c3297
community: `tigergraph` fixes (#20034)
- added guard on the `pyTigerGraph` import
- added a missed example page in the `docs/integrations/graphs/`
- formatted the `docs/integrations/providers/` page to the consistent
format. Added links.
5 months ago
Martin Kolb 0186e4e633
community[patch]: Advanced filtering for HANA Cloud Vector Engine (#20821)
- **Description:**
This PR adds support for advanced filtering to the integration of HANA
Vector Engine.
The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt,
$lte, $between, $in, $nin, $like, $and, $or

  - **Issue:** N/A
  - **Dependencies:** no new dependencies added

Added integration tests to:
`libs/community/tests/integration_tests/vectorstores/test_hanavector.py`

Description of the new capabilities in notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
5 months ago
Alex Sherstinsky 12e5ec6de3
community: Support both Predibase SDK-v1 and SDK-v2 in Predibase-LangChain integration (#20859) 5 months ago
Erick Friis 8c95ac3145
docs, multiple: de-beta with_structured_output (#20850) 5 months ago
Nuno Campos 477eb1745c
Better support for subgraphs in graph viz (#20840) 5 months ago
JeffKatzy 5ab3f9a995
community[patch]: standardize chat init args (#20844)
Thank you for contributing to LangChain!

community:perplexity[patch]: standardize init args

updated pplx_api_key and request_timeout so that aliased to api_key, and
timeout respectively. Added test that both continue to set the same
underlying attributes.

Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Massimiliano Pronesti 8d1167b32f
community[patch]: add support for similarity_score_threshold search in… (#20852)
See
https://github.com/langchain-ai/langchain/issues/20600#issuecomment-2075569338
for details.

@chrislrobert
5 months ago
Eugene Yurtsev d8aa72f51d
core[minor],langchain[patch]: Move base indexing interface and logic to core (#20667)
This PR moves the interface and the logic to core.

The following changes to namespaces:


`indexes` -> `indexing`
`indexes._api` -> `indexing.api`


Testing code is intentionally duplicated for now since it's testing
different
implementations of the record manager (in-memory vs. SQL).

Common logic will need to be pulled out into the test client.


A follow up PR will move the SQL based implementation outside of
LangChain.
5 months ago
ccurme 3bcfbcc871
groq: handle null queue_time (#20839) 5 months ago
Eugene Yurtsev 30e48c9878
core[patch],community[patch]: Move file chat history back to community (#20834)
Marking as patch since we haven't had releases in between. This just reverting part of a PR from yesterday.
5 months ago
ccurme 6debadaa70
groq: bump core (#20838) 5 months ago
Erick Friis 7984206c95
groq: release 0.1.3 (#20836)
Fixes #20811
5 months ago
Nestor Qin 9111d3a636
community[patch]: Fix message formatting for Anthropic models on Amazon Bedrock (#20801)
**Description:**
This PR fixes an issue in message formatting function for Anthropic
models on Amazon Bedrock.

Currently, LangChain BedrockChat model will crash if it uses Anthropic
models and the model return a message in the following type:
- `AIMessageChunk`

Moreover, when use BedrockChat with for building Agent, the following
message types will trigger the same issue too:
- `HumanMessageChunk`
- `FunctionMessage`

**Issue:**
https://github.com/langchain-ai/langchain/issues/18831

**Dependencies:**
No.

**Testing:**
Manually tested. The following code was failing before the patch and
works after.

```
@tool
def square_root(x: str):
    "Useful when you need to calculate the square root of a number"
    return math.sqrt(int(x))

llm = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={ "temperature": 0.0 },
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", FUNCTION_CALL_PROMPT),
        ("human", "Question: {user_input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

tools = [square_root]
tools_string = format_tool_to_anthropic_function(square_root)

agent = (
        RunnablePassthrough.assign(
            user_input=lambda x: x['user_input'],
            agent_scratchpad=lambda x: format_to_openai_function_messages(
                x["intermediate_steps"]
            )
        )
        | prompt
        | llm
        | AnthropicFunctionsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True)
output = agent_executor.invoke({
    "user_input": "What is the square root of 2?",
    "tools_string": tools_string,
})
```
List of messages returned from Bedrock:
```
<SystemMessage> content='You are a helpful assistant.'
<HumanMessage> content='Question: What is the square root of 2?'
<AIMessageChunk> content="Okay, let's calculate the square root of 2.<scratchpad>\nTo calculate the square root of a number, I can use the square_root tool:\n\n<function_calls>\n  <invoke>\n    <tool_name>square_root</tool_name>\n    <parameters>\n      <__arg1>2</__arg1>\n    </parameters>\n  </invoke>\n</function_calls>\n</scratchpad>\n\n<function_results>\n<search_result>\nThe square root of 2 is approximately 1.414213562373095\n</search_result>\n</function_results>\n\n<answer>\nThe square root of 2 is approximately 1.414213562373095\n</answer>" id='run-92363df7-eff6-4849-bbba-fa16a1b2988c'"
<FunctionMessage> content='1.4142135623730951' name='square_root'
```
5 months ago
ccurme 06b04b80b8
groq: fix warning filter for integration test (#20806) 5 months ago
ccurme 5a3c65a756
standard tests: add xfails (#20659) 5 months ago
Erick Friis ddc2274aea
standard-tests: split tool calling test (#20803)
just making it a bit easier to grok
5 months ago
ccurme 6622829c67
mistral: catch GatedRepoError, release 0.1.3 (#20802)
https://github.com/langchain-ai/langchain/issues/20618

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Eugene Yurtsev a7c347ab35
langchain[patch]: Update evaluation logic that instantiates a default LLM (#20760)
Favor langchain_openai over langchain_community for evaluation logic.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
5 months ago
Eugene Yurtsev 72f720fa38
langchain[major]: Remove default instantations of LLMs from VectorstoreToolkit (#20794)
Remove default instantiation from vectorstore toolkit.
5 months ago
ccurme 42de5168b1
langchain: deprecate LLMChain, RetrievalQA, and ConversationalRetrievalChain (#20751) 5 months ago
Erick Friis 30c7951505
core: use qualname in beta message (#20361) 5 months ago
Aliaksandr Kuzmik 5560cc448c
community[patch]: fix CometTracer bug (#20796)
Hi! My name is Alex, I'm an SDK engineer from
[Comet](https://www.comet.com/site/)

This PR updates the `CometTracer` class.

Fixed an issue when `CometTracer` failed while logging the data to Comet
because this data is not JSON-encodable.

The problem was in some of the `Run` attributes that could contain
non-default types inside, now these attributes are taken not from the
run instance, but from the `run.dict()` return value.
5 months ago
Eugene Yurtsev 1c89e45c14
langchain[major]: breaks some chains to remove hidden defaults (#20759)
Breaks some chains in langchain to remove hidden chat model / llm instantiation.
5 months ago
Eugene Yurtsev ad6b5f84e5
community[patch],core[minor]: Move in memory cache implementation to core (#20753)
This PR moves the InMemoryCache implementation from community to core.
5 months ago
Eugene Yurtsev a2cc9b55ba
core[patch]: Remove autoupgrade to addable dict in Runnable/RunnableLambda/RunnablePassthrough transform (#20677)
Causes an issue for this code

```python
from langchain.chat_models.openai import ChatOpenAI
from langchain.output_parsers.openai_tools import JsonOutputToolsParser
from langchain.schema import SystemMessage

prompt = SystemMessage(content="You are a nice assistant.") + "{question}"

llm = ChatOpenAI(
    model_kwargs={
        "tools": [
            {
                "type": "function",
                "function": {
                    "name": "web_search",
                    "description": "Searches the web for the answer to the question.",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "query": {
                                "type": "string",
                                "description": "The question to search for.",
                            },
                        },
                    },
                },
            }
        ],
    },
    streaming=True,
)

parser = JsonOutputToolsParser(first_tool_only=True)

llm_chain = prompt | llm | parser | (lambda x: x)


for chunk in llm_chain.stream({"question": "tell me more about turtles"}):
    print(chunk)

# message = llm_chain.invoke({"question": "tell me more about turtles"})

# print(message)
```

Instead by definition, we'll assume that RunnableLambdas consume the
entire stream and that if the stream isn't addable then it's the last
message of the stream that's in the usable format.

---

If users want to use addable dicts, they can wrap the dict in an
AddableDict class.

---

Likely, need to follow up with the same change for other places in the
code that do the upgrade
5 months ago
Oleksandr Yaremchuk 9428923bab
experimental[minor]: upgrade the prompt injection model (#20783)
- **Description:** In January, Laiyer.ai became part of ProtectAI, which
means the model became owned by ProtectAI. In addition to that,
yesterday, we released a new version of the model addressing issues the
Langchain's community and others mentioned to us about false-positives.
The new model has a better accuracy compared to the previous version,
and we thought the Langchain community would benefit from using the
[latest version of the
model](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2).
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** @alex_yaremchuk
5 months ago