Commit Graph

1005 Commits (175ef0a55d2fc04f6fd02089d40a35a2216237fa)

Author SHA1 Message Date
Harrison Chase a1ade48e8f
update agent docs (#10894) 1 year ago
Bagatur d37ce48e60
sep base url and loaded url in sub link extraction (#10895) 1 year ago
Bagatur 24cb5cd379
bump 298 (#10892) 1 year ago
Bagatur c1f9cc0bc5
recursive loader add status check (#10891) 1 year ago
Matvey Arye 6e02c45ca4
Add integration for Timescale Vector(Postgres) (#10650)
**Description:**
This commit adds a vector store for the Postgres-based vector database
(`TimescaleVector`).

Timescale Vector(https://www.timescale.com/ai) is PostgreSQL++ for AI
applications. It enables you to efficiently store and query billions of
vector embeddings in `PostgreSQL`:
- Enhances `pgvector` with faster and more accurate similarity search on
1B+ vectors via DiskANN inspired indexing algorithm.
- Enables fast time-based vector search via automatic time-based
partitioning and indexing.
- Provides a familiar SQL interface for querying vector embeddings and
relational data.

Timescale Vector scales with you from POC to production:
- Simplifies operations by enabling you to store relational metadata,
vector embeddings, and time-series data in a single database.
- Benefits from rock-solid PostgreSQL foundation with enterprise-grade
feature liked streaming backups and replication, high-availability and
row-level security.
- Enables a worry-free experience with enterprise-grade security and
compliance.

Timescale Vector is available on Timescale, the cloud PostgreSQL
platform. (There is no self-hosted version at this time.) LangChain
users get a 90-day free trial for Timescale Vector.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Avthar Sewrathan <avthar@timescale.com>
1 year ago
Michael Feil 55570e54e1
gradient.ai LLM intregration (#10800)
- **Description:** This PR implements a new LLM API to
https://gradient.ai
- **Issue:** Feature request for LLM #10745 
- **Dependencies**: No additional dependencies are introduced. 
- **Tag maintainer:** I am opening this PR for visibility, once ready
for review I'll tag.

- ```make format && make lint && make test``` is running.
- added a `integration` and `mock unit` test.


Co-authored-by: michaelfeil <me@michaelfeil.eu>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 5097007407
cleanup recursive url session (#10863) 1 year ago
Harrison Chase 777b33b873
fix experimental imports (#10875) 1 year ago
Harrison Chase 808caca607
beef up agent docs (#10866) 1 year ago
Sharath Rajasekar 96023f94d9
Add Javelin integration (#10275)
We are introducing the py integration to Javelin AI Gateway
www.getjavelin.io. Javelin is an enterprise-scale fast llm router &
gateway. Could you please review and let us know if there is anything
missing.

Javelin AI Gateway wraps Embedding, Chat and Completion LLMs. Uses
javelin_sdk under the covers (pip install javelin_sdk).

Author: Sharath Rajasekar, Twitter: @sharathr, @javelinai

Thanks!!
1 year ago
Bagatur 957956ba6d
bump 297 (#10861) 1 year ago
Harrison Chase 1bc3244db9
fix loading of sql chain (#10860)
Closing #6889
1 year ago
Bagatur b05a74b106
fix recursive loader (#10856) 1 year ago
Bagatur de0a02f507
fix extract sublink bug (#10855) 1 year ago
Harrison Chase 7dec2d399b
format intermediate steps (#10794)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
1 year ago
Harrison Chase 386ef1e654
add agent output parsers (#10790) 1 year ago
Mukit Momin 67c5950df3
Amazon Bedrock Support Streaming (#10393)
### Description

- Add support for streaming with `Bedrock` LLM and `BedrockChat` Chat
Model.
- Bedrock as of now supports streaming for the `anthropic.claude-*` and
`amazon.titan-*` models only, hence support for those have been built.
- Also increased the default `max_token_to_sample` for Bedrock
`anthropic` model provider to `256` from `50` to keep in line with the
`Anthropic` defaults.
- Added examples for streaming responses to the bedrock example
notebooks.

**_NOTE:_**: This PR fixes the issues mentioned in #9897 and makes that
PR redundant.
1 year ago
Bagatur 0749a642f5
Stream refac and vertex streaming (#10470)
---------

Co-authored-by: Terry Cruz Melo <tcruz@vozy.co>
Co-authored-by: Terry Cruz Melo <33166112+TerryCM@users.noreply.github.com>
1 year ago
William FH f421af8b80
Criteria Parser Improvements (#10824) 1 year ago
Bagatur 46aa90062b
bump exp 19 (#10851) 1 year ago
Bagatur 775f3edffd
bump 296 (#10842) 1 year ago
Bagatur 96a9c27116
fix recursive loader (#10752)
maintain same base url throughout recursion, yield initial page, fixing
recursion depth tracking
1 year ago
Nuno Campos 276125a33b
Use shallow copy on runnable locals (#10825)
- deep copy prevents storing complex objects in locals
1 year ago
DanielZzz ebe08412ad
fix: chat_models Qianfan not compatiable with SystemMessage (#10642)
- **Description:** QianfanEndpoint bugs for SystemMessages. When the
`SystemMessage` is input as the messages to
`chat_models.QianfanEndpoint`. A `TypeError` will be raised.
  - **Issue:** #10643
  - **Dependencies:** 
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** no
1 year ago
Massimiliano Pronesti f0198354d9
fix(embeddings): number of texts in Azure OpenAIEmbeddings batch (#10707)
This PR addresses the limitation of Azure OpenAI embeddings, which can
handle at maximum 16 texts in a batch. This can be solved setting
`chunk_size=16`. However, I'd love to have this automated, not to force
the user to figure where the issue comes from and how to solve it.

Closes #4575. 

@baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
zhanghexian 0abe996409
add clustered vearch in langchain (#10771)
---------

Co-authored-by: zhanghexian1 <zhanghexian1@jd.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
HeTaoPKU f505320a73
Add Minimax chat model (#10776)
resolve the merging issues for
https://github.com/langchain-ai/langchain/pull/6757

---------

Co-authored-by: 何涛 <taohe@bytedance.com>
1 year ago
Anar c656a6b966
LLMRails (#10796)
### LLMRails Integration
This PR provides integration with LLMRails. Implemented here are:

langchain/vectorstore/llm_rails.py
tests/integration_tests/vectorstores/test_llm_rails.py
docs/extras/integrations/vectorstores/llm-rails.ipynb

---------

Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
mateai 900dbd1cbe
Substring support for similarity_search_with_score (#10746)
**Description:** Possible to filter with substrings in
similarity_search_with_score, for example: filter={'user_id':
{'substring': 'user'}}

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Ansil M B 740eafe41d
Updated return parameter of YouTubeSearchTool (#10743)
**Description:** 
changed return parameter of YouTubeSearchTool
 

1. changed the returning links of youtube videos by adding prefix
"https://www.youtube.com", now this will return the exact links to the
videos
2. updated the returning type from 'string' to 'list', which will be
more suited for further processings

 **Issue:** 
Fixes #10742

 **Dependencies:** 
None


<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** changed return parameter of YouTubeSearchTool
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** None
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 1dae3c383e
Harrison/add submodule to docs (#10803) 1 year ago
Henry (Hezheng) Yin c15bbaac31
misc: add gpt-3.5-turbo-instruct to model_token_mapping (#10808)
A one-line fix to get`max_tokens=-1` working `OpenAI` class for
`gpt-3.5-turbo-instruct` model.

Closes https://github.com/langchain-ai/langchain/issues/10806
1 year ago
Harrison Chase d2bee34d4c
Harrison/add vald (#10807)
Co-authored-by: datelier <57349093+datelier@users.noreply.github.com>
1 year ago
Jacob Lee bbc3fe259b
Start RunnableBranch callback tags with 1 instead of 0 (#10755)
Changes to match `RunnableSequences`

@eyurtsev
1 year ago
Ziyang Liu 931b292126
Add support for HTTP PUT in the open api agent prompt (#10763)
**Description:** This PR adds HTTP PUT support for the langchain openapi
agent toolkit by leveraging existing structure and HTTP put request
wrapper. The PUT method is almost identical to HTTP POST but should be
idempotent and therefore tighter than POST which is not idempotent. Some
APIs may consider to use PUT instead of POST which is unfortunately not
supported with the current toolkit yet.
1 year ago
Mateusz Wosinski a29cd89923
Synthetic data generation (#9759)
### Description

Implements synthetic data generation with the fields and preferences
given by the user. Adds showcase notebook.
Corresponding prompt was proposed for langchain-hub.

### Example

```
output = chain({"fields": {"colors": ["blue", "yellow"]}, "preferences": {"style": "Make it in a style of a weather forecast."}})
print(output)

# {'fields': {'colors': ['blue', 'yellow']},
 'preferences': {'style': 'Make it in a style of a weather forecast.'},
 'text': "Good morning! Today's weather forecast brings a beautiful combination of colors to the sky, with hues of blue and yellow gently blending together like a mesmerizing painting."}
```

### Twitter handle 

@deepsense_ai @matt_wosinski

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur c4a6de3fc9
Revert "Add ChatGLM for llm and chat_model by using ChatGLM API (#9797)" (#10805)
@etveritas reverting for now until this is resolved
https://github.com/langchain-ai/langchain/pull/9797/files#r1330795585,
apologies for merging too eagerly!
1 year ago
Mickaël c86a1a6710
chore: allow using dataclasses_json dependency v0.6.0 (#10775)
**Description:** upgrade the `dataclasses_json` dependency to its latest
version ([no real breaking
change](https://github.com/lidatong/dataclasses-json/releases/tag/v0.6.0)
if used correctly), while allowing previous version to not break other
users' setup
**Issue:** I need to use the latest version of that dependency in my
project, but `langchain` prevents it.

Note: it looks like running `poetry lock --no-update` did some changes
to the lockfiles as it was the first time it was with the
`macosx_11_0_arm64` architecture 🤷

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Bagatur 76dd7480e6
Add batch_size param to Weaviate vector store (#9890)
cc @mcantillon21 @hsm207 @cs0lar
1 year ago
Mateusz Wosinski 720f6dbaac
Add XMLOutputParser (#10051)
**Description**
Adds new output parser, this time enabling the output of LLM to be of an
XML format. Seems to be particularly useful together with Claude model.
Addresses [issue
9820](https://github.com/langchain-ai/langchain/issues/9820).

**Twitter handle**
@deepsense_ai @matt_wosinski
1 year ago
etVERITAS d6df288380
Add ChatGLM for llm and chat_model by using ChatGLM API (#9797)
using sample:
```
endpoint_url = API URL
ChatGLM_llm = ChatGLM(
    endpoint_url=endpoint_url,
    api_key=Your API Key by ChatGLM
)
print(ChatGLM_llm("hello"))
```

```
model = ChatChatGLM(
    chatglm_api_key="api_key",
    chatglm_api_base="api_base_url",
    model_name="model_name"
)
chain = LLMChain(llm=model)
```
Description: The call of ChatGLM has been adapted.
Issue: The call of ChatGLM has been adapted.
Dependencies: Need python package `zhipuai` and `aiostream`
Tag maintainer: @baskaryan
Twitter handle: None

I remove the compatibility test for pydantic version 2, because pydantic
v2 can't not pickle classmethod,but BaseModel use @root_validator is a
classmethod decorator.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase d60145229b
make agent action serializable (#10797)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Maxime Bourliatoux 21b236e5e4
Fixing _InactiveRpcError in MatchingEngine vectorstore (#10056)
- Description: There was an issue with the MatchingEngine VectorStore,
preventing from using it with a public endpoint. In the Google Cloud
library there are two similar methods for private or public endpoints :
`match()` and `find_neighbors()`.
  - Issue: Fixes #8378 
- This uses the `google.cloud.aiplatform` library :
https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/matching_engine/matching_engine_index_endpoint.py
1 year ago
Sam Chou 4f19ba3065
Azure Search: Remove select field restrictions and expand metadata to other fields, also expose kwargs to searches (#9894)
Description: 
If metadata field returned in results, previous behavior unchanged. If
metadata field does not exist in results, expand metadata to any fields
returned outside of content field.

There's precedence for this as well, see the retriever:
https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/azure_cognitive_search.py#L96C46-L96C46

Issue: 
#9765 - Ameliorates hard-coding in case you already indexed to cognitive
search without a metadata field but rather placed metadata in separate
fields.

@hwchase17
1 year ago
Piyush Jain 94cf71ecfa
Updated Neptune graph to use boto (#10121)
## Description
This PR updates the `NeptuneGraph` class to start using the boto API for
connecting to the Neptune service. With boto integration, the graph
class now supports authenticating requests using Sigv4; this is
encapsulated with the boto API, and users only have to ensure they have
the correct AWS credentials setup in their workspace to work with the
graph class.

This PR also introduces a conditional prompt that uses a simpler prompt
when using the `Anthropic` model provider. A simpler prompt have seemed
to work better for generating cypher queries in our testing.

**Note**: This version will require boto3 version 1.28.38 or greater to
work.
1 year ago
Douglas Monsky d5f1969d55
Introducing Enhanced Functionality to WeaviateHybridSearchRetriever: Accepting Additional Keyword Arguments (#10802)
**Description:** 
This commit enriches the `WeaviateHybridSearchRetriever` class by
introducing a new parameter, `hybrid_search_kwargs`, within the
`_get_relevant_documents` method. This parameter accommodates arbitrary
keyword arguments (`**kwargs`) which can be channeled to the inherited
public method, `get_relevant_documents`, originating from the
`BaseRetriever` class.

This modification facilitates more intricate querying capabilities,
allowing users to convey supplementary arguments to the `.with_hybrid()`
method. This expansion not only makes it possible to perform a more
nuanced search targeting specific properties but also grants the ability
to boost the weight of searched properties, to carry out a search with a
custom vector, and to apply the Fusion ranking method. The documentation
has been updated accordingly to delineate these new possibilities in
detail.

In light of the layered approach in which this search operates,
initiating with `query.get()` and then transitioning to
`.with_hybrid()`, several advantageous opportunities are unlocked for
the hybrid component that were previously unattainable.

Here’s a representative example showcasing a query structure that was
formerly unfeasible:

[Specific Properties
Only](https://weaviate.io/developers/weaviate/search/hybrid#selected-properties-only)
"The example below illustrates a BM25 search targeting the keyword
'food' exclusively within the 'question' property, integrated with
vector search results corresponding to 'food'."
```python
response = (
    client.query
    .get("JeopardyQuestion", ["question", "answer"])
    .with_hybrid(
        query="food",
        properties=["question"], # Will now be possible moving forward
        alpha=0.25
    )
    .with_limit(3)
    .do()
)
```
This functionality is now accessible through my alterations, by
conveying `hybrid_search_kwargs={"properties": ["question", "answer"]}`
as an argument to
`WeaviateHybridSearchRetriever.get_relevant_documents()`. For example:

```python
import os
from weaviate import Client
from langchain.retrievers import WeaviateHybridSearchRetriever

client = Client(
        url=os.getenv("WEAVIATE_CLIENT_URL"),
        additional_headers={
            "X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY"),
            "Authorization": f"Bearer {os.getenv('WEAVIATE_API_KEY')}",
        },
    )

index_name = "Document"
text_key = "content"
attributes = ["title", "summary", "header", "url"]

retriever = ExtendedWeaviateHybridSearchRetriever(
        client=client,
        index_name=index_name,
        text_key=text_key,
        attributes=attributes,
    )

# Warning: to utilize properties in this way, each use property must also be in the list `attributes + [text_key]`.
hybrid_search_kwargs = {"properties": ["summary^2", "content"]}
query_text = "Some Query Text"

relevant_docs = retriever.get_relevant_documents(
        query=query_text,
        hybrid_search_kwargs=hybrid_search_kwargs
    )
```
In my experience working with the `weaviate-client` library, I have
found that these supplementary options stand as vital tools for
refining/finetuning searches, notably within multifaceted datasets. As a
final note, this implementation supports both backwards and forward
(within reason) compatiblity. It accommodates any future additional
parameters Weaviate may add to `.with_hybrid()`, without necessitating
further alterations.

**Additional Documentation:**
For a more comprehensive understanding and to explore a myriad of useful
options that are now accessible, please refer to the Weaviate
documentation:
- [Fusion Ranking
Method](https://weaviate.io/developers/weaviate/search/hybrid#fusion-ranking-method)
- [Selected Properties
Only](https://weaviate.io/developers/weaviate/search/hybrid#selected-properties-only)
- [Weight Boost Searched
Properties](https://weaviate.io/developers/weaviate/search/hybrid#weight-boost-searched-properties)
- [With a Custom
Vector](https://weaviate.io/developers/weaviate/search/hybrid#with-a-custom-vector)

**Tag Maintainer:** 
@hwchase17 - I have tagged you based on your frequent contributions to
the pertinent file, `/retrievers/weaviate_hybrid_search.py`. My
apologies if this was not the appropriate choice.

Thank you for considering my contribution, I look forward to your
feedback, and to future collaboration.
1 year ago
Jacob Lee 61cecf8b1b
Fix for versioned OpenAI instruct models (#10788)
Versioned OpenAI instruct models may end with numbers, e.g.
`gpt-3.5-turbo-instruct-0914`.

Fixes https://github.com/langchain-ai/langchainjs/issues/2669 in Python
1 year ago
Cory Zue 62603f2664
make auto-setting the encodings optional, alow explicitly setting it (#10774)
I was trying to use web loaders on some spanish documentation (e.g.
[this site](https://www.fromdoppler.com/es/mailing-tendencias/), but the
auto-encoding introduced in
https://github.com/langchain-ai/langchain/pull/3602 was detected as
"MacRoman" instead of the (correct) "UTF-8".

To address this, I've added the ability to disable the auto-encoding, as
well as the ability to explicitly tell the loader what encoding to use.

- **Description:** Makes auto-setting the encoding optional in
`WebBaseLoader`, and introduces an `encoding` option to explicitly set
it.
  - **Dependencies:** N/A
  - **Tag maintainer:** @hwchase17 
  - **Twitter handle:** @czue
1 year ago
Harrison Chase c68be4eb2b
tool rendering (#10786) 1 year ago
Aashish Saini 1b050b98f5
Corrected some spelling mistakes and grammatical errors (#10791)
Corrected some spelling mistakes and grammatical errors
CC: @baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Ishita Chauhan <136303787+IshitaChauhanShortHillsAI@users.noreply.github.com>
Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: ManpreetShorthillsAI <142380984+ManpreetShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com>
Co-authored-by: Md Nazish Arman <142379599+MdNazishArmanShorthillsAI@users.noreply.github.com>
Co-authored-by: KamalSharmaShorthillsAI <142474019+KamalSharmaShorthillsAI@users.noreply.github.com>
Co-authored-by: Lakshya <lakshyagupta87@yahoo.com>
Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>
Co-authored-by: ishita <chauhanishita5356@gmail.com>
1 year ago
Ahmad Bunni 5272e42b0d
Add namespace to pinecone hybrid search (#10677)
**Description:** 
  
Pinecone hybrid search is now limited to default namespace. There is no
option for the user to provide a namespace to partition an index, which
is one of the most important features of pinecone.
  
**Resource:** 
https://docs.pinecone.io/docs/namespaces

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Bagatur 0d1550da91
Bagatur/bump 295 (#10785) 1 year ago
Vikram Shitole a4e858b111
Sagemaker endpoint capability to inject boto3 client for cross account scenarios (#10728)
- **Description: Allow to inject boto3 client for Cross account access
type of scenarios in using Sagemaker Endpoint **
  - **Issue:#10634 #10184** 
  - **Dependencies: None** 
  - **Tag maintainer:** 
  - **Twitter handle:lethargicoder**

Co-authored-by: Vikram(VS) <vssht@amazon.com>
1 year ago
William FH c8f386db97
Merge metadata + tags in config (#10762)
Think these should be a merge/update rather than overwrite
1 year ago
BarberAlec c898a4d7ba
Update ContextCallbackHandler Docstring & metadata key (#10732)
- **Description:** Updating URL in Context Callback Docstrings and
update metadata key Context CallbackHandler uses to send model names.
- **Issue:** The URL in ContextCallbackHandler is out of date. Model
data being sent to Context should be under the "model" key and not
"llm_model". This allows Context to do more sophisticated analysis.
  - **Dependencies:** None

Tagging @agamble.
1 year ago
Harrison Chase 8b68d1a03b
keep reference to old embeddings base (#10759) 1 year ago
Jacob Lee babf46692d
Allow extra variables when invoking prompt templates (#10765)
Makes chaining easier as many maps have extra properties.

@baskaryan @hwchase17
1 year ago
Bagatur 8515e27d82
bump 294 (#10751) 1 year ago
Jacob Lee 579d14fbc1
Allow 3.5-turbo instruct models in the OpenAI LLM class (#10750)
@baskaryan @hwchase17
1 year ago
Harrison Chase e404fd39dd
add anthropic page (#10666) 1 year ago
Bagatur 5072138893
bump 293 (#10740) 1 year ago
Harrison Chase 12ff780089
move embeddings to schema (#10696) 1 year ago
Jiayi Ni ce61840e3b
ENH: Add `llm_kwargs` for Xinference LLMs (#10354)
- This pr adds `llm_kwargs` to the initialization of Xinference LLMs
(integrated in #8171 ).
- With this enhancement, users can not only provide `generate_configs`
when calling the llms for generation but also during the initialization
process. This allows users to include custom configurations when
utilizing LangChain features like LLMChain.
- It also fixes some format issues for the docstrings.
1 year ago
Eugene Yurtsev 1eefb9052b
RunnableBranch (#10594)
Runnable Branch implementation, no optimization for streaming logic yet
1 year ago
William FH 287c81db89
Catch Base Exception (#10607)
Currently the on_*_error isn't called for CancellationError's. This is
because in python 3.8, the inheritance changed from Exception to
BaseException


https://docs.python.org/3/library/asyncio-exceptions.html#asyncio.CancelledError
1 year ago
Philippe PRADOS 39c1c94272
Fix typing in WebResearchRetriver (#10734)
Hello @hwchase17 

**Issue**:
The class WebResearchRetriever accept only
RecursiveCharacterTextSplitter, but never uses a specification of this
class. I propose to change the type to TextSplitter. Then, the lint can
accept all subtypes.
1 year ago
Nuno Campos 8201cae770
Bug fixes for runnables (#10738)
- tools invoked in async methods would not work due to missing await
- RunnableSequence.stream() was creating an extra root run by mistake,
and it can simplified due to existence of default implementation for
.transform()

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
1 year ago
William FH 6e48092746
Update LangSmith Version (#10722)
And assign dataset ID upon project creation
1 year ago
William FH a3e5507faa
Make eval output parsers more robust (#10658)
Ran through a few hundred generations with some models to fix up the
parsers
1 year ago
William FH c5078fb13c
Add support for showing IO to chain group (#10510)
As well as error propagation
1 year ago
Harrison Chase 2c957de2fc
add checks on basic base modules (#10693) 1 year ago
Harrison Chase 5442d2b1fa
Harrison/stop importing from init (#10690) 1 year ago
Hedeer El Showk 9749f8ebae
database -> db in from_llm (#10667)
**Description:** Renamed argument `database` in
`SQLDatabaseSequentialChain.from_llm()` to `db`,

I realize it's tiny and a bit of a nitpick but for consistency with
SQLDatabaseChain (and all the others actually) I thought it should be
renamed. Also got me while working and using it today.

✔️ Please make sure your PR is passing linting and
testing before submitting. Run `make format`, `make lint` and `make
test` to check this locally.
1 year ago
Joshua Sundance Bailey c4e591a57d
OpenAI function calling docstring and notebook imports (#10663)
This PR is a documentation fix.

Description:
* fixes imports in the code samples in the docstrings of
`create_openai_fn_chain` and `create_structured_output_chain`
* fixes imports in
`docs/extras/modules/chains/how_to/openai_functions.ipynb`
* removes unused imports from the notebook

Issues:
* the docstrings use `from pydantic_v1 import BaseModel, Field` which
this PR changes to `from langchain.pydantic_v1 import BaseModel, Field`
* importing `pydantic` instead of `langchain.pydantic_v1` leads to
errors later in the notebook
1 year ago
Nuno Campos 9cd131a178
Support kwargs in RunnableWithFallbacks (#10682)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
1 year ago
Bagatur 6831a25675
bump 292 (#10649) 1 year ago
Nuno Campos 029b2f6aac
Allow calls to batch() with 0 length arrays (#10627)
This can happen if eg the input to batch is a list generated dynamically, where a 0-length list might be a valid use case
1 year ago
Jacob Lee a50e62e44b
Adds transform and atransform support to runnable sequences (#9583)
Allow runnable sequences to support transform if each individual
runnable inside supports transform/atransform.

@nfcampos
1 year ago
Aashish Saini f9f1340208
Fixed some grammatical and spelling errors (#10595)
Fixed some grammatical and spelling errors
1 year ago
Ackermann Yuriy 5e50b89164
Added embeddings support for ollama (#10124)
- Description: Added support for Ollama embeddings
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: N/A
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
  - Twitter handle: @herrjemand

cc  https://github.com/jmorganca/ollama/issues/436
1 year ago
Bagatur bc6b9331a9
bump 291 (#10604) 1 year ago
Bagatur ecbb1ed8cb
Replicate params fix (#10603) 1 year ago
Bagatur 50bb704da5
bump 290 (#10602) 1 year ago
Bagatur e195b78e1d
Fix replicate model kwargs (#10599) 1 year ago
Bagatur 77a165e0d9
fix replicate output type (#10598) 1 year ago
Bagatur 0786395b56
bump 289 (#10586)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
1 year ago
Bagatur 9dd4cacae2
add replicate stream (#10518)
support direct replicate streaming. cc @cbh123 @tjaffri
1 year ago
Bagatur 7f3f6097e7
Add mmr support to redis retriever (#10556) 1 year ago
Bagatur ccf71e23e8
cache replicate version (#10517)
In subsequent pr will update _call to use replicate.run directly when
not streaming, so version object isn't needed at all

cc @cbh123 @tjaffri
1 year ago
Stefano Lottini 49b65a1b57
CassandraCache and CassandraSemanticCache can handle any "Generation" (#10563)
Hello,
this PR improves coverage for caching by the two Cassandra-related
caches (i.e. exact-match and semantic alike) by switching to the more
general `dumps`/`loads` serdes utilities.

This enables cache usage within e.g. `ChatOpenAI` contexts (which need
to store lists of `ChatGeneration` instead of `Generation`s), which was
not possible as long as the cache classes were relying on the legacy
`_dump_generations_to_json` and `_load_generations_from_json`).

Additionally, a slightly different init signature is introduced for the
cache objects:
- named parameters required for init, to pave the way for easier changes
in the future connect-to-db flow (and tests adjusted accordingly)
- added a `skip_provisioning` optional passthrough parameter for use
cases where the user knows the underlying DB table, etc already exist.

Thank you for a review!
1 year ago
Tomaz Bratanic e1e01d6586
Add Neo4j vector index hybrid search (#10442)
Adding support for Neo4j vector index hybrid search option. In Neo4j,
you can achieve hybrid search by using a combination of vector and
fulltext indexes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
William FH 596f294b01
Update LangSmith Walkthrough (#10564) 1 year ago
stonekim adabdfdfc7
Add Baidu Qianfan endpoint for LLM (#10496)
- Description:
* Baidu AI Cloud's [Qianfan
Platform](https://cloud.baidu.com/doc/WENXINWORKSHOP/index.html) is an
all-in-one platform for large model development and service deployment,
catering to enterprise developers in China. Qianfan Platform offers a
wide range of resources, including the Wenxin Yiyan model (ERNIE-Bot)
and various third-party open-source models.
- Issue: none
- Dependencies: 
    * qianfan
- Tag maintainer: @baskaryan
- Twitter handle:

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Sergey Kozlov 0a0276bcdb
Fix OpenAIFunctionsAgent function call message content retrieving (#10488)
`langchain.agents.openai_functions[_multi]_agent._parse_ai_message()`
incorrectly extracts AI message content, thus LLM response ("thoughts")
is lost and can't be logged or processed by callbacks.

This PR fixes function call message content retrieving.
1 year ago
Michael Kim 2dc3c64386
Adding headers for accessing pdf file url (#10370)
- Description: Set up 'file_headers' params for accessing pdf file url
  - Tag maintainer: @hwchase17 

 make format, make lint, make test

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Renze Yu a34510536d
Improve code example indent (#10490) 1 year ago
Ali Soliman bcf130c07c
Fix Import BedrockChat (#10485)
- Description: Couldn't import BedrockChat from the chat_models
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: N/A
  - Issues: #10468

---------

Co-authored-by: Ali Soliman <alisaws@amazon.nl>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Stefano Lottini 415d38ae62
Cassandra Vector Store, add metadata filtering + improvements (#9280)
This PR addresses a few minor issues with the Cassandra vector store
implementation and extends the store to support Metadata search.

Thanks to the latest cassIO library (>=0.1.0), metadata filtering is
available in the store.

Further,
- the "relevance" score is prevented from being flipped in the [0,1]
interval, thus ensuring that 1 corresponds to the closest vector (this
is related to how the underlying cassIO class returns the cosine
difference);
- bumped the cassIO package version both in the notebooks and the
pyproject.toml;
- adjusted the textfile location for the vector-store example after the
reshuffling of the Langchain repo dir structure;
- added demonstration of metadata filtering in the Cassandra vector
store notebook;
- better docstring for the Cassandra vector store class;
- fixed test flakiness and removed offending out-of-place escape chars
from a test module docstring;

To my knowledge all relevant tests pass and mypy+black+ruff don't
complain. (mypy gives unrelated errors in other modules, which clearly
don't depend on the content of this PR).

Thank you!
Stefano

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 49694f6a3f
explicitly check openllm return type (#10560)
cc @aarnphm
1 year ago
Joshua Sundance Bailey 85e05fa5d6
ArcGISLoader: add keyword arguments, error handling, and better tests (#10558)
* More clarity around how geometry is handled. Not returned by default;
when returned, stored in metadata. This is because it's usually a waste
of tokens, but it should be accessible if needed.
* User can supply layer description to avoid errors when layer
properties are inaccessible due to passthrough access.
* Enhanced testing
* Updated notebook

---------

Co-authored-by: Connor Sutton <connor.sutton@swca.com>
Co-authored-by: connorsutton <135151649+connorsutton@users.noreply.github.com>
1 year ago