Commit Graph

1167 Commits

Author SHA1 Message Date
Michael Landis
a8db594012
fix: short-circuit black and mypy calls when no changes made (#11051)
Both black and mypy expect a list of files or directories as input.
As-is the Makefile computes a list files changed relative to the last
commit; these are passed to black and mypy in the `format_diff` and
`lint_diff` targets. This is done by way of the Makefile variable
`PYTHON_FILES`. This is to save time by skipping running mypy and black
over the whole source tree.

When no changes have been made, this variable is empty, so the call to
black (and mypy) lacks input files. The call exits with error causing
the Makefile target to error out with:

```bash
$ make format_diff
poetry run black
Usage: black [OPTIONS] SRC ...

One of 'SRC' or 'code' is required.
make: *** [format_diff] Error 1
```

This is unexpected and undesirable, as the naive caller (that's me! 😄 )
will think something else is wrong. This commit smooths over this by
short circuiting when `PYTHON_FILES` is empty.
2023-09-28 16:13:07 -07:00
Michael Kim
fbcd8e02f2
Change type annotations from LLMChain to Chain in MultiPromptChain (#11082)
- **Description:** The types of 'destination_chains' and 'default_chain'
in 'MultiPromptChain' were changed from 'LLMChain' to 'Chain'. and
removed variables declared overlapping with the parent class
- **Issue:** When a class that inherits only Chain and not LLMChain,
such as 'SequentialChain' or 'RetrievalQA', is entered in
'destination_chains' and 'default_chain', a pydantic validation error is
raised.
-  -  codes
```
retrieval_chain = ConversationalRetrievalChain(
        retriever=doc_retriever,
        combine_docs_chain=combine_docs_chain,
        question_generator=question_gen_chain,
    )
    
    destination_chains = {
        'retrieval': retrieval_chain,
    }
    
    main_chain = MultiPromptChain(
        router_chain=router_chain,
        destination_chains=destination_chains,
        default_chain=default_chain,
        verbose=True,
    )
```

 `make format`, `make lint` and `make test`
2023-09-28 15:59:25 -07:00
Piyush Jain
32d09bcd1e
Expanded version range for networkx, fixed sample notebook (#11094)
## Description
Expanded the upper bound for `networkx` dependency to allow installation
of latest stable version. Tested the included sample notebook with
version 3.1, and all steps ran successfully.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 15:33:30 -07:00
Piotr Mardziel
b40ecee4b9
FIx eval prompt (#11087)
**Description:** fixes a common typo in some of the eval criteria.
2023-09-28 15:21:15 -07:00
Guy Korland
5564833bd2
Add add_graph_documents support for FalkorDBGraph (#11122)
Adding `add_graph_documents` support for FalkorDBGraph and extending the
`Neo4JGraph` api so it can support `cypher.py`
2023-09-28 15:03:54 -07:00
Tomaz Bratanic
7d25a65b10
add from_existing_graph to neo4j vector (#11124)
This PR adds the option to create a Neo4jvector instance from existing
graph, which embeds existing text in the database and creates relevant
indices.
2023-09-28 15:02:26 -07:00
Noah Stapp
2c952de21a
Add support for MongoDB Atlas $vectorSearch vector search (#11139)
Adds support for the `$vectorSearch` operator for
MongoDBAtlasVectorSearch, which was announced at .Local London
(September 26th, 2023). This change maintains breaks compatibility
support for the existing `$search` operator used by the original
integration (https://github.com/langchain-ai/langchain/pull/5338) due to
incompatibilities in the Atlas search implementations.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 15:01:03 -07:00
Hugues
b599f91e33
LLMonitor Callback handler: fix bug (#11128)
Here is a small bug fix for the LLMonitor callback handler. I've also
added user identification capabilities.
2023-09-28 15:00:38 -07:00
William FH
e9b51513e9
Shared Executor (#11028) 2023-09-28 13:30:58 -07:00
Justin Plock
926e4b6bad
[Feat] Add optional client-side encryption to DynamoDB chat history memory (#11115)
**Description:** Added optional client-side encryption to the Amazon
DynamoDB chat history memory with an AWS KMS Key ID using the [AWS
Database Encryption SDK for
Python](https://docs.aws.amazon.com/database-encryption-sdk/latest/devguide/python.html)
**Issue:** #7886
**Dependencies:**
[dynamodb-encryption-sdk](https://pypi.org/project/dynamodb-encryption-sdk/)
**Tag maintainer:**  @hwchase17 
**Twitter handle:** [@jplock](https://twitter.com/jplock/)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 13:29:46 -07:00
Eugene Yurtsev
4947ac2965
Add langserve version (#11195)
Add langserve version
2023-09-28 16:24:00 -04:00
Joseph McElroy
822fc590d9
[ElasticsearchStore] Improve migration text to ElasticsearchStore (#11158)
We noticed that as we have been moving developers to the new
`ElasticsearchStore` implementation, we want to keep the
ElasticVectorSearch class still available as developers transition
slowly to the new store.

To speed up this process, I updated the blurb giving them a better
recommendation of why they should use ElasticsearchStore.
2023-09-28 12:40:18 -07:00
Naveen Tatikonda
9b0029b9c2
[OpenSearch] Add Self Query Retriever Support to OpenSearch (#11184)
### Description
Add Self Query Retriever Support to OpenSearch

### Maintainers
@rlancemartin, @eyurtsev, @navneet1v

### Twitter Handle
@OpenSearchProj

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-09-28 12:36:52 -07:00
Arthur Telders
0da484be2c
Add source metadata to OutlookMessageLoader (#11183)
Description: Add "source" metadata to OutlookMessageLoader

This pull request adds the "source" metadata to the OutlookMessageLoader
class in the load method. The "source" metadata is required when
indexing with RecordManager in order to sync the index documents with a
source.

Issue: None

Dependencies: None

Twitter handle: @ATelders

Co-authored-by: Arthur Telders <arthur.telders@roquette.com>
2023-09-28 14:58:12 -04:00
Bagatur
3508e582f1
add anthropic scheduled tests and unit tests (#11188) 2023-09-28 11:47:29 -07:00
Eugene Yurtsev
fd96878c4b
Fix anthropic secret key when passed in via init (#11185)
Fixes anthropic secret key when passed via init

https://github.com/langchain-ai/langchain/issues/11182
2023-09-28 14:21:41 -04:00
Bagatur
f201d80d40
temporarily skip embedding empty string test (#11187) 2023-09-28 11:20:00 -07:00
Eugene Yurtsev
b3cf9c8759
LangServe: Update langchain requirement for publishing (#11186)
Update langchain requirement for publishing
2023-09-28 14:11:58 -04:00
mani2348
89ddc7cbb6
Update Bedrock service name to "bedrock-runtime" and model identifiers (#11161)
- **Description:** Bedrock updated boto service name to
"bedrock-runtime" for the InvokeModel and InvokeModelWithResponseStream
APIs. This update also includes new model identifiers for Titan text,
embedding and Anthropic.

Co-authored-by: Mani Kumar Adari <maniadar@amazon.com>
2023-09-28 09:42:56 -07:00
Eugene Yurtsev
de3e25683e
Expose lc_id as a classmethod (#11176)
* Expose LC id as a class method 
* User should not need to know that the last part of the id is the class
name
2023-09-28 17:25:27 +01:00
Nuno Campos
5ca461160b Lint 2023-09-28 17:12:07 +01:00
Nuno Campos
151f27d502 Lint 2023-09-28 16:42:58 +01:00
Eugene Yurtsev
4ba9c16f74 mypy 2023-09-28 11:27:20 -04:00
Eugene Yurtsev
44489e7029
LangServe: Clean up init files (#11174)
Clean up init files
2023-09-28 11:10:42 -04:00
Akio Nishimura
785b9d47b7
Fix stop key of TextGen. (#11109)
The key of stopping strings used in text-generation-webui api is
[`stopping_strings`](https://github.com/oobabooga/text-generation-webui/blob/main/api-examples/api-example.py#L51),
not `stop`.
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-28 11:05:24 -04:00
Eugene Yurtsev
d1d7d0cb27 x 2023-09-28 10:56:50 -04:00
Eugene Yurtsev
c86b2b5e42 x 2023-09-28 10:53:30 -04:00
Eugene Yurtsev
fe4f3b8fdf x 2023-09-28 10:51:28 -04:00
Eugene Yurtsev
a5b15e9d0f x 2023-09-28 10:51:17 -04:00
Nuno Campos
5c1f462bb9 Implement better reprs for Runnables 2023-09-28 15:24:51 +01:00
Nan LI
53a9d6115e
Xata chat memory FIX (#11145)
- **Description:** Changed data type from `text` to `json` in xata for
improved performance. Also corrected the `additionalKwargs` key in the
`messages()` function to `additional_kwargs` to adhere to `BaseMessage`
requirements.
- **Issue:** The Chathisroty.messages() will return {} of
`additional_kwargs`, as the name is wrong for `additionalKwargs` .
  - **Dependencies:**  N/A
  - **Tag maintainer:** N/A
  - **Twitter handle:** N/A

My PR is passing linting and testing before submitting.
2023-09-28 09:52:15 -04:00
William FH
8ae9b71e41
Async support for OpenAIFunctionsAgentOutputParser (#11140) 2023-09-28 09:42:59 -04:00
Bagatur
ce08f436db
Expose loads and dumps in load namespace 2023-09-28 09:34:48 -04:00
Nuno Campos
cfa2203c62
Add input/output schemas to runnables (#11063)
This adds `input_schema` and `output_schema` properties to all
runnables, which are Pydantic models for the input and output types
respectively. These are inferred from the structure of the Runnable as
much as possible, the only manual typing needed is
- optionally add type hints to lambdas (which get translated to
input/output schemas)
- optionally add type hint to RunnablePassthrough

These schemas can then be used to create JSON Schema descriptions of
input and output types, see the tests

- [x] Ensure no InputType and OutputType in our classes use abstract
base classes (replace with union of subclasses)
- [x] Implement in BaseChain and LLMChain
- [x] Implement in RunnableBranch
- [x] Implement in RunnableBinding, RunnableMap, RunnablePassthrough,
RunnableEach, RunnableRouter
- [x] Implement in LLM, Prompt, Chat Model, Output Parser, Retriever
- [x] Implement in RunnableLambda from function signature
- [x] Implement in Tool

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-28 11:05:15 +01:00
Eugene Yurtsev
b05bb9e136
LangServe (#11046)
Adds LangServe package

* Integrate Runnables with Fast API creating Server and a RemoteRunnable
client
* Support multiple runnables for a given server
* Support sync/async/batch/abatch/stream/astream/astream_log on the
client side (using async implementations on server)
* Adds validation using annotations (relying on pydantic under the hood)
-- this still has some rough edges -- e.g., open api docs do NOT
generate correctly at the moment
* Uses pydantic v1 namespace

Known issues: type translation code doesn't handle a lot of types (e.g.,
TypedDicts)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-09-28 10:52:44 +01:00
Nuno Campos
77ce9ed6f1
Support using async callback handlers with sync callback manager (#10945)
The current behaviour just calls the handler without awaiting the
coroutine, which results in exceptions/warnings, and obviously doesn't
actually execute whatever the callback handler does

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-28 10:39:01 +01:00
Bagatur
48a04aed75
bump 304 (#11147) 2023-09-27 19:24:09 -07:00
Jonathan Evans
23065f54c0
Added prompt wrapping for Claude with Bedrock (#11090)
- **Description:** Prompt wrapping requirements have been implemented on
the service side of AWS Bedrock for the Anthropic Claude models to
provide parity between Anthropic's offering and Bedrock's offering. This
overnight change broke most existing implementations of Claude, Bedrock
and Langchain. This PR just steals the the Anthropic LLM implementation
to enforce alias/role wrapping and implements it in the existing
mechanism for building the request body. This has also been tested to
fix the chat_model implementation as well. Happy to answer any further
questions or make changes where necessary to get things patched and up
to PyPi ASAP, TY.
- **Issue:** No issue opened at the moment, though will update when
these roll in.
  - **Dependencies:** None

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-27 19:20:07 -07:00
xiaoyu
b87cc8b31e
add 3 property types in metadata for notiondb loader (#8509)
### Description: 
NotionDB supports a number of common property types. I have found three
common types that are not included in notiondb loader. When programs
loaded them with notiondb, which will cause some metadata information
not to be passed to langchain. Therefore, I added three common types:
- date
- created_time
- last_edit_time.

### Issue: 
no
### Dependencies: 
No dependencies added :)
### Tag maintainer: 
@rlancemartin, @eyurtsev
### Twitter handle: 
@BJTUTC
2023-09-27 17:38:05 -07:00
Harrison Chase
258d67b0ac
Revert "improve the performance of base.py" (#11143)
Reverts langchain-ai/langchain#8610

this is actually an oversight - this merges all dfs into one df. we DO
NOT want to do this - the idea is we work and manipulate multiple dfs
2023-09-27 17:37:29 -07:00
Mohamad Zamini
9306394078
improve the performance of base.py (#8610)
This removes the use of the intermediate df list and directly
concatenates the dataframes if path is a list of strings. The pd.concat
function combines the dataframes efficiently, making it faster and more
memory-efficient compared to appending dataframes to a list.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 17:36:03 -07:00
Mincoolee
05b75f3f13
feat: add support for arxiv identifier in ArxivAPIWrapper() (#9318)
- Description: this PR adds the support for arxiv identifier of the
ArxivAPIWrapper. I modified the `run()` and `load()` functions in
`arxiv.py`, using regex to recognize if the query is in the form of
arxiv identifier (see
[https://info.arxiv.org/help/find/index.html](https://info.arxiv.org/help/find/index.html)).
If so, it will directly search the paper corresponding to the arxiv
identifier. I also modified and added tests in `test_arxiv.py`.
  - Issue: #9047 
  - Dependencies: N/A
  - Tag maintainer: N/A

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 17:35:16 -07:00
William FH
d3c2ca5656
Enhanced pairwise error (#11131) 2023-09-27 16:04:43 -07:00
Taqi Jaffri
b7e9db5e73
Stop sequences in fireworks, plus notebook updates (#11136)
The new Fireworks and FireworksChat implementations are awesome! Added
in this PR https://github.com/langchain-ai/langchain/pull/11117 thank
you @ZixinYang

However, I think stop words were not plumbed correctly. I've made some
simple changes to do that, and also updated the notebook to be a bit
clearer with what's needed to use both new models.


---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-09-27 16:01:05 -07:00
William FH
33da8bd711
Add Exact match and Regex Match Evaluators (#11132) 2023-09-27 14:18:07 -07:00
Harrison Chase
e355606b11
add more import checks (#11033) 2023-09-27 11:17:12 -07:00
Dan Bolser
efb7c459a2
Update base.py (#10843)
Fixing a typo in the example code in the docstring...

You have to start somewhere though right?

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 11:15:58 -07:00
tanujtiwari-at
a79f595543
Support extra tools argument for pandas agent toolkit (#11040)
**Description** 

We support adding new tools in some toolkits already like the [SQLAgent
toolkit](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/agent_toolkits/sql/base.py#L27).

Related
[SO](https://stackoverflow.com/questions/76583163/are-langchain-toolkits-able-to-be-modified-can-we-add-tools-to-a-pandas-datafra)
thread
This replicates the same functionality here, so users can add custom
bespoke tools.
2023-09-27 10:57:04 -07:00
Bagatur
410ac8129d
bump 303 (#11120) 2023-09-27 08:30:33 -07:00
Bagatur
8e4dbae428
Add fireworks chat model (#11117) 2023-09-27 08:22:12 -07:00
Bagatur
657581dbdf Fix ChatFireworks typing 2023-09-27 08:15:40 -07:00
Bagatur
12aad659dd add ChatFireworks to chat_models 2023-09-27 08:11:26 -07:00
Bagatur
872ebdaf90 remove FireworksChat from llms 2023-09-27 08:10:41 -07:00
Bagatur
9451240941 Fix fireworks chat linting issues 2023-09-27 08:09:33 -07:00
Tomáš Dvořák
865a21938c
speed up enforce_stop_tokens helper function (#10984)
**Description:**

As long as `enforce_stop_tokens` returns a first occurrence, we can
speed up the execution by setting the optional `maxsplit` parameter to
1.

Tag maintainer:
@agola11
@hwchase17

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-27 05:29:29 -07:00
Austin Walker
bb41252dab
fix: bump min_unstructured_version for UnstructuredAPIFileLoader (#11025)
**Description:** New metadata fields were added to
`unstructured==0.10.15`, and our hosted api has been updated to reflect
this. When users call `partition_via_api` with an older version of the
library, they'll hit a parsing error related to the new fields.
2023-09-27 05:28:06 -07:00
William FH
75b3893daf
Fix runnable branch callbacks (#11091)
We aren't calling on_chain_end here unless we use the default option
2023-09-27 11:38:56 +01:00
Bagatur
6c5251feb0 poetry 2023-09-26 20:12:49 -07:00
Bagatur
5310184f96 poetry 2023-09-26 20:12:29 -07:00
Cynthia Yang
6dd44ff1c0
Refactor Fireworks and add ChatFireworks (#3) (#10597)
Description 
* Refactor Fireworks within Langchain LLMs.
* Remove FireworksChat within Langchain LLMs.
* Add ChatFireworks (which uses chat completion api) to Langchain chat
models.
* Users have to install `fireworks-ai` and register an api key to use
the api.

Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin @baskaryan
2023-09-26 20:11:55 -07:00
Bagatur
5514ebe859
Don't type chains in output_parsers (#11092)
Can't use TYPE_CHECKING style imports for pydantic params because it will try to instantiate the typed object by default.
2023-09-26 17:49:35 -07:00
CG80499
64385c4eae
Make pairwise comparison chain more like LLM as a judge (#11013)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:**: Adds LLM as a judge as an eval chain
  - **Tag maintainer:** @hwchase17 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-09-26 13:19:04 -07:00
Joseph McElroy
175ef0a55d
[ElasticsearchStore] Enable custom Bulk Args (#11065)
This enables bulk args like `chunk_size` to be passed down from the
ingest methods (from_text, from_documents) to be passed down to the bulk
API.

This helps alleviate issues where bulk importing a large amount of
documents into Elasticsearch was resulting in a timeout.

Contribution Shoutout
- @elastic

- [x] Updated Integration tests

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-26 12:53:50 -07:00
Eugene Yurtsev
d19fd0cfae
LogEntry/LogStream use str instead of uuid for id (#11080)
Cast the UUID to a string
2023-09-26 20:38:51 +01:00
Bagatur
d85339b9f2
extract sublinks exclude by abs path (#11079) 2023-09-26 12:26:27 -07:00
Bagatur
7ee8b2d1bf
exclude dirs in async recursive loading (#11077) 2023-09-26 09:59:04 -07:00
Bagatur
12fb393a43
bump 302 (#11070) 2023-09-26 08:13:01 -07:00
Bagatur
097ecef06b
refactor web base loader (#11057) 2023-09-26 08:11:31 -07:00
Bagatur
487611521d
fix root import (#11072) 2023-09-26 08:11:16 -07:00
Bagatur
a2f7246f0e
skip excluded sublinks before recursion (#11036) 2023-09-26 02:24:54 -07:00
William FH
4aec587979
Update LangSmith Walkthrough (#11043) 2023-09-25 22:32:56 -07:00
Harrison Chase
bea78b3271
make warnings more modular (#11047) 2023-09-25 20:46:43 -07:00
Harrison Chase
c87e9fb2ce
conditional imports (#11017) 2023-09-25 15:46:32 -07:00
Tomaz Bratanic
0625ab7a9e
Filtering graph schema for Cypher generation (#10577)
Sometimes you don't want the LLM to be aware of the whole graph schema,
and want it to ignore parts of the graph when it is constructing Cypher
statements.
2023-09-25 14:14:15 -07:00
Palau
89ef440c14
Kay retriever (#10657)
- **Description**: Adding retrievers for [kay.ai](https://kay.ai) and
SEC filings powered by Kay and Cybersyn. Kay provides context as a
service: it's an API built for RAG.
- **Issue**: N/A
- **Dependencies**: Just added a dep to the
[kay](https://pypi.org/project/kay/) package
- **Tag maintainer**: @baskaryan @hwchase17 Discussed in slack
- **Twtter handle:** [@vishalrohra_](https://twitter.com/vishalrohra_)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-25 13:10:13 -07:00
Harrison Chase
5f13668fa0
Harrison/move vectorstore base (#11030) 2023-09-25 12:44:23 -07:00
Eugene Yurtsev
af5390d416
Add a batch size for cleanup (#10948)
Add pagination to indexing cleanup to deal with large numbers of
documents that need to be deleted.
2023-09-25 14:52:32 -04:00
Eugene Yurtsev
09486ed188
Update Serializable to use classmethods (#10956) 2023-09-25 18:39:30 +01:00
Taqi Jaffri
b7290f01d8
Batching for hf_pipeline (#10795)
The huggingface pipeline in langchain (used for locally hosted models)
does not support batching. If you send in a batch of prompts, it just
processes them serially using the base implementation of _generate:
https://github.com/docugami/langchain/blob/master/libs/langchain/langchain/llms/base.py#L1004C2-L1004C29

This PR adds support for batching in this pipeline, so that GPUs can be
fully saturated. I updated the accompanying notebook to show GPU batch
inference.

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-09-25 18:23:11 +01:00
Bagatur
aa6e6db8c7
bump 301 (#11018) 2023-09-25 08:50:47 -07:00
Nuno Campos
956ee981c0
Fix issue where requests wrapper passes auth kwarg twice (#11010)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Closes #8842
2023-09-25 15:45:04 +01:00
Scotty
88a02076af
fix ChatMessageChunk concat error (#10174)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->

- Description: fix `ChatMessageChunk` concat error 
- Issue: #10173 
- Dependencies: None
- Tag maintainer: @baskaryan, @eyurtsev, @rlancemartin
- Twitter handle: None

---------

Co-authored-by: wangshuai.scotty <wangshuai.scotty@bytedance.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-09-25 11:17:11 +01:00
Naveen Tatikonda
b0f21e2b50
[OpenSearch] Pass ids using from_texts and indexname in add_texts and search (#10969)
### Description
This PR makes the following changes to OpenSearch:
1. Pass optional ids with `from_texts`
2. Pass an optional index name with `add_texts` and `search` instead of
using the same index name that was used during `from_texts`

### Issue
https://github.com/langchain-ai/langchain/issues/10967

### Maintainers
@rlancemartin, @eyurtsev, @navneet1v

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-09-23 16:12:51 -07:00
deanchanter
f945426874
Resolve GHI 10674 (#10977) 2023-09-23 16:11:52 -07:00
Anar
ff732e10f8
LLMRails Embedding (#10959)
LLMRails  Embedding Integration
This PR provides integration with LLMRails. Implemented here are:

langchain/embeddings/llm_rails.py
docs/extras/integrations/text_embedding/llm_rails.ipynb


Hi @hwchase17 after adding our vectorstore integration to langchain with
confirmation of you and @baskaryan, now we want to add our embedding
integration

---------

Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-23 16:11:02 -07:00
Michael Feil
94e31647bd
Support for Gradient.ai embedding (#10968)
Adds support for gradient.ai's embedding model.

This will remain a Draft, as the code will likely be refactored with the
`pip install gradientai` python sdk.
2023-09-23 16:10:23 -07:00
C.J. Jameson
05d5fcfdf8
fix make-coverage local invocation #10941 (#10974)
Fix the invocation of `make coverage` in `libs/langchain`

Fixes #10941
2023-09-23 16:03:53 -07:00
Bagatur
040d436b3f
Add vertex scheduled test (#10958) 2023-09-23 15:51:59 -07:00
Piyush Jain
8602a32b7e
Fixes error with providers that don't have model_id (#10966)
## Description
Fixes error with using the chain for providers that don't have
`model_id` field.


![image](https://github.com/langchain-ai/langchain/assets/289369/a86074cf-6c99-4390-a135-b3af7a4f0827)
2023-09-23 15:34:28 -07:00
Nuno Campos
7b13292e35
Remove python eval from vector sql db chain (#10937)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-23 08:51:03 -07:00
Richard Wang
b809c243af
Fix bug in index api (#10614)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

- **Description:** a fix for `index`.
- **Issue:** Not applicable.
- **Dependencies:** None
- **Tag maintainer:** 
- **Twitter handle:** richarddwang

# Problem
Replication code
```python
from pprint import pprint
from langchain.embeddings import OpenAIEmbeddings
from langchain.indexes import SQLRecordManager, index
from langchain.schema import Document
from langchain.vectorstores import Qdrant
from langchain_setup.qdrant import pprint_qdrant_documents, create_inmemory_empty_qdrant

# Documents
metadata1 = {"source": "fullhell.alchemist"}
doc1_1 = Document(page_content="1-1 I have a dog~", metadata=metadata1)
doc1_2 = Document(page_content="1-2 I have a daugter~", metadata=metadata1)
doc1_3 = Document(page_content="1-3 Ahh! O..Oniichan", metadata=metadata1)
doc2 = Document(page_content="2 Lancer died again.", metadata={"source": "fate.docx"})

# Create empty vectorstore
collection_name = "secret_of_D_disk"
vectorstore: Qdrant = create_inmemory_empty_qdrant()

# Create record Manager
import tempfile
from pathlib import Path

record_manager = SQLRecordManager(
    namespace="qdrant/{collection_name}",
    db_url=f"sqlite:///{Path(tempfile.gettempdir())/collection_name}.sql",
)
record_manager.create_schema()  # 必須

sync_result = index(
    [doc1_1, doc1_2, doc1_2, doc2],
    record_manager,
    vectorstore,
    cleanup="full",
    source_id_key="source",
)
print(sync_result, end="\n\n")
pprint_qdrant_documents(vectorstore)
```
<details>
<summary>Code of helper functions `pprint_qdrant_documents` and
`create_inmemory_empty_qdrant`</summary>

```python
def create_inmemory_empty_qdrant(**from_texts_kwargs):
    # Qdrant requires vector size, which can be only know after applying embedder
    vectorstore = Qdrant.from_texts(["dummy"], location=":memory:", embedding=OpenAIEmbeddings(), **from_texts_kwargs)
    dummy_document_id = vectorstore.client.scroll(vectorstore.collection_name)[0][0].id
    vectorstore.delete([dummy_document_id])
    return vectorstore

def pprint_qdrant_documents(vectorstore, limit: int = 100, **scroll_kwargs):
    document_ids, documents = [], []
    for record in vectorstore.client.scroll(
        vectorstore.collection_name, limit=100, **scroll_kwargs
    )[0]:
        document_ids.append(record.id)
        documents.append(
            Document(
                page_content=record.payload["page_content"],
                metadata=record.payload["metadata"] or {},
            )
        )
    pprint_documents(documents, document_ids=document_ids)

def pprint_document(document: Document = None, document_id=None, return_string=False):
    displayed_text = ""
    if document_id:
        displayed_text += f"Document {document_id}:\n\n"
    displayed_text += f"{document.page_content}\n\n"
    metadata_text = pformat(document.metadata, indent=1)
    if "\n" in metadata_text:
        displayed_text += f"Metadata:\n{metadata_text}"
    else:
        displayed_text += f"Metadata:{metadata_text}"

    if return_string:
        return displayed_text
    else:
        print(displayed_text)


def pprint_documents(documents, document_ids=None):
    if not document_ids:
        document_ids = [i + 1 for i in range(len(documents))]

    displayed_texts = []
    for document_id, document in zip(document_ids, documents):
        displayed_text = pprint_document(
            document_id=document_id, document=document, return_string=True
        )
        displayed_texts.append(displayed_text)
    print(f"\n{'-' * 100}\n".join(displayed_texts))
```
</details>
You will get

```
{'num_added': 3, 'num_updated': 0, 'num_skipped': 0, 'num_deleted': 0}

Document 1b19816e-b802-53c0-ad60-5ff9d9b9b911:

1-2 I have a daugter~

Metadata:{'source': 'fullhell.alchemist'}
----------------------------------------------------------------------------------------------------
Document 3362f9bc-991a-5dd5-b465-c564786ce19c:

1-1 I have a dog~

Metadata:{'source': 'fullhell.alchemist'}
----------------------------------------------------------------------------------------------------
Document a4d50169-2fda-5339-a196-249b5f54a0de:

1-2 I have a daugter~

Metadata:{'source': 'fullhell.alchemist'}
```
This is not correct. We should be able to expect that the vectorsotre
now includes doc1_1, doc1_2, and doc2, but not doc1_1, doc1_2, and
doc1_2.


# Reason
In `index`, the original code is 
```python
uids = []
docs_to_index = []
for doc, hashed_doc, doc_exists in zip(doc_batch, hashed_docs, exists_batch):
    if doc_exists:
        # Must be updated to refresh timestamp.
        record_manager.update([hashed_doc.uid], time_at_least=index_start_dt)
        num_skipped += 1
        continue
    uids.append(hashed_doc.uid)
    docs_to_index.append(doc)
```
In the aforementioned example, `len(doc_batch) == 4`, but
`len(hashed_docs) == len(exists_batch) == 3`. This is because the
deduplication of input documents [doc1_1, doc1_2, doc1_2, doc2] is
[doc1_1, doc1_2, doc2]. So `index` insert doc1_1, doc1_2, doc1_2 with
the uid of doc1_1, doc1_2, doc2.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-09-22 22:41:07 -04:00
Joshua Sundance Bailey
d67b120a41
Make anthropic_api_key a secret str (#10724)
This PR makes `ChatAnthropic.anthropic_api_key` a `pydantic.SecretStr`
to avoid inadvertently exposing API keys when the `ChatAnthropic` object
is represented as a str.
2023-09-22 22:06:20 -04:00
Bagatur
1b65779905
fix integration tests (#10952) 2023-09-22 12:04:38 -07:00
Harrison Chase
9062e36722
Harrison/agents structured (#10911) 2023-09-22 10:21:23 -07:00
C.J. Jameson
b4d2663beb
CONTRIBUTING.md Quick Start: focus on langchain core; clarify docs and experimental are separate (#10906)
follow up to https://github.com/langchain-ai/langchain/pull/7959 ,
explaining better to focus just on langchain core

no dependencies

twitter @cjcjameson
2023-09-22 10:17:08 -07:00
Michael Landis
f30b4697d4
fix: broken link in libs/langchain README (#10920)
**Description**
Fixes broken link to `CONTRIBUTING.md` in `libs/langchain/README.md`.

Because`libs/langchain/README.md` was copied from the top level README,
and because the README contains a link to `.github/CONTRIBUTING.md`, the
copied README's link relative path must be updated. This commit fixes
that link.
2023-09-22 10:14:19 -07:00
Bagatur
3cb460d5d8
bump 300 (#10940) 2023-09-22 09:44:47 -07:00
Nuno Campos
3d5e92e3ef
Accept run name arg for non-chain runs (#10935) 2023-09-22 08:41:25 -07:00
Nuno Campos
aac2d4dcef
In MergerRetriever async call all retrievers in parallel (#10938) 2023-09-22 08:40:16 -07:00
German Martin
66d5a7e7cf
Add async support to multi-query retriever. (#10873)
Added async support to the MultiQueryRetriever class.

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-09-22 08:33:20 -07:00
Leonid Kuligin
9d4b710a48
small fixes to Vertex (#10934)
Fixed tests, updated the required version of the SDK and a few minor
changes after the recent improvement
(https://github.com/langchain-ai/langchain/pull/10910)
2023-09-22 08:18:09 -07:00
wo0d
4e58b78102
Fix chat_history message order (#10869)
Not all databases uses id as default order, so add it explicitly

sqlite uses rawid as default order in select statement:
[https://www.sqlite.org/lang_createtable.html#rowid](https://www.sqlite.org/lang_createtable.html#rowid),
but some other databases like postgresql not behaves like this. since
this class supports multiple db engine. we should have an order.
2023-09-22 11:15:59 -04:00
Roman Shaptala
3d40de75c5
Fix default refine prompt template bug (#10928)
**Description:**
  
Default refine template does not actually use the refine template
defined above, it uses a string with the variable name.
 @baskaryan, @eyurtsev, @hwchase17
2023-09-22 11:04:28 -04:00
Bagatur
cab55e9bc1
add vertex prod features (#10910)
- chat vertex async
- vertex stream
- vertex full generation info
- vertex use server-side stopping
- model garden async
- update docs for all the above

in follow up will add
[] chat vertex full generation info
[] chat vertex retries
[] scheduled tests
2023-09-22 01:44:09 -07:00
Bagatur
dccc20b402
add model feat table (#10921) 2023-09-22 01:10:27 -07:00
William FH
ee8653f62c
Wfh/allow nonparallel (#10914) 2023-09-21 20:21:01 -07:00
Leonid Kuligin
95e1d1fae6
fix in the docstring (#10902)
Description: A fix in the documentation on how to use
`GoogleSearchAPIWrapper`.
2023-09-21 14:30:32 -07:00
Bagatur
af41bc84e6
bump 299 (#10904) 2023-09-21 12:56:52 -07:00
Bagatur
9a858a9107
Bagatur/arxiv kwargs (#10903)
support all arXiv api wrapper kwargs in loader
2023-09-21 12:49:56 -07:00
niklas
e5f420d2bc
Fix typo in URL document loader example (#10585)
- **Description:** Fix typo in URL document loader example
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** not urgent
2023-09-21 11:35:27 -07:00
Nuno Campos
ea26c12b23
Fix Runnable.transform() for false-y inputs (#10893)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-21 11:27:09 -07:00
Nuno Campos
fcb5aba9f0
Add Runnable.astream_log() (#10374)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-21 10:19:55 -07:00
Harrison Chase
a1ade48e8f
update agent docs (#10894) 2023-09-21 09:09:33 -07:00
Bagatur
d37ce48e60
sep base url and loaded url in sub link extraction (#10895) 2023-09-21 08:47:41 -07:00
Bagatur
24cb5cd379
bump 298 (#10892) 2023-09-21 08:26:11 -07:00
Bagatur
c1f9cc0bc5
recursive loader add status check (#10891) 2023-09-21 08:25:43 -07:00
Matvey Arye
6e02c45ca4
Add integration for Timescale Vector(Postgres) (#10650)
**Description:**
This commit adds a vector store for the Postgres-based vector database
(`TimescaleVector`).

Timescale Vector(https://www.timescale.com/ai) is PostgreSQL++ for AI
applications. It enables you to efficiently store and query billions of
vector embeddings in `PostgreSQL`:
- Enhances `pgvector` with faster and more accurate similarity search on
1B+ vectors via DiskANN inspired indexing algorithm.
- Enables fast time-based vector search via automatic time-based
partitioning and indexing.
- Provides a familiar SQL interface for querying vector embeddings and
relational data.

Timescale Vector scales with you from POC to production:
- Simplifies operations by enabling you to store relational metadata,
vector embeddings, and time-series data in a single database.
- Benefits from rock-solid PostgreSQL foundation with enterprise-grade
feature liked streaming backups and replication, high-availability and
row-level security.
- Enables a worry-free experience with enterprise-grade security and
compliance.

Timescale Vector is available on Timescale, the cloud PostgreSQL
platform. (There is no self-hosted version at this time.) LangChain
users get a 90-day free trial for Timescale Vector.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Avthar Sewrathan <avthar@timescale.com>
2023-09-21 07:33:37 -07:00
Michael Feil
55570e54e1
gradient.ai LLM intregration (#10800)
- **Description:** This PR implements a new LLM API to
https://gradient.ai
- **Issue:** Feature request for LLM #10745 
- **Dependencies**: No additional dependencies are introduced. 
- **Tag maintainer:** I am opening this PR for visibility, once ready
for review I'll tag.

- ```make format && make lint && make test``` is running.
- added a `integration` and `mock unit` test.


Co-authored-by: michaelfeil <me@michaelfeil.eu>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-21 07:29:16 -07:00
Bagatur
5097007407
cleanup recursive url session (#10863) 2023-09-21 07:22:13 -07:00
Harrison Chase
777b33b873
fix experimental imports (#10875) 2023-09-20 23:44:17 -07:00
Harrison Chase
808caca607
beef up agent docs (#10866) 2023-09-20 23:09:58 -07:00
Sharath Rajasekar
96023f94d9
Add Javelin integration (#10275)
We are introducing the py integration to Javelin AI Gateway
www.getjavelin.io. Javelin is an enterprise-scale fast llm router &
gateway. Could you please review and let us know if there is anything
missing.

Javelin AI Gateway wraps Embedding, Chat and Completion LLMs. Uses
javelin_sdk under the covers (pip install javelin_sdk).

Author: Sharath Rajasekar, Twitter: @sharathr, @javelinai

Thanks!!
2023-09-20 16:36:39 -07:00
Bagatur
957956ba6d
bump 297 (#10861) 2023-09-20 14:45:49 -07:00
Harrison Chase
1bc3244db9
fix loading of sql chain (#10860)
Closing #6889
2023-09-20 14:37:49 -07:00
Bagatur
b05a74b106
fix recursive loader (#10856) 2023-09-20 13:55:47 -07:00
Bagatur
de0a02f507
fix extract sublink bug (#10855) 2023-09-20 13:30:42 -07:00
Harrison Chase
7dec2d399b
format intermediate steps (#10794)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-09-20 13:02:55 -07:00
Harrison Chase
386ef1e654
add agent output parsers (#10790) 2023-09-20 12:10:09 -07:00
Mukit Momin
67c5950df3
Amazon Bedrock Support Streaming (#10393)
### Description

- Add support for streaming with `Bedrock` LLM and `BedrockChat` Chat
Model.
- Bedrock as of now supports streaming for the `anthropic.claude-*` and
`amazon.titan-*` models only, hence support for those have been built.
- Also increased the default `max_token_to_sample` for Bedrock
`anthropic` model provider to `256` from `50` to keep in line with the
`Anthropic` defaults.
- Added examples for streaming responses to the bedrock example
notebooks.

**_NOTE:_**: This PR fixes the issues mentioned in #9897 and makes that
PR redundant.
2023-09-20 11:55:38 -07:00
Bagatur
0749a642f5
Stream refac and vertex streaming (#10470)
---------

Co-authored-by: Terry Cruz Melo <tcruz@vozy.co>
Co-authored-by: Terry Cruz Melo <33166112+TerryCM@users.noreply.github.com>
2023-09-20 11:49:16 -07:00
William FH
f421af8b80
Criteria Parser Improvements (#10824) 2023-09-20 11:18:33 -07:00
Bagatur
46aa90062b
bump exp 19 (#10851) 2023-09-20 10:17:52 -07:00
Bagatur
775f3edffd
bump 296 (#10842) 2023-09-20 08:31:14 -07:00
Bagatur
96a9c27116
fix recursive loader (#10752)
maintain same base url throughout recursion, yield initial page, fixing
recursion depth tracking
2023-09-20 08:16:54 -07:00
Nuno Campos
276125a33b
Use shallow copy on runnable locals (#10825)
- deep copy prevents storing complex objects in locals
2023-09-20 08:13:06 -07:00
DanielZzz
ebe08412ad
fix: chat_models Qianfan not compatiable with SystemMessage (#10642)
- **Description:** QianfanEndpoint bugs for SystemMessages. When the
`SystemMessage` is input as the messages to
`chat_models.QianfanEndpoint`. A `TypeError` will be raised.
  - **Issue:** #10643
  - **Dependencies:** 
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** no
2023-09-19 22:35:51 -07:00
Massimiliano Pronesti
f0198354d9
fix(embeddings): number of texts in Azure OpenAIEmbeddings batch (#10707)
This PR addresses the limitation of Azure OpenAI embeddings, which can
handle at maximum 16 texts in a batch. This can be solved setting
`chunk_size=16`. However, I'd love to have this automated, not to force
the user to figure where the issue comes from and how to solve it.

Closes #4575. 

@baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-19 21:50:39 -07:00
zhanghexian
0abe996409
add clustered vearch in langchain (#10771)
---------

Co-authored-by: zhanghexian1 <zhanghexian1@jd.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-19 21:22:23 -07:00
HeTaoPKU
f505320a73
Add Minimax chat model (#10776)
resolve the merging issues for
https://github.com/langchain-ai/langchain/pull/6757

---------

Co-authored-by: 何涛 <taohe@bytedance.com>
2023-09-19 20:43:49 -07:00
Anar
c656a6b966
LLMRails (#10796)
### LLMRails Integration
This PR provides integration with LLMRails. Implemented here are:

langchain/vectorstore/llm_rails.py
tests/integration_tests/vectorstores/test_llm_rails.py
docs/extras/integrations/vectorstores/llm-rails.ipynb

---------

Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-19 20:33:33 -07:00
mateai
900dbd1cbe
Substring support for similarity_search_with_score (#10746)
**Description:** Possible to filter with substrings in
similarity_search_with_score, for example: filter={'user_id':
{'substring': 'user'}}

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-19 20:32:44 -07:00
Ansil M B
740eafe41d
Updated return parameter of YouTubeSearchTool (#10743)
**Description:** 
changed return parameter of YouTubeSearchTool
 

1. changed the returning links of youtube videos by adding prefix
"https://www.youtube.com", now this will return the exact links to the
videos
2. updated the returning type from 'string' to 'list', which will be
more suited for further processings

 **Issue:** 
Fixes #10742

 **Dependencies:** 
None


<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** changed return parameter of YouTubeSearchTool
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** None
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-19 17:04:06 -07:00
Harrison Chase
1dae3c383e
Harrison/add submodule to docs (#10803) 2023-09-19 17:03:32 -07:00
Henry (Hezheng) Yin
c15bbaac31
misc: add gpt-3.5-turbo-instruct to model_token_mapping (#10808)
A one-line fix to get`max_tokens=-1` working `OpenAI` class for
`gpt-3.5-turbo-instruct` model.

Closes https://github.com/langchain-ai/langchain/issues/10806
2023-09-19 17:03:16 -07:00
Harrison Chase
d2bee34d4c
Harrison/add vald (#10807)
Co-authored-by: datelier <57349093+datelier@users.noreply.github.com>
2023-09-19 16:42:52 -07:00
Jacob Lee
bbc3fe259b
Start RunnableBranch callback tags with 1 instead of 0 (#10755)
Changes to match `RunnableSequences`

@eyurtsev
2023-09-19 16:38:08 -07:00
Ziyang Liu
931b292126
Add support for HTTP PUT in the open api agent prompt (#10763)
**Description:** This PR adds HTTP PUT support for the langchain openapi
agent toolkit by leveraging existing structure and HTTP put request
wrapper. The PUT method is almost identical to HTTP POST but should be
idempotent and therefore tighter than POST which is not idempotent. Some
APIs may consider to use PUT instead of POST which is unfortunately not
supported with the current toolkit yet.
2023-09-19 16:37:20 -07:00
Mateusz Wosinski
a29cd89923
Synthetic data generation (#9759)
### Description

Implements synthetic data generation with the fields and preferences
given by the user. Adds showcase notebook.
Corresponding prompt was proposed for langchain-hub.

### Example

```
output = chain({"fields": {"colors": ["blue", "yellow"]}, "preferences": {"style": "Make it in a style of a weather forecast."}})
print(output)

# {'fields': {'colors': ['blue', 'yellow']},
 'preferences': {'style': 'Make it in a style of a weather forecast.'},
 'text': "Good morning! Today's weather forecast brings a beautiful combination of colors to the sky, with hues of blue and yellow gently blending together like a mesmerizing painting."}
```

### Twitter handle 

@deepsense_ai @matt_wosinski

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-19 16:29:50 -07:00
Bagatur
c4a6de3fc9
Revert "Add ChatGLM for llm and chat_model by using ChatGLM API (#9797)" (#10805)
@etveritas reverting for now until this is resolved
https://github.com/langchain-ai/langchain/pull/9797/files#r1330795585,
apologies for merging too eagerly!
2023-09-19 16:23:42 -07:00
Mickaël
c86a1a6710
chore: allow using dataclasses_json dependency v0.6.0 (#10775)
**Description:** upgrade the `dataclasses_json` dependency to its latest
version ([no real breaking
change](https://github.com/lidatong/dataclasses-json/releases/tag/v0.6.0)
if used correctly), while allowing previous version to not break other
users' setup
**Issue:** I need to use the latest version of that dependency in my
project, but `langchain` prevents it.

Note: it looks like running `poetry lock --no-update` did some changes
to the lockfiles as it was the first time it was with the
`macosx_11_0_arm64` architecture 🤷

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-19 16:22:35 -07:00