Commit Graph

572 Commits

Author SHA1 Message Date
Taqi Jaffri
5cd244e9b7 CR feedback 2023-08-19 13:48:15 -07:00
Predrag Gruevski
be9bc62f8b
Fix bash test regex for Linux under WSL2. (#9475)
It fails with `Permission denied` and not `not found`. Both seem
reasonable.
2023-08-19 09:27:14 -04:00
Lorenzo
5b3dbf12a5
Uniform valid suffixes and clarify exceptions (#9463)
**Description**:
- Uniformed the current valid suffixes (file formats) for loading agents
from hubs and files (to better handle future additions);
 - Clarified exception messages (also in unit test).
2023-08-18 21:35:53 -07:00
Brendan Collins
9f545825b7
Added Geometry Validation, Geometry Metadata, and WKT instead of Python str() to GeoDataFrame Loader (#9466)
@rlancemartin The current implementation within `Geopandas.GeoDataFrame`
loader uses the python builtin `str()` function on the input geometries.
While this looks very close to WKT (Well known text), Python's str
function doesn't guarantee that.

In the interest of interop., I've changed to the of use `wkt` property
on the Shapely geometries for generating the text representation of the
geometries.

Also, included here:
- validation of the input `page_content_column` as being a GeoSeries.
- geometry `crs` (Coordinate Reference System) / bounds
(xmin/ymin/xmax/ymax) added to Document metadata. Having the CRS is
critical... having the bounds is just helpful!

I think there is a larger question of "Should the geometry live in the
`page_content`, or should the record be better summarized and tuck the
geom into metadata?" ...something for another day and another PR.
2023-08-18 21:35:39 -07:00
Kacper Łukawski
616e728ef9
Enhance qdrant vs using async embed documents (#9462)
This is an extension of #8104. I updated some of the signatures so all
the tests pass.

@danhnn I couldn't commit to your PR, so I created a new one. Thanks for
your contribution!

@baskaryan Could you please merge it?

---------

Co-authored-by: Danh Nguyen <dnncntt@gmail.com>
2023-08-18 18:59:48 -07:00
Matt Robinson
83d2a871eb
fix: apply unstructured preprocess functions (#9473)
### Summary

Fixes a bug from #7850 where post processing functions in Unstructured
loaders were not apply. Adds a assertion to the test to verify the post
processing function was applied and also updates the explanation in the
example notebook.
2023-08-18 18:54:28 -07:00
William FH
292ae8468e
Let you specify run id in trace as chain group (#9484)
I think we'll deprecate this soon anyway but still nice to be able to
fetch the run id
2023-08-18 17:21:53 -07:00
Predrag Gruevski
df8e35fd81
Remove incorrect ABC from two Elasticsearch classes. (#9470)
Neither is an ABC because their own example code instantiates them directly.
2023-08-18 15:01:02 -04:00
Predrag Gruevski
82f28ca9ef
ChatPromptTemplate is not an ABC, it's instantiated directly. (#9468)
Its own `__add__` method constructs `ChatPromptTemplate` objects
directly, it cannot be abstract.

Found while debugging something else with @nfcampos.
2023-08-18 14:37:10 -04:00
vamseeyarla
82fb56b79c
Issue 9401 - SequentialChain runs the same callbacks over and over in async mode (#9452)
Issue: https://github.com/langchain-ai/langchain/issues/9401

In the Async mode, SequentialChain implementation seems to run the same
callbacks over and over since it is re-using the same callbacks object.

Langchain version: 0.0.264, master

The implementation of this aysnc route differs from the sync route and
sync approach follows the right pattern of generating a new callbacks
object instead of re-using the old one and thus avoiding the cascading
run of callbacks at each step.

Async mode:
```
        _run_manager = run_manager or AsyncCallbackManagerForChainRun.get_noop_manager()
        callbacks = _run_manager.get_child()
        ...
        for i, chain in enumerate(self.chains):
            _input = await chain.arun(_input, callbacks=callbacks)
            ...
```

Regular mode:
```
        _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
        for i, chain in enumerate(self.chains):
            _input = chain.run(_input, callbacks=_run_manager.get_child(f"step_{i+1}"))
            ...
```

Notice how we are reusing the callbacks object in the Async code which
will have a cascading effect as we run through the chain. It runs the
same callbacks over and over resulting in issues.

Solution:
Define the async function in the same pattern as the regular one and
added tests.
---------

Co-authored-by: vamsee_yarlagadda <vamsee.y@airbnb.com>
2023-08-18 11:26:12 -07:00
William FH
c29fbede59
Wfh/rm num repetitions (#9425)
Makes it hard to do test run comparison views and we'd probably want to
just run multiple runs right now
2023-08-18 10:08:39 -07:00
Predrag Gruevski
eee0d1d0dd
Update repository links in the package metadata. (#9454) 2023-08-18 12:55:43 -04:00
Bagatur
50b8f4dcc7
bump 268 (#9455) 2023-08-18 08:46:39 -07:00
Nuno Campos
354c42afd2 Lint 2023-08-18 15:30:30 +01:00
Nuno Campos
4452314aab Merge branch 'master' into bagatur/locals_in_config 2023-08-18 15:23:05 +01:00
Nuno Campos
d5eb228874
Add kwargs to all other optional runnable methods (#9439)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-18 15:04:26 +01:00
Leonid Ganeline
a3dd4dcadf
📖 docstrings retrievers consistency (#9422)
📜 
- updated the top-level descriptions to a consistent format;
- changed the format of several 100% internal functions from "name" to
"_name". So, these functions are not shown in the Top-level API
Reference page (with lists of classes/functions)
2023-08-18 09:20:39 -04:00
Nuno Campos
9417961b17
Add lock on tee peer cleanup (#9446)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-18 14:20:09 +01:00
Nuno Campos
d3f10d2f4f Update test 2023-08-18 11:36:16 +01:00
Nuno Campos
6ae58da668 Assign defaults in batch calls 2023-08-18 10:53:10 +01:00
Nuno Campos
ddcb4ff5fb Li t 2023-08-18 10:30:42 +01:00
Nuno Campos
1baedc4e18 Move patch_config 2023-08-18 10:28:39 +01:00
Nuno Campos
46f3850794 Lint 2023-08-18 10:25:41 +01:00
Nuno Campos
24a197f96a Merge branch 'master' into bagatur/locals_in_config 2023-08-18 10:12:10 +01:00
Nuno Campos
8ddaaf3d41 Move config helpers 2023-08-18 10:10:35 +01:00
Nuno Campos
a5e7dcec61 Lint 2023-08-18 10:03:28 +01:00
Nuno Campos
c1b1666ec8 Ensure config defaults apply even when a config is passed in 2023-08-18 10:02:29 +01:00
Nuno Campos
7fe474d198 Update snapshots 2023-08-18 10:02:11 +01:00
Jacob Lee
0689628489
Adds streaming for runnable maps (#9283)
@nfcampos @baskaryan

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-08-18 07:46:23 +01:00
Bagatur
ab21af71be wip 2023-08-17 17:28:02 -07:00
Bagatur
6f69b19ff5 wip tests 2023-08-17 16:45:52 -07:00
Bagatur
9e906c39ba nit 2023-08-17 16:22:22 -07:00
Bagatur
6b0a849f59 fix 2023-08-17 16:22:12 -07:00
Bagatur
c447e9a854 cr 2023-08-17 15:29:00 -07:00
Bagatur
bd80cad6db add 2023-08-17 13:52:19 -07:00
Bagatur
8c1a528c71 cr 2023-08-17 13:52:09 -07:00
Bagatur
25cbcd9374 merge 2023-08-17 13:03:28 -07:00
Aashish Saini
ce78877a87
Replaced instances of raising ValueError with raising ImportError. (#9388)
Refactored code to ensure consistent handling of ImportError. Replaced
instances of raising ValueError with raising ImportError.

The choice of raising a ValueError here is somewhat unconventional and
might lead to confusion for anyone reading the code. Typically, when
dealing with import-related errors, the recommended approach is to raise
an ImportError with a descriptive message explaining the issue. This
provides a clearer indication that the problem is related to importing
the required module.

@hwchase17 , @baskaryan , @eyurtsev 

Thanks
Aashish

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-17 12:24:08 -07:00
Bagatur
8c986221e4
make openapi_schema_pydantic opt (#9408) 2023-08-17 11:49:23 -07:00
Eugene Yurtsev
77b359edf5
More missing type annotations (#9406)
This PR fills in more missing type annotations on pydantic models. 

It's OK if it missed some annotations, we just don't want it to get
annotations wrong at this stage.

I'll do a few more passes over the same files!
2023-08-17 12:19:50 -04:00
Bagatur
a69d1b84f4
bump 267 (#9403) 2023-08-17 08:47:13 -07:00
Nuno Campos
c0d67420e5
Use a submodule for pydantic v1 compat (#9371)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-17 16:35:49 +01:00
Bagatur
995ef8a7fc
unpin pydantic (#9356) 2023-08-17 01:55:46 -07:00
Tong Gao
3c8e9a9641
Fix typos in eval_chain.py (#9365)
Fixed two minor typos.
2023-08-17 01:53:46 -07:00
Eugene Yurtsev
2673b3a314
Create pydantic v1 namespace in langchain (#9254)
Create pydantic v1 namespace in langchain experimental
2023-08-16 21:19:31 -07:00
Eugene Yurtsev
4c2de2a7f2
Adding missing types in some pydantic models (#9355)
* Adding missing types in some pydantic models -- this change is
required for making the code work with pydantic v2.
2023-08-16 20:10:34 -07:00
Harrison Chase
1c089cadd7
fix import v2 (#9346) 2023-08-16 17:33:01 -07:00
qqjettkgjzhxmwj
84a97d55e1
Fix typo in llm_router.py (#9322)
Fix typo
2023-08-16 15:56:44 -07:00
Joe Reuter
09aa1eac03
Airbyte loaders: Fix last_state getter (#9314)
This PR fixes the Airbyte loaders when doing incremental syncs. The
notebooks are calling out to access `loader.last_state` to get the
current state of incremental syncs, but this didn't work due to a
refactoring of how the loaders are structured internally in the original
PR.

This PR fixes the issue by adding a `last_state` property that forwards
the state correctly from the CDK adapter.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-16 15:56:33 -07:00
Jakub Kuciński
8bebc9206f
Add improved sources splitting in BaseQAWithSourcesChain (#8716)
## Type:
Improvement

---

## Description:
Running QAWithSourcesChain sometimes raises ValueError as mentioned in
issue #7184:
```
ValueError: too many values to unpack (expected 2)
Traceback:

    response = qa({"question": pregunta}, return_only_outputs=True)
File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 166, in __call__
    raise e
File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 160, in __call__
    self._call(inputs, run_manager=run_manager)
File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\qa_with_sources\base.py", line 132, in _call
    answer, sources = re.split(r"SOURCES:\s", answer)
```
This is due to LLM model generating subsequent question, answer and
sources, that is complement in a similar form as below:
```
<final_answer>
SOURCES: <sources>
QUESTION: <new_or_repeated_question>
FINAL ANSWER: <new_or_repeated_final_answer>
SOURCES: <new_or_repeated_sources>
```
It leads the following line
```
 re.split(r"SOURCES:\s", answer)
```
to return more than 2 elements and result in ValueError. The simple fix
is to split also with "QUESTION:\s" and take the first two elements:
```
answer, sources = re.split(r"SOURCES:\s|QUESTION:\s", answer)[:2]
```

Sometimes LLM might also generate some other texts, like alternative
answers in a form:
```
<final_answer_1>
SOURCES: <sources>

<final_answer_2>
SOURCES: <sources>

<final_answer_3>
SOURCES: <sources>
```
In such cases it is the best to split previously obtained sources with
new line:
```
sources = re.split(r"\n", sources.lstrip())[0]
```



---

## Issue:
Resolves #7184

---

## Maintainer:
@baskaryan
2023-08-16 13:30:15 -07:00
Bagatur
a3c79b1909
Add tiktoken integration dep (#9332) 2023-08-16 12:09:22 -07:00
Bagatur
ba5fbaba70
bump 266 (#9296) 2023-08-16 01:13:19 -07:00
axiangcoding
63601551b1
fix(llms): improve the ernie chat model (#9289)
- Description: improve the ernie chat model.
   - fix missing kwargs to payload
   - new test cases
   - add some debug level log
   - improve description
- Issue: None
- Dependencies: None
- Tag maintainer: @baskaryan
2023-08-16 00:48:42 -07:00
Daniel Chalef
1d55141c50
zep/new ZepVectorStore (#9159)
- new ZepVectorStore class
- ZepVectorStore unit tests
- ZepVectorStore demo notebook
- update zep-python to ~1.0.2

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-16 00:23:07 -07:00
William FH
2519580994
Add Schema Evals (#9228)
Simple eval checks for whether a generation is valid json and whether it
matches an expected dict
2023-08-15 17:17:32 -07:00
Kenny
74a64cfbab
expose output key to create_openai_fn_chain (#9155)
I quick change to allow the output key of create_openai_fn_chain to
optionally be changed.

@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-15 17:01:32 -07:00
Bagatur
afba2be3dc
update openai functions docs (#9278) 2023-08-15 17:00:56 -07:00
Bagatur
9abf60acb6
Bagatur/vectara regression (#9276)
Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
2023-08-15 16:19:46 -07:00
Xiaoyu Xee
b30f449dae
Add dashvector vectorstore (#9163)
## Description
Add `Dashvector` vectorstore for langchain

- [dashvector quick
start](https://help.aliyun.com/document_detail/2510223.html)
- [dashvector package description](https://pypi.org/project/dashvector/)

## How to use
```python
from langchain.vectorstores.dashvector import DashVector

dashvector = DashVector.from_documents(docs, embeddings)
```

---------

Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-15 16:19:30 -07:00
Bagatur
bfbb97b74c
Bagatur/deeplake docs fixes (#9275)
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
2023-08-15 15:56:36 -07:00
Kunj-2206
1b3942ba74
Added BittensorLLM (#9250)
Description: Adding NIBittensorLLM via Validator Endpoint to langchain
llms
Tag maintainer: @Kunj-2206

Maintainer responsibilities:
    Models / Prompts: @hwchase17, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-15 15:40:52 -07:00
Toshish Jawale
852722ea45
Improvements in Nebula LLM (#9226)
- Description: Added improvements in Nebula LLM to perform auto-retry;
more generation parameters supported. Conversation is no longer required
to be passed in the LLM object. Examples are updated.
  - Issue: N/A
  - Dependencies: N/A
  - Tag maintainer: @baskaryan 
  - Twitter handle: symbldotai

---------

Co-authored-by: toshishjawale <toshish@symbl.ai>
2023-08-15 15:33:07 -07:00
Bagatur
358562769a
Bagatur/refac faiss (#9076)
Code cleanup and bug fix in deletion
2023-08-15 15:19:00 -07:00
Bagatur
3eccd72382
pin pydantic (#9274)
don't want default to be v2 yet
2023-08-15 15:02:28 -07:00
Erick Friis
76d09b4ed0
hub push/pull (#9225)
Description: Adds push/pull functions to interact with the hub
Issue: n/a
Dependencies: `langchainhub`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-15 14:11:43 -07:00
Alex Gamble
cf17c58b47
Update documentation for the Context integration with new URL and features (#9259)
Update documentation and URLs for the Langchain Context integration.

We've moved from getcontext.ai to context.ai \o/

Thanks in advance for the review!
2023-08-15 11:38:34 -07:00
Eugene Yurtsev
a091b4bf4c
Update testing workflow to test with both pydantic versions (#9206)
* PR updates test.yml to test with both pydantic versions
* Code should be refactored to make it easier to do testing in matrix
format w/ packages
* Added steps to assert that pydantic version in the environment is as
expected
2023-08-15 13:21:11 -04:00
Bagatur
e0162baa3b
add oai sched tests (#9257) 2023-08-15 09:40:33 -07:00
Joseph McElroy
5e9687a196
Elasticsearch self-query retriever (#9248)
Now with ElasticsearchStore VectorStore merged, i've added support for
the self-query retriever.

I've added a notebook also to demonstrate capability. I've also added
unit tests.

**Credit**
@elastic and @phoey1 on twitter.
2023-08-15 10:53:43 -04:00
Eugene Yurtsev
0470198fb5
Remove packages for pydantic compatibility (#9217)
# Poetry updates

This PR updates LangChains poetry file to remove
any dependencies that aren't pydantic v2 compatible yet.

All packages remain usable under pydantic v1, and can be installed
separately. 

## Bumping the following packages:

* langsmith

## Removing the following packages

not used in extended unit-tests:

* zep-python, anthropic, jina, spacy, steamship, betabageldb

not used at all:

* octoai-sdk

Cleaning up extras w/ for removed packages.

## Snapshots updated

Some snapshots had to be updated due to a change in the data model in
langsmith. RunType used to be Union of Enum and string and was changed
to be string only.
2023-08-15 10:41:25 -04:00
Bagatur
e986afa13a
bump 265 (#9253) 2023-08-15 07:21:32 -07:00
Hech
4b505060bd
fix: max_marginal_relevance_search and docs in Dingo (#9244) 2023-08-15 01:06:06 -07:00
axiangcoding
664ff28cba
feat(llms): support ernie chat (#9114)
Description: support ernie (文心一言) chat model
Related issue: #7990
Dependencies: None
Tag maintainer: @baskaryan
2023-08-15 01:05:46 -07:00
Bharat Ramanathan
08a8363fc6
feat(integration): Add support to serialize protobufs in WandbTracer (#8914)
This PR adds serialization support for protocol bufferes in
`WandbTracer`. This allows code generation chains to be visualized.
Additionally, it also fixes a minor bug where the settings are not
honored when a run is initialized before using the `WandbTracer`

@agola11

---------

Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-15 01:05:12 -07:00
Joshua Sundance Bailey
ef0664728e
ArcGISLoader update (#9240)
Small bug fixes and added metadata based on user feedback. This PR is
from the author of https://github.com/langchain-ai/langchain/pull/8873 .
2023-08-14 23:44:29 -07:00
Joseph McElroy
eac4ddb4bb
Elasticsearch Store Improvements (#8636)
Todo:
- [x] Connection options (cloud, localhost url, es_connection) support
- [x] Logging support
- [x] Customisable field support
- [x] Distance Similarity support 
- [x] Metadata support
  - [x] Metadata Filter support 
- [x] Retrieval Strategies
  - [x] Approx
  - [x] Approx with Hybrid
  - [x] Exact
  - [x] Custom 
  - [x] ELSER (excluding hybrid as we are working on RRF support)
- [x] integration tests 
- [x] Documentation

👋 this is a contribution to improve Elasticsearch integration with
Langchain. Its based loosely on the changes that are in master but with
some notable changes:

## Package name & design improvements
The import name is now `ElasticsearchStore`, to aid discoverability of
the VectorStore.

```py
## Before
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch, ElasticKnnSearch

## Now
from langchain.vectorstores.elasticsearch import ElasticsearchStore
```

## Retrieval Strategy support
Before we had a number of classes, depending on the strategy you wanted.
`ElasticKnnSearch` for approx, `ElasticVectorSearch` for exact / brute
force.

With `ElasticsearchStore` we have retrieval strategies:

### Approx Example
Default strategy for the vast majority of developers who use
Elasticsearch will be inferring the embeddings from outside of
Elasticsearch. Uses KNN functionality of _search.

```py
        texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index"
        )
        output = docsearch.similarity_search("foo", k=1)
```

### Approx, with hybrid
Developers who want to search, using both the embedding and the text
bm25 match. Its simple to enable.

```py
 texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.ApproxRetrievalStrategy(hybrid=True)
        )
        output = docsearch.similarity_search("foo", k=1)
```

### Approx, with `query_model_id`
Developers who want to infer within Elasticsearch, using the model
loaded in the ml node.

This relies on the developer to setup the pipeline and index if they
wish to embed the text in Elasticsearch. Example of this in the test.

```py
 texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.ApproxRetrievalStrategy(
                query_model_id="sentence-transformers__all-minilm-l6-v2"
            ),
        )
        output = docsearch.similarity_search("foo", k=1)
```

### I want to provide my own custom Elasticsearch Query
You might want to have more control over the query, to perform
multi-phase retrieval such as LTR, linearly boosting on document
parameters like recently updated or geo-distance. You can do this with
`custom_query_fn`

```py
        def my_custom_query(query_body: dict, query: str) -> dict:
            return {"query": {"match": {"text": {"query": "bar"}}}}

        texts = ["foo", "bar", "baz"]
        docsearch = ElasticsearchStore.from_texts(
            texts, FakeEmbeddings(), **elasticsearch_connection, index_name=index_name
        )
        docsearch.similarity_search("foo", k=1, custom_query=my_custom_query)

```

### Exact Example
Developers who have a small dataset in Elasticsearch, dont want the cost
of indexing the dims vs tradeoff on cost at query time. Uses
script_score.

```py
        texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.ExactRetrievalStrategy(),
        )
        output = docsearch.similarity_search("foo", k=1)
```

### ELSER Example
Elastic provides its own sparse vector model called ELSER. With these
changes, its really easy to use. The vector store creates a pipeline and
index thats setup for ELSER. All the developer needs to do is configure,
ingest and query via langchain tooling.

```py
texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.SparseVectorStrategy(),
        )
        output = docsearch.similarity_search("foo", k=1)

```

## Architecture
In future, we can introduce new strategies and allow us to not break bwc
as we evolve the index / query strategy.

## Credit
On release, could you credit @elastic and @phoey1 please? Thank you!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-14 23:42:35 -07:00
Divyansh Garg
9529483c2a
Improve MultiOn client toolkit prompts (#9222)
- Updated prompts for the MultiOn toolkit for better functionality
- Non-blocking but good to have it merged to improve the overall
performance for the toolkit
 
@hinthornw @hwchase17

---------

Co-authored-by: Naman Garg <ngarg3@binghamton.edu>
2023-08-14 17:39:51 -07:00
William FH
c478fc208e
Default On Retry (#9230)
Base callbacks don't have a default on retry event

Fix #8542

---------

Co-authored-by: landonsilla <landon.silla@stepstone.com>
2023-08-14 16:45:17 -07:00
Leonid Ganeline
93dd499997
docstrings: document_loaders consistency 3 (#9216)
Updated docstrings into the consistent format (probably, the last update
for the `document_loaders`.
2023-08-14 16:28:39 -07:00
Kshitij Wadhwa
a69cb95850
track langchain usage for Rockset (#9229)
Add ability to track langchain usage for Rockset. Rockset's new python
client allows setting this. To prevent old clients from failing, it
ignore if setting throws exception (we can't track old versions)

Tested locally with old and new Rockset python client

cc @baskaryan
2023-08-14 16:27:34 -07:00
Leonid Ganeline
7810ea5812
docstrings: chat_models consistency (#9227)
Updated docstrings into the consistent format.
2023-08-14 16:15:56 -07:00
William FH
b0896210c7
Return feedback with failed response if there's an error (#9223)
In Evals
2023-08-14 15:59:16 -07:00
William FH
7124f2ebfa
Parent Doc Retriever (#9214)
2 things:
- Implement the private method rather than the public one so callbacks
are handled properly
- Add search_kwargs (Open to not adding this if we are trying to
deprecate this UX but seems like as a user i'd assume similar args to
the vector store retriever. In fact some may assume this implements the
same interface but I'm not dealing with that here)
-
2023-08-14 15:41:53 -07:00
Harrison Chase
3f601b5809
add async method in (#9204) 2023-08-14 11:04:31 -07:00
Clark
03ea0762a1
fix(jinachat): related to #9197 (#9200)
related to: https://github.com/langchain-ai/langchain/issues/9197

---------

Co-authored-by: qianjun.wqj <qianjun.wqj@alibaba-inc.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-14 11:04:20 -07:00
Eugene Yurtsev
4f1feaca83
Wrap OpenAPI features in conditionals for pydantic v2 compatibility (#9205)
Wrap OpenAPI in conditionals for pydantic v2 compatibility.
2023-08-14 13:40:58 -04:00
Glauco Custódio
89be10f6b4
add ttl to RedisCache (#9068)
Add `ttl` (time to live) to `RedisCache`
2023-08-14 12:59:18 -04:00
Eugene Yurtsev
04bc5f3b18
Conditionally add pydantic v1 to namespace (#9202)
Conditionally add pydantic_v1 to namespace.
2023-08-14 11:26:45 -04:00
shibuiwilliam
feec422bf7
fix logging to logger (#9192)
# What
- fix logging to logger
2023-08-14 08:21:09 -07:00
Bagatur
5935767056
bump lc 246, lce 9 (#9207) 2023-08-14 08:14:37 -07:00
Bagatur
b5a57acf6c
lite llm lint (#9208) 2023-08-14 11:03:06 -04:00
Krish Dholakia
49f1d8477c
Adding ChatLiteLLM model (#9020)
Description: Adding a langchain integration for the LiteLLM library 
Tag maintainer: @hwchase17, @baskaryan
Twitter handle: @krrish_dh / @Berri_AI

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-14 07:43:40 -07:00
Eugene Yurtsev
72f9150a50
Update 2 more pydantic imports (#9203)
Update two more pydantic imports to use v1 explicitly
2023-08-14 10:11:30 -04:00
Eugene Yurtsev
c172f972ea
Create pydantic v1 namespace, add partial compatibility for pydantic v2 (#9123)
First of a few PRs to add full compatibility to both pydantic v1 and v2.

This PR creates pydantic v1 namespace and adds it to sys.modules.

Upcoming changes: 
1. Handle `openapi-schema-pydantic = "^1.2"` and dependent chains/tools
2. bump dependencies to versions that are cross compatible for pydantic
or remove them (see below)
3. Add tests to github workflows to test with pydantic v1 and v2

**Dependencies**

From a quick look (could be wrong since was done manually)

**dependencies pinning pydantic below 2** (some of these can be bumped
to newer versions are provide cross-compatible code)
anthropic
bentoml
confection
fastapi
langsmith
octoai-sdk
openapi-schema-pydantic
qdrant-client
spacy
steamship
thinc
zep-python

Unpinned

marqo (*)
nomic (*)
xinference(*)
2023-08-14 09:37:32 -04:00
Evan Schultz
8189dea0d8
Fixes typing issues in BaseOpenAI (#9183)
## Description: 

Sets default values for `client` and `model` attributes in the
BaseOpenAI class to fix Pylance Typing issue.

  - Issue: #9182.
  - Twitter handle: @evanmschultz
2023-08-13 23:03:28 -07:00
Massimiliano Pronesti
d95eeaedbe
feat(llms): support vLLM's OpenAI-compatible server (#9179)
This PR aims at supporting [vLLM's OpenAI-compatible server
feature](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server),
i.e. allowing to call vLLM's LLMs like if they were OpenAI's.

I've also udpated the related notebook providing an example usage. At
the moment, vLLM only supports the `Completion` API.
2023-08-13 23:03:05 -07:00
Michael Goin
621da3c164
Adds DeepSparse as an LLM (#9184)
Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM
backend. DeepSparse supports running various open-source sparsified
models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for
performance gains on CPUs.

Twitter handles: @mgoin_ @neuralmagic


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-13 22:35:58 -07:00
Bagatur
0fa69d8988
Bagatur/zep python 1.0 (#9186)
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
2023-08-13 21:52:53 -07:00
Eugene Yurtsev
9b24f0b067
Enhance deprecation decorator to modify docs with sphinx directives (#9069)
Enhance deprecation decorator
2023-08-13 15:35:01 -04:00
Bagatur
cdfe2c96c5
bump 263 (#9156) 2023-08-12 12:36:44 -07:00
Leonid Ganeline
19f504790e
docstrings: document_loaders consitency 2 (#9148)
This is Part 2. See #9139 (Part 1).
2023-08-11 16:25:40 -07:00
Harrison Chase
1b58460fe3
update keys for chain (#5164)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 16:25:13 -07:00
胡亮
7edf4ca396
Support multi gpu inference for HuggingFaceEmbeddings (#4732)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 15:55:44 -07:00
UmerHA
8aab39e3ce
Added SmartGPT workflow (issue #4463) (#4816)
# Added SmartGPT workflow by providing SmartLLM wrapper around LLMs
Edit:
As @hwchase17 suggested, this should be a chain, not an LLM. I have
adapted the PR.

It is used like this:
```
from langchain.prompts import PromptTemplate
from langchain.chains import SmartLLMChain
from langchain.chat_models import ChatOpenAI

hard_question = "I have a 12 liter jug and a 6 liter jug. I want to measure 6 liters. How do I do it?"
hard_question_prompt = PromptTemplate.from_template(hard_question)

llm = ChatOpenAI(model_name="gpt-4")
prompt = PromptTemplate.from_template(hard_question)
chain = SmartLLMChain(llm=llm, prompt=prompt, verbose=True)

chain.run({})
```


Original text: 
Added SmartLLM wrapper around LLMs to allow for SmartGPT workflow (as in
https://youtu.be/wVzuvf9D9BU). SmartLLM can be used wherever LLM can be
used. E.g:

```
smart_llm = SmartLLM(llm=OpenAI())
smart_llm("What would be a good company name for a company that makes colorful socks?")
```
or
```
smart_llm = SmartLLM(llm=OpenAI())
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
chain = LLMChain(llm=smart_llm, prompt=prompt)
chain.run("colorful socks")
```

SmartGPT consists of 3 steps:

1. Ideate - generate n possible solutions ("ideas") to user prompt
2. Critique - find flaws in every idea & select best one
3. Resolve - improve upon best idea & return it

Fixes #4463

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @agola11

Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
RicChilligerDude#7589

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 15:44:27 -07:00
Lucas Pickup
1d3735a84c
Ensure deployment_id is set to provided deployment, required for Azure OpenAI. (#5002)
# Ensure deployment_id is set to provided deployment, required for Azure
OpenAI.
---------

Co-authored-by: Lucas Pickup <lupickup@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 15:43:01 -07:00
Bagatur
45741bcc1b
Bagatur/vectara nit (#9140)
Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
2023-08-11 15:32:03 -07:00
Dominick DEV
9b64932e55
Add LangChain utility for real-time crypto exchange prices (#4501)
This commit adds the LangChain utility which allows for the real-time
retrieval of cryptocurrency exchange prices. With LangChain, users can
easily access up-to-date pricing information by running the command
".run(from_currency, to_currency)". This new feature provides a
convenient way to stay informed on the latest exchange rates and make
informed decisions when trading crypto.


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 14:45:06 -07:00
Joshua Sundance Bailey
eaa505fb09
Create ArcGISLoader & example notebook (#8873)
- Description: Adds the ArcGISLoader class to
`langchain.document_loaders`
  - Allows users to load data from ArcGIS Online, Portal, and similar
- Users can authenticate with `arcgis.gis.GIS` or retrieve public data
anonymously
  - Uses the `arcgis.features.FeatureLayer` class to retrieve the data
  - Defines the most relevant keywords arguments and accepts `**kwargs`
- Dependencies: Using this class requires `arcgis` and, optionally,
`bs4.BeautifulSoup`.

Tagging maintainers:
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 14:33:40 -07:00
Bagatur
e21152358a
fix (#9145) 2023-08-11 13:58:23 -07:00
Leonid Ganeline
edb585228d
docstrings: document_loaders consitency (#9139)
Formatted docstrings from different formats to consistent format, lile:
>Loads processed docs from Docugami.
"Load from `Docugami`."

>Loader that uses Unstructured to load HTML files.
"Load `HTML` files using `Unstructured`."

>Load documents from a directory.
"Load from a directory."
 
- `Load` - no `Loads`
- DocumentLoader always loads Documents, so no more
"documents/docs/texts/ etc"
- integrated systems and APIs enclosed in backticks,
2023-08-11 13:09:31 -07:00
Markus Schiffer
00bf472265
Fix for SVM retriever discarding document metadata (#9141)
As stated in the title the SVM retriever discarded the metadata of
passed in docs. This code fixes that. I also added one unit test that
should test that.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 13:08:17 -07:00
Bagatur
bace17e0aa
rm integration deps (#9142) 2023-08-11 12:43:08 -07:00
Eugene Yurtsev
44bc89b7bf
Support a few list like operations on ChatPromptTemplate (#9077)
Make it easier to work with chat prompt template
2023-08-11 14:49:51 -04:00
Hai The Dude
e4418d1b7e
Added new use case docs for Web Scraping, Chromium loader, BS4 transformer (#8732)
- Description: Added a new use case category called "Web Scraping", and
a tutorial to scrape websites using OpenAI Functions Extraction chain to
the docs.
  - Tag maintainer:@baskaryan @hwchase17 ,
- Twitter handle: https://www.linkedin.com/in/haiphunghiem/ (I'm on
LinkedIn mostly)

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
2023-08-11 11:46:59 -07:00
sseide
6cb763507c
add basic support for redis cluster server (#9128)
This change updates the central utility class to recognize a Redis
cluster server after connection and returns an new cluster aware Redis
client. The "normal" Redis client would not be able to talk to a cluster
node because keys might be stored on other shards of the Redis cluster
and therefor not readable or writable.

With this patch clients do not need to know what Redis server it is,
they just connect though the same API calls for standalone and cluster
server.

There are no dependencies added due to this MR.

Remark - with current redis-py client library (4.6.0) a cluster cannot
be used as VectorStore. It can be used for other use-cases. There is a
bug / missing feature(?) in the Redis client breaking the VectorStore
implementation. I opened an issue at the client library too
(redis/redis-py#2888) to fix this. As soon as this is fixed in
`redis-py` library it should be usable there too.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-11 11:37:44 -07:00
David Duong
6d03f8b5d8
Add serialisable support for Replicate (#8525) 2023-08-11 11:35:21 -07:00
niklub
16af5f8690
Add LabelStudio integration (#8880)
This PR introduces [Label Studio](https://labelstud.io/) integration
with LangChain via `LabelStudioCallbackHandler`:

- sending data to the Label Studio instance
- labeling dataset for supervised LLM finetuning
- rating model responses
- tracking and displaying chat history
- support for custom data labeling workflow

### Example

```
chat_llm = ChatOpenAI(callbacks=[LabelStudioCallbackHandler(mode="chat")])
chat_llm([
    SystemMessage(content="Always use emojis in your responses."),
        HumanMessage(content="Hey AI, how's your day going?"),
    AIMessage(content="🤖 I don't have feelings, but I'm running smoothly! How can I help you today?"),
        HumanMessage(content="I'm feeling a bit down. Any advice?"),
    AIMessage(content="🤗 I'm sorry to hear that. Remember, it's okay to seek help or talk to someone if you need to. 💬"),
        HumanMessage(content="Can you tell me a joke to lighten the mood?"),
    AIMessage(content="Of course! 🎭 Why did the scarecrow win an award? Because he was outstanding in his field! 🌾"),
        HumanMessage(content="Haha, that was a good one! Thanks for cheering me up."),
    AIMessage(content="Always here to help! 😊 If you need anything else, just let me know."),
        HumanMessage(content="Will do! By the way, can you recommend a good movie?"),
])
```

<img width="906" alt="image"
src="https://github.com/langchain-ai/langchain/assets/6087484/0a1cf559-0bd3-4250-ad96-6e71dbb1d2f3">


### Dependencies
- [label-studio](https://pypi.org/project/label-studio/)
- [label-studio-sdk](https://pypi.org/project/label-studio-sdk/)

https://twitter.com/labelstudiohq

---------

Co-authored-by: nik <nik@heartex.net>
2023-08-11 11:24:10 -07:00
Bagatur
8cb2594562
Bagatur/dingo (#9079)
Co-authored-by: gary <1625721671@qq.com>
2023-08-11 10:54:45 -07:00
Jacques Arnoux
926c64da60
Fix web research retriever for unknown links in results (#9115)
Fixes an issue with web research retriever for unknown links in results.
This is currently making the retrieve crash sometimes.

@rlancemartin
2023-08-11 10:50:37 -07:00
Alvaro Bartolome
f7ae183f40
ArgillaCallbackHandler to properly use default values for api_url and api_key (#9113)
As of the recent PR at #9043, after some testing we've realised that the
default values were not being used for `api_key` and `api_url`. Besides
that, the default for `api_key` was set to `argilla.apikey`, but since
the default values are intended for people using the Argilla Quickstart
(easy to run and setup), the defaults should be instead `owner.apikey`
if using Argilla 1.11.0 or higher, or `admin.apikey` if using a lower
version of Argilla.

Additionally, we've removed the f-string replacements from the
docstrings.

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>
2023-08-11 09:37:06 -07:00
Bagatur
01ef786e7e
bump 262 (#9108) 2023-08-11 01:29:07 -07:00
Bagatur
3b754b5461
Bagatur/filter metadata (#9015)
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
2023-08-11 01:10:00 -07:00
Kim Minjong
7f0e847c13
Update pydantic format instruction prompt (#9095)
- remove unopened bracket
2023-08-11 00:22:13 -07:00
Ashutosh Sanzgiri
991b448dfc
minor edits (#9093)
Description:

Minor edit to PR#845

Thanks!
2023-08-10 23:40:36 -07:00
Bagatur
3ab4e21579
fix json tool (#9096) 2023-08-10 23:39:25 -07:00
Sam Groenjes
2184e3a400
Fix IndexError when input_list is Empty in prep_prompts (#5769)
This MR corrects the IndexError arising in prep_prompts method when no
documents are returned from a similarity search.

Fixes #1733 
Co-authored-by: Sam Groenjes <sam.groenjes@darkwolfsolutions.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 22:50:39 -07:00
Chenyu Zhao
c0acbdca1b
Update Fireworks model names (#9085) 2023-08-10 19:23:42 -07:00
Bagatur
b80e3825a6
Bagatur/pinecone by vector (#9087)
Co-authored-by: joseph <joe@outverse.com>
2023-08-10 18:28:55 -07:00
Nikhil Kumar
6abb2c2c08
Buffer method of ConversationTokenBufferMemory should be able to return messages as string (#7057)
### Description:
`ConversationBufferTokenMemory` should have a simple way of returning
the conversation messages as a string.

Previously to complete this, you would only have the option to return
memory as an array through the buffer method and call
`get_buffer_string` by importing it from `langchain.schema`, or use the
`load_memory_variables` method and key into `self.memory_key`.

### Maintainer
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 18:17:22 -07:00
William FH
57dd4daa9a
Add string example mapper (#9086)
Now that we accept any runnable or arbitrary function to evaluate, we
don't always look up the input keys. If an evaluator requires
references, we should try to infer if there's one key present. We only
have delayed validation here but it's better than nothing
2023-08-10 17:07:02 -07:00
Bidhan Roy
02430e25b6
BagelDB (bageldb.ai), VectorStore integration. (#8971)
- **Description**: [BagelDB](bageldb.ai) a collaborative vector
database. Integrated the bageldb PyPi package with langchain with
related tests and code.

  - **Issue**: Not applicable.
  - **Dependencies**: `betabageldb` PyPi package.
  - **Tag maintainer**: @rlancemartin, @eyurtsev, @baskaryan
  - **Twitter handle**: bageldb_ai (https://twitter.com/BagelDB_ai)
  
We ran `make format`, `make lint` and `make test` locally.

Followed the contribution guideline thoroughly
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

---------

Co-authored-by: Towhid1 <nurulaktertowhid@gmail.com>
2023-08-10 16:48:36 -07:00
DJ Atha
ee52482db8
Fix issue 7445 (#7635)
Description: updated BabyAGI examples and experimental to append the
iteration to the result id to fix error storing data to vectorstore.
Issue: 7445
Dependencies: no
Tag maintainer: @eyurtsev
This fix worked for me locally. Happy to take some feedback and iterate
on a better solution. I was considering appending a uuid instead but
didn't want to over complicate the example.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 16:29:31 -07:00
Harrison Chase
bb6fbf4c71
openai adapters (#8988)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-08-10 16:08:50 -07:00
Harrison Chase
45f0f9460a
add async for python repl (#9080) 2023-08-10 16:07:06 -07:00
Neil Murphy
105c787e5a
Add convenience methods to ConversationBufferMemory and ConversationB… (#8981)
Add convenience methods to `ConversationBufferMemory` and
`ConversationBufferWindowMemory` to get buffer either as messages or as
string.

Helps when `return_messages` is set to `True` but you want access to the
messages as a string, and vice versa.

@hwchase17

One use case: Using a `MultiPromptRouter` where `default_chain` is
`ConversationChain`, but destination chains are `LLMChains`. Injecting
chat memory into prompts for destination chains prints a stringified
`List[Messages]` in the prompt, which creates a lot of noise. These
convenience methods allow caller to choose either as needed.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 15:45:30 -07:00
Zend
6221eb5974
Recursive url loader w/ test (#8813)
Description: Due to some issue on the test, this is a separate PR with
the test for #8502

Tag maintainer: @rlancemartin

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 14:50:31 -07:00
Junlin Zhou
cb5fb751e9
Enhance regex of structured_chat agents' output parser (#8965)
Current regex only extracts agent's action between '` ``` ``` `', this
commit will extract action between both '` ```json ``` `' and '` ``` ```
`'

This is very similar to #7511 
Co-authored-by: zjl <junlinzhou@yzbigdata.com>
2023-08-10 14:26:07 -07:00
Bagatur
16bd328aab
Use Embeddings in pinecone (#8982)
cc @eyurtsev @olivier-lacroix @jamescalam 

redo of #2741
2023-08-10 14:22:41 -07:00
Piyush Jain
8eea46ed0e
Bedrock embeddings async methods (#9024)
## Description
This PR adds the `aembed_query` and `aembed_documents` async methods for
improving the embeddings generation for large documents. The
implementation uses asyncio tasks and gather to achieve concurrency as
there is no bedrock async API in boto3.

### Maintainers
@agola11 
@aarora79  

### Open questions
To avoid throttling from the Bedrock API, should there be an option to
limit the concurrency of the calls?
2023-08-10 14:21:03 -07:00
Eugene Yurtsev
67ca187560
Fix incorrect code blocks in documentation (#9060)
Fixes incorrect code block syntax in doc strings.
2023-08-10 14:13:42 -07:00
Eugene Yurtsev
46f3428cb3
Fix more incorrect code blocks in doc strings (#9073)
Fix 2 more incorrect code blocks in strings
2023-08-10 13:49:15 -07:00
Eugene Yurtsev
a5a4c53280
RedisStore: Update init and Documentation updates (#9044)
* Update Redis Store to support init from parameters
* Update notebook to show how to use redis store, and some fixes in
documentation
2023-08-10 15:30:29 -04:00
Leonid Ganeline
fcbbddedae
ArxivLoader fix for issue 9046 (#9061)
Fixed #9046 
Added ut-s for this fix.
 @eyurtsev
2023-08-10 14:59:39 -04:00
Mike Lambert
e94a5d753f
Move from test to supported claude-instant-1 model (#9066)
Moves from "test" model to "claude-instant-1" model which is supported
and has actual capacity
2023-08-10 11:57:28 -07:00
Eugene Yurtsev
b7bc8ec87f
Add excludes to FileSystemBlobLoader (#9064)
Add option to specify exclude patterns.

https://github.com/langchain-ai/langchain/discussions/9059
2023-08-10 14:56:58 -04:00
Eugene Yurtsev
6c70f491ba
ChatPromptTemplate pending deprecation proposal (#9004)
Pending deprecations for ChatPromptTemplate proposals
2023-08-10 14:40:55 -04:00
TRY-ER
2431eca700
Agent vector store tool doc (#9029)
I was initially confused weather to use create_vectorstore_agent or
create_vectorstore_router_agent due to lack of documentation so I
created a simple documentation for each of the function about their
different usecase.
Replace this comment with:
- Description: Added the doc_strings in create_vectorstore_agent and
create_vectorstore_router_agent to point out the difference in their
usecase
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 11:13:12 -07:00
Alvaro Bartolome
08a0741d82
Update ArgillaCallbackHandler as of latest argilla release (#9043)
Hi @agola11, or whoever is reviewing this PR 😄 

## What's in this PR?

As of the latest Argilla release, we'll change and refactor some things
to make some workflows easier, one of those is how everything's pushed
to Argilla, so that now there's no need to call `push_to_argilla` over a
`FeedbackDataset` when either `push_to_argilla` is called for the first
time, or `from_argilla` is called; among others.

We also add some class variables to make sure those are easy to update
in case we update those internally in the future, also to make the
`warnings.warn` message lighter from the code view.

P.S. Regarding the Twitter/X mention feel free to do so at either
https://twitter.com/argilla_io or https://twitter.com/alvarobartt, or
both if applicable, otherwise, just the first Twitter/X handle.
2023-08-10 10:59:46 -07:00
Blake (Yung Cher Ho)
8d351bfc20
Takeoff integration (#9045)
## Description:
This PR adds the Titan Takeoff Server to the available LLMs in
LangChain.

Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.

Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)

#### Testing
As Titan Takeoff runs locally on port 8000 by default, no network access
is needed. Responses are mocked for testing.

- [x] Make Lint
- [x] Make Format
- [x] Make Test

#### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.

Thanks for your help and please let me know if you have any questions.

cc: @hwchase17 @baskaryan
2023-08-10 10:56:06 -07:00
Nuno Campos
3bdc273ab3
Implement .transform() in RunnablePassthrough() (#9032)
- This ensures passthrough doesnt break streaming
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-10 10:41:19 -07:00