Commit Graph

111 Commits

Author SHA1 Message Date
Harrison Chase
82df923f37 Merge branch 'master' of github.com:hwchase17/langchain 2023-07-27 22:01:20 -07:00
Harrison Chase
1b0bfa54cf cr 2023-07-27 22:00:52 -07:00
Jeff Vestal
c7ff5f19a8
ElasticKnnSearch rewrite - bug fix - return Document (#8180)
Fixes: 
https://github.com/hwchase17/langchain/issues/7117
https://github.com/hwchase17/langchain/issues/5760

Adding back `create_index` , `add_texts`, `from_texts` to
ElasticKnnSearch

`from_texts` matches standard `from_texts` methods as quick start up
method

`knn_search` and `hybrid_result` return a list of [`Document()`,
`score`,]

# Test `from_texts` for quick start
```
# create new index using from_text

from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch
from langchain.embeddings import ElasticsearchEmbeddings

model_id = "sentence-transformers__all-distilroberta-v1" 
dims = 768
es_cloud_id = ""
es_user = ""
es_password = ""
test_index = "knn_test_index_305"

embeddings = ElasticsearchEmbeddings.from_credentials(
    model_id,
    #input_field=input_field,
    es_cloud_id=es_cloud_id,
    es_user=es_user,
    es_password=es_password,
)

# add texts and create class instance
texts = ["This is a test document", "This is another test document"]
knnvectorsearch = ElasticKnnSearch.from_texts(
    texts=texts,
    embedding=embeddings,
    index_name= test_index,
    vector_query_field='vector',
    query_field='text',
    model_id=model_id,
    dims=dims,
	es_cloud_id=es_cloud_id, 
	es_user=es_user, 
	es_password=es_password
)

# Test `add_texts` method
texts2 = ["Hello, world!", "Machine learning is fun.", "I love Python."]
knnvectorsearch.add_texts(texts2)

query = "Hello"
knn_result = knnvectorsearch.knn_search(query = query, model_id= model_id, k=2)

hybrid_result = knnvectorsearch.knn_hybrid_search(query = query, model_id= model_id, k=2)

```

The  mapping is as follows:
```
{
  "knn_test_index_012": {
    "mappings": {
      "properties": {
        "text": {
          "type": "text"
        },
        "vector": {
          "type": "dense_vector",
          "dims": 768,
          "index": true,
          "similarity": "dot_product"
        }
      }
    }
  }
}
```

# Check response type
```
>>> hybrid_result
[(Document(page_content='Hello, world!', metadata={}), 0.94232327), (Document(page_content='I love Python.', metadata={}), 0.5321523)]

>>> hybrid_result[0]
(Document(page_content='Hello, world!', metadata={}), 0.94232327)

>>> hybrid_result[0][0]
Document(page_content='Hello, world!', metadata={})

>>> type(hybrid_result[0][0])
<class 'langchain.schema.document.Document'>
```

# Test with existing Index
```
from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch
from langchain.embeddings import ElasticsearchEmbeddings

## Initialize ElasticsearchEmbeddings
model_id = "sentence-transformers__all-distilroberta-v1" 
dims = 768
es_cloud_id = 
es_user = ""
es_password = ""
test_index = "knn_test_index_012"

embeddings = ElasticsearchEmbeddings.from_credentials(
    model_id,
    es_cloud_id=es_cloud_id,
    es_user=es_user,
    es_password=es_password,
)

## Initialize ElasticKnnSearch
knn_search = ElasticKnnSearch(
	es_cloud_id=es_cloud_id, 
	es_user=es_user, 
	es_password=es_password, 
	index_name= test_index, 
	embedding= embeddings
)


## Test adding vectors

### Test `add_texts` method when index created
texts = ["Hello, world!", "Machine learning is fun.", "I love Python."]
knn_search.add_texts(texts)

```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 22:00:18 -07:00
Harrison Chase
a221a9ced0
Harrison/sql query (#8370)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-07-27 21:55:17 -07:00
Bagatur
a1a650c743
Bagatur/from texts bug fix (#8394)
---------

Co-authored-by: Davit Buniatyan <davit@loqsh.com>
Co-authored-by: Davit Buniatyan <d@activeloop.ai>
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
Co-authored-by: Ivo Stranic <istranic@gmail.com>
2023-07-27 21:52:38 -07:00
Jiayi Ni
1efb9bae5f
FEAT: Integrate Xinference LLMs and Embeddings (#8171)
- [Xorbits
Inference(Xinference)](https://github.com/xorbitsai/inference) is a
powerful and versatile library designed to serve language, speech
recognition, and multimodal models. Xinference supports a variety of
GGML-compatible models including chatglm, whisper, and vicuna, and
utilizes heterogeneous hardware and a distributed architecture for
seamless cross-device and cross-server model deployment.
- This PR integrates Xinference models and Xinference embeddings into
LangChain.
- Dependencies: To install the depenedencies for this integration, run
    
    `pip install "xinference[all]"`
    
- Example Usage:

To start a local instance of Xinference, run `xinference`.

To deploy Xinference in a distributed cluster, first start an Xinference
supervisor using `xinference-supervisor`:

`xinference-supervisor -H "${supervisor_host}"`

Then, start the Xinference workers using `xinference-worker` on each
server you want to run them on.

`xinference-worker -e "http://${supervisor_host}:9997"`

To use Xinference with LangChain, you also need to launch a model. You
can use command line interface (CLI) to do so. Fo example: `xinference
launch -n vicuna-v1.3 -f ggmlv3 -q q4_0`. This launches a model named
vicuna-v1.3 with `model_format="ggmlv3"` and `quantization="q4_0"`. A
model UID is returned for you to use.

Now you can use Xinference with LangChain:

```python
from langchain.llms import Xinference

llm = Xinference(
    server_url="http://0.0.0.0:9997", # suppose the supervisor_host is "0.0.0.0"
    model_uid = {model_uid} # model UID returned from launching a model
)

llm(
    prompt="Q: where can we visit in the capital of France? A:",
    generate_config={"max_tokens": 1024},
)
```

You can also use RESTful client to launch a model:
```python
from xinference.client import RESTfulClient

client = RESTfulClient("http://0.0.0.0:9997")

model_uid = client.launch_model(model_name="vicuna-v1.3", model_size_in_billions=7, quantization="q4_0")
```

The following code block demonstrates how to use Xinference embeddings
with LangChain:
```python
from langchain.embeddings import XinferenceEmbeddings

xinference = XinferenceEmbeddings(
    server_url="http://0.0.0.0:9997",
    model_uid = model_uid
)
```

```python
query_result = xinference.embed_query("This is a test query")
```

```python
doc_result = xinference.embed_documents(["text A", "text B"])
```

Xinference is still under rapid development. Feel free to [join our
Slack
community](https://xorbitsio.slack.com/join/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA)
to get the latest updates!

- Request for review: @hwchase17, @baskaryan
- Twitter handle: https://twitter.com/Xorbitsio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 21:23:19 -07:00
Bagatur
877d384bc9
Revert "PromptTemplate update documentation and expand kwargs (#8234)" (#8395)
fyi @eyurtsev was failing a unit test
2023-07-27 21:11:10 -07:00
Gordon Clark
e66759cc9d
Github add "Create PR" tool + Docs update (#8235)
Added a new tool to the Github toolkit called **Create Pull Request.**
Now we can make our own langchain contributor in langchain 😁

In order to have somewhere to pull from, I also added a new env var,
"GITHUB_BASE_BRANCH." This will allow the existing env var,
"GITHUB_BRANCH," to be a working branch for the bot (so that it doesn't
have to always commit on the main/master). For example, if you want the
bot to work in a branch called `bot_dev` and your repo base is `main`,
you would set up the vars like:
```
GITHUB_BASE_BRANCH = "main"
GITHUB_BRANCH = "bot_dev"
``` 

Maintainer responsibilities:
  - Agents / Tools / Toolkits: @hinthornw
2023-07-27 19:19:44 -07:00
William FH
ecd4aae818
Few Shot Chat Prompt (#8038)
Proposal for a few shot chat message example selector

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-07-27 18:46:10 -07:00
Eugene Yurtsev
6dd18eee26
PromptTemplate update documentation and expand kwargs (#8234)
# PromptTemplate

* Update documentation to highlight the classmethod for instantiating a
prompt template.
* Expand kwargs in the classmethod to make parameters easier to discover
2023-07-27 18:11:39 -07:00
Karan V
a003a0baf6
fix(petals) allows to run models that aren't Bloom (Support for LLama and newer models) (#8356)
In this PR:

- Removed restricted model loading logic for Petals-Bloom
- Removed petals imports (DistributedBloomForCausalLM,
BloomTokenizerFast)
- Instead imported more generalized versions of loader
(AutoDistributedModelForCausalLM, AutoTokenizer)
- Updated the Petals example notebook to allow for a successful
installation of Petals in Apple Silicon Macs

- Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 18:01:04 -07:00
lars.gersmann
e758e9e7f5
fix(openapi): openapi chain will work without/empty description/summa… (#8351)
Description: 

This PR will enable the Open API chain to work with valid Open API
specifications missing `description` and `summary` properties for path
and operation nodes in open api specs.

Since both `description` and `summary` property are declared optional we
cannot be sure they are defined. This PR resolves this problem by
providing an empty (`''`) description as fallback.

The previous behavior of the Open API chain was that the underlying LLM
(OpenAI) throw ed an exception since `None` is not of type string:

```
openai.error.InvalidRequestError: None is not of type 'string' - 'functions.0.description'
```

Using this PR the Open API chain will succeed also using Open API specs
lacking `description` and `summary` properties for path and operation
nodes.

Thanks for your amazing work !

Tag maintainer: @baskaryan

---------

Co-authored-by: Lars Gersmann <lars.gersmann@cm4all.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 17:58:43 -07:00
ljeagle
caa6caeb8a
Upgrade the AwaDB from v0.3.7 to v0.3.9 and change the default embeddings (#8281)
1. Upgrade the AwaDB from v0.3.7 to v0.3.9
2. Change the default embedding to AwaEmbedding

---------

Co-authored-by: ljeagle <awadb.vincent@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-27 17:20:50 -07:00
Holt Skinner
d7e6770de8
refactor: Code refactoring & simplification for Google Cloud Enterprise Search retriever (#8369)
Followup to https://github.com/langchain-ai/langchain/pull/7857

- Changes `_convert_search_response()` to use object attributes instead
of converting to dictionary
- Simplifies logic for readability
2023-07-27 17:13:49 -07:00
Taozhi Wang
594f195e54
Add embeddings for AwaEmbedding (#8353)
- Description: Adds AwaEmbeddings class for embeddings, which provides
users with a convenient way to do fine-tuning, as well as the potential
need for multimodality

  - Tag maintainer: @baskaryan

Create `Awa.ipynb`: an example notebook for AwaEmbeddings class
Modify `embeddings/__init__.py`: Import the class
Create `embeddings/awa.py`: The embedding class
Create `embeddings/test_awa.py`: The test file.

---------

Co-authored-by: taozhiwang <taozhiwa@gmail.com>
2023-07-27 17:08:00 -07:00
thehunmonkgroup
ba4e82bb47
fix missing _identifying_params() in _VertexAICommon (#8303)
Full set of params are missing from Vertex* LLMs when `dict()` method is
called.

```
>>> from langchain.chat_models.vertexai import ChatVertexAI
>>> from langchain.llms.vertexai import VertexAI
>>> chat_llm = ChatVertexAI()
l>>> llm = VertexAI()
>>> chat_llm.dict()
{'_type': 'vertexai'}
>>> llm.dict()
{'_type': 'vertexai'}
```

This PR just uses the same mechanism used elsewhere to expose the full
params.

Since `_identifying_params()` is on the `_VertexAICommon` class, it
should cover the chat and non-chat cases.
2023-07-27 16:59:10 -07:00
Caitlin2694
b2e4b9dca4
Fix exception caused by restrictions in OWL (#8341)
Description: Fix exception caused by restrictions in OWL
Issue: #8331
Dependencies: none
Maintainer: @baskaryan
2023-07-27 16:51:32 -07:00
Nikita Pokidyshev
f499e6ea6a
Add FunctionMessage to _message_from_dict (#8374)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 16:45:27 -07:00
emarco177
2ab13ab743
added unit tests for mrkl output_parser.py (#8321)
- Description: added unit tests for mrkl output_parser.py, 
  - Tag maintainer: @hinthornw
  - Twitter handle: EdenEmarco177
2023-07-27 13:46:06 -07:00
Harrison Chase
f5bf893035
rename to str output parser (#8373) 2023-07-27 12:57:34 -07:00
William FH
0e9e5b5202
Retry events on any run type (#8375) 2023-07-27 12:56:46 -07:00
William FH
ff98fad2d9
Add Retry Events (#8053)
![image](https://github.com/hwchase17/langchain/assets/13333726/59a5c3b4-4367-47e6-9f58-5b6557576a8a)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 12:39:39 -07:00
Nuno Campos
0eca3e7d90
Add Runnable.bind method to attach kwargs to a Runnable that will be passed to all invoke/stream/batch calls when it is run (#8368)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 11:16:30 -07:00
Nuno Campos
1bbadde77b
Support using RunnableMap directly (#8317)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 17:24:29 +01:00
Bagatur
944321c6ab
bump 245 (#8359) 2023-07-27 06:53:24 -07:00
Rubén Barragán
ef6332ead6
Support loading files from Dropbox (#8271)
## Description
This commit introduces the `DropboxLoader` class, a new document loader
that allows loading files from Dropbox into the application. The loader
relies on a Dropbox app, which requires creating an app on Dropbox,
obtaining the necessary scope permissions, and generating an access
token. Additionally, the dropbox Python package is required.

The `DropboxLoader` class is designed to be used as a document loader
for processing various file types, including text files, PDFs, and
Dropbox Paper files.

## Dependencies
`pip install dropbox` and `pip install unstructured` for PDF reading.

## Tag maintainer
@rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some
feedback here 🙏 .

## Social Networks
https://github.com/rubenbarragan
https://www.linkedin.com/in/rgbarragan/
https://twitter.com/RubenBarraganP

---------

Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>
2023-07-27 06:36:08 -07:00
Pranay Chandekar
41bb3a6f9b
fixed the bug #8343 (#8345)
- Issue: #8343

Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>
2023-07-27 06:33:15 -07:00
Martin Krasser
93260a9922
Fix broken make targets format_diff and lint_diff (#8344)
Since the refactoring into sub-projects `libs/langchain` and
`libs/experimental`, the `make` targets `format_diff` and `lint_diff` do
not work anymore when running `make` from these subdirectories. Reason
is that

```
PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
```

generates paths from the project's root directory instead of the
corresponding subdirectories. This PR fixes this by adding a
`--relative` command line option.

- Tag maintainer: @baskaryan
2023-07-27 01:56:55 -07:00
Harrison Chase
ae78ef7fe6
bump experimental to 005 (#8339) 2023-07-26 21:46:28 -07:00
Vadim Gubergrits
e7e5cb9d08
Tree of Thought introducing a new ToTChain. (#5167)
# [WIP] Tree of Thought introducing a new ToTChain.

This PR adds a new chain called ToTChain that implements the ["Large
Language Model Guided
Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper.

There's a notebook example `docs/modules/chains/examples/tot.ipynb` that
shows how to use it.


Implements #4975


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @vowelparrot

---------

Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-26 21:29:39 -07:00
William FH
9eb7e6e27f
Delete Old Evals Examples (#8252)
Still retain:
- Comparison Examples
- Data + QA walkthrough
- QA (but really minimize it)
2023-07-26 18:46:54 -07:00
Saurabh Misra
db9d5b213a
Optimize the cosine_similarity_top_k function performance (#8151)
Optimizing important numerical code and making it run faster.

Performance went up by 1.48x (148%). Runtime went down from 138715us to
56020us

Optimization explanation:

The `cosine_similarity_top_k` function is where we made the most
significant optimizations.
Instead of sorting the entire score_array which needs considering all
elements, `np.argpartition` is utilized to find the top_k largest scores
indices, this operation has a time complexity of O(n), higher
performance than sorting. Remember, `np.argpartition` doesn't guarantee
the order of the values. So we need to use argsort() to get the indices
that would sort our top-k values after partitioning, which is much more
efficient because it only sorts the top-K elements, not the entire
array. Then to get the row and column indices of sorted top_k scores in
the original score array, we use `np.unravel_index`. This operation is
more efficient and cleaner than a list comprehension.

The code has been tested for correctness by running the following
snippet on both the original function and the optimized function and
averaged over 5 times.
```
def test_cosine_similarity_top_k_large_matrices():
    X = np.random.rand(1000, 1000)
    Y = np.random.rand(1000, 1000)
    top_k = 100
    score_threshold = 0.5
    gc.disable()
    counter = time.perf_counter_ns()
    return_value = cosine_similarity_top_k(X, Y, top_k, score_threshold)
    duration = time.perf_counter_ns() - counter
    gc.enable()
```

@hwaking @hwchase17 @jerwelborn 

Unit tests pass, I also generated more regression tests which all
passed.
2023-07-26 18:03:49 -07:00
Fabrizio Ruocco
ddc353a768
Azure Cognitive Search: Custom index and scoring profile support (#6843)
Description: Adding support for custom index and scoring profile support
in Azure Cognitive Search
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 17:58:01 -07:00
Kacper Łukawski
c5988c1d4b
Implement async support for Cohere (#8237)
This PR introduces async API support for Cohere, both LLM and
embeddings. It requires updating `cohere` package to `^4`.

Tagging @hwchase17, @baskaryan, @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:51:18 -07:00
Daniel Alexander Brenot
bf1357f584
Added async support to PlanAndExecute Chain (#8239)
- Description: Adds async support to the PlanAndExecute Chain

Maintainer responsibilities:
  - Async: @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:16:07 -07:00
Bastin Florian
a3ac9b23eb
feat(confluence): add markdown format option (#8246)
# Description:
**Add the possibility to keep text as Markdown in the ConfluenceLoader**
Add a bool variable that allows to keep the Markdown format of the
Confluence pages.
It is useful because it allows to use MarkdownHeaderTextSplitter as a
DataSplitter.
If this variable in set to True in the load() method, the pages are
extracted using the markdownify library.

  # Issue: 
[4407](https://github.com/langchain-ai/langchain/issues/4407)
  # Dependencies: 
Add the markdownify library
  # Tag maintainer:
 @rlancemartin, @eyurtsev
  # Twitter handle:
 FloBastinHeyI - https://twitter.com/FloBastinHeyI

---------

Co-authored-by: Florian Bastin <florian.bastin@octo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:00:27 -07:00
Leonid Ganeline
ee6ff96e28
docstrings cleanup (#8311)
- added missed docstrings
 - changed docstrings into consistent format
  
@baskaryan
2023-07-26 14:13:10 -07:00
Rohit Gupta
e5dba8978a
Avoid re-computation of embedding in weaviate similarity search (#8284)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 13:31:55 -07:00
Nuno Campos
a612800ef0
Runnable single protocol (#7800)
Objects implementing Runnable: BasePromptTemplate, LLM, ChatModel,
Chain, Retriever, OutputParser

- [x] Implement Runnable in base Retriever
- [x] Raise TypeError in operator methods for unsupported things 
- [x] Implement dict which calls values in parallel and outputs dict
with results
- [x] Merge in `+` for prompts
- [x] Confirm precedence order for operators, ideal would be `+` `|`,
https://docs.python.org/3/reference/expressions.html#operator-precedence
- [x] Add support for openai functions, ie. Chat Models must return
messages
- [x] Implement BaseMessageChunk return type for BaseChatModel, a
subclass of BaseMessage which implements __add__ to return
BaseMessageChunk, concatenating all str args
- [x] Update implementation of stream/astream for llm and chat models to
use new `_stream`, `_astream` optional methods, with default
implementation in base class `raise NotImplementedError` use
https://stackoverflow.com/a/59762827 to see if it is implemented in base
class
- [x] Delete the IteratorCallbackHandler (leave the async one because
people using)
- [x] Make BaseLLMOutputParser implement Runnable, accepting either str
or BaseMessage
---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-07-26 12:16:46 -07:00
Bharat
04a4d3e312
Fixes #8310 Fix maximum recursion depth exceeded error (#8313)
ElasticsearchVectorStore.as_retriever() method is returning 
`RecursionError: maximum recursion depth exceeded` 
because of incorrect field reference in
 `embeddings()` method

  - Description: Fix RecursionError because of a typo
  - Issue: the issue #8310 
  - Dependencies: None,
  - Tag maintainer: @eyurtsev
  - Twitter handle: bpatel
2023-07-26 12:15:37 -07:00
Caitlin2694
b9db3dd09b
Fix "missing key op" RDFGraph OWL serialization (#8276)
Replace this comment with:
- Description: Fix "missing key op" error in RDFGraph OWL Serialization
  - Issue: #8263
  - Dependencies: None
  - Tag maintainer: @baskaryan
2023-07-26 12:14:56 -07:00
Eugene Yurtsev
862e9aed66
ChatPromptTemplate: Update doc-strings, update from_role_strings behavior (#8308)
* Update doc-strings in ChatPromptTemplate
* Update from_role_strings classmethod to use well known roles
2023-07-26 15:02:36 -04:00
Bagatur
2c2fd9ff13
bump 244 (#8314) 2023-07-26 11:58:26 -07:00
Lance Martin
77c0582243
Clean queries prior to search (#8309)
With some search tools, we see no results returned if the query is a
numeric list.

E.g., if we pass:
```
'1. "LangChain vs LangSmith: How do they differ?"'
```

We see:
```
No good Google Search Result was found
```

Local testing w/ Streamlit:

![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)
2023-07-26 11:48:28 -07:00
shibuiwilliam
6b88fbd9bb
add test for embedding distance evaluation (#8285)
Add tests for embedding distance evaluation

  - Description: Add tests for embedding distance evaluation
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: @MlopsJ
2023-07-26 11:45:50 -07:00
Timon Palm
70604e590f
DuckDuckGoSearch News Tool (#8292)
Description: 
I wanted to use the DuckDuckGoSearch tool in an agent to let him get the
latest news for a topic. DuckDuckGoSearch has already an implemented
function for retrieving news articles. But there wasn't a tool to use
it. I simply adapted the SearchResult class with an extra argument
"backend". You can set it to "news" to only get news articles.

Furthermore, I added an example to the DuckDuckGo Notebook on how to
further customize the results by using the DuckDuckGoSearchAPIWrapper.

Dependencies: no new dependencies
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 11:30:01 -07:00
Byron Saltysiak
61347bd322
giving path to the copy command for *.toml files (#8294)
Description: in the .devcontainer, docker-compose build is currently
failing due to the src paths in the COPY command. This change adds the
full path to the pyproject.toml and poetry.toml to allow the build to
run.
Issue: 

You can see the issue if you try to build the dev docker image with:
```
cd .devcontainer
docker-compose build
```

Dependencies: none
Twitter handle: byronsalty
2023-07-26 10:37:03 -07:00
happyxhw
6384c1ec8f
fix: ElasticVectorSearch.from_documents failed #8293 (#8296)
- Description: fix ElasticVectorSearch.from_documents with
elasticsearch_url param,
- Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if
applicable),


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 10:33:52 -07:00
jacobswe
83a53e2126
Bug Fix: AzureChatOpenAI streaming with function calls (#8300)
- Description: During streaming, the first chunk may only contain the
name of an OpenAI function and not any arguments. In this case, the
current code presumes there is a streaming response and tries to append
to it, but gets a KeyError. This fixes that case by checking if the
arguments key exists, and if not, creates a new entry instead of
appending.
  - Issue: Related to #6462

Sample Code:
```python
llm = AzureChatOpenAI(
    deployment_name=deployment_name,
    model_name=model_name,
    streaming=True
)

tools = [PythonREPLTool()]
callbacks = [StreamingStdOutCallbackHandler()]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    callbacks=callbacks
)

agent('Run some python code to test your interpreter')
```

Previous Result:
```
File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
    342         function_call = _function_call
    343     else:
--> 344         function_call["arguments"] += _function_call["arguments"]
    345 if run_manager:
    346     run_manager.on_llm_new_token(token)

KeyError: 'arguments'
```

New Result:
```python
{'input': 'Run some python code to test your interpreter',
 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."}
```

Co-authored-by: jswe <jswe@polencapital.com>
2023-07-26 10:11:50 -07:00
German Martin
457a4730b2
Fix the mangling issue on several VectorStores child classes. (#8274)
- Description: Fix mangling issue affecting a couple of VectorStore
classes including Redis.
  - Issue: https://github.com/langchain-ai/langchain/issues/8185
  - @rlancemartin 
  
This is a simple issue but I lack of some context in the original
implementation.
My changes perhaps are not the definitive fix but to start a quick
discussion.

@hinthornw Tagging you since one of your changes introduced this
[here.](c38965fcba)
2023-07-26 09:48:55 -07:00