Commit Graph

3751 Commits

Author SHA1 Message Date
Eugene Yurtsev
6dd18eee26
PromptTemplate update documentation and expand kwargs (#8234)
# PromptTemplate

* Update documentation to highlight the classmethod for instantiating a
prompt template.
* Expand kwargs in the classmethod to make parameters easier to discover
2023-07-27 18:11:39 -07:00
Karan V
a003a0baf6
fix(petals) allows to run models that aren't Bloom (Support for LLama and newer models) (#8356)
In this PR:

- Removed restricted model loading logic for Petals-Bloom
- Removed petals imports (DistributedBloomForCausalLM,
BloomTokenizerFast)
- Instead imported more generalized versions of loader
(AutoDistributedModelForCausalLM, AutoTokenizer)
- Updated the Petals example notebook to allow for a successful
installation of Petals in Apple Silicon Macs

- Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 18:01:04 -07:00
lars.gersmann
e758e9e7f5
fix(openapi): openapi chain will work without/empty description/summa… (#8351)
Description: 

This PR will enable the Open API chain to work with valid Open API
specifications missing `description` and `summary` properties for path
and operation nodes in open api specs.

Since both `description` and `summary` property are declared optional we
cannot be sure they are defined. This PR resolves this problem by
providing an empty (`''`) description as fallback.

The previous behavior of the Open API chain was that the underlying LLM
(OpenAI) throw ed an exception since `None` is not of type string:

```
openai.error.InvalidRequestError: None is not of type 'string' - 'functions.0.description'
```

Using this PR the Open API chain will succeed also using Open API specs
lacking `description` and `summary` properties for path and operation
nodes.

Thanks for your amazing work !

Tag maintainer: @baskaryan

---------

Co-authored-by: Lars Gersmann <lars.gersmann@cm4all.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 17:58:43 -07:00
ljeagle
caa6caeb8a
Upgrade the AwaDB from v0.3.7 to v0.3.9 and change the default embeddings (#8281)
1. Upgrade the AwaDB from v0.3.7 to v0.3.9
2. Change the default embedding to AwaEmbedding

---------

Co-authored-by: ljeagle <awadb.vincent@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-27 17:20:50 -07:00
Harrison Chase
25b8cc7e3d
Harrison/update memory docs (#8384)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 17:18:19 -07:00
Holt Skinner
d7e6770de8
refactor: Code refactoring & simplification for Google Cloud Enterprise Search retriever (#8369)
Followup to https://github.com/langchain-ai/langchain/pull/7857

- Changes `_convert_search_response()` to use object attributes instead
of converting to dictionary
- Simplifies logic for readability
2023-07-27 17:13:49 -07:00
Taozhi Wang
594f195e54
Add embeddings for AwaEmbedding (#8353)
- Description: Adds AwaEmbeddings class for embeddings, which provides
users with a convenient way to do fine-tuning, as well as the potential
need for multimodality

  - Tag maintainer: @baskaryan

Create `Awa.ipynb`: an example notebook for AwaEmbeddings class
Modify `embeddings/__init__.py`: Import the class
Create `embeddings/awa.py`: The embedding class
Create `embeddings/test_awa.py`: The test file.

---------

Co-authored-by: taozhiwang <taozhiwa@gmail.com>
2023-07-27 17:08:00 -07:00
thehunmonkgroup
ba4e82bb47
fix missing _identifying_params() in _VertexAICommon (#8303)
Full set of params are missing from Vertex* LLMs when `dict()` method is
called.

```
>>> from langchain.chat_models.vertexai import ChatVertexAI
>>> from langchain.llms.vertexai import VertexAI
>>> chat_llm = ChatVertexAI()
l>>> llm = VertexAI()
>>> chat_llm.dict()
{'_type': 'vertexai'}
>>> llm.dict()
{'_type': 'vertexai'}
```

This PR just uses the same mechanism used elsewhere to expose the full
params.

Since `_identifying_params()` is on the `_VertexAICommon` class, it
should cover the chat and non-chat cases.
2023-07-27 16:59:10 -07:00
bheroder
dc3ca44e05
Add an example for azure ml managed feature store (#8324)
We are adding an example of how one can connect to azure ml managed
feature store and use such a prompt template in a llm chain. @baskaryan
2023-07-27 16:56:06 -07:00
Caitlin2694
b2e4b9dca4
Fix exception caused by restrictions in OWL (#8341)
Description: Fix exception caused by restrictions in OWL
Issue: #8331
Dependencies: none
Maintainer: @baskaryan
2023-07-27 16:51:32 -07:00
Harrison Chase
cddd8ae83d
update release yml (#8364)
only do the step that tags and adds release notes if its langchain
2023-07-27 16:49:04 -07:00
Nikita Pokidyshev
f499e6ea6a
Add FunctionMessage to _message_from_dict (#8374)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 16:45:27 -07:00
evelynmitchell
539574670c
Update tot.ipynb (#8387)
Spelling error fix

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 16:44:41 -07:00
emarco177
2ab13ab743
added unit tests for mrkl output_parser.py (#8321)
- Description: added unit tests for mrkl output_parser.py, 
  - Tag maintainer: @hinthornw
  - Twitter handle: EdenEmarco177
2023-07-27 13:46:06 -07:00
Sachin Varghese
01217b2247
Update sql database agent example (#8354)
This PR fixes a minor documentation issue on the SQL database toolkit
example notebook.
2023-07-27 13:44:02 -07:00
Bagatur
55beab326c
cleanup warnings (#8379) 2023-07-27 13:43:05 -07:00
William FH
41524304bf
Update local script for docs build (#8377) 2023-07-27 13:13:59 -07:00
Harrison Chase
f5bf893035
rename to str output parser (#8373) 2023-07-27 12:57:34 -07:00
William FH
0e9e5b5202
Retry events on any run type (#8375) 2023-07-27 12:56:46 -07:00
Bagatur
68763bd25f
mv popular and additional chains to use cases (#8242) 2023-07-27 12:55:13 -07:00
William FH
ff98fad2d9
Add Retry Events (#8053)
![image](https://github.com/hwchase17/langchain/assets/13333726/59a5c3b4-4367-47e6-9f58-5b6557576a8a)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 12:39:39 -07:00
William FH
94a693e2ee
Link to use cases from tutorials (#8371) 2023-07-27 11:54:04 -07:00
Nuno Campos
0eca3e7d90
Add Runnable.bind method to attach kwargs to a Runnable that will be passed to all invoke/stream/batch calls when it is run (#8368)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 11:16:30 -07:00
Harrison Chase
cf608f876b update link 2023-07-27 09:47:57 -07:00
Nuno Campos
1bbadde77b
Support using RunnableMap directly (#8317)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 17:24:29 +01:00
Bagatur
944321c6ab
bump 245 (#8359) 2023-07-27 06:53:24 -07:00
Rubén Barragán
ef6332ead6
Support loading files from Dropbox (#8271)
## Description
This commit introduces the `DropboxLoader` class, a new document loader
that allows loading files from Dropbox into the application. The loader
relies on a Dropbox app, which requires creating an app on Dropbox,
obtaining the necessary scope permissions, and generating an access
token. Additionally, the dropbox Python package is required.

The `DropboxLoader` class is designed to be used as a document loader
for processing various file types, including text files, PDFs, and
Dropbox Paper files.

## Dependencies
`pip install dropbox` and `pip install unstructured` for PDF reading.

## Tag maintainer
@rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some
feedback here 🙏 .

## Social Networks
https://github.com/rubenbarragan
https://www.linkedin.com/in/rgbarragan/
https://twitter.com/RubenBarraganP

---------

Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>
2023-07-27 06:36:08 -07:00
Pranay Chandekar
41bb3a6f9b
fixed the bug #8343 (#8345)
- Issue: #8343

Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>
2023-07-27 06:33:15 -07:00
Ikko Eltociear Ashimine
934ea80780
Fix typo in Etherscan.ipynb (#8340)
specifc  -> specific
2023-07-27 01:57:19 -07:00
Martin Krasser
93260a9922
Fix broken make targets format_diff and lint_diff (#8344)
Since the refactoring into sub-projects `libs/langchain` and
`libs/experimental`, the `make` targets `format_diff` and `lint_diff` do
not work anymore when running `make` from these subdirectories. Reason
is that

```
PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
```

generates paths from the project's root directory instead of the
corresponding subdirectories. This PR fixes this by adding a
`--relative` command line option.

- Tag maintainer: @baskaryan
2023-07-27 01:56:55 -07:00
Harrison Chase
ae78ef7fe6
bump experimental to 005 (#8339) 2023-07-26 21:46:28 -07:00
Vadim Gubergrits
e7e5cb9d08
Tree of Thought introducing a new ToTChain. (#5167)
# [WIP] Tree of Thought introducing a new ToTChain.

This PR adds a new chain called ToTChain that implements the ["Large
Language Model Guided
Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper.

There's a notebook example `docs/modules/chains/examples/tot.ipynb` that
shows how to use it.


Implements #4975


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @vowelparrot

---------

Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-26 21:29:39 -07:00
William FH
412e29d436
Fix notebook that 'cannot convert' via nbdoc_build (#8333) 2023-07-26 18:54:23 -07:00
William FH
9eb7e6e27f
Delete Old Evals Examples (#8252)
Still retain:
- Comparison Examples
- Data + QA walkthrough
- QA (but really minimize it)
2023-07-26 18:46:54 -07:00
Saurabh Misra
db9d5b213a
Optimize the cosine_similarity_top_k function performance (#8151)
Optimizing important numerical code and making it run faster.

Performance went up by 1.48x (148%). Runtime went down from 138715us to
56020us

Optimization explanation:

The `cosine_similarity_top_k` function is where we made the most
significant optimizations.
Instead of sorting the entire score_array which needs considering all
elements, `np.argpartition` is utilized to find the top_k largest scores
indices, this operation has a time complexity of O(n), higher
performance than sorting. Remember, `np.argpartition` doesn't guarantee
the order of the values. So we need to use argsort() to get the indices
that would sort our top-k values after partitioning, which is much more
efficient because it only sorts the top-K elements, not the entire
array. Then to get the row and column indices of sorted top_k scores in
the original score array, we use `np.unravel_index`. This operation is
more efficient and cleaner than a list comprehension.

The code has been tested for correctness by running the following
snippet on both the original function and the optimized function and
averaged over 5 times.
```
def test_cosine_similarity_top_k_large_matrices():
    X = np.random.rand(1000, 1000)
    Y = np.random.rand(1000, 1000)
    top_k = 100
    score_threshold = 0.5
    gc.disable()
    counter = time.perf_counter_ns()
    return_value = cosine_similarity_top_k(X, Y, top_k, score_threshold)
    duration = time.perf_counter_ns() - counter
    gc.enable()
```

@hwaking @hwchase17 @jerwelborn 

Unit tests pass, I also generated more regression tests which all
passed.
2023-07-26 18:03:49 -07:00
Fabrizio Ruocco
ddc353a768
Azure Cognitive Search: Custom index and scoring profile support (#6843)
Description: Adding support for custom index and scoring profile support
in Azure Cognitive Search
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 17:58:01 -07:00
Leonid Ganeline
ed24de8467
removed namespace title (#8208)
This change compacts the left-side Navbar (ToC) of the [API
Reference](https://api.python.langchain.com/en/latest/api_reference.html).
Now almost each namespace item is split into two lines. For example
`langchain.chat_models: Chat Models`
We remove the `Chat Models` and leave one the `langchain.chat_models`. 
This effectively compacts the navbar and increases the main page's
usability. On my screen, it reduces # of lines in Toc from 28 t to 18,
which is huge.

Removing the namespace "title" (like `Chat Models`) does not remove any
information because the title is composed directly from the namespace.
API Reference users are developers. Usability for them is very
important. We see less text => we find faster.
2023-07-26 16:45:23 -07:00
Kacper Łukawski
c5988c1d4b
Implement async support for Cohere (#8237)
This PR introduces async API support for Cohere, both LLM and
embeddings. It requires updating `cohere` package to `^4`.

Tagging @hwchase17, @baskaryan, @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:51:18 -07:00
Daniel Alexander Brenot
bf1357f584
Added async support to PlanAndExecute Chain (#8239)
- Description: Adds async support to the PlanAndExecute Chain

Maintainer responsibilities:
  - Async: @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:16:07 -07:00
Bastin Florian
a3ac9b23eb
feat(confluence): add markdown format option (#8246)
# Description:
**Add the possibility to keep text as Markdown in the ConfluenceLoader**
Add a bool variable that allows to keep the Markdown format of the
Confluence pages.
It is useful because it allows to use MarkdownHeaderTextSplitter as a
DataSplitter.
If this variable in set to True in the load() method, the pages are
extracted using the markdownify library.

  # Issue: 
[4407](https://github.com/langchain-ai/langchain/issues/4407)
  # Dependencies: 
Add the markdownify library
  # Tag maintainer:
 @rlancemartin, @eyurtsev
  # Twitter handle:
 FloBastinHeyI - https://twitter.com/FloBastinHeyI

---------

Co-authored-by: Florian Bastin <florian.bastin@octo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:00:27 -07:00
Leonid Ganeline
ee6ff96e28
docstrings cleanup (#8311)
- added missed docstrings
 - changed docstrings into consistent format
  
@baskaryan
2023-07-26 14:13:10 -07:00
Bagatur
ceab0a7c1f
update api ref style (#8318) 2023-07-26 14:12:44 -07:00
Rohit Gupta
e5dba8978a
Avoid re-computation of embedding in weaviate similarity search (#8284)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 13:31:55 -07:00
William FH
01a9b06400
Add api cross ref linking (#8275)
Example of how it would show up in our python docs:


![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15)


Examples added to the reference docs:

https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma


![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)
2023-07-26 12:38:58 -07:00
Nuno Campos
a612800ef0
Runnable single protocol (#7800)
Objects implementing Runnable: BasePromptTemplate, LLM, ChatModel,
Chain, Retriever, OutputParser

- [x] Implement Runnable in base Retriever
- [x] Raise TypeError in operator methods for unsupported things 
- [x] Implement dict which calls values in parallel and outputs dict
with results
- [x] Merge in `+` for prompts
- [x] Confirm precedence order for operators, ideal would be `+` `|`,
https://docs.python.org/3/reference/expressions.html#operator-precedence
- [x] Add support for openai functions, ie. Chat Models must return
messages
- [x] Implement BaseMessageChunk return type for BaseChatModel, a
subclass of BaseMessage which implements __add__ to return
BaseMessageChunk, concatenating all str args
- [x] Update implementation of stream/astream for llm and chat models to
use new `_stream`, `_astream` optional methods, with default
implementation in base class `raise NotImplementedError` use
https://stackoverflow.com/a/59762827 to see if it is implemented in base
class
- [x] Delete the IteratorCallbackHandler (leave the async one because
people using)
- [x] Make BaseLLMOutputParser implement Runnable, accepting either str
or BaseMessage
---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-07-26 12:16:46 -07:00
Bharat
04a4d3e312
Fixes #8310 Fix maximum recursion depth exceeded error (#8313)
ElasticsearchVectorStore.as_retriever() method is returning 
`RecursionError: maximum recursion depth exceeded` 
because of incorrect field reference in
 `embeddings()` method

  - Description: Fix RecursionError because of a typo
  - Issue: the issue #8310 
  - Dependencies: None,
  - Tag maintainer: @eyurtsev
  - Twitter handle: bpatel
2023-07-26 12:15:37 -07:00
Caitlin2694
b9db3dd09b
Fix "missing key op" RDFGraph OWL serialization (#8276)
Replace this comment with:
- Description: Fix "missing key op" error in RDFGraph OWL Serialization
  - Issue: #8263
  - Dependencies: None
  - Tag maintainer: @baskaryan
2023-07-26 12:14:56 -07:00
Eugene Yurtsev
862e9aed66
ChatPromptTemplate: Update doc-strings, update from_role_strings behavior (#8308)
* Update doc-strings in ChatPromptTemplate
* Update from_role_strings classmethod to use well known roles
2023-07-26 15:02:36 -04:00
Bagatur
2c2fd9ff13
bump 244 (#8314) 2023-07-26 11:58:26 -07:00
Lance Martin
77c0582243
Clean queries prior to search (#8309)
With some search tools, we see no results returned if the query is a
numeric list.

E.g., if we pass:
```
'1. "LangChain vs LangSmith: How do they differ?"'
```

We see:
```
No good Google Search Result was found
```

Local testing w/ Streamlit:

![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)
2023-07-26 11:48:28 -07:00