Commit Graph

3638 Commits (642b57c7ff48970e392ed31dda2808ab5dcf1e9a)
 

Author SHA1 Message Date
emarco177 2ab13ab743
added unit tests for mrkl output_parser.py (#8321)
- Description: added unit tests for mrkl output_parser.py, 
  - Tag maintainer: @hinthornw
  - Twitter handle: EdenEmarco177
1 year ago
Sachin Varghese 01217b2247
Update sql database agent example (#8354)
This PR fixes a minor documentation issue on the SQL database toolkit
example notebook.
1 year ago
Bagatur 55beab326c
cleanup warnings (#8379) 1 year ago
William FH 41524304bf
Update local script for docs build (#8377) 1 year ago
Harrison Chase f5bf893035
rename to str output parser (#8373) 1 year ago
William FH 0e9e5b5202
Retry events on any run type (#8375) 1 year ago
Bagatur 68763bd25f
mv popular and additional chains to use cases (#8242) 1 year ago
William FH ff98fad2d9
Add Retry Events (#8053)
![image](https://github.com/hwchase17/langchain/assets/13333726/59a5c3b4-4367-47e6-9f58-5b6557576a8a)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
William FH 94a693e2ee
Link to use cases from tutorials (#8371) 1 year ago
Nuno Campos 0eca3e7d90
Add Runnable.bind method to attach kwargs to a Runnable that will be passed to all invoke/stream/batch calls when it is run (#8368)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Harrison Chase cf608f876b update link 1 year ago
Nuno Campos 1bbadde77b
Support using RunnableMap directly (#8317)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Bagatur 944321c6ab
bump 245 (#8359) 1 year ago
Rubén Barragán ef6332ead6
Support loading files from Dropbox (#8271)
## Description
This commit introduces the `DropboxLoader` class, a new document loader
that allows loading files from Dropbox into the application. The loader
relies on a Dropbox app, which requires creating an app on Dropbox,
obtaining the necessary scope permissions, and generating an access
token. Additionally, the dropbox Python package is required.

The `DropboxLoader` class is designed to be used as a document loader
for processing various file types, including text files, PDFs, and
Dropbox Paper files.

## Dependencies
`pip install dropbox` and `pip install unstructured` for PDF reading.

## Tag maintainer
@rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some
feedback here 🙏 .

## Social Networks
https://github.com/rubenbarragan
https://www.linkedin.com/in/rgbarragan/
https://twitter.com/RubenBarraganP

---------

Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>
1 year ago
Pranay Chandekar 41bb3a6f9b
fixed the bug #8343 (#8345)
- Issue: #8343

Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>
1 year ago
Ikko Eltociear Ashimine 934ea80780
Fix typo in Etherscan.ipynb (#8340)
specifc  -> specific
1 year ago
Martin Krasser 93260a9922
Fix broken `make` targets `format_diff` and `lint_diff` (#8344)
Since the refactoring into sub-projects `libs/langchain` and
`libs/experimental`, the `make` targets `format_diff` and `lint_diff` do
not work anymore when running `make` from these subdirectories. Reason
is that

```
PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
```

generates paths from the project's root directory instead of the
corresponding subdirectories. This PR fixes this by adding a
`--relative` command line option.

- Tag maintainer: @baskaryan
1 year ago
Harrison Chase ae78ef7fe6
bump experimental to 005 (#8339) 1 year ago
Vadim Gubergrits e7e5cb9d08
Tree of Thought introducing a new ToTChain. (#5167)
# [WIP] Tree of Thought introducing a new ToTChain.

This PR adds a new chain called ToTChain that implements the ["Large
Language Model Guided
Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper.

There's a notebook example `docs/modules/chains/examples/tot.ipynb` that
shows how to use it.


Implements #4975


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @vowelparrot

---------

Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
William FH 412e29d436
Fix notebook that 'cannot convert' via nbdoc_build (#8333) 1 year ago
William FH 9eb7e6e27f
Delete Old Evals Examples (#8252)
Still retain:
- Comparison Examples
- Data + QA walkthrough
- QA (but really minimize it)
1 year ago
Saurabh Misra db9d5b213a
Optimize the cosine_similarity_top_k function performance (#8151)
Optimizing important numerical code and making it run faster.

Performance went up by 1.48x (148%). Runtime went down from 138715us to
56020us

Optimization explanation:

The `cosine_similarity_top_k` function is where we made the most
significant optimizations.
Instead of sorting the entire score_array which needs considering all
elements, `np.argpartition` is utilized to find the top_k largest scores
indices, this operation has a time complexity of O(n), higher
performance than sorting. Remember, `np.argpartition` doesn't guarantee
the order of the values. So we need to use argsort() to get the indices
that would sort our top-k values after partitioning, which is much more
efficient because it only sorts the top-K elements, not the entire
array. Then to get the row and column indices of sorted top_k scores in
the original score array, we use `np.unravel_index`. This operation is
more efficient and cleaner than a list comprehension.

The code has been tested for correctness by running the following
snippet on both the original function and the optimized function and
averaged over 5 times.
```
def test_cosine_similarity_top_k_large_matrices():
    X = np.random.rand(1000, 1000)
    Y = np.random.rand(1000, 1000)
    top_k = 100
    score_threshold = 0.5
    gc.disable()
    counter = time.perf_counter_ns()
    return_value = cosine_similarity_top_k(X, Y, top_k, score_threshold)
    duration = time.perf_counter_ns() - counter
    gc.enable()
```

@hwaking @hwchase17 @jerwelborn 

Unit tests pass, I also generated more regression tests which all
passed.
1 year ago
Fabrizio Ruocco ddc353a768
Azure Cognitive Search: Custom index and scoring profile support (#6843)
Description: Adding support for custom index and scoring profile support
in Azure Cognitive Search
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline ed24de8467
removed namespace title (#8208)
This change compacts the left-side Navbar (ToC) of the [API
Reference](https://api.python.langchain.com/en/latest/api_reference.html).
Now almost each namespace item is split into two lines. For example
`langchain.chat_models: Chat Models`
We remove the `Chat Models` and leave one the `langchain.chat_models`. 
This effectively compacts the navbar and increases the main page's
usability. On my screen, it reduces # of lines in Toc from 28 t to 18,
which is huge.

Removing the namespace "title" (like `Chat Models`) does not remove any
information because the title is composed directly from the namespace.
API Reference users are developers. Usability for them is very
important. We see less text => we find faster.
1 year ago
Kacper Łukawski c5988c1d4b
Implement async support for Cohere (#8237)
This PR introduces async API support for Cohere, both LLM and
embeddings. It requires updating `cohere` package to `^4`.

Tagging @hwchase17, @baskaryan, @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Daniel Alexander Brenot bf1357f584
Added async support to PlanAndExecute Chain (#8239)
- Description: Adds async support to the PlanAndExecute Chain

Maintainer responsibilities:
  - Async: @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bastin Florian a3ac9b23eb
feat(confluence): add markdown format option (#8246)
# Description:
**Add the possibility to keep text as Markdown in the ConfluenceLoader**
Add a bool variable that allows to keep the Markdown format of the
Confluence pages.
It is useful because it allows to use MarkdownHeaderTextSplitter as a
DataSplitter.
If this variable in set to True in the load() method, the pages are
extracted using the markdownify library.

  # Issue: 
[4407](https://github.com/langchain-ai/langchain/issues/4407)
  # Dependencies: 
Add the markdownify library
  # Tag maintainer:
 @rlancemartin, @eyurtsev
  # Twitter handle:
 FloBastinHeyI - https://twitter.com/FloBastinHeyI

---------

Co-authored-by: Florian Bastin <florian.bastin@octo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline ee6ff96e28
docstrings cleanup (#8311)
- added missed docstrings
 - changed docstrings into consistent format
  
@baskaryan
1 year ago
Bagatur ceab0a7c1f
update api ref style (#8318) 1 year ago
Rohit Gupta e5dba8978a
Avoid re-computation of embedding in weaviate similarity search (#8284)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
William FH 01a9b06400
Add api cross ref linking (#8275)
Example of how it would show up in our python docs:


![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15)


Examples added to the reference docs:

https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma


![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)
1 year ago
Nuno Campos a612800ef0
Runnable single protocol (#7800)
Objects implementing Runnable: BasePromptTemplate, LLM, ChatModel,
Chain, Retriever, OutputParser

- [x] Implement Runnable in base Retriever
- [x] Raise TypeError in operator methods for unsupported things 
- [x] Implement dict which calls values in parallel and outputs dict
with results
- [x] Merge in `+` for prompts
- [x] Confirm precedence order for operators, ideal would be `+` `|`,
https://docs.python.org/3/reference/expressions.html#operator-precedence
- [x] Add support for openai functions, ie. Chat Models must return
messages
- [x] Implement BaseMessageChunk return type for BaseChatModel, a
subclass of BaseMessage which implements __add__ to return
BaseMessageChunk, concatenating all str args
- [x] Update implementation of stream/astream for llm and chat models to
use new `_stream`, `_astream` optional methods, with default
implementation in base class `raise NotImplementedError` use
https://stackoverflow.com/a/59762827 to see if it is implemented in base
class
- [x] Delete the IteratorCallbackHandler (leave the async one because
people using)
- [x] Make BaseLLMOutputParser implement Runnable, accepting either str
or BaseMessage
---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Bharat 04a4d3e312
Fixes #8310 Fix maximum recursion depth exceeded error (#8313)
ElasticsearchVectorStore.as_retriever() method is returning 
`RecursionError: maximum recursion depth exceeded` 
because of incorrect field reference in
 `embeddings()` method

  - Description: Fix RecursionError because of a typo
  - Issue: the issue #8310 
  - Dependencies: None,
  - Tag maintainer: @eyurtsev
  - Twitter handle: bpatel
1 year ago
Caitlin2694 b9db3dd09b
Fix "missing key op" RDFGraph OWL serialization (#8276)
Replace this comment with:
- Description: Fix "missing key op" error in RDFGraph OWL Serialization
  - Issue: #8263
  - Dependencies: None
  - Tag maintainer: @baskaryan
1 year ago
Eugene Yurtsev 862e9aed66
ChatPromptTemplate: Update doc-strings, update from_role_strings behavior (#8308)
* Update doc-strings in ChatPromptTemplate
* Update from_role_strings classmethod to use well known roles
1 year ago
Bagatur 2c2fd9ff13
bump 244 (#8314) 1 year ago
Lance Martin 77c0582243
Clean queries prior to search (#8309)
With some search tools, we see no results returned if the query is a
numeric list.

E.g., if we pass:
```
'1. "LangChain vs LangSmith: How do they differ?"'
```

We see:
```
No good Google Search Result was found
```

Local testing w/ Streamlit:

![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)
1 year ago
shibuiwilliam 6b88fbd9bb
add test for embedding distance evaluation (#8285)
Add tests for embedding distance evaluation

  - Description: Add tests for embedding distance evaluation
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: @MlopsJ
1 year ago
Riche Akparuorji f3d2fdd54c
Fix for code snippet in documentation (#8290)
- Description: I fixed an issue in the code snippet related to the
variable name and the evaluation of its length. The original code used
the variable "docs," but the correct variable name is "docs_svm" after
using the SVMRetriever.
- maintainer: @baskaryan
- Twitter handle: @iamreechi_

Co-authored-by: iamreechi <richieakparuorji>
1 year ago
Bagatur f27176930a
fix geopandas link (#8305) 1 year ago
Timon Palm 70604e590f
DuckDuckGoSearch News Tool (#8292)
Description: 
I wanted to use the DuckDuckGoSearch tool in an agent to let him get the
latest news for a topic. DuckDuckGoSearch has already an implemented
function for retrieving news articles. But there wasn't a tool to use
it. I simply adapted the SearchResult class with an extra argument
"backend". You can set it to "news" to only get news articles.

Furthermore, I added an example to the DuckDuckGo Notebook on how to
further customize the results by using the DuckDuckGoSearchAPIWrapper.

Dependencies: no new dependencies
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Aarav Borthakur 8ce661d5a1
Docs: Fix Rockset links (#8214)
Fix broken Rockset links.

Right now links at
https://python.langchain.com/docs/integrations/providers/rockset are
broken.
1 year ago
Byron Saltysiak 61347bd322
giving path to the copy command for *.toml files (#8294)
Description: in the .devcontainer, docker-compose build is currently
failing due to the src paths in the COPY command. This change adds the
full path to the pyproject.toml and poetry.toml to allow the build to
run.
Issue: 

You can see the issue if you try to build the dev docker image with:
```
cd .devcontainer
docker-compose build
```

Dependencies: none
Twitter handle: byronsalty
1 year ago
happyxhw 6384c1ec8f
fix: ElasticVectorSearch.from_documents failed #8293 (#8296)
- Description: fix ElasticVectorSearch.from_documents with
elasticsearch_url param,
- Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if
applicable),


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Jon Bennion ad38eb2d50
correction to reference to code (#8301)
- Description: fixes typo referencing code

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
jacobswe 83a53e2126
Bug Fix: AzureChatOpenAI streaming with function calls (#8300)
- Description: During streaming, the first chunk may only contain the
name of an OpenAI function and not any arguments. In this case, the
current code presumes there is a streaming response and tries to append
to it, but gets a KeyError. This fixes that case by checking if the
arguments key exists, and if not, creates a new entry instead of
appending.
  - Issue: Related to #6462

Sample Code:
```python
llm = AzureChatOpenAI(
    deployment_name=deployment_name,
    model_name=model_name,
    streaming=True
)

tools = [PythonREPLTool()]
callbacks = [StreamingStdOutCallbackHandler()]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    callbacks=callbacks
)

agent('Run some python code to test your interpreter')
```

Previous Result:
```
File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
    342         function_call = _function_call
    343     else:
--> 344         function_call["arguments"] += _function_call["arguments"]
    345 if run_manager:
    346     run_manager.on_llm_new_token(token)

KeyError: 'arguments'
```

New Result:
```python
{'input': 'Run some python code to test your interpreter',
 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."}
```

Co-authored-by: jswe <jswe@polencapital.com>
1 year ago
German Martin 457a4730b2
Fix the mangling issue on several VectorStores child classes. (#8274)
- Description: Fix mangling issue affecting a couple of VectorStore
classes including Redis.
  - Issue: https://github.com/langchain-ai/langchain/issues/8185
  - @rlancemartin 
  
This is a simple issue but I lack of some context in the original
implementation.
My changes perhaps are not the definitive fix but to start a quick
discussion.

@hinthornw Tagging you since one of your changes introduced this
[here.](c38965fcba)
1 year ago
Alec Flett 4da43f77e5
Add ability to load (deserialize) objects from other namespaces (#7726)
I have some Prompt subclasses in my project that I'd like to be able to
deserialize in callbacks. Right now `loads()`/`load()` will bail when it
encounters my object, but I know I can trust the objects because they're
in my own projects.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Bagatur 5c6dcb1960
bump 243 (#8289) 1 year ago
William FH adf019724f
unpack later (#8278)
Fix https://github.com/langchain-ai/langchain/issues/8272
1 year ago