Commit Graph

5304 Commits (fd7ab539c8d1ae6ba3b978eeb0adf4977485fdbb)
 

Author SHA1 Message Date
Haris Wang f1269830a0
Fix bug in MarkdownHeaderTextSplitter for codeblock (#10262)
- Description: The previous version of the MarkdownHeaderTextSplitter
did not take into account the possibility of '#' appearing within code
blocks, which caused segmentation anomalies in these situations. This PR
has fixed this issue.
  - Issue: 
  - Dependencies: No
  - Tag maintainer: 
  - Twitter handle: 

cc @baskaryan @eyurtsev  @rlancemartin

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Eddie Cohen 656d2303f7
add in, nin for pinecone (#10303)
Description: Adds the in and nin comparators for pinecone seen
[here](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur a3a2ce623e Revise vowpal_wabbit notebook 1 year ago
Bagatur 8fafa1af91 merge 1 year ago
olgavrou 3b07c0cf3d
RL Chain with VowpalWabbit (#10242)
- Description: This PR adds a new chain `rl_chain.PickBest` for learned
prompt variable injection, detailed description and usage can be found
in the example notebook added. It essentially adds a
[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer
before the llm call in order to learn or personalize prompt variable
selections.

Most of the code is to make the API simple and provide lots of defaults
and data wrangling that is needed to use Vowpal Wabbit, so that the user
of the chain doesn't have to worry about it.

- Dependencies:
[vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/),
     - sentence-transformers (already a dep)
     - numpy (already a dep)
  - tagging @ataymano who contributed to this chain
  - Tag maintainer: @baskaryan
  - Twitter handle: @olgavrou


Added example notebook and unit tests
1 year ago
Manikanta5112 56048b909f
added ContentFormatter escape special characters for message content (#10319)
---------

Co-authored-by: Manikanta5112 <42089393+mani5112@users.noreply.github.com>
1 year ago
Leonid Ganeline d17416ec79
docstrings `callbacks` (#11456)
Added missed docstrings to the `callbacks/`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
1 year ago
Ofer Mendelevitch 3c7653bf0f
"source" argument in constructor of Vectara (#11454)
Replace this entire comment with:
- **Description:** minor update to constructor to allow for
specification of "source"
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
1 year ago
Eugene Yurtsev d9018ae5f1
Improve CLI ux (#11452)
Improve UX for cli
1 year ago
Jaikanth J 9f85f7c543
fix(cache): use dumps for RedisCache (#10408)
# Description
Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps`
used in SQLAlchemy cache by @hwchase17 . this is better than pickle
dump, because this won't execute any arbitrary code during
de-serialisation.

# Issues
#7722 & #8666 

# Dependencies
None, but removes the warning introduced in #8041 by @baskaryan

Handle: @jaikanthjay46
1 year ago
rodrigo-clickup 5944c1851b
Add ClickUp Toolkit (#10662)
- **Description:** Adds a toolkit to interact with the
[ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/)
- **Dependencies:** None
- **Tag maintainer:** @rodrigo-georgian, @rodrigo-clickup,
@aiswaryasankarwork
- **Twitter handle:** 
- Aiswarya (https://twitter.com/Aiswarya_Sankar,
https://www.linkedin.com/in/sankaraiswarya/)
   - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/)


---------

Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local>
Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
John Reynolds 68901e1e40
Update output_parser.py (#10430)
- Description: Updated output parser for mrkl to remove any
hallucination actions after the final answer; this was encountered when
using Anthropic claude v2 for planning; reopening PR with updated unit
tests
- Issue: #10278 
- Dependencies: N/A
- Twitter handle: @johnreynolds
1 year ago
Joshua Sundance Bailey 790010703b
ArcGISLoader: Limit number of results in query (#10615)
Description: this PR changes the `ArcGISLoader` to set
`return_all_records` to `False` when `result_record_count` is provided
as a keyword argument. Previously, `return_all_records` was `True` by
default and this made the API ignore `result_record_count`.

Issue: `ArcGISLoader` would ignore `result_record_count` unless user
also passed `return_all_records=False`.
1 year ago
Beck Bekmyradov f9df55f7d2
Fix a Typo in Documentation (#11453)
- **Description:** This commit corrects a minor typo in the
documentation. It changes "frum" to "from" in the sentence: "The results
from search are passed back to the LLM for synthesis into an answer" in
the file `docs/extras/use_cases/more/agents/agents.ipynb`. This typo fix
enhances the clarity and accuracy of the documentation.
- **Tag maintainer:** @baskaryan
1 year ago
Bagatur f5ce286932
fix api docs build (#11445) 1 year ago
mrbean 9903a70379
Add youdotcom retriever (#11304)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
ashish-dahal 1655ff2ded
Fix PyMuPDFLoader kwargs (#11434)
- **Description:** Fix the `PyMuPDFLoader` to accept `loader_kwargs`
from the document loader's `loader_kwargs` option. This provides more
flexibility in formatting the output from documents.

- **Issue:** The `loader_kwargs` is not passed into the `load` method
from the document loader, which limits configuration options.

- **Dependencies:**  None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Kuligin e4a46747dc
integration test for DocAI parser (#11424)
- **Description:** added an integration test
  - **Issue:** #11407 

@baskaryan
1 year ago
Aashish Saini 2abbdc6ecb
Update bageldb.py (#11421)
I have restructured the code to ensure uniform handling of ImportError.
In place of previously used ValueError, I've adopted the standard
practice of raising ImportError with explanatory messages. This
modification enhances code readability and clarifies that any problems
stem from module importation.
1 year ago
Syed Ather Rizvi bfd48925e5
Feature/csharp text splitter doc (#10571)
- **Description:** Just docs related to csharp code splitter
   
- **Issue:** It's related to a request made by @baskaryan in a comment
on my previous PR #10350
  - **Dependencies:** None
  - **Twitter handle:** @ather19

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Nuno Campos 2c11302598
Update langchain_release.yml (#11444) 1 year ago
maks-operlejn-ds 2aae1102b0
Instance anonymization (#10501)
### Description

Add instance anonymization - if `John Doe` will appear twice in the
text, it will be treated as the same entity.
The difference between `PresidioAnonymizer` and
`PresidioReversibleAnonymizer` is that only the second one has a
built-in memory, so it will remember anonymization mapping for multiple
texts:

```
>>> anonymizer = PresidioAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Brett Russell. Hi Brett Russell!'
```
```
>>> anonymizer = PresidioReversibleAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
```

### Twitter handle
@deepsense_ai / @MaksOpp

### Tag maintainer
@baskaryan @hwchase17 @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Kyle Pancamo 203258b4d6
Update pdf.py comment for PyPDFLoader (#10495)
PyPDF does not chunk at the character level to my understanding.

Description: PyPDF does not chunk at the character level, but instead
breaks up content by page. Fixup comment

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Juan Daza 4236ae3851
Added Streaming Capability to SageMaker LLMs (#10535)
This PR adds the ability to declare a Streaming response in the
SageMaker LLM by leveraging the `invoke_endpoint_with_response_stream`
capability in `boto3`. It is heavily based on the AWS Blog Post
announcement linked
[here](https://aws.amazon.com/blogs/machine-learning/elevating-the-generative-ai-experience-introducing-streaming-support-in-amazon-sagemaker-hosting/).

It does not add any additional dependencies since it uses the existing
`boto3` version.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Laurentiu Piciu d9670a5945
openai_functions_multi_agent: solved the case when the "arguments" is valid JSON but it does not contain `actions` key (#10543)
Description: There are cases when the output from the LLM comes fine
(i.e. function_call["arguments"] is a valid JSON object), but it does
not contain the key "actions". So I split the validation in 2 steps:
loading arguments as JSON and then checking for "actions" in it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Eugene Yurtsev fcccde406d
Add SymbolicMathChain to experiment in preparation for deprecation (#11129)
Move symbolic math chain to experimental
1 year ago
Holt Skinner 9f73fec057
fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513)
- Description: Google Cloud Enterprise Search was renamed to Vertex AI
Search
-
https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available
- This PR updates the documentation and Retriever class to use the new
terminology.
- Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to
`GoogleVertexAISearchRetriever`
- Updated documentation to specify that `extractive_segments` requires
the new [Enterprise
edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features)
to be enabled.
  - Fixed spelling errors in documentation.
- Change parameter for Retriever from `search_engine_id` to
`data_store_id`
- When this retriever was originally implemented, there was no
distinction between a data store and search engine, but now these have
been split.
- Fixed an issue blocking some users where the api_endpoint can't be set
1 year ago
Patrick Randell 1d678f805f
Additional Weaviate Filter Comparators (#10522)
### Description
When using Weaviate Self-Retrievers, certain common filter comparators
generated by user queries were unimplemented, resulting in errors. This
PR implements some of them. All linting and format commands have been
run and tests passed.
### Issue
#10474
### Dependencies
timestamp module

---------

Co-authored-by: Patrick Randell <prandell@deloitte.com.au>
1 year ago
Nuno Campos 79011f835f
Remove str() from RunnableConfigurableAlternatives (#11446) 1 year ago
Mateusz Wosinski 656480feb6
Add language detection example (#10540)
### Description

Adds language detection examples based on
[langdetect](https://github.com/Mimino666/langdetect/tree/master/langdetect)
and [fasttext](https://github.com/facebookresearch/fastText/) libraries.
These frameworks can be especially useful together with components that
require selection of the language (e.g. data-anonymizer)

### Twitter handle

@deepsense_ai, @matt_wosinski
1 year ago
Harrison Chase 31d5bd84d7
make vectorstores optional (#11393) 1 year ago
Eugene Yurtsev 8aa545901a
Update agent type docs (#11137)
In code docs for agent types
1 year ago
Eugene Yurtsev 3e31d6e35f
Start deprecation of LLMBashChain (#11300)
In preparation for migration LLMBashChain and related tools add a
derprecation warning to the code.
1 year ago
Bagatur 8b6b8bf68c
bump 309 (#11443) 1 year ago
billytrend-cohere 2ff91a46c0
Add cohere /chat integration (#11389)
Add cohere /chat integration and an iPython notebook to demonstrate the
addition.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
adrienohana ca346011b7
added interactive login for azure cognitive search vector store (#11360)
**Description:** Previously if the access to Azure Cognitive Search was
not done via an API key, the default credential was called which doesn't
allow to use an interactive login. I simply added the option to use
"INTERACTIVE" as a key name, and this will launch a login window upon
initialization of the AzureSearch object.
1 year ago
ElliotKetchup 53d4f1554a
Update aws.mdx (#11431) 1 year ago
Lance Martin 211a74941a
Update QA doc w/ Runnables (#11401)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
1 year ago
Eugene Yurtsev 5a1f614175
Add docker compose to CLI (#11406)
Add docker compose to cli
1 year ago
Predrag Gruevski e2d6c41177
Upgrade langchain dependencies. (#11420)
I was hoping this would pick up numpy 1.26, which is required to support
the new Python 3.12 release, but it didn't. It seems that some
transitive dependency requirement on numpy is preventing that, and the
highest we can currently go is 1.24.x.

But to find this out required a 15min `poetry lock`, so I figured we
might as well upgrade the dependencies we can and hopefully make the
next dependency upgrade a bit smaller.
1 year ago
Jacob Lee 71fd6428c5
Remove overridden async not implemented method on embeddings filters and add default async implementation for document compressors (#11415)
@nfcampos @eyurtsev @baskaryan

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
1 year ago
Nuno Campos 2f490be09b
Fix .dict() for agent/chain (#11436)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
1 year ago
Nuno Campos 1e59c44d36
Nc/5oct/runnable release (#11428)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
1 year ago
Bagatur 58b7a3ba16
Rm bedrock anthropic error (#11403) 1 year ago
Predrag Gruevski c9986bc3a9
Tweak type hints to match dependency's behavior. (#11355)
Needs #11353 to merge first, and a new `langchain` to be published with
those changes.
1 year ago
William FH 940b9ae30a
Normalize Option in Scoring Chain (#11412) 1 year ago
bholagabbar b9fad28f5e
Fix typing imports in extraction usecase (#11402)
The person class here:
https://python.langchain.com/docs/use_cases/extraction#pydantic-1 has
attributes `dog_breed` and `dog_name` that use `Optional` from typing,
but it hasn't been imported. Fixed the import here
1 year ago
Leonid Ganeline 22165cb2fc
merge pages into `google` and `AWS` pages (#11312)
There are several pages in `integrations/providers/more` that belongs to
Google and AWS `integrations/providers`.
- moved content of these pages into the Google and AWS
`integrations/providers` pages
- removed these individual pages
1 year ago
Eugene Yurtsev 70be04a816
CLI: Readme update (#11404)
Consolidating to a single README for now, will be easier to maintain we
can differentiate between poetry and pip later. Does not seem critical.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
1 year ago
Nuno Campos fde19c8667
Add CLI command to create a new project (#7837)
First version of CLI command to create a new langchain project template

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago