Commit Graph

3665 Commits (c2f46b2cdbad36d9ead3ef6702a75cc6a1edcbb6)
 

Author SHA1 Message Date
shibuiwilliam f68f3b23d7
add missing RemoteLangChainRetriever _get_relevant_documents test (#8628)
# What
- Add missing RemoteLangChainRetriever _get_relevant_documents test

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
William FH 206901fa01
Use salt instead of datetime (#8653)
If you want to kick off two runs at the same time it'll cause errors.
Use a uuid instead
1 year ago
William FH 7ea2b08d1f
Use call directly for chain (#8655)
for run_on_dataset since the `run()` method requires a single output
1 year ago
William FH 368aa4ede7
fix enum error message (#8652)
could be a string so don't directly call value
1 year ago
millerick 5018af8839
docs: fix some grammar (#8654)
### Description
Fixes a grammar issue I noticed when reading through the documentation.

### Maintainers
@baskaryan

Co-authored-by: mmillerick <mmillerick@blend.com>
1 year ago
Erick Friis 96b0ff182e
Enterprise support form wording (#8641) 1 year ago
Lance Martin 59194c2214
Add summarization use-case (#8376)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Will Thompson ee1d13678e
🐛 Docs Fixes [2 one-liners, examples broken] (#8519)
## Description: 
   
1)Map reduce example in docs is missing an important import statement.
Figured other people would benefit from being able to copy 🍝 the code.

2)RefineDocumentsChain example also broken.

## Issue: 

None

## Dependencies:

None. One liner.

## Tag maintainer:

@baskaryan

## Twitter handle: 

I mean, it's a one line fix lol. But @will_thompson_k is my twitter
handle.
1 year ago
Leonid Ganeline 1335f2b9f8
`MLflow` examples (#8642)
Updated `MLflow` examples with links to the examples from MLflow

 @baskaryan
1 year ago
Kacper Łukawski 16551536e3
Refactor Qdrant integration (#8634)
This small PR introduces new parameters into Qdrant (`on_disk`), fixes
some tests and changes the error message to be more clear.

Tagging: @baskaryan, @rlancemartin, @eyurtsev
1 year ago
Erick Friis c5fb3b6069
Enterprise support form in airtable (#8607) 1 year ago
Eugene Yurtsev 1ec0b18379
Re-add __add__ functionality for messages (revert #8245) (#8489)
This PR reverts #8245, so `__add__` is defined on base messages.

Resolves issue: https://github.com/langchain-ai/langchain/issues/8472
1 year ago
Bagatur f31047a394
bump 250 (#8632) 1 year ago
Comendeiro 5c516945d0
Add local support for audio models (PR #7329) (#7591)
- Description: run the poetry dependencies
  - Issue: #7329 
  - Dependencies: any dependencies required for this change,
  - Tag maintainer: @rlancemartin

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Naveen Tatikonda d2adec3818
[Opensearch] : Fix the service validation in http_auth (#8609)
### Description
OpenSearch supports validation using both Master Credentials (Username
and password) and IAM. For Master Credentials users will not pass the
argument `service` in `http_auth` and the existing code will break. To
fix this, I have updated the condition to check if service attribute is
present in http_auth before accessing it.

### Maintainers
@baskaryan @navneet1v

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Harrison Chase 7c5c0557cb
cast to string when measuring token length (#8617) 1 year ago
rjanardhan3 68113348cc
Fireworks integration (#8322)
Description - Integrates Fireworks within Langchain LLMs to allow users
to use Fireworks models with Langchain, mainly for summarization.

Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin

---------

Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>
1 year ago
Bagatur b574507c51
normalized openai embeddings embed_query (#8604)
we weren't normalizing when embedding queries
1 year ago
Neil Murphy 31820a31e4
Add firestore_client param to FirestoreChatMessageHistory if caller already has one; also lets them specify GCP project, etc. (#8601)
Existing implementation requires that you install `firebase-admin`
package, and prevents you from using an existing Firestore client
instance if available.

This adds optional `firestore_client` param to
`FirestoreChatMessageHistory`, so users can just use their existing
client/settings. If not passed, existing logic executes to initialize a
`firestore_client`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Naveen Tatikonda 13ccf202de
[OpenSearch] : Fix AOSS Initialization (#8600)
### Description
This PR fixes the AOSS Initialization in Opensearch.

### Maintainers
@rlancemartin, @eyurtsev, @navneet1v

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Joshua Carroll 6705928b9d
Add StreamlitChatMessageHistory (#8497)
Add a StreamlitChatMessageHistory class that stores chat messages in
[Streamlit's Session
State](https://docs.streamlit.io/library/api-reference/session-state).

Note: The integration test uses a currently-experimental Streamlit
testing framework to simulate the execution of a Streamlit app. Marking
this PR as draft until I confirm with the Streamlit team that we're
comfortable supporting it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Matt Robinson 8961c720b8
docs: update `unstructured` install instructions (#8596)
### Summary

Updates the `unstructured` install instructions. For
`unstructured>=0.9.0`, dependencies are broken out by document type and
the base `unstructured` package includes fewer dependencies. `pip
install "unstructured[local-inference]"` has been replace by `pip
install "unstructured[all-docs]"`, though the `local-inference` extra is
still supported for the time being.

### Reviewers

- @rlancemartin
- @eyurtsev
- @hwchase17
1 year ago
Bagatur 73072d3db8
mv (#8595) 1 year ago
brettdbrewer 2de028834f
updated to use new llm_util query (#8591)
- Description: added memgraph_graph.py which defines the MemgraphGraph
class, subclassing off the existing Neo4jGraph class. This lets you
query the Memgraph graph database using natural language. It leverages
the Neo4j drivers and the bolt protocol.
- Dependencies: since it is a subclass off of Neo4jGraph, it is
dependent on it and the GraphCypherQA Chain implementations. It is
dependent on the Neo4j drivers being present. It is dependent on having
a running Memgraph instance to connect to.
  - Tag maintainer: @baskaryan
  - Twitter handle: @villageideate
- example usage can be seen in this repo
https://github.com/brettdbrewer/MemgraphGraph/

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Tesfagabir Meharizghi a7000ee89e
Callback handler for Amazon SageMaker Experiments (#8587)
## Description

This PR implements a callback handler for SageMaker Experiments which is
similar to that of mlflow.
* When creating the callback handler, it takes the experiment's run
object as an argument. All the callback outputs are then logged to the
run object.
* The output of each callback action (e.g., `on_llm_start`) is saved to
S3 bucket as json file.
* Optionally, you can also log additional information such as the LLM
hyper-parameters to the same run object.
* Once the callback object is no more needed, you will need to call the
`flush_tracker()` method. This makes sure that any intermediate files
are deleted.
* A separate notebook example is provided to show how the callback is
used.

@3coins  @agola11

---------

Co-authored-by: Tesfagabir Meharizghi <mehariz@amazon.com>
1 year ago
Harrison Chase 9c2b29a1cb
Harrison/loader bug (#8559)
Co-authored-by: ddroghini <d.droghini@mflgroup.com>
Co-authored-by: Buckler89 <Droghini.diego@gmail.com>
1 year ago
Kristelle Widjaja f190bc3e83
Bug fix: feature/issue-7804-chroma-client_settings-bug (#8267)
Description: Made Chroma constructor more robust when client_settings is
provided. Otherwise, existing embeddings will not be loaded correctly
from Chroma.
Issue: #7804
Dependencies: None
Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
mpb159753 7df2dfc4c2
Add Support for Loading Documents from Huawei OBS (#8573)
Description:
This PR adds support for loading documents from Huawei OBS (Object
Storage Service) in Langchain. OBS is a cloud-based object storage
service provided by Huawei Cloud. With this enhancement, Langchain users
can now easily access and load documents stored in Huawei OBS directly
into the system.

Key Changes:
- Added a new document loader module specifically for Huawei OBS
integration.
- Implemented the necessary logic to authenticate and connect to Huawei
OBS using access credentials.
- Enabled the loading of individual documents from a specified bucket
and object key in Huawei OBS.
- Provided the option to specify custom authentication information or
obtain security tokens from Huawei Cloud ECS for easy access.

How to Test:
1. Ensure the required package "esdk-obs-python" is installed.
2. Configure the endpoint, access key, secret key, and bucket details
for Huawei OBS in the Langchain settings.
3. Load documents from Huawei OBS using the updated document loader
module.
4. Verify that documents are successfully retrieved and loaded into
Langchain for further processing.

Please review this PR and let us know if any further improvements are
needed. Your feedback is highly appreciated!

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline ed9a0f8185
Docstrings: Module descriptions (#8262)
Added/changed the module descriptions (the firs-line docstrings in the
`__init__` files).
Added class hierarchy info.
 @baskaryan
1 year ago
shibuiwilliam 465faab935
fix apparent spelling inconsistencies (#8574)
Use ImportErrors where appropriate
1 year ago
Nuno Campos 0ec020698f
Add new run types for Runnables (#8488)
- allow overriding run_type in on_chain_start

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Bagatur bd2e298468
bump 249 (#8571) 1 year ago
Harrison Chase 66226d1d4d
add example for memory (#8552) 1 year ago
William FH e83250cc5f
Rm RunTypeEnum (#8553)
We already support raw strings in the SDK but would like to deprecate
client-side validation of run types. This removes its usage
1 year ago
Jacob Lee 2a26cc6d2b
Fix combining runnable sequences (#8557)
Combining runnable sequences was dropping a step in the middle.

@nfcampos @baskaryan
1 year ago
Mohamad Zamini 3fbb737bb3
Update combined.py (#7541)
from my understanding, the `check_repeated_memory_variable` validator
will raise an error if any of the variables in the `memories` list are
repeated. However, the `load_memory_variables` method does not check for
repeated variables. This means that it is possible for the
`CombinedMemory` instance to return a dictionary of memory variables
that contains duplicate values. This code will check for repeated
variables in the `data` dictionary returned by the
`load_memory_variables` method of each sub-memory. If a repeated
variable is found, an error will be raised.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Shantanu Nair 53f3793504
Fast load conversationsummarymemory from existing summary (#7533)
- Description: Adds an optional buffer arg to the memory's
from_messages() method. If provided the existing memory will be loaded
instead of regenerating a summary from the loaded messages.
 
Why? If we have past messages to load from, it is likely we also have an
existing summary. This is particularly helpful in cases where the chat
is ephemeral and/or is backed by serverless where the chat history is
not stored but where the updated chat history is passed back and forth
between a backend/frontend.

Eg: Take a stateless qa backend implementation that loads messages on
every request and generates a response — without this addition, each
time the messages are loaded via from_messages, the summaries are
recomputed even though they may have just been computed during the
previous response. With this, the previously computed summary can be
passed in and avoid:
  1) spending extra $$$ on tokens, and 
2) increased response time by avoiding regenerating previously generated
summary.

Tag maintainer: @hwchase17
Twitter handle: https://twitter.com/ShantanuNair

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
DJ Atha ec40ead980
Fixed bug7445 where a duplicate restuld_id is added to the vectorstore. (#7573)
- Description: updated BabyAGI examples to append the iteration to the
result id to fix error storing data to vectorstore.
  - Issue: 7445
  - Dependencies: no
  - Tag maintainer: @eyurtsev
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

This fix worked for me locally. Happy to take some feedback and iterate
on a better solution. I was considering appending a uuid instead but
didnt want to over complicate the example.
1 year ago
yangdihang ff5024634e
fix: openapi controller prompt, when bot is unable to resolve an api … (#7525)
…call, it needs retry

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Co-authored-by: yangdihang <yangdihang@bytedance.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Kenny 1e8fca5518
Add ConcurrentLoader (#7512)
Works just like the GenericLoader but concurrently for those who choose
to optimize their workflow.

@rlancemartin @eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Kevin Buckley 8061994c61
AzureSearch Vector Store: Moving the usage of additional_fields into context of it's definition (bug fix from python error) (#8551)
Description: Using Azure Cognitive Search as a VectorStore. Calling the
`add_texts` method throws an error if there is no metadata property
specified. The `additional_fields` field is set in an `if` statement and
then is used later outside the if statement. This PR just moves the
declaration of `additional_fields` below and puts the usage of it in
context.

Issue: https://github.com/langchain-ai/langchain/issues/8544

Tagging @rlancemartin, @eyurtsev as this is related to Vector stores.

`make format`, `make lint`, `make spellcheck`, and `make test` have been
run
1 year ago
Danny Davenport 8d2344db43
updates some spelling mistakes (#8537)
Just updating some spelling / grammar issues in the documentation. No
code changes.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Leonid Kuligin b4a126ae71
Updated docs on Vertex AI going GA (#8531)
#8074

Co-authored-by: Leonid Kuligin <kuligin@google.com>
1 year ago
Pranay Chandekar 7e70cd2a28
Bug Fix - #8415 (#8417)
- Issue: #8415

Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>
1 year ago
shibuiwilliam de61ebd9e0
add tests to redis vectorstore (#8116)
# What
- Add function to get similarity with score with threshold in Redis
vector store.
- Add tests to Redis vector store.
1 year ago
Bharat Raghunathan c19a0b9c10
doc(prompts): Follow up on broken Prompt Sublink pages (#8530)
- Description: Follow up of #8478  
  - Issue: #8477
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: [@BharatR123](twitter.com/BharatR123)

The links were still broken after #8478 and sadly the issue was not
caught with either the Vercel app build and `make docs_linkcheck`
1 year ago
Bruno Bornsztein 5a490a79f4
fix issue #8357 by making json backtick regex greedy (#8528)
- Description: Markdown code blocks in json response should not break
the parser
  - Issue: #8357

@baskaryan @hinthornw
1 year ago
Gordon Clark 64d0a0fcc0
Updating docstings in utilities (#8411)
Updating docstrings on utility packages
 @baskaryan
1 year ago
Harrison Chase bca0749a11
conversational retrieval chain in lcel (#8532) 1 year ago
Jeff Huber 07d6d1ca38
fix error in chroma docker instructions (#8533)
This makes the Chroma instructions for Docker work! 


https://python.langchain.com/docs/integrations/vectorstores/chroma#basic-example-using-the-docker-container
1 year ago