Commit Graph

421 Commits (master)

Author SHA1 Message Date
os1ma 2667ddc686
Fix `make docs_build` and related scripts (#7276)
**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
11 months ago
Bagatur 3c4338470e
bump 230 (#7544) 11 months ago
Adilkhan Sarsen 5debd5043e
Added deeplake use case examples of the new features (#6528)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
 
 1. Added use cases of the new features
 2. Done some code refactoring

---------

Co-authored-by: Ivo Stranic <istranic@gmail.com>
11 months ago
Bagatur 9b615022e2
bump 229 (#7467) 11 months ago
Yifei Song 7d29bb2c02
Add Xorbits Dataframe as a Document Loader (#7319)
- [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source
computing framework that makes it easy to scale data science and machine
learning workloads in parallel. Xorbits can leverage multi cores or GPUs
to accelerate computation on a single machine, or scale out up to
thousands of machines to support processing terabytes of data.

- This PR added support for the Xorbits document loader, which allows
langchain to leverage Xorbits to parallelize and distribute the loading
of data.
- Dependencies: This change requires the Xorbits library to be installed
in order to be used.
`pip install xorbits`
- Request for review: @rlancemartin, @eyurtsev
- Twitter handle: https://twitter.com/Xorbitsio

Co-authored-by: Bagatur <baskaryan@gmail.com>
11 months ago
Bagatur 26c86a197c
bump 228 (#7393) 11 months ago
William FH 4789c99bc2
Add String Distance and Embedding Evaluators (#7123)
Add a string evaluator and pairwise string evaluator implementation for:
- Embedding distance
- String distance

Update docs
11 months ago
Bagatur cb4e88e4fb
bump 227 (#7354) 11 months ago
Bagatur 0ed2da7020
bump 226 (#7335) 11 months ago
OwenElliott 3074306ae1
Marqo Vector Store Examples & Type Hints (#7326)
This PR improves the example notebook for the Marqo vectorstore
implementation by adding a new RetrievalQAWithSourcesChain example. The
`embedding` parameter in `from_documents` has its type updated to
`Union[Embeddings, None]` and a default parameter of None because this
is ignored in Marqo.

This PR also upgrades the Marqo version to 0.11.0 to remove the device
parameter after a breaking change to the API.

Related to #7068 @tomhamer @hwchase17

---------

Co-authored-by: Tom Hamer <tom@marqo.ai>
11 months ago
Bagatur a9c5b4bcea
Bagatur/clarifai update (#7324)
This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs.

---------

Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>
11 months ago
Harrison Chase 035ad33a5b
bump ver to 225 (#7244) 11 months ago
Harrison Chase 52b016920c
Harrison/update anthropic (#7237)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
11 months ago
Tom e533da8bf2
Adding Marqo to vectorstore ecosystem (#7068)
This PR brings in a vectorstore interface for
[Marqo](https://www.marqo.ai/).

The Marqo vectorstore exposes some of Marqo's functionality in addition
the the VectorStore base class. The Marqo vectorstore also makes the
embedding parameter optional because inference for embeddings is an
inherent part of Marqo.

Docs, notebook examples and integration tests included.

Related PR:
https://github.com/hwchase17/langchain/pull/2807

---------

Co-authored-by: Tom Hamer <tom@marqo.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
11 months ago
felixocker db98c44f8f
Support for SPARQL (#7165)
# [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for
[LangChain](https://github.com/hwchase17/langchain)

## Description
LangChain support for knowledge graphs relying on W3C standards using
RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \
* Works with local files, files from the web, and SPARQL endpoints
* Supports both SELECT and UPDATE queries
* Includes both a Jupyter notebook with an example and integration tests

## Contribution compared to related PRs and discussions
* [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) -
uses SPARQL, but specifically for wikibase querying
* [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph
DB question answering for Neo4J via Cypher
* [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries
something similar, but does not cover UPDATE queries and supports only
RDF
* Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related
to the combination of LLMs (specifically ChatGPT) and knowledge graphs

## Dependencies
* [RDFlib](https://github.com/RDFLib/rdflib)

## Tag maintainer
Graph database related to memory -> @hwchase17
11 months ago
Harrison Chase 79fb90aafd
bump version to 224 (#7203) 11 months ago
Bagatur 898087d02c
bump 223 (#7155) 11 months ago
William FH dfa48dc3b5
Update sdk version (#7109) 11 months ago
Bagatur 719316e84c
bump 222 (#7086) 11 months ago
Bagatur 5a45363954
bump 221 (#7047) 11 months ago
Stefano Lottini 8d2281a8ca
Second Attempt - Add concurrent insertion of vector rows in the Cassandra Vector Store (#7017)
Retrying with the same improvements as in #6772, this time trying not to
mess up with branches.

@rlancemartin doing a fresh new PR from a branch with a new name. This
should do. Thank you for your help!

---------

Co-authored-by: Jonathan Ellis <jbellis@datastax.com>
Co-authored-by: rlm <pexpresss31@gmail.com>
11 months ago
Daniel Chalef b26cca8008
Zep Authentication (#6728)
## Description: Add Zep API Key argument to ZepChatMessageHistory and
ZepRetriever
- correct docs site links
- add zep api_key auth to constructors

ZepChatMessageHistory: @hwchase17, 
ZepRetriever: @rlancemartin, @eyurtsev
11 months ago
William FH 13c62cf6b1
Arthur Callback (#6972)
Co-authored-by: Max Cembalest <115359769+arthuractivemodeling@users.noreply.github.com>
11 months ago
Bagatur 8f5eca236f
release v220 (#6962) 11 months ago
Bagatur 60b0d6ea35
Bagatur/openllm ensure available (#6960)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
11 months ago
Stefano Lottini 75fb9d2fdc
Cassandra support for chat history using CassIO library (#6771)
### Overview

This PR aims at building on #4378, expanding the capabilities and
building on top of the `cassIO` library to interface with the database
(as opposed to using the core drivers directly).

Usage of `cassIO` (a library abstracting Cassandra access for
ML/GenAI-specific purposes) is already established since #6426 was
merged, so no new dependencies are introduced.

In the same spirit, we try to uniform the interface for using Cassandra
instances throughout LangChain: all our appreciation of the work by
@jj701 notwithstanding, who paved the way for this incremental work
(thank you!), we identified a few reasons for changing the way a
`CassandraChatMessageHistory` is instantiated. Advocating a syntax
change is something we don't take lighthearted way, so we add some
explanations about this below.

Additionally, this PR expands on integration testing, enables use of
Cassandra's native Time-to-Live (TTL) features and improves the phrasing
around the notebook example and the short "integrations" documentation
paragraph.

We would kindly request @hwchase to review (since this is an elaboration
and proposed improvement of #4378 who had the same reviewer).

### About the __init__ breaking changes

There are
[many](https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cluster/)
options when creating the `Cluster` object, and new ones might be added
at any time. Choosing some of them and exposing them as `__init__`
parameters `CassandraChatMessageHistory` will prove to be insufficient
for at least some users.

On the other hand, working through `kwargs` or adding a long, long list
of arguments to `__init__` is not a desirable option either. For this
reason, (as done in #6426), we propose that whoever instantiates the
Chat Message History class provide a Cassandra `Session` object, ready
to use. This also enables easier injection of mocks and usage of
Cassandra-compatible connections (such as those to the cloud database
DataStax Astra DB, obtained with a different set of init parameters than
`contact_points` and `port`).

We feel that a breaking change might still be acceptable since LangChain
is at `0.*`. However, while maintaining that the approach we propose
will be more flexible in the future, room could be made for a
"compatibility layer" that respects the current init method. Honestly,
we would to that only if there are strong reasons for it, as that would
entail an additional maintenance burden.

### Other changes

We propose to remove the keyspace creation from the class code for two
reasons: first, production Cassandra instances often employ RBAC so that
the database user reading/writing from tables does not necessarily (and
generally shouldn't) have permission to create keyspaces, and second
that programmatic keyspace creation is not a best practice (it should be
done more or less manually, with extra care about schema mismatched
among nodes, etc). Removing this (usually unnecessary) operation from
the `__init__` path would also improve initialization performance
(shorter time).

We suggest, likewise, to remove the `__del__` method (which would close
the database connection), for the following reason: it is the
recommended best practice to create a single Cassandra `Session` object
throughout an application (it is a resource-heavy object capable to
handle concurrency internally), so in case Cassandra is used in other
ways by the app there is the risk of truncating the connection for all
usages when the history instance is destroyed. Moreover, the `Session`
object, in typical applications, is best left to garbage-collect itself
automatically.

As mentioned above, we defer the actual database I/O to the `cassIO`
library, which is designed to encode practices optimized for LLM
applications (among other) without the need to expose LangChain
developers to the internals of CQL (Cassandra Query Language). CassIO is
already employed by the LangChain's Vector Store support for Cassandra.

We added a few more connection options in the companion notebook example
(most notably, Astra DB) to encourage usage by anyone who cannot run
their own Cassandra cluster.

We surface the `ttl_seconds` option for automatic handling of an
expiration time to chat history messages, a likely useful feature given
that very old messages generally may lose their importance.

We elaborated a bit more on the integration testing (Time-to-live,
separation of "session ids", ...).

### Remarks from linter & co.

We reinstated `cassio` as a dependency both in the "optional" group and
in the "integration testing" group of `pyproject.toml`. This might not
be the right thing do to, in which case the author of this PR offer his
apologies (lack of confidence with Poetry - happy to be pointed in the
right direction, though!).

During linter tests, we were hit by some errors which appear unrelated
to the code in the PR. We left them here and report on them here for
awareness:

```
langchain/vectorstores/mongodb_atlas.py:137: error: Argument 1 to "insert_many" of "Collection" has incompatible type "List[Dict[str, Sequence[object]]]"; expected "Iterable[Union[MongoDBDocumentType, RawBSONDocument]]"  [arg-type]
langchain/vectorstores/mongodb_atlas.py:186: error: Argument 1 to "aggregate" of "Collection" has incompatible type "List[object]"; expected "Sequence[Mapping[str, Any]]"  [arg-type]

langchain/vectorstores/qdrant.py:16: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:19: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:20: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:22: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:23: error: Name "grpc" is not defined  [name-defined]
```

In the same spirit, we observe that to even get `import langchain` run,
it seems that a `pip install bs4` is missing from the minimal package
installation path.

Thank you!
11 months ago
Harrison Chase 8502117f62
bump version to 219 (#6899) 11 months ago
Harrison Chase 3ac08c3de4
Harrison/octo ml (#6897)
Co-authored-by: Bassem Yacoube <125713079+AI-Bassem@users.noreply.github.com>
Co-authored-by: Shotaro Kohama <khmshtr28@gmail.com>
Co-authored-by: Rian Dolphin <34861538+rian-dolphin@users.noreply.github.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Shashank Deshpande <shashankdeshpande18@gmail.com>
11 months ago
Harrison Chase e5611565b7
bump version to 218 (#6857) 11 months ago
Ayan Bandyopadhyay f92ccf70fd
Update to the latest Psychic python library version (#6804)
Update the Psychic document loader to use the latest `psychicapi` python
library version: `0.8.0`
11 months ago
Cristóbal Carnero Liñán e494b0a09f
feat (documents): add a source code loader based on AST manipulation (#6486)
#### Summary

A new approach to loading source code is implemented:

Each top-level function and class in the code is loaded into separate
documents. Then, an additional document is created with the top-level
code, but without the already loaded functions and classes.

This could improve the accuracy of QA chains over source code.

For instance, having this script:

```
class MyClass:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")

def main():
    name = input("Enter your name: ")
    obj = MyClass(name)
    obj.greet()

if __name__ == '__main__':
    main()
```

The loader will create three documents with this content:

First document:
```
class MyClass:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")
```

Second document:
```
def main():
    name = input("Enter your name: ")
    obj = MyClass(name)
    obj.greet()
```

Third document:
```
# Code for: class MyClass:

# Code for: def main():

if __name__ == '__main__':
    main()
```

A threshold parameter is added to control whether small scripts are
split in this way or not.

At this moment, only Python and JavaScript are supported. The
appropriate parser is determined by examining the file extension.

#### Tests

This PR adds:

- Unit tests
- Integration tests

#### Dependencies

Only one dependency was added as optional (needed for the JavaScript
parser).

#### Documentation

A notebook is added showing how the loader can be used.

#### Who can review?

@eyurtsev @hwchase17

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
11 months ago
Harrison Chase 8392ca602c
bump version to 217 (#6831) 11 months ago
Harrison Chase d1bcc58beb
bump version to 216 (#6770) 11 months ago
Harrison Chase 1742db0c30
bump version to 215 (#6719) 11 months ago
Davis Chase 1da99ce013
bump v214 (#6694) 11 months ago
Harrison Chase ef4c7b54ef
bump to version 213 (#6688) 11 months ago
Davis Chase fe828185ed
Dev2049/bump 212 (#6665) 11 months ago
Davis Chase b25933b607
bump 211 (#6660) 11 months ago
Davis Chase b062a3f938
bump 210 (#6656) 11 months ago
Zander Chase b4fe7f3a09
Session to project (#6249)
Sessions are being renamed to projects in the tracer
11 months ago
Tim Conkling c28990d871
StreamlitCallbackHandler (#6315)
A new implementation of `StreamlitCallbackHandler`. It formats Agent
thoughts into Streamlit expanders.

You can see the handler in action here:
https://langchain-mrkl.streamlit.app/

Per a discussion with Harrison, we'll be adding a
`StreamlitCallbackHandler` implementation to an upcoming
[Streamlit](https://github.com/streamlit/streamlit) release as well, and
will be updating it as we add new LLM- and LangChain-specific features
to Streamlit.

The idea with this PR is that the LangChain `StreamlitCallbackHandler`
will "auto-update" in a way that keeps it forward- (and backward-)
compatible with Streamlit. If the user has an older Streamlit version
installed, the LangChain `StreamlitCallbackHandler` will be used; if
they have a newer Streamlit version that has an updated
`StreamlitCallbackHandler`, that implementation will be used instead.

(I'm opening this as a draft to get the conversation going and make sure
we're on the same page. We're really excited to land this into
LangChain!)

#### Who can review?

@agola11, @hwchase17
11 months ago
Davis Chase b909bc8b58
bump 209 (#6593) 11 months ago
minhajul-clarifai 6e57306a13
Clarifai integration (#5954)
# Changes
This PR adds [Clarifai](https://www.clarifai.com/) integration to
Langchain. Clarifai is an end-to-end AI Platform. Clarifai offers user
the ability to use many types of LLM (OpenAI, cohere, ect and other open
source models). As well, a clarifai app can be treated as a vector
database to upload and retrieve data. The integrations includes:
- Clarifai LLM integration: Clarifai supports many types of language
model that users can utilize for their application
- Clarifai VectorDB: A Clarifai application can hold data and
embeddings. You can run semantic search with the embeddings

#### Before submitting
- [x] Added integration test for LLM 
- [x] Added integration test for VectorDB 
- [x] Added notebook for LLM 
- [x] Added notebook for VectorDB 

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
11 months ago
Davis Chase 4fabd02d25
Add OpenLLM wrapper(#6578)
LLM wrapper for models served with OpenLLM

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Chaoyu <paranoyang@gmail.com>
11 months ago
Harrison Chase ace442b992
bump to ver 208 (#6540) 11 months ago
Davis Chase 8cd5f65a6f
release 207 (#6488) 11 months ago
Harrison Chase 7414e9d196
bump version to 206 (#6465) 11 months ago
Zeeland 8a604b93ab
feat: use latest duckduckgo_search API to call (#6409)
# Provider the latest duckduckgo_search API

The Git commit contents involve two files related to some DuckDuckGo
query operations, and an upgrade of the DuckDuckGo module to version
3.8.3. A suitable commit message could be "Upgrade DuckDuckGo module to
version 3.8.3, including query operations". Specifically, in the
duckduckgo_search.py file, a DDGS() class instance is newly added to
replace the previous ddg() function, and the time parameter name in the
get_snippets() and results() methods is changed from "time" to
"timelimit" to accommodate recent changes. In the pyproject.toml file,
the duckduckgo-search module is upgraded to version 3.8.3.

[duckduckgo_search readme
attention](https://github.com/deedy5/duckduckgo_search): Versions before
v2.9.4 no longer work as of May 12, 2023

## Who can review?

@vowelparrot

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
11 months ago
volodymyr-memsql d2e9b621ab
Update SinglStoreDB vectorstore (#6423)
1. Introduced new distance strategies support: **DOT_PRODUCT** and
**EUCLIDEAN_DISTANCE** for enhanced flexibility.
2. Implemented a feature to filter results based on metadata fields.
3. Incorporated connection attributes specifying "langchain python sdk"
usage for enhanced traceability and debugging.
4. Expanded the suite of integration tests for improved code
reliability.
5. Updated the existing notebook with the usage example

@dev2049

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
11 months ago
Zander Chase 00f276d23f
Run eval in eval mode (#6447)
For the `run_on_dataset` sessions
11 months ago
Harrison Chase df40cd233f
bump version to 205 (#6410) 11 months ago
Harrison Chase 10bff4ecc4
Harrison/chroma fix (#6390)
Co-authored-by: Junu Moon(Fran) <francomoon7@gmail.com>
12 months ago
Leonid Ganeline c7ca350cd3
Fix class promotion (#6187)
In LangChain, all module classes are enumerated in the `__init__.py`
file of the correspondent module. But some classes were missed and were
not included in the module `__init__.py`

This PR:
- added the missed classes to the module `__init__.py` files
- `__init__.py:__all_` variable value (a list of the class names) was
sorted
- `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into
the `QuerySQLCheckerTool` because it conflicted with
`langchain.tools.spark_sql.tool.QueryCheckerTool`
- changes to `pyproject.toml`:
  - added `pgvector` to `pyproject.toml:extended_testing`
- added `pandas` to
`pyproject.toml:[tool.poetry.group.test.dependencies]`
- commented out the `streamlit` from `collbacks/__init__.py`, It is
because now the `streamlit` requires Python >=3.7, !=3.9.7
- fixed duplicate names in `tools`
- fixed correspondent ut-s

#### Who can review?
@hwchase17
@dev2049
12 months ago
Davis Chase ec850e607f
bump 203 (#6372) 12 months ago
Harrison Chase af18413d97
Harrison/deeplake new features (#6263)
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
12 months ago
ljeagle ad324a39ae
Improve the performance of add_texts interface and upgrade the AwaDB from 0.3.2 to 0.3.3 (#6316)
1. Changed the implementation of add_texts interface for the AwaDB
vector store in order to improve the performance
2. Upgrade the AwaDB from 0.3.2 to 0.3.3

---------

Co-authored-by: vincent <awadb.vincent@gmail.com>
12 months ago
Harrison Chase 94c82a189d
bump to 202 (#6262) 12 months ago
Harrison Chase 8e1a7a8646
bump version to 201 (#6233) 12 months ago
Harrison Chase 6ac120f299
bump ver to 200 (#6130) 12 months ago
Harrison Chase 34ebb29726
bump version to 199 (#6102) 12 months ago
Zander Chase 0c52275bdb
Use Run object from SDK (#6067)
Update the Run object in the tracer to extend that in the SDK to include
the parameters necessary for tracking/tracing
12 months ago
Harrison Chase 289e9aeb9d
bump ver to 198 (#6026) 12 months ago
Harrison Chase d1561b74eb
Harrison/cognitive search (#6011)
Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>
12 months ago
Nuno Campos 18af149e91
nc/load (#5733)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Harrison Chase 20e9ce8a62
bump version to 197 (#6007) 12 months ago
Harrison Chase 9218684759
Add a new vector store - AwaDB (#5971) (#5992)
Added AwaDB vector store, which is a wrapper over the AwaDB, that can be
used as a vector storage and has an efficient similarity search. Added
integration tests for the vector store
Added jupyter notebook with the example

Delete a unneeded empty file and resolve the
conflict(https://github.com/hwchase17/langchain/pull/5886)

Please check, Thanks!

@dev2049
@hwchase17

---------

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: ljeagle <vincent_jieli@yeah.net>
Co-authored-by: vincent <awadb.vincent@gmail.com>
12 months ago
Harrison Chase 62ec10a7f5
bump version to 196 (#5988) 12 months ago
Harrison Chase 3678cba0be
bump ver to 195 (#5949) 12 months ago
Zander Chase 77c286cf02
Use LCP Client in Tracer (#5908)
Move the LCP calls to the client.
12 months ago
Harrison Chase 893d20f735
bump version to 194 (#5866) 12 months ago
Harrison Chase 35cfd25db3
Harrison/nebula graph (#5865)
Co-authored-by: Wey Gu <weyl.gu@gmail.com>
Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>
12 months ago
volodymyr-memsql a1549901ce
Added SingleStoreDB Vector Store (#5619)
- Added `SingleStoreDB` vector store, which is a wrapper over the
SingleStore DB database, that can be used as a vector storage and has an
efficient similarity search.
- Added integration tests for the vector store
- Added jupyter notebook with the example

@dev2049

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Zander Chase d9fcc45d05
Add in the async methods and link the run id (#5810) 12 months ago
Harrison Chase ce7c11625f
bump version to 193 (#5838) 12 months ago
Harrison Chase b3ae6bcd3f
bump ver to 192 (#5812) 12 months ago
Zander Chase 204a73c1d9
Use client from LCP-SDK (#5695)
- Remove the client implementation (this breaks backwards compatibility
for existing testers. I could keep the stub in that file if we want, but
not many people are using it yet
- Add SDK as dependency
- Update the 'run_on_dataset' method to be a function that optionally
accepts a client as an argument
- Remove the langchain plus server implementation (you get it for free
with the SDK now)

We could make the SDK optional for now, but the plan is to use w/in the
tracer so it would likely become a hard dependency at some point.
12 months ago
Harrison Chase 08e2352f7b
bump ver 191 (#5766) 12 months ago
Adil Ansari 233b52735e
feat: Support for `Tigris` Vector Database for vector search (#5703)
### Changes
- New vector store integration - [Tigris](https://tigrisdata.com)
- Adds [tigrisdb](https://pypi.org/project/tigrisdb/) optional
dependency
- Example notebook demonstrating usage

Fixes #5535 
Closes tigrisdata/tigris-client-python#40

#### Twitter handles
We'd love a shoutout on our
[@TigrisData](https://twitter.com/TigrisData) and
[@adilansari](https://twitter.com/adilansari) twitter handles

#### Who can review?
@dev2049

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Daniel Chalef 0551bc90a5
Zep Hybrid Search (#5742)
Zep now supports persisting custom metadata with messages and hybrid
search across both message embeddings and structured metadata. This PR
implements custom metadata and enhancements to the
`ZepChatMessageHistory` and `ZepRetriever` classes to implement this
support.

Tag maintainers/contributors who might be interested:

  VectorStores / Retrievers / Memory
  - @dev2049

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
12 months ago
Harrison Chase d0d89d39ef
bump version to 190 (#5704) 12 months ago
Davis Chase 3c6fa9126a
bump 189 (#5620) 1 year ago
Davis Chase 6632188606
bump 188 (#5568) 1 year ago
Davis Chase 46b7181f13
bump 187 (#5504) 1 year ago
Kacper Łukawski 8bcaca435a
Feature: Qdrant filters supports (#5446)
# Support Qdrant filters

Qdrant has an [extensive filtering
system](https://qdrant.tech/documentation/concepts/filtering/) with rich
type support. This PR makes it possible to use the filters in Langchain
by passing an additional param to both the
`similarity_search_with_score` and `similarity_search` methods.

## Who can review?

@dev2049 @hwchase17

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Natalie 199cc700a3
Ability to specify credentials wihen using Google BigQuery as a data loader (#5466)
# Adds ability to specify credentials when using Google BigQuery as a
data loader

Fixes #5465 . Adds ability to set credentials which must be of the
`google.auth.credentials.Credentials` type. This argument is optional
and will default to `None.

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Ayan Bandyopadhyay 8181f9e362
Update psychicapi version (#5471)
Update [psychicapi](https://pypi.org/project/psychicapi/) python package
dependency to the latest version 0.5. The newest python package version
addresses breaking changes in the Psychic http api.
1 year ago
Davis Chase 4379bd4cbb
bump 186 (#5459) 1 year ago
Davis Chase 64b4165c8d
bump 185 (#5442) 1 year ago
Paul-Emile Brotons a61b7f7e7c
adding MongoDBAtlasVectorSearch (#5338)
# Add MongoDBAtlasVectorSearch for the python library

Fixes #5337
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase 760632b292
Harrison/spark reader (#5405)
Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
German Martin 0b3e0dd1d2
New Trello document loader (#4767)
# Added New Trello loader class and documentation

Simple Loader on top of py-trello wrapper. 
With a board name you can pull cards and to do some field parameter
tweaks on load operation.
I included documentation and examples.
Included unit test cases using patch and a fixture for py-trello client
class.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase cce731c3c2
bump version 184 (#5407) 1 year ago
Harrison Chase ad7f4c0317
bump to 183 (#5372) 1 year ago
Davis Chase b705f260f4
bump 182 (#5364) 1 year ago
Eugene Yurtsev 0a8d6bc402
Add instructions to pyproject.toml (#5138)
# Add instructions to pyproject.toml

* Add instructions to pyproject.toml about how to handle optional
dependencies.

## Before submitting


## Who can review?

---------

Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>
Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
1 year ago
Davis Chase 641303a361
bump 181 (#5302) 1 year ago
Michael Landis 7047a2c1af
feat: add Momento as a standard cache and chat message history provider (#5221)
# Add Momento as a standard cache and chat message history provider

This PR adds Momento as a standard caching provider. Implements the
interface, adds integration tests, and documentation. We also add
Momento as a chat history message provider along with integration tests,
and documentation.

[Momento](https://www.gomomento.com/) is a fully serverless cache.
Similar to S3 or DynamoDB, it requires zero configuration,
infrastructure management, and is instantly available. Users sign up for
free and get 50GB of data in/out for free every month.

## Before submitting

 We have added documentation, notebooks, and integration tests
demonstrating usage.

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Davis Chase ca88b25da6
Zep sdk version (#5267)
zep-python's sync methods no longer need an asyncio wrapper. This was
causing issues with FastAPI deployment.
Zep also now supports putting and getting of arbitrary message metadata.

Bump zep-python version to v0.30

Remove nest-asyncio from Zep example notebooks.

Modify tests to include metadata.

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
1 year ago
Davis Chase 15b17f9334
bump 180 (#5248) 1 year ago
Eugene Yurtsev 5cfa72a130
Bibtex integration for document loader and retriever (#5137)
# Bibtex integration

Wrap bibtexparser to retrieve a list of docs from a bibtex file.
* Get the metadata from the bibtex entries
* `page_content` get from the local pdf referenced in the `file` field
of the bibtex entry using `pymupdf`
* If no valid pdf file, `page_content` set to the `abstract` field of
the bibtex entry
* Support Zotero flavour using regex to get the file path
* Added usage example in
`docs/modules/indexes/document_loaders/examples/bibtex.ipynb`
---------

Co-authored-by: Sébastien M. Popoff <sebastien.popoff@espci.fr>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago