- [Xorbits
Inference(Xinference)](https://github.com/xorbitsai/inference) is a
powerful and versatile library designed to serve language, speech
recognition, and multimodal models. Xinference supports a variety of
GGML-compatible models including chatglm, whisper, and vicuna, and
utilizes heterogeneous hardware and a distributed architecture for
seamless cross-device and cross-server model deployment.
- This PR integrates Xinference models and Xinference embeddings into
LangChain.
- Dependencies: To install the depenedencies for this integration, run
`pip install "xinference[all]"`
- Example Usage:
To start a local instance of Xinference, run `xinference`.
To deploy Xinference in a distributed cluster, first start an Xinference
supervisor using `xinference-supervisor`:
`xinference-supervisor -H "${supervisor_host}"`
Then, start the Xinference workers using `xinference-worker` on each
server you want to run them on.
`xinference-worker -e "http://${supervisor_host}:9997"`
To use Xinference with LangChain, you also need to launch a model. You
can use command line interface (CLI) to do so. Fo example: `xinference
launch -n vicuna-v1.3 -f ggmlv3 -q q4_0`. This launches a model named
vicuna-v1.3 with `model_format="ggmlv3"` and `quantization="q4_0"`. A
model UID is returned for you to use.
Now you can use Xinference with LangChain:
```python
from langchain.llms import Xinference
llm = Xinference(
server_url="http://0.0.0.0:9997", # suppose the supervisor_host is "0.0.0.0"
model_uid = {model_uid} # model UID returned from launching a model
)
llm(
prompt="Q: where can we visit in the capital of France? A:",
generate_config={"max_tokens": 1024},
)
```
You can also use RESTful client to launch a model:
```python
from xinference.client import RESTfulClient
client = RESTfulClient("http://0.0.0.0:9997")
model_uid = client.launch_model(model_name="vicuna-v1.3", model_size_in_billions=7, quantization="q4_0")
```
The following code block demonstrates how to use Xinference embeddings
with LangChain:
```python
from langchain.embeddings import XinferenceEmbeddings
xinference = XinferenceEmbeddings(
server_url="http://0.0.0.0:9997",
model_uid = model_uid
)
```
```python
query_result = xinference.embed_query("This is a test query")
```
```python
doc_result = xinference.embed_documents(["text A", "text B"])
```
Xinference is still under rapid development. Feel free to [join our
Slack
community](https://xorbitsio.slack.com/join/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA)
to get the latest updates!
- Request for review: @hwchase17, @baskaryan
- Twitter handle: https://twitter.com/Xorbitsio
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Background
With the addition on email and calendar tools, LangChain is continuing
to complete its functionality to automate business processes.
## Challenge
One of the pieces of business functionality that LangChain currently
doesn't have is the ability to search for flights and travel in order to
book business travel.
## Changes
This PR implements an integration with the
[Amadeus](https://developers.amadeus.com/) travel search API for
LangChain, enabling seamless search for flights with a single
authentication process.
## Who can review?
@hinthornw
## Appendix
@tsolakoua and @minjikarin, I utilized your
[amadeus-python](https://github.com/amadeus4dev/amadeus-python) library
extensively. Given the rising popularity of LangChain and similar AI
frameworks, the convergence of libraries like amadeus-python and tools
like this one is likely. So, I wanted to keep you updated on our
progress.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Work in Progress.
WIP
Not ready...
Adds Document Loader support for
[Geopandas.GeoDataFrames](https://geopandas.org/)
Example:
- [x] stub out `GeoDataFrameLoader` class
- [x] stub out integration tests
- [ ] Experiment with different geometry text representations
- [ ] Verify CRS is successfully added in metadata
- [ ] Test effectiveness of searches on geometries
- [ ] Test with different geometry types (point, line, polygon with
multi-variants).
- [ ] Add documentation
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Removing **kwargs argument from add_texts method in DeepLake vectorstore
as it confuses users and doesn't fail when user is typing incorrect
parameters.
Also added small test to ensure the change is applies correctly.
Guys could pls take a look: @rlancemartin, @eyurtsev, this is a small
PR.
Thx so much!
** This should land Monday the 17th **
Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to
build, more durable, faster, smaller, and more extensible. This comes
with a few changes:
1. A simplified and improved client setup. Instead of having to remember
weird settings, users can just do `EphemeralClient`, `PersistentClient`
or `HttpClient` (the underlying direct `Client` implementation is also
still accessible)
2. We migrated data stores away from `duckdb` and `clickhouse`. This
changes the api for the `PersistentClient` that used to reference
`chroma_db_impl="duckdb+parquet"`. Now we simply set
`is_persistent=true`. `is_persistent` is set for you to `true` if you
use `PersistentClient`.
3. Because we migrated away from `duckdb` and `clickhouse` - this also
means that users need to migrate their data into the new layout and
schema. Chroma is committed to providing extension notification and
tooling around any schema and data migrations (for example - this PR!).
After upgrading to `0.4.0` - if users try to access their data that was
stored in the previous regime, the system will throw an `Exception` and
instruct them how to use the migration assistant to migrate their data.
The migration assitant is a pip installable CLI: `pip install
chroma_migrate`. And is runnable by calling `chroma_migrate`
-- TODO ADD here is a short video demonstrating how it works.
Please reference the readme at
[chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate)
to see a full write-up of our philosophy on migrations as well as more
details about this particular migration.
Please direct any users facing issues upgrading to our Discord channel
called
[#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883).
We have also created a [email
listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers
directly in the future about breaking changes.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
1. Add the metadata filter of documents.
2. Add the text page_content filter of documents
3. fix the bug of similarity_search_with_score
Improvement and fix bug of AwaDB
Fix the conflict https://github.com/hwchase17/langchain/pull/7840
@rlancemartin @eyurtsev Thanks!
---------
Co-authored-by: vincent <awadb.vincent@gmail.com>
- Description: Add a BM25 Retriever that do not need Elastic search
- Dependencies: rank_bm25(if it is not installed it will be install by
using pip, just like TFIDFRetriever do)
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: DayuanJian21687
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Starting over from #5654 because I utterly borked the poetry.lock file.
Adds new paramerters for to the MWDumpLoader class:
* skip_redirecst (bool) Tells the loader to skip articles that redirect
to other articles. False by default.
* stop_on_error (bool) Tells the parser to skip any page that causes a
parse error. True by default.
* namespaces (List[int]) Tells the parser which namespaces to parse.
Contains namespaces from -2 to 15 by default.
Default values are chosen to preserve backwards compatibility.
Sample dump XML and full unit test coverage (with extended tests that
pass!) also included!
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Inspired by #5550, I implemented full async API support in Qdrant. The
docs were extended to mention the existence of asynchronous operations
in Langchain. I also used that chance to restructure the tests of Qdrant
and provided a suite of tests for the async version. Async API requires
the GRPC protocol to be enabled. Thus, it doesn't work on local mode
yet, but we're considering including the support to be consistent.
- Migrate from deprecated langchainplus_sdk to `langsmith` package
- Update the `run_on_dataset()` API to use an eval config
- Update a number of evaluators, as well as the loading logic
- Update docstrings / reference docs
- Update tracer to share single HTTP session
Updates to the WhyLabsCallbackHandler and example notebook
- Update dependency to langkit 0.0.6 which defines new helper methods
for callback integrations
- Update WhyLabsCallbackHandler to use the new `get_callback_instance`
so that the callback is mostly defined in langkit
- Remove much of the implementation of the WhyLabsCallbackHandler here
in favor of the callback instance
This does not change the behavior of the whylabs callback handler
implementation but is a reorganization that moves some of the
implementation externally to our optional dependency package, and should
make future updates easier.
@agola11
Probably the most boring PR to review ;)
Individual commits might be easier to digest
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- Description: Adds a new chain that acts as a wrapper around Sympy to
give LLMs the ability to do some symbolic math.
- Dependencies: SymPy
---------
Co-authored-by: sreiswig <sreiswig@github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description: a description of the change**
Fixed `make docs_build` and related scripts which caused errors. There
are several changes.
First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.
It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.
`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.
Finally, the description of CONTRIBUTING.md was modified.
**Issue: the issue # it fixes (if applicable)**
https://github.com/hwchase17/langchain/issues/6413
**Dependencies: any dependencies required for this change**
`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.
**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan
If this PR needs any additional changes, I'll be happy to make them!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description:
This PR introduces a new option format_diff to the existing Makefile.
This option allows us to apply the formatting tools (Black and isort)
only to the changed Python and ipynb files since the last commit. This
will make our development process more efficient as we only format the
codes that we modify. Along with this change, comments were added to
make the Makefile more understandable and maintainable.
### Issue:
N/A
### Dependencies:
Add dependency to black.
### Tag maintainer:
@baskaryan
### Twitter handle:
[kzk_maeda](https://twitter.com/kzk_maeda)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->
<!-- Remove if not applicable -->
Fixes # (issue)
#### Before submitting
<!-- If you're adding a new integration, please include:
1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use
See contribution guidelines for more information on how to write tests,
lint
etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
#### Who can review?
Tag maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @hwchase17
VectorStores / Retrievers / Memory
- @dev2049
-->
1. Added use cases of the new features
2. Done some code refactoring
---------
Co-authored-by: Ivo Stranic <istranic@gmail.com>
This PR improves the example notebook for the Marqo vectorstore
implementation by adding a new RetrievalQAWithSourcesChain example. The
`embedding` parameter in `from_documents` has its type updated to
`Union[Embeddings, None]` and a default parameter of None because this
is ignored in Marqo.
This PR also upgrades the Marqo version to 0.11.0 to remove the device
parameter after a breaking change to the API.
Related to #7068 @tomhamer @hwchase17
---------
Co-authored-by: Tom Hamer <tom@marqo.ai>
This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs.
---------
Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>
This PR brings in a vectorstore interface for
[Marqo](https://www.marqo.ai/).
The Marqo vectorstore exposes some of Marqo's functionality in addition
the the VectorStore base class. The Marqo vectorstore also makes the
embedding parameter optional because inference for embeddings is an
inherent part of Marqo.
Docs, notebook examples and integration tests included.
Related PR:
https://github.com/hwchase17/langchain/pull/2807
---------
Co-authored-by: Tom Hamer <tom@marqo.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for
[LangChain](https://github.com/hwchase17/langchain)
## Description
LangChain support for knowledge graphs relying on W3C standards using
RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \
* Works with local files, files from the web, and SPARQL endpoints
* Supports both SELECT and UPDATE queries
* Includes both a Jupyter notebook with an example and integration tests
## Contribution compared to related PRs and discussions
* [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) -
uses SPARQL, but specifically for wikibase querying
* [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph
DB question answering for Neo4J via Cypher
* [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries
something similar, but does not cover UPDATE queries and supports only
RDF
* Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related
to the combination of LLMs (specifically ChatGPT) and knowledge graphs
## Dependencies
* [RDFlib](https://github.com/RDFLib/rdflib)
## Tag maintainer
Graph database related to memory -> @hwchase17
[Apache HugeGraph](https://github.com/apache/incubator-hugegraph) is a
convenient, efficient, and adaptable graph database, compatible with the
Apache TinkerPop3 framework and the Gremlin query language.
In this PR, the HugeGraph and HugeGraphQAChain provide the same
functionality as the existing integration with Neo4j and enables query
generation and question answering over HugeGraph database. The
difference is that the graph query language supported by HugeGraph is
not cypher but another very popular graph query language
[Gremlin](https://tinkerpop.apache.org/gremlin.html).
A notebook example and a simple test case have also been added.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Retrying with the same improvements as in #6772, this time trying not to
mess up with branches.
@rlancemartin doing a fresh new PR from a branch with a new name. This
should do. Thank you for your help!
---------
Co-authored-by: Jonathan Ellis <jbellis@datastax.com>
Co-authored-by: rlm <pexpresss31@gmail.com>
Handle the new retriever events in a way that (I think) is entirely
backwards compatible? Needs more testing for some of the chain changes
and all.
This creates an entire new run type, however. We could also just treat
this as an event within a chain run presumably (same with memory)
Adds a subclass initializer that upgrades old retriever implementations
to the new schema, along with tests to ensure they work.
First commit doesn't upgrade any of our retriever implementations (to
show that we can pass the tests along with additional ones testing the
upgrade logic).
Second commit upgrades the known universe of retrievers in langchain.
- [X] Add callback handling methods for retriever start/end/error (open
to renaming to 'retrieval' if you want that)
- [X] Update BaseRetriever schema to support callbacks
- [X] Tests for upgrading old "v1" retrievers for backwards
compatibility
- [X] Update existing retriever implementations to implement the new
interface
- [X] Update calls within chains to .{a]get_relevant_documents to pass
the child callback manager
- [X] Update the notebooks/docs to reflect the new interface
- [X] Test notebooks thoroughly
Not handled:
- Memory pass throughs: retrieval memory doesn't have a parent callback
manager passed through the method
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
### Overview
This PR aims at building on #4378, expanding the capabilities and
building on top of the `cassIO` library to interface with the database
(as opposed to using the core drivers directly).
Usage of `cassIO` (a library abstracting Cassandra access for
ML/GenAI-specific purposes) is already established since #6426 was
merged, so no new dependencies are introduced.
In the same spirit, we try to uniform the interface for using Cassandra
instances throughout LangChain: all our appreciation of the work by
@jj701 notwithstanding, who paved the way for this incremental work
(thank you!), we identified a few reasons for changing the way a
`CassandraChatMessageHistory` is instantiated. Advocating a syntax
change is something we don't take lighthearted way, so we add some
explanations about this below.
Additionally, this PR expands on integration testing, enables use of
Cassandra's native Time-to-Live (TTL) features and improves the phrasing
around the notebook example and the short "integrations" documentation
paragraph.
We would kindly request @hwchase to review (since this is an elaboration
and proposed improvement of #4378 who had the same reviewer).
### About the __init__ breaking changes
There are
[many](https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cluster/)
options when creating the `Cluster` object, and new ones might be added
at any time. Choosing some of them and exposing them as `__init__`
parameters `CassandraChatMessageHistory` will prove to be insufficient
for at least some users.
On the other hand, working through `kwargs` or adding a long, long list
of arguments to `__init__` is not a desirable option either. For this
reason, (as done in #6426), we propose that whoever instantiates the
Chat Message History class provide a Cassandra `Session` object, ready
to use. This also enables easier injection of mocks and usage of
Cassandra-compatible connections (such as those to the cloud database
DataStax Astra DB, obtained with a different set of init parameters than
`contact_points` and `port`).
We feel that a breaking change might still be acceptable since LangChain
is at `0.*`. However, while maintaining that the approach we propose
will be more flexible in the future, room could be made for a
"compatibility layer" that respects the current init method. Honestly,
we would to that only if there are strong reasons for it, as that would
entail an additional maintenance burden.
### Other changes
We propose to remove the keyspace creation from the class code for two
reasons: first, production Cassandra instances often employ RBAC so that
the database user reading/writing from tables does not necessarily (and
generally shouldn't) have permission to create keyspaces, and second
that programmatic keyspace creation is not a best practice (it should be
done more or less manually, with extra care about schema mismatched
among nodes, etc). Removing this (usually unnecessary) operation from
the `__init__` path would also improve initialization performance
(shorter time).
We suggest, likewise, to remove the `__del__` method (which would close
the database connection), for the following reason: it is the
recommended best practice to create a single Cassandra `Session` object
throughout an application (it is a resource-heavy object capable to
handle concurrency internally), so in case Cassandra is used in other
ways by the app there is the risk of truncating the connection for all
usages when the history instance is destroyed. Moreover, the `Session`
object, in typical applications, is best left to garbage-collect itself
automatically.
As mentioned above, we defer the actual database I/O to the `cassIO`
library, which is designed to encode practices optimized for LLM
applications (among other) without the need to expose LangChain
developers to the internals of CQL (Cassandra Query Language). CassIO is
already employed by the LangChain's Vector Store support for Cassandra.
We added a few more connection options in the companion notebook example
(most notably, Astra DB) to encourage usage by anyone who cannot run
their own Cassandra cluster.
We surface the `ttl_seconds` option for automatic handling of an
expiration time to chat history messages, a likely useful feature given
that very old messages generally may lose their importance.
We elaborated a bit more on the integration testing (Time-to-live,
separation of "session ids", ...).
### Remarks from linter & co.
We reinstated `cassio` as a dependency both in the "optional" group and
in the "integration testing" group of `pyproject.toml`. This might not
be the right thing do to, in which case the author of this PR offer his
apologies (lack of confidence with Poetry - happy to be pointed in the
right direction, though!).
During linter tests, we were hit by some errors which appear unrelated
to the code in the PR. We left them here and report on them here for
awareness:
```
langchain/vectorstores/mongodb_atlas.py:137: error: Argument 1 to "insert_many" of "Collection" has incompatible type "List[Dict[str, Sequence[object]]]"; expected "Iterable[Union[MongoDBDocumentType, RawBSONDocument]]" [arg-type]
langchain/vectorstores/mongodb_atlas.py:186: error: Argument 1 to "aggregate" of "Collection" has incompatible type "List[object]"; expected "Sequence[Mapping[str, Any]]" [arg-type]
langchain/vectorstores/qdrant.py:16: error: Name "grpc" is not defined [name-defined]
langchain/vectorstores/qdrant.py:19: error: Name "grpc" is not defined [name-defined]
langchain/vectorstores/qdrant.py:20: error: Name "grpc" is not defined [name-defined]
langchain/vectorstores/qdrant.py:22: error: Name "grpc" is not defined [name-defined]
langchain/vectorstores/qdrant.py:23: error: Name "grpc" is not defined [name-defined]
```
In the same spirit, we observe that to even get `import langchain` run,
it seems that a `pip install bs4` is missing from the minimal package
installation path.
Thank you!
#### Summary
A new approach to loading source code is implemented:
Each top-level function and class in the code is loaded into separate
documents. Then, an additional document is created with the top-level
code, but without the already loaded functions and classes.
This could improve the accuracy of QA chains over source code.
For instance, having this script:
```
class MyClass:
def __init__(self, name):
self.name = name
def greet(self):
print(f"Hello, {self.name}!")
def main():
name = input("Enter your name: ")
obj = MyClass(name)
obj.greet()
if __name__ == '__main__':
main()
```
The loader will create three documents with this content:
First document:
```
class MyClass:
def __init__(self, name):
self.name = name
def greet(self):
print(f"Hello, {self.name}!")
```
Second document:
```
def main():
name = input("Enter your name: ")
obj = MyClass(name)
obj.greet()
```
Third document:
```
# Code for: class MyClass:
# Code for: def main():
if __name__ == '__main__':
main()
```
A threshold parameter is added to control whether small scripts are
split in this way or not.
At this moment, only Python and JavaScript are supported. The
appropriate parser is determined by examining the file extension.
#### Tests
This PR adds:
- Unit tests
- Integration tests
#### Dependencies
Only one dependency was added as optional (needed for the JavaScript
parser).
#### Documentation
A notebook is added showing how the loader can be used.
#### Who can review?
@eyurtsev @hwchase17
---------
Co-authored-by: rlm <pexpresss31@gmail.com>
A new implementation of `StreamlitCallbackHandler`. It formats Agent
thoughts into Streamlit expanders.
You can see the handler in action here:
https://langchain-mrkl.streamlit.app/
Per a discussion with Harrison, we'll be adding a
`StreamlitCallbackHandler` implementation to an upcoming
[Streamlit](https://github.com/streamlit/streamlit) release as well, and
will be updating it as we add new LLM- and LangChain-specific features
to Streamlit.
The idea with this PR is that the LangChain `StreamlitCallbackHandler`
will "auto-update" in a way that keeps it forward- (and backward-)
compatible with Streamlit. If the user has an older Streamlit version
installed, the LangChain `StreamlitCallbackHandler` will be used; if
they have a newer Streamlit version that has an updated
`StreamlitCallbackHandler`, that implementation will be used instead.
(I'm opening this as a draft to get the conversation going and make sure
we're on the same page. We're really excited to land this into
LangChain!)
#### Who can review?
@agola11, @hwchase17
# Changes
This PR adds [Clarifai](https://www.clarifai.com/) integration to
Langchain. Clarifai is an end-to-end AI Platform. Clarifai offers user
the ability to use many types of LLM (OpenAI, cohere, ect and other open
source models). As well, a clarifai app can be treated as a vector
database to upload and retrieve data. The integrations includes:
- Clarifai LLM integration: Clarifai supports many types of language
model that users can utilize for their application
- Clarifai VectorDB: A Clarifai application can hold data and
embeddings. You can run semantic search with the embeddings
#### Before submitting
- [x] Added integration test for LLM
- [x] Added integration test for VectorDB
- [x] Added notebook for LLM
- [x] Added notebook for VectorDB
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1. upgrade the version of AwaDB
2. add some new interfaces
3. fix bug of packing page content error
@dev2049 please review, thanks!
---------
Co-authored-by: vincent <awadb.vincent@gmail.com>
This PR adds `KuzuGraph` and `KuzuQAChain` for interacting with [Kùzu
database](https://github.com/kuzudb/kuzu). Kùzu is an in-process
property graph database management system (GDBMS) built for query speed
and scalability. The `KuzuGraph` and `KuzuQAChain` provide the same
functionality as the existing integration with NebulaGraph and Neo4j and
enables query generation and question answering over Kùzu database.
A notebook example and a simple test case have also been added.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Provider the latest duckduckgo_search API
The Git commit contents involve two files related to some DuckDuckGo
query operations, and an upgrade of the DuckDuckGo module to version
3.8.3. A suitable commit message could be "Upgrade DuckDuckGo module to
version 3.8.3, including query operations". Specifically, in the
duckduckgo_search.py file, a DDGS() class instance is newly added to
replace the previous ddg() function, and the time parameter name in the
get_snippets() and results() methods is changed from "time" to
"timelimit" to accommodate recent changes. In the pyproject.toml file,
the duckduckgo-search module is upgraded to version 3.8.3.
[duckduckgo_search readme
attention](https://github.com/deedy5/duckduckgo_search): Versions before
v2.9.4 no longer work as of May 12, 2023
## Who can review?
@vowelparrot
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1. Introduced new distance strategies support: **DOT_PRODUCT** and
**EUCLIDEAN_DISTANCE** for enhanced flexibility.
2. Implemented a feature to filter results based on metadata fields.
3. Incorporated connection attributes specifying "langchain python sdk"
usage for enhanced traceability and debugging.
4. Expanded the suite of integration tests for improved code
reliability.
5. Updated the existing notebook with the usage example
@dev2049
---------
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>