# Better docs for weaviate hybrid search
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes: NA
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
@dev2049
zep-python's sync methods no longer need an asyncio wrapper. This was
causing issues with FastAPI deployment.
Zep also now supports putting and getting of arbitrary message metadata.
Bump zep-python version to v0.30
Remove nest-asyncio from Zep example notebooks.
Modify tests to include metadata.
---------
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
# Improve pinecone hybrid search retriever adding metadata support
I simply remove the hardwiring of metadata to the existing
implementation allowing one to pass `metadatas` attribute to the
constructors and in `get_relevant_documents`. I also add one missing pip
install to the accompanying notebook (I am not adding dependencies, they
were pre-existing).
First contribution, just hoping to help, feel free to critique :)
my twitter username is `@andreliebschner`
While looking at hybrid search I noticed #3043 and #1743. I think the
former can be closed as following the example right now (even prior to
my improvements) works just fine, the latter I think can be also closed
safely, maybe pointing out the relevant classes and example. Should I
reply those issues mentioning someone?
@dev2049, @hwchase17
---------
Co-authored-by: Andreas Liebschner <a.liebschner@shopfully.com>
# Add self query translator for weaviate vectorstore
Adds support for the EQ comparator and the AND/OR operators.
Co-authored-by: Dominic Chan <dchan@cppib.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Zep Retriever - Vector Search Over Chat History with the Zep Long-term
Memory Service
More on Zep: https://github.com/getzep/zep
Note: This PR is related to and relies on
https://github.com/hwchase17/langchain/pull/4834. I did not want to
modify the `pyproject.toml` file to add the `zep-python` dependency a
second time.
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
# Docs: improvements in the `retrievers/examples/` notebooks
Its primary purpose is to make the Jupyter notebook examples
**consistent** and more suitable for first-time viewers.
- add links to the integration source (if applicable) with a short
description of this source;
- removed `_retriever` suffix from the file names (where it existed) for
consistency;
- removed ` retriever` from the notebook title (where it existed) for
consistency;
- added code to install necessary Python package(s);
- added code to set up the necessary API Key.
- very small fixes in notebooks from other folders (for consistency):
- docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb
- docs/modules/indexes/vectorstores/examples/pinecone.ipynb
- docs/modules/models/llms/integrations/cohere.ipynb
- fixed misspelling in langchain/retrievers/time_weighted_retriever.py
comment (sorry, about this change in a .py file )
## Who can review
@dev2049
# Fixed typos (issues #4818 & #4668 & more typos)
- At some places, it said `model = ChatOpenAI(model='gpt-3.5-turbo')`
but should be `model = ChatOpenAI(model_name='gpt-3.5-turbo')`
- Fixes some other typos
Fixes#4818, #4668
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
- added `Wikipedia` retriever. It is effectively a wrapper for
`WikipediaAPIWrapper`. It wrapps load() into get_relevant_documents()
- sorted `__all__` in the `retrievers/__init__`
- added integration tests for the WikipediaRetriever
- added an example (as Jupyter notebook) for the WikipediaRetriever
Alternate implementation of #3452 that relies on a generic query
constructor chain and language and then has vector store-specific
translation layer. Still refactoring and updating examples but general
structure is there and seems to work s well as #3452 on exampels
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
## Use `index_id` over `app_id`
We made a major update to index + retrieve based on Metal Indexes
(instead of apps). With this change, we accept an index instead of an
app in each of our respective core apis. [More details
here](https://docs.getmetal.io/api-reference/core/indexing).
Add a time-weighted memory retriever and a notebook that approximates a
Generative Agent from https://arxiv.org/pdf/2304.03442.pdf
The "daily plan" components are removed for now since they are less
useful without a virtual world, but the memory is an interesting
component to build off.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Add SVM retriever class, based on
https://github.com/karpathy/randomfun/blob/master/knn_vs_svm.ipynb.
Testing still WIP, but the logic is correct (I have a local
implementation outside of Langchain working).
---------
Co-authored-by: Lance Martin <122662504+PineappleExpress808@users.noreply.github.com>
Co-authored-by: rlm <31treehaus@31s-MacBook-Pro.local>
Improvements to Deep Lake Vector Store
- much faster view loading of embeddings after filters with
`fetch_chunks=True`
- 2x faster ingestion
- use np.float32 for embeddings to save 2x storage, LZ4 compression for
text and metadata storage (saves up to 4x storage for text data)
- user defined functions as filters
Docs
- Added retriever full example for analyzing twitter the-algorithm
source code with GPT4
- Added a use case for code analysis (please let us know your thoughts
how we can improve it)
---------
Co-authored-by: Davit Buniatyan <d@activeloop.ai>