Commit Graph

10 Commits

Author SHA1 Message Date
Daniel Chalef
b26cca8008
Zep Authentication (#6728)
## Description: Add Zep API Key argument to ZepChatMessageHistory and
ZepRetriever
- correct docs site links
- add zep api_key auth to constructors

ZepChatMessageHistory: @hwchase17, 
ZepRetriever: @rlancemartin, @eyurtsev
2023-06-30 14:24:26 -07:00
Lance Martin
3f9900a864
Create MultiQueryRetriever (#6833)
Distance-based vector database retrieval embeds (represents) queries in
high-dimensional space and finds similar embedded documents based on
"distance". But, retrieval may produce difference results with subtle
changes in query wording or if the embeddings do not capture the
semantics of the data well. Prompt engineering / tuning is sometimes
done to manually address these problems, but can be tedious.

The `MultiQueryRetriever` automates the process of prompt tuning by
using an LLM to generate multiple queries from different perspectives
for a given user input query. For each query, it retrieves a set of
relevant documents and takes the unique union across all queries to get
a larger set of potentially relevant documents. By generating multiple
perspectives on the same question, the `MultiQueryRetriever` might be
able to overcome some of the limitations of the distance-based retrieval
and get a richer set of results.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-27 22:59:40 -07:00
Piyush Jain
b1de927f1b
Kendra retriever api (#6616)
## Description
Replaces [Kendra
Retriever](https://github.com/hwchase17/langchain/blob/master/langchain/retrievers/aws_kendra_index_retriever.py)
with an updated version that uses the new [retriever
API](https://docs.aws.amazon.com/kendra/latest/dg/searching-retrieve.html)
which is better suited for retrieval augmented generation (RAG) systems.

**Note**: This change requires the latest version (1.26.159) of boto3 to
work. `pip install -U boto3` to upgrade the boto3 version.

cc @hupe1980
cc @dev2049
2023-06-23 14:59:35 -07:00
Ikko Eltociear Ashimine
73da193a4b
Fix typo in myscale_self_query.ipynb (#6601) 2023-06-23 14:57:12 -07:00
Davis Chase
3298bf4f00
docs/fix links (#6498) 2023-06-20 14:06:50 -07:00
Leonid Ganeline
03b16ed2b1
docs retrievers fixes (#6299)
Fixed several inconsistencies:
- file names and notebook titles should be similar otherwise ToC on the
[retrievers
page](https://python.langchain.com/en/latest/modules/indexes/retrievers.html)
and on the left ToC tab are different. For example, now, `Self-querying
with Chroma` is not correctly alphabetically sorted because its file
named `chroma_self_query.ipynb`
- `Stringing compressors and document transformers...` demoted from `#`
to `##`. Otherwise, it appears in Toc.
- several formatting problems

#### Who can review?

@hwchase17 
@dev2049

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:04:35 -07:00
Harrison Chase
c0c2fd0782
Harrison/zep mem (#6388)
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
2023-06-18 16:53:35 -07:00
Harrison Chase
9bf5b0defa
Harrison/myscale self query (#6376)
Co-authored-by: Fangrui Liu <fangruil@moqi.ai>
Co-authored-by: 刘 方瑞 <fangrui.liu@outlook.com>
Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>
2023-06-18 16:53:10 -07:00
Saba Sturua
427551eabf
DocArray as a Retriever (#6031)
## DocArray as a Retriever

[DocArray](https://github.com/docarray/docarray) is an open-source tool
for managing your multi-modal data. It offers flexibility to store and
search through your data using various document index backends. This PR
introduces `DocArrayRetriever` - which works with any available backend
and serves as a retriever for Langchain apps.

Also, I added 2 notebooks:
DocArray Backends - intro to all 5 currently supported backends, how to
initialize, index, and use them as a retriever
DocArray Usage - showcasing what additional search parameters you can
pass to create versatile retrievers

Example:
```python
from docarray.index import InMemoryExactNNIndex
from docarray import BaseDoc, DocList
from docarray.typing import NdArray
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.retrievers import DocArrayRetriever


# define document schema
class MyDoc(BaseDoc):
    description: str
    description_embedding: NdArray[1536]


embeddings = OpenAIEmbeddings()
# create documents
descriptions = ["description 1", "description 2"]
desc_embeddings = embeddings.embed_documents(texts=descriptions)
docs = DocList[MyDoc](
    [
        MyDoc(description=desc, description_embedding=embedding)
        for desc, embedding in zip(descriptions, desc_embeddings)
    ]
)

# initialize document index with data
db = InMemoryExactNNIndex[MyDoc](docs)

# create a retriever
retriever = DocArrayRetriever(
    index=db,
    embeddings=embeddings,
    search_field="description_embedding",
    content_field="description",
)

# find the relevant document
doc = retriever.get_relevant_documents("action movies")
print(doc)
```

#### Who can review?

@dev2049

---------

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
2023-06-17 09:09:33 -07:00
Davis Chase
87e502c6bc
Doc refactor (#6300)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-16 11:52:56 -07:00