Fix spelling errors in the text: 'Therefore' and 'Retrying
I want to stress that your feedback is invaluable to us and is genuinely
cherished.
With gratitude,
@baskaryan @hwchase17
Removed extra "the" in the sentence about the chicken crossing the road
in fallbacks.ipynb. The sentence now reads correctly: "Why did the
chicken cross the road?" This resolves the grammatical error and
improves the overall quality of the content.
@baskaryan , @hinthornw , @hwchase17
I want to extend my heartfelt gratitude to the creator for masterfully
crafting this remarkable application. 🙌 I am truly impressed by the
meticulous attention to grammar and spelling in the documentation, which
undoubtedly contributes to a polished and seamless reader experience.
As always, your feedback holds immense value and is greatly appreciated.
@baskaryan , @hwchase17
I want to convey my deep appreciation to the creator for their expert
craftsmanship in developing this exceptional application. 👏 The
remarkable dedication to upholding impeccable grammar and spelling in
the documentation significantly enhances the polished and seamless
experience for readers.
I want to stress that your feedback is invaluable to us and is genuinely
cherished.
With gratitude,
@baskaryan, @hwchase17
In this commit, I have made a modification to the term "Langchain" to
correctly reflect the project's name as "LangChain". This change ensures
consistency and accuracy throughout the codebase and documentation.
@baskaryan , @hwchase17
Refined the example in router.ipynb by addressing a minor typographical
error. The typo "rins" has been corrected to "rains" in the code snippet
that demonstrates the usage of the MultiPromptChain. This change ensures
accuracy and consistency in the provided code example.
This improvement enhances the readability and correctness of the
notebook, making it easier for users to understand and follow the
demonstration. The commit aims to maintain the quality and accuracy of
the content within the repository.
Thank you for your attention to detail, and please review the change at
your convenience.
@baskaryan , @hwchase17
- Description: Fix a minor variable naming inconsistency in a code
snippet in the docs
- Issue: N/A
- Dependencies: none
- Tag maintainer: N/A
- Twitter handle: N/A
- Description: Added improvements in Nebula LLM to perform auto-retry;
more generation parameters supported. Conversation is no longer required
to be passed in the LLM object. Examples are updated.
- Issue: N/A
- Dependencies: N/A
- Tag maintainer: @baskaryan
- Twitter handle: symbldotai
---------
Co-authored-by: toshishjawale <toshish@symbl.ai>
Update documentation and URLs for the Langchain Context integration.
We've moved from getcontext.ai to context.ai \o/
Thanks in advance for the review!
Now with ElasticsearchStore VectorStore merged, i've added support for
the self-query retriever.
I've added a notebook also to demonstrate capability. I've also added
unit tests.
**Credit**
@elastic and @phoey1 on twitter.
Todo:
- [x] Connection options (cloud, localhost url, es_connection) support
- [x] Logging support
- [x] Customisable field support
- [x] Distance Similarity support
- [x] Metadata support
- [x] Metadata Filter support
- [x] Retrieval Strategies
- [x] Approx
- [x] Approx with Hybrid
- [x] Exact
- [x] Custom
- [x] ELSER (excluding hybrid as we are working on RRF support)
- [x] integration tests
- [x] Documentation
👋 this is a contribution to improve Elasticsearch integration with
Langchain. Its based loosely on the changes that are in master but with
some notable changes:
## Package name & design improvements
The import name is now `ElasticsearchStore`, to aid discoverability of
the VectorStore.
```py
## Before
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch, ElasticKnnSearch
## Now
from langchain.vectorstores.elasticsearch import ElasticsearchStore
```
## Retrieval Strategy support
Before we had a number of classes, depending on the strategy you wanted.
`ElasticKnnSearch` for approx, `ElasticVectorSearch` for exact / brute
force.
With `ElasticsearchStore` we have retrieval strategies:
### Approx Example
Default strategy for the vast majority of developers who use
Elasticsearch will be inferring the embeddings from outside of
Elasticsearch. Uses KNN functionality of _search.
```py
texts = ["foo", "bar", "baz"]
docsearch = ElasticsearchStore.from_texts(
texts,
FakeEmbeddings(),
es_url="http://localhost:9200",
index_name="sample-index"
)
output = docsearch.similarity_search("foo", k=1)
```
### Approx, with hybrid
Developers who want to search, using both the embedding and the text
bm25 match. Its simple to enable.
```py
texts = ["foo", "bar", "baz"]
docsearch = ElasticsearchStore.from_texts(
texts,
FakeEmbeddings(),
es_url="http://localhost:9200",
index_name="sample-index",
strategy=ElasticsearchStore.ApproxRetrievalStrategy(hybrid=True)
)
output = docsearch.similarity_search("foo", k=1)
```
### Approx, with `query_model_id`
Developers who want to infer within Elasticsearch, using the model
loaded in the ml node.
This relies on the developer to setup the pipeline and index if they
wish to embed the text in Elasticsearch. Example of this in the test.
```py
texts = ["foo", "bar", "baz"]
docsearch = ElasticsearchStore.from_texts(
texts,
FakeEmbeddings(),
es_url="http://localhost:9200",
index_name="sample-index",
strategy=ElasticsearchStore.ApproxRetrievalStrategy(
query_model_id="sentence-transformers__all-minilm-l6-v2"
),
)
output = docsearch.similarity_search("foo", k=1)
```
### I want to provide my own custom Elasticsearch Query
You might want to have more control over the query, to perform
multi-phase retrieval such as LTR, linearly boosting on document
parameters like recently updated or geo-distance. You can do this with
`custom_query_fn`
```py
def my_custom_query(query_body: dict, query: str) -> dict:
return {"query": {"match": {"text": {"query": "bar"}}}}
texts = ["foo", "bar", "baz"]
docsearch = ElasticsearchStore.from_texts(
texts, FakeEmbeddings(), **elasticsearch_connection, index_name=index_name
)
docsearch.similarity_search("foo", k=1, custom_query=my_custom_query)
```
### Exact Example
Developers who have a small dataset in Elasticsearch, dont want the cost
of indexing the dims vs tradeoff on cost at query time. Uses
script_score.
```py
texts = ["foo", "bar", "baz"]
docsearch = ElasticsearchStore.from_texts(
texts,
FakeEmbeddings(),
es_url="http://localhost:9200",
index_name="sample-index",
strategy=ElasticsearchStore.ExactRetrievalStrategy(),
)
output = docsearch.similarity_search("foo", k=1)
```
### ELSER Example
Elastic provides its own sparse vector model called ELSER. With these
changes, its really easy to use. The vector store creates a pipeline and
index thats setup for ELSER. All the developer needs to do is configure,
ingest and query via langchain tooling.
```py
texts = ["foo", "bar", "baz"]
docsearch = ElasticsearchStore.from_texts(
texts,
FakeEmbeddings(),
es_url="http://localhost:9200",
index_name="sample-index",
strategy=ElasticsearchStore.SparseVectorStrategy(),
)
output = docsearch.similarity_search("foo", k=1)
```
## Architecture
In future, we can introduce new strategies and allow us to not break bwc
as we evolve the index / query strategy.
## Credit
On release, could you credit @elastic and @phoey1 please? Thank you!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>