docs: Add documentation of `ElasticsearchStore.BM25RetrievalStrategy` (#20098)

This pull request follows up on
https://github.com/langchain-ai/langchain/pull/19314 and
https://github.com/langchain-ai/langchain-elastic/pull/6, adding
documentation for the `ElasticsearchStore.BM25RetrievalStrategy`.

Like other retrieval strategies, we are now introducing
BM25RetrievalStrategy.

### Background
- The `BM25RetrievalStrategy` has been introduced to `langchain-elastic`
via the pull request
https://github.com/langchain-ai/langchain-elastic/pull/6.
- This PR was initially created in the main `langchain` repository but
was moved to `langchain-elastic` during the review process due to the
migration of the partner package.
- The original PR can be found at
https://github.com/langchain-ai/langchain/pull/19314.
- As
[commented](https://github.com/langchain-ai/langchain/pull/19314#issuecomment-2023202401)
by @joemcelroy, documenting the new retrieval strategy is part of the
requirements for its introduction.

Although the `BM25RetrievalStrategy` has been merged into
`langchain-elastic`, its documentation is still to be maintained in the
main `langchain` repository. Therefore, this pull request adds the
documentation portion of `BM25RetrievalStrategy`.

The content of the documentation remains the same as that included in
the original PR, https://github.com/langchain-ai/langchain/pull/19314.

---------

Co-authored-by: Max Jakob <max.jakob@elastic.co>
pull/18013/head^2
Shotaro Sano 3 months ago committed by GitHub
parent 0394c6e126
commit 6c11c8dac6
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -736,6 +736,50 @@
"```"
]
},
{
"cell_type": "markdown",
"id": "05cdb43d-5e46-46f6-a2dc-91df4aa56ec7",
"metadata": {},
"source": [
"## BM25RetrievalStrategy\n",
"This strategy allows the user to perform searches using pure BM25 without vector search.\n",
"\n",
"To use this, specify `BM25RetrievalStrategy` in `ElasticsearchStore` constructor.\n",
"\n",
"Note that in the example below, the embedding option is not specified, indicating that the search is conducted without using embeddings."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4464a657-08c5-4a1a-b0e8-dba65f5b7ec0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Document(page_content='foo'), Document(page_content='foo bar'), Document(page_content='foo bar baz')]\n"
]
}
],
"source": [
"from langchain_elasticsearch import ElasticsearchStore\n",
"\n",
"db = ElasticsearchStore(\n",
" es_url=\"http://localhost:9200\",\n",
" index_name=\"test_index\",\n",
" strategy=ElasticsearchStore.BM25RetrievalStrategy(),\n",
")\n",
"\n",
"db.add_texts(\n",
" [\"foo\", \"foo bar\", \"foo bar baz\", \"bar\", \"bar baz\", \"baz\"],\n",
")\n",
"\n",
"results = db.similarity_search(query=\"foo\", k=10)\n",
"print(results)"
]
},
{
"cell_type": "markdown",
"id": "0960fa0a",
@ -993,7 +1037,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
"version": "3.11.8"
}
},
"nbformat": 4,

Loading…
Cancel
Save