For most queries it's the `size` parameter that determines final number
of documents to return. Since our abstractions refer to this as `k`, set
this to be `k` everywhere instead of expecting a separate param. Would
be great to have someone more familiar with OpenSearch validate that
this is reasonable (e.g. that having `size` and what OpenSearch calls
`k` be the same won't lead to any strange behavior). cc @naveentatikonda
Closes#5212
"This notebook shows how to use functionality related to the `OpenSearch` database.\n",
"\n",
"To run, you should have the opensearch instance up and running: [here](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/)\n",
"To run, you should have an OpenSearch instance up and running: [see here for an easy Docker installation](https://hub.docker.com/r/opensearchproject/opensearch).\n",
"\n",
"`similarity_search` by default performs the Approximate k-NN Search which uses one of the several algorithms like lucene, nmslib, faiss recommended for\n",
"large datasets. To perform brute force search we have other search methods known as Script Scoring and Painless Scripting.\n",
"Check [this](https://opensearch.org/docs/latest/search-plugins/knn/index/) for more details."
@ -23,7 +24,8 @@
"id": "94963977-9dfc-48b7-872a-53f2947f46c6",
"metadata": {},
"source": [
"## Installation"
"## Installation\n",
"Install the Python client."
]
},
{
@ -61,7 +63,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "aac9563e",
"metadata": {},
"outputs": [],
@ -74,7 +76,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"id": "a3c3999a",
"metadata": {},
"outputs": [],
@ -98,6 +100,32 @@
"`similarity_search` using `Approximate k-NN` Search with Custom Parameters"