langchain/langchain
Janos Tolgyesi 5f4552391f
Add SKLearnVectorStore (#5305)
# Add SKLearnVectorStore

This PR adds SKLearnVectorStore, a simply vector store based on
NearestNeighbors implementations in the scikit-learn package. This
provides a simple drop-in vector store implementation with minimal
dependencies (scikit-learn is typically installed in a data scientist /
ml engineer environment). The vector store can be persisted and loaded
from json, bson and parquet format.

SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn,
numpy and pandas packages. Persisting to bson requires the bson package,
persisting to parquet requires the pyarrow package.

## Before submitting

Integration tests are provided under
`tests/integration_tests/vectorstores/test_sklearn.py`

Sample usage notebook is provided under
`docs/modules/indexes/vectorstores/examples/sklear.ipynb`

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-28 08:17:42 -07:00
..
agents Fixing blank thoughts in verbose for "_Exception" Action (#5331) 2023-05-27 21:14:16 -07:00
callbacks Log warning (#5192) 2023-05-24 21:05:13 +00:00
chains Add an example to make the prompt more robust (#5291) 2023-05-26 09:32:35 -04:00
chat_models OpenAI lint (#5273) 2023-05-25 16:20:06 -07:00
cli Add 'status' command to get server status (#5197) 2023-05-24 21:43:16 +00:00
client Add Delete Session Method (#5193) 2023-05-24 21:06:03 +00:00
docstore changed ValueError to ImportError (#5006) 2023-05-19 15:28:08 -07:00
document_loaders Fixed regression in JoplinLoader's get note url (#5265) 2023-05-25 13:10:10 -07:00
embeddings OpenAI lint (#5273) 2023-05-25 16:20:06 -07:00
evaluation Replace remaining usage of basellm with baselangmodel (#3981) 2023-05-02 21:52:29 -07:00
experimental fix: remove empty lines that cause InvalidRequestError (#5320) 2023-05-27 21:15:03 -07:00
graphs changed ValueError to ImportError (#5103) 2023-05-22 15:24:45 -07:00
indexes Create async copy of from_text() inside GraphIndexCreator. (#5214) 2023-05-24 21:54:12 -07:00
llms Fixed passing creds to VertexAI LLM (#5297) 2023-05-26 08:31:02 -07:00
memory added cosmos kwargs option (#5292) 2023-05-27 21:19:40 -07:00
output_parsers add enum output parser (#5165) 2023-05-27 20:58:23 -07:00
prompts fix prompt saving (#4987) 2023-05-20 08:21:52 -07:00
retrievers Better docs for weaviate hybrid search (#5290) 2023-05-26 09:30:41 -07:00
tools Add visible_only and strict_mode options to ClickTool (#4088) 2023-05-25 14:10:39 -07:00
utilities feat: support for shopping search in SerpApi (#5259) 2023-05-27 21:20:24 -07:00
vectorstores Add SKLearnVectorStore (#5305) 2023-05-28 08:17:42 -07:00
__init__.py console callback verbose (#4696) 2023-05-17 01:28:43 +00:00
base_language.py Add async versions of predict() and predict_messages() (#4867) 2023-05-23 17:22:49 -07:00
cache.py feat: add Momento as a standard cache and chat message history provider (#5221) 2023-05-25 19:13:21 -07:00
docker-compose.yaml Update docker-compose.yaml (#3582) 2023-04-26 15:11:59 -07:00
document_transformers.py Contextual compression retriever (#2915) 2023-04-20 17:01:14 -07:00
env.py Add Environment Info to Run (#4691) 2023-05-15 15:38:49 +00:00
example_generator.py Replace remaining usage of basellm with baselangmodel (#3981) 2023-05-02 21:52:29 -07:00
formatting.py Validate input_variables when using jinja2 templates (#3140) 2023-04-19 16:18:32 -07:00
input.py Bold Crumbs (#4876) 2023-05-17 22:50:35 +00:00
math_utils.py add get_top_k_cosine_similarity method to get max top k score and index (#5059) 2023-05-22 11:55:48 -07:00
model_laboratory.py Harrison/improve cache (#368) 2022-12-18 16:22:42 -05:00
py.typed Add py.typed marker to package (#121) 2022-11-12 11:22:32 -08:00
python.py Move PythonRepl -> langchain.utilities (#2917) 2023-04-15 10:50:25 -07:00
requests.py fixed aiohttp.client_exceptions.ClientConnectionError: Connection closed (#2718) 2023-04-11 10:52:55 -07:00
schema.py Improving Resilience of MRKL Agent (#5014) 2023-05-22 11:08:08 -07:00
serpapi.py move serpapi wrapper (#1199) 2023-02-20 21:15:45 -08:00
server.py Update Tracing Walkthrough (#4760) 2023-05-16 13:26:43 +00:00
sql_database.py Support bigquery dialect - SQL (#5261) 2023-05-25 18:19:17 -07:00
text_splitter.py Improve effeciency of TextSplitter.split_documents, iterate once (#5111) 2023-05-22 23:00:24 -04:00
utils.py Harrison/virtual time (#4658) 2023-05-14 10:29:17 -07:00