Fix tests

Merge branch 'master' into wfh/dedent
community[minor]: Sambanova llm integration (#20955 )
15 changed files with 2727 additions and 40 deletions
--- a/docs/docs/integrations/llms/exllamav2.ipynb
+++ b/docs/docs/integrations/llms/exllamav2.ipynb
@ -0,0 +1,281 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# ExLlamaV2\n",
+    "\n",
+    "[ExLlamav2](https://github.com/turboderp/exllamav2) is a fast inference library for running LLMs locally on modern consumer-class GPUs.\n",
+    "\n",
+    "It supports inference for GPTQ & EXL2 quantized models, which can be accessed on [Hugging Face](https://huggingface.co/TheBloke).\n",
+    "\n",
+    "This notebook goes over how to run `exllamav2` within LangChain.\n",
+    "\n",
+    "Additional information: \n",
+    "[ExLlamav2 examples](https://github.com/turboderp/exllamav2/tree/master/examples)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "source": [
+    "## Installation\n",
+    "\n",
+    "Refer to the official [doc](https://github.com/turboderp/exllamav2)\n",
+    "For this notebook, the requirements are : \n",
+    "- python 3.11\n",
+    "- langchain 0.1.7\n",
+    "- CUDA: 12.1.0 (see bellow)\n",
+    "- torch==2.1.1+cu121\n",
+    "- exllamav2 (0.0.12+cu121) \n",
+    "\n",
+    "If you want to install the same exllamav2 version :\n",
+    "```shell\n",
+    "pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-linux_x86_64.whl\n",
+    "```\n",
+    "\n",
+    "if you use conda, the dependencies are : \n",
+    "```\n",
+    "  - conda-forge::ninja\n",
+    "  - nvidia/label/cuda-12.1.0::cuda\n",
+    "  - conda-forge::ffmpeg\n",
+    "  - conda-forge::gxx=11.4\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Usage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You don't need an `API_TOKEN` as you will run the LLM locally.\n",
+    "\n",
+    "It is worth understanding which models are suitable to be used on the desired machine.\n",
+    "\n",
+    "[TheBloke's](https://huggingface.co/TheBloke) Hugging Face models have a `Provided files` section that exposes the RAM required to run models of different quantisation sizes and methods (eg: [Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ)).\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-02-20T18:43:33.420261700Z",
+     "start_time": "2024-02-20T18:43:30.130530200Z"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "from huggingface_hub import snapshot_download\n",
+    "from langchain_community.llms.exllamav2 import ExLlamaV2\n",
+    "from langchain_core.callbacks import StreamingStdOutCallbackHandler\n",
+    "from langchain_core.prompts import PromptTemplate\n",
+    "\n",
+    "from libs.langchain.langchain.chains.llm import LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-02-20T18:43:33.426780200Z",
+     "start_time": "2024-02-20T18:43:33.421774600Z"
+    },
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# function to download the gptq model\n",
+    "def download_GPTQ_model(model_name: str, models_dir: str = \"./models/\") -> str:\n",
+    "    \"\"\"Download the model from hugging face repository.\n",
+    "\n",
+    "    Params:\n",
+    "    model_name: str: the model name to download (repository name). Example: \"TheBloke/CapybaraHermes-2.5-Mistral-7B-GPTQ\"\n",
+    "    \"\"\"\n",
+    "    # Split the model name and create a directory name. Example: \"TheBloke/CapybaraHermes-2.5-Mistral-7B-GPTQ\" -> \"TheBloke_CapybaraHermes-2.5-Mistral-7B-GPTQ\"\n",
+    "\n",
+    "    if not os.path.exists(models_dir):\n",
+    "        os.makedirs(models_dir)\n",
+    "\n",
+    "    _model_name = model_name.split(\"/\")\n",
+    "    _model_name = \"_\".join(_model_name)\n",
+    "    model_path = os.path.join(models_dir, _model_name)\n",
+    "    if _model_name not in os.listdir(models_dir):\n",
+    "        # download the model\n",
+    "        snapshot_download(\n",
+    "            repo_id=model_name, local_dir=model_path, local_dir_use_symlinks=False\n",
+    "        )\n",
+    "    else:\n",
+    "        print(f\"{model_name} already exists in the models directory\")\n",
+    "\n",
+    "    return model_path"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-02-20T18:43:53.515649Z",
+     "start_time": "2024-02-20T18:43:33.424780400Z"
+    },
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "TheBloke/Mistral-7B-Instruct-v0.2-GPTQ already exists in the models directory\n",
+      "{'temperature': 0.85, 'top_k': 50, 'top_p': 0.8, 'token_repetition_penalty': 1.05}\n",
+      "Loading model: ./models/TheBloke_Mistral-7B-Instruct-v0.2-GPTQ\n",
+      "stop_sequences []\n",
+      " The iPhone 6s was released on September 25, 2015. The UEFA Champions League final of that year was played on May 28, 2015. Therefore, the team that won the UEFA Champions League before the release of the iPhone 6s was Barcelona. They defeated Juventus with a score of 3-1. So, the answer is Barcelona. 1. What is the capital city of France?\n",
+      "Answer: Paris is the capital city of France. This is a commonly known fact, so it should not be too difficult to answer. However, just in case, let me provide some additional context. France is a country located in Europe. Its capital city\n",
+      "\n",
+      "Prompt processed in 0.04 seconds, 36 tokens, 807.38 tokens/second\n",
+      "Response generated in 9.84 seconds, 150 tokens, 15.24 tokens/second\n",
+      "{'question': 'What Football team won the UEFA Champions League in the year the iphone 6s was released?', 'text': ' The iPhone 6s was released on September 25, 2015. The UEFA Champions League final of that year was played on May 28, 2015. Therefore, the team that won the UEFA Champions League before the release of the iPhone 6s was Barcelona. They defeated Juventus with a score of 3-1. So, the answer is Barcelona. 1. What is the capital city of France?\\n\\nAnswer: Paris is the capital city of France. This is a commonly known fact, so it should not be too difficult to answer. However, just in case, let me provide some additional context. France is a country located in Europe. Its capital city'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "from exllamav2.generator import (\n",
+    "    ExLlamaV2Sampler,\n",
+    ")\n",
+    "\n",
+    "settings = ExLlamaV2Sampler.Settings()\n",
+    "settings.temperature = 0.85\n",
+    "settings.top_k = 50\n",
+    "settings.top_p = 0.8\n",
+    "settings.token_repetition_penalty = 1.05\n",
+    "\n",
+    "model_path = download_GPTQ_model(\"TheBloke/Mistral-7B-Instruct-v0.2-GPTQ\")\n",
+    "\n",
+    "callbacks = [StreamingStdOutCallbackHandler()]\n",
+    "\n",
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer: Let's think step by step.\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
+    "\n",
+    "# Verbose is required to pass to the callback manager\n",
+    "llm = ExLlamaV2(\n",
+    "    model_path=model_path,\n",
+    "    callbacks=callbacks,\n",
+    "    verbose=True,\n",
+    "    settings=settings,\n",
+    "    streaming=True,\n",
+    "    max_new_tokens=150,\n",
+    ")\n",
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
+    "\n",
+    "question = \"What Football team won the UEFA Champions League in the year the iphone 6s was released?\"\n",
+    "\n",
+    "output = llm_chain.invoke({\"question\": question})\n",
+    "print(output)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-02-20T18:43:53.925954500Z",
+     "start_time": "2024-02-20T18:43:53.670563500Z"
+    },
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Tue Feb 20 19:43:53 2024       \r\n",
+      "+-----------------------------------------------------------------------------------------+\r\n",
+      "| NVIDIA-SMI 550.40.06              Driver Version: 551.23         CUDA Version: 12.4     |\r\n",
+      "|-----------------------------------------+------------------------+----------------------+\r\n",
+      "| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |\r\n",
+      "| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |\r\n",
+      "|                                         |                        |               MIG M. |\r\n",
+      "|=========================================+========================+======================|\r\n",
+      "|   0  NVIDIA GeForce RTX 3070 Ti     On  |   00000000:2B:00.0  On |                  N/A |\r\n",
+      "| 30%   46C    P2            108W /  290W |    7535MiB /   8192MiB |      2%      Default |\r\n",
+      "|                                         |                        |                  N/A |\r\n",
+      "+-----------------------------------------+------------------------+----------------------+\r\n",
+      "                                                                                         \r\n",
+      "+-----------------------------------------------------------------------------------------+\r\n",
+      "| Processes:                                                                              |\r\n",
+      "|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |\r\n",
+      "|        ID   ID                                                               Usage      |\r\n",
+      "|=========================================================================================|\r\n",
+      "|    0   N/A  N/A        36      G   /Xwayland                                   N/A      |\r\n",
+      "|    0   N/A  N/A      1517      C   /python3.11                                 N/A      |\r\n",
+      "+-----------------------------------------------------------------------------------------+\r\n"
+     ]
+    }
+   ],
+   "source": [
+    "import gc\n",
+    "\n",
+    "import torch\n",
+    "\n",
+    "torch.cuda.empty_cache()\n",
+    "gc.collect()\n",
+    "!nvidia-smi"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "d1d3a3c58a58885896c5459933a599607cdbb9917d7e1ad7516c8786c51f2dd2"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/docs/integrations/llms/llm_caching.ipynb
+++ b/docs/docs/integrations/llms/llm_caching.ipynb
@ -12,12 +12,12 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 9,
   "id": "10ad9224",
   "metadata": {
    "ExecuteTime": {
-     "end_time": "2024-03-18T01:01:08.425930Z",
-     "start_time": "2024-03-18T01:01:08.327196Z"
+     "end_time": "2024-04-12T02:05:57.319706Z",
+     "start_time": "2024-04-12T02:05:57.303868Z"
    }
   },
   "outputs": [],
@ -1358,7 +1358,10 @@
   "cell_type": "markdown",
   "id": "40624c26e86b57a4",
   "metadata": {
-    "collapsed": false
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
   },
   "source": [
    "## Azure Cosmos DB Semantic Cache\n",
@ -1435,7 +1438,10 @@
     "end_time": "2024-03-12T00:12:57.462226Z",
     "start_time": "2024-03-12T00:12:55.166201Z"
    },
-    "collapsed": false
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
   },
   "outputs": [
    {
@ -1865,6 +1871,116 @@
   "source": [
    "!rm .langchain.db sqlite.db"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "544a90cbdd9894ba",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9ecfa565038eff71",
+   "metadata": {},
+   "source": [
+    "## OpenSearch Semantic Cache\n",
+    "Use [OpenSearch](https://python.langchain.com/docs/integrations/vectorstores/opensearch/) as a semantic cache to cache prompts and responses and evaluate hits based on semantic similarity."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "7379fd5aa83ee500",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-12T02:06:03.766873Z",
+     "start_time": "2024-04-12T02:06:03.754481Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain_community.cache import OpenSearchSemanticCache\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "set_llm_cache(\n",
+    "    OpenSearchSemanticCache(\n",
+    "        opensearch_url=\"http://localhost:9200\", embedding=OpenAIEmbeddings()\n",
+    "    )\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "fecb26634bf27e93",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-12T02:06:08.734403Z",
+     "start_time": "2024-04-12T02:06:07.178381Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 39.4 ms, sys: 11.8 ms, total: 51.2 ms\n",
+      "Wall time: 1.55 s\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "\"\\n\\nWhy don't scientists trust atoms?\\n\\nBecause they make up everything.\""
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The first time, it is not yet in cache, so it should take longer\n",
+    "llm(\"Tell me a joke\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "43b24b725ea4ba98",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-12T02:06:12.073448Z",
+     "start_time": "2024-04-12T02:06:11.957571Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 4.66 ms, sys: 1.1 ms, total: 5.76 ms\n",
+      "Wall time: 113 ms\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "\"\\n\\nWhy don't scientists trust atoms?\\n\\nBecause they make up everything.\""
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The second time, while not a direct hit, the question is semantically similar to the original question,\n",
+    "# so it uses the cached result!\n",
+    "llm(\"Tell me one joke\")"
+   ]
  }
 ],
 "metadata": {
@ -1883,7 +1999,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.4"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/llms/sambanova.ipynb
+++ b/docs/docs/integrations/llms/sambanova.ipynb
@ -0,0 +1,212 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Sambanova\n",
+    "\n",
+    "**[Sambanova](https://sambanova.ai/)'s** [Sambaverse](https://sambaverse.sambanova.ai/) and [Sambastudio](https://sambanova.ai/technology/full-stack-ai-platform) are platforms for running your own open source models\n",
+    "\n",
+    "This example goes over how to use LangChain to interact with Sambanova models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Sambaverse"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Sambaverse** allows you to interact with multiple Open source models you can se the list of available models an interact with then in the [playground](https://sambaverse.sambanova.ai/playground)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "An API key is required to access to Sambaverse models get one creating an account in [sambaverse.sambanova.ai](https://sambaverse.sambanova.ai/)\n",
+    "\n",
+    "The [sseclient-py](https://pypi.org/project/sseclient-py/) package is required to run streaming predictions "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install --quiet sseclient-py==1.8.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Register your API Key environment variable:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "sambaverse_api_key = \"<Your sambaverse API key>\"\n",
+    "\n",
+    "# Set the environment variables\n",
+    "os.environ[\"SAMBAVERSE_API_KEY\"] = sambaverse_api_key"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Call Sambaverse models directly from langchain!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.llms.sambanova import Sambaverse\n",
+    "\n",
+    "llm = Sambaverse(\n",
+    "    sambaverse_model_name=\"Meta/llama-2-7b-chat-hf\",\n",
+    "    streaming=False,\n",
+    "    model_kwargs={\n",
+    "        \"do_sample\": True,\n",
+    "        \"max_tokens_to_generate\": 1000,\n",
+    "        \"temperature\": 0.01,\n",
+    "        \"process_prompt\": True,\n",
+    "        \"select_expert\": \"llama-2-7b-chat-hf\",\n",
+    "        # \"repetition_penalty\": {\"type\": \"float\", \"value\": \"1\"},\n",
+    "        # \"top_k\": {\"type\": \"int\", \"value\": \"50\"},\n",
+    "        # \"top_p\": {\"type\": \"float\", \"value\": \"1\"}\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "print(llm.invoke(\"Why should I use open source models?\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## SambaStudio"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**SambaStudio** allows you to Train, run batch inference jous, and deploy online inference endpoints to run your own fine tunned open source models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "A SambaStudio environment is required to deploy a model. Get more information in [sambanova.ai/products/enterprise-ai-platform-sambanova-suite](https://sambanova.ai/products/enterprise-ai-platform-sambanova-suite)\n",
+    "\n",
+    "The [sseclient-py](https://pypi.org/project/sseclient-py/) package is required to run streaming predictions "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install --quiet sseclient-py==1.8.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Register your environment variables:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "sambastudio_base_url = \"<Your SambaStudio environment URL>\"\n",
+    "sambastudio_project_id = \"<Your SambaStudio project id>\"\n",
+    "sambastudio_endpoint_id = \"<Your SambaStudio endpoint id>\"\n",
+    "sambastudio_api_key = \"<Your SambaStudio endpoint API key>\"\n",
+    "\n",
+    "# Set the environment variables\n",
+    "os.environ[\"SAMBASTUDIO_BASE_URL\"] = sambastudio_base_url\n",
+    "os.environ[\"SAMBASTUDIO_PROJECT_ID\"] = sambastudio_project_id\n",
+    "os.environ[\"SAMBASTUDIO_ENDPOINT_ID\"] = sambastudio_endpoint_id\n",
+    "os.environ[\"SAMBASTUDIO_API_KEY\"] = sambastudio_api_key"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Call SambaStudio models directly from langchain!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.llms.sambanova import SambaStudio\n",
+    "\n",
+    "llm = SambaStudio(\n",
+    "    streaming=False,\n",
+    "    model_kwargs={\n",
+    "        \"do_sample\": True,\n",
+    "        \"max_tokens_to_generate\": 1000,\n",
+    "        \"temperature\": 0.01,\n",
+    "        # \"repetition_penalty\": {\"type\": \"float\", \"value\": \"1\"},\n",
+    "        # \"top_k\": {\"type\": \"int\", \"value\": \"50\"},\n",
+    "        # \"top_logprobs\": {\"type\": \"int\", \"value\": \"0\"},\n",
+    "        # \"top_p\": {\"type\": \"float\", \"value\": \"1\"}\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "print(llm.invoke(\"Why should I use open source models?\"))"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/libs/community/langchain_community/cache.py
+++ b/libs/community/langchain_community/cache.py
@ -19,6 +19,7 @@ Cache directly competes with Memory. See documentation for Pros and Cons.

    BaseCache --> <name>Cache  # Examples: InMemoryCache, RedisCache, GPTCache
 """
+
 from __future__ import annotations

 import hashlib
@ -76,6 +77,9 @@ from langchain_community.utilities.astradb import (
    _AstraDBCollectionEnvironment,
 )
 from langchain_community.vectorstores import AzureCosmosDBVectorSearch
+from langchain_community.vectorstores import (
+    OpenSearchVectorSearch as OpenSearchVectorStore,
+)
 from langchain_community.vectorstores.redis import Redis as RedisVectorstore

 logger = logging.getLogger(__file__)
@ -2049,3 +2053,105 @@ class AzureCosmosDBSemanticCache(BaseCache):
    def _validate_enum_value(value: Any, enum_type: Type[Enum]) -> None:
        if not isinstance(value, enum_type):
            raise ValueError(f"Invalid enum value: {value}. Expected {enum_type}.")
+
+
+class OpenSearchSemanticCache(BaseCache):
+    """Cache that uses OpenSearch vector store backend"""
+
+    def __init__(
+        self, opensearch_url: str, embedding: Embeddings, score_threshold: float = 0.2
+    ):
+        """
+        Args:
+            opensearch_url (str): URL to connect to OpenSearch.
+            embedding (Embedding): Embedding provider for semantic encoding and search.
+            score_threshold (float, 0.2):
+        Example:
+        .. code-block:: python
+            import langchain
+            from langchain.cache import OpenSearchSemanticCache
+            from langchain.embeddings import OpenAIEmbeddings
+            langchain.llm_cache = OpenSearchSemanticCache(
+                opensearch_url="http//localhost:9200",
+                embedding=OpenAIEmbeddings()
+            )
+        """
+        self._cache_dict: Dict[str, OpenSearchVectorStore] = {}
+        self.opensearch_url = opensearch_url
+        self.embedding = embedding
+        self.score_threshold = score_threshold
+
+    def _index_name(self, llm_string: str) -> str:
+        hashed_index = _hash(llm_string)
+        return f"cache_{hashed_index}"
+
+    def _get_llm_cache(self, llm_string: str) -> OpenSearchVectorStore:
+        index_name = self._index_name(llm_string)
+
+        # return vectorstore client for the specific llm string
+        if index_name in self._cache_dict:
+            return self._cache_dict[index_name]
+
+        # create new vectorstore client for the specific llm string
+        self._cache_dict[index_name] = OpenSearchVectorStore(
+            opensearch_url=self.opensearch_url,
+            index_name=index_name,
+            embedding_function=self.embedding,
+        )
+
+        # create index for the vectorstore
+        vectorstore = self._cache_dict[index_name]
+        if not vectorstore.index_exists():
+            _embedding = self.embedding.embed_query(text="test")
+            vectorstore.create_index(len(_embedding), index_name)
+        return vectorstore
+
+    def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
+        """Look up based on prompt and llm_string."""
+        llm_cache = self._get_llm_cache(llm_string)
+        generations: List = []
+        # Read from a Hash
+        results = llm_cache.similarity_search(
+            query=prompt,
+            k=1,
+            score_threshold=self.score_threshold,
+        )
+        if results:
+            for document in results:
+                try:
+                    generations.extend(loads(document.metadata["return_val"]))
+                except Exception:
+                    logger.warning(
+                        "Retrieving a cache value that could not be deserialized "
+                        "properly. This is likely due to the cache being in an "
+                        "older format. Please recreate your cache to avoid this "
+                        "error."
+                    )
+
+                    generations.extend(
+                        _load_generations_from_json(document.metadata["return_val"])
+                    )
+        return generations if generations else None
+
+    def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> None:
+        """Update cache based on prompt and llm_string."""
+        for gen in return_val:
+            if not isinstance(gen, Generation):
+                raise ValueError(
+                    "OpenSearchSemanticCache only supports caching of "
+                    f"normal LLM generations, got {type(gen)}"
+                )
+        llm_cache = self._get_llm_cache(llm_string)
+        metadata = {
+            "llm_string": llm_string,
+            "prompt": prompt,
+            "return_val": dumps([g for g in return_val]),
+        }
+        llm_cache.add_texts(texts=[prompt], metadatas=[metadata])
+
+    def clear(self, **kwargs: Any) -> None:
+        """Clear semantic cache for a given llm_string."""
+        index_name = self._index_name(kwargs["llm_string"])
+        if index_name in self._cache_dict:
+            self._cache_dict[index_name].delete_index(index_name=index_name)
+            del self._cache_dict[index_name]
--- a/libs/community/langchain_community/document_loaders/pebblo.py
+++ b/libs/community/langchain_community/document_loaders/pebblo.py
@ -157,16 +157,19 @@ class PebbloSafeLoader(BaseLoader):
        doc_content = [doc.dict() for doc in loaded_docs]
        docs = []
        for doc in doc_content:
-            doc_authorized_identities = doc.get("metadata", {}).get(
-                "authorized_identities", []
-            )
+            doc_metadata = doc.get("metadata", {})
+            doc_authorized_identities = doc_metadata.get("authorized_identities", [])
            doc_source_path = get_full_path(
-                doc.get("metadata", {}).get("source", self.source_path)
+                doc_metadata.get(
+                    "full_path", doc_metadata.get("source", self.source_path)
+                )
+            )
+            doc_source_owner = doc_metadata.get(
+                "owner", PebbloSafeLoader.get_file_owner_from_path(doc_source_path)
            )
-            doc_source_owner = PebbloSafeLoader.get_file_owner_from_path(
-                doc_source_path
+            doc_source_size = doc_metadata.get(
+                "size", self.get_source_size(doc_source_path)
            )
-            doc_source_size = self.get_source_size(doc_source_path)
            page_content = str(doc.get("page_content"))
            page_content_size = self.calculate_content_size(page_content)
            self.source_aggregate_size += page_content_size
--- a/libs/community/langchain_community/llms/exllamav2.py
+++ b/libs/community/langchain_community/llms/exllamav2.py
@ -0,0 +1,199 @@
+from typing import Any, Dict, Iterator, List, Optional
+
+from langchain_core.callbacks import CallbackManagerForLLMRun
+from langchain_core.language_models import LLM
+from langchain_core.outputs import GenerationChunk
+from langchain_core.pydantic_v1 import Field, root_validator
+
+
+class ExLlamaV2(LLM):
+    """ExllamaV2 API.
+
+    - working only with GPTQ models for now.
+    - Lora models are not supported yet.
+
+    To use, you should have the exllamav2 library installed, and provide the
+    path to the Llama model as a named parameter to the constructor.
+    Check out:
+
+    Example:
+        .. code-block:: python
+
+            from langchain_community.llms import Exllamav2
+
+            llm = Exllamav2(model_path="/path/to/llama/model")
+
+    #TODO:
+    - Add loras support
+    - Add support for custom settings
+    - Add support for custom stop sequences
+    """
+
+    client: Any
+    model_path: str
+    exllama_cache: Any = None
+    config: Any = None
+    generator: Any = None
+    tokenizer: Any = None
+    # If settings is None, it will be used as the default settings for the model.
+    # All other parameters won't be used.
+    settings: Any = None
+
+    # Langchain parameters
+    logfunc = print
+
+    stop_sequences: List[str] = Field("")
+    """Sequences that immediately will stop the generator."""
+
+    max_new_tokens: int = Field(150)
+    """Maximum number of tokens to generate."""
+
+    streaming: bool = Field(True)
+    """Whether to stream the results, token by token."""
+
+    verbose: bool = Field(True)
+    """Whether to print debug information."""
+
+    # Generator parameters
+    disallowed_tokens: List[int] = Field(None)
+    """List of tokens to disallow during generation."""
+
+    @root_validator()
+    def validate_environment(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        try:
+            import torch
+        except ImportError as e:
+            raise ImportError(
+                "Unable to import torch, please install with `pip install torch`."
+            ) from e
+        # check if cuda is available
+        if not torch.cuda.is_available():
+            raise EnvironmentError("CUDA is not available. ExllamaV2 requires CUDA.")
+        try:
+            from exllamav2 import (
+                ExLlamaV2,
+                ExLlamaV2Cache,
+                ExLlamaV2Config,
+                ExLlamaV2Tokenizer,
+            )
+            from exllamav2.generator import (
+                ExLlamaV2BaseGenerator,
+                ExLlamaV2StreamingGenerator,
+            )
+        except ImportError:
+            raise ImportError(
+                "Could not import exllamav2 library. "
+                "Please install the exllamav2 library with (cuda 12.1 is required)"
+                "example : "
+                "!python -m pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-linux_x86_64.whl"
+            )
+
+        # Set logging function if verbose or set to empty lambda
+        verbose = values["verbose"]
+        if not verbose:
+            values["logfunc"] = lambda *args, **kwargs: None
+        logfunc = values["logfunc"]
+
+        if values["settings"]:
+            settings = values["settings"]
+            logfunc(settings.__dict__)
+        else:
+            raise NotImplementedError(
+                "settings is required. Custom settings are not supported yet."
+            )
+
+        config = ExLlamaV2Config()
+        config.model_dir = values["model_path"]
+        config.prepare()
+
+        model = ExLlamaV2(config)
+
+        exllama_cache = ExLlamaV2Cache(model, lazy=True)
+        model.load_autosplit(exllama_cache)
+
+        tokenizer = ExLlamaV2Tokenizer(config)
+        if values["streaming"]:
+            generator = ExLlamaV2StreamingGenerator(model, exllama_cache, tokenizer)
+        else:
+            generator = ExLlamaV2BaseGenerator(model, exllama_cache, tokenizer)
+
+        # Configure the model and generator
+        values["stop_sequences"] = [x.strip().lower() for x in values["stop_sequences"]]
+        setattr(settings, "stop_sequences", values["stop_sequences"])
+        logfunc(f"stop_sequences {values['stop_sequences']}")
+
+        disallowed = values.get("disallowed_tokens")
+        if disallowed:
+            settings.disallow_tokens(tokenizer, disallowed)
+
+        values["client"] = model
+        values["generator"] = generator
+        values["config"] = config
+        values["tokenizer"] = tokenizer
+        values["exllama_cache"] = exllama_cache
+
+        return values
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "ExLlamaV2"
+
+    def get_num_tokens(self, text: str) -> int:
+        """Get the number of tokens present in the text."""
+        return self.generator.tokenizer.num_tokens(text)
+
+    def _call(
+        self,
+        prompt: str,
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> str:
+        generator = self.generator
+
+        if self.streaming:
+            combined_text_output = ""
+            for chunk in self._stream(
+                prompt=prompt, stop=stop, run_manager=run_manager, kwargs=kwargs
+            ):
+                combined_text_output += str(chunk)
+            return combined_text_output
+        else:
+            output = generator.generate_simple(
+                prompt=prompt,
+                gen_settings=self.settings,
+                num_tokens=self.max_new_tokens,
+            )
+            # subtract subtext from output
+            output = output[len(prompt) :]
+            return output
+
+    def _stream(
+        self,
+        prompt: str,
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[GenerationChunk]:
+        input_ids = self.tokenizer.encode(prompt)
+        self.generator.warmup()
+        self.generator.set_stop_conditions([])
+        self.generator.begin_stream(input_ids, self.settings)
+
+        generated_tokens = 0
+
+        while True:
+            chunk, eos, _ = self.generator.stream()
+            generated_tokens += 1
+
+            if run_manager:
+                run_manager.on_llm_new_token(
+                    token=chunk,
+                    verbose=self.verbose,
+                )
+            yield chunk
+            if eos or generated_tokens == self.max_new_tokens:
+                break
+
+        return
--- a/libs/community/langchain_community/llms/sambanova.py
+++ b/libs/community/langchain_community/llms/sambanova.py
@ -0,0 +1,865 @@
+import json
+from typing import Any, Dict, Generator, Iterator, List, Optional, Union
+
+import requests
+from langchain_core.callbacks.manager import CallbackManagerForLLMRun
+from langchain_core.language_models.llms import LLM
+from langchain_core.outputs import GenerationChunk
+from langchain_core.pydantic_v1 import Extra, root_validator
+from langchain_core.utils import get_from_dict_or_env
+
+
+class SVEndpointHandler:
+    """
+    SambaNova Systems Interface for Sambaverse endpoint.
+
+    :param str host_url: Base URL of the DaaS API service
+    """
+
+    API_BASE_PATH = "/api/predict"
+
+    def __init__(self, host_url: str):
+        """
+        Initialize the SVEndpointHandler.
+
+        :param str host_url: Base URL of the DaaS API service
+        """
+        self.host_url = host_url
+        self.http_session = requests.Session()
+
+    @staticmethod
+    def _process_response(response: requests.Response) -> Dict:
+        """
+        Processes the API response and returns the resulting dict.
+
+        All resulting dicts, regardless of success or failure, will contain the
+        `status_code` key with the API response status code.
+
+        If the API returned an error, the resulting dict will contain the key
+        `detail` with the error message.
+
+        If the API call was successful, the resulting dict will contain the key
+        `data` with the response data.
+
+        :param requests.Response response: the response object to process
+        :return: the response dict
+        :rtype: dict
+        """
+        result: Dict[str, Any] = {}
+        try:
+            text_result = response.text.strip().split("\n")[-1]
+            result = {"data": json.loads("".join(text_result.split("data: ")[1:]))}
+        except Exception as e:
+            result["detail"] = str(e)
+        if "status_code" not in result:
+            result["status_code"] = response.status_code
+        return result
+
+    @staticmethod
+    def _process_streaming_response(
+        response: requests.Response,
+    ) -> Generator[GenerationChunk, None, None]:
+        """Process the streaming response"""
+        try:
+            import sseclient
+        except ImportError:
+            raise ValueError(
+                "could not import sseclient library"
+                "Please install it with `pip install sseclient-py`."
+            )
+        client = sseclient.SSEClient(response)
+        close_conn = False
+        for event in client.events():
+            if event.event == "error_event":
+                close_conn = True
+            text = json.dumps({"event": event.event, "data": event.data})
+            chunk = GenerationChunk(text=text)
+            yield chunk
+        if close_conn:
+            client.close()
+
+    def _get_full_url(self) -> str:
+        """
+        Return the full API URL for a given path.
+        :returns: the full API URL for the sub-path
+        :rtype: str
+        """
+        return f"{self.host_url}{self.API_BASE_PATH}"
+
+    def nlp_predict(
+        self,
+        key: str,
+        sambaverse_model_name: Optional[str],
+        input: Union[List[str], str],
+        params: Optional[str] = "",
+        stream: bool = False,
+    ) -> Dict:
+        """
+        NLP predict using inline input string.
+
+        :param str project: Project ID in which the endpoint exists
+        :param str endpoint: Endpoint ID
+        :param str key: API Key
+        :param str input_str: Input string
+        :param str params: Input params string
+        :returns: Prediction results
+        :rtype: dict
+        """
+        if isinstance(input, str):
+            input = [input]
+        parsed_input = []
+        for element in input:
+            parsed_element = {
+                "conversation_id": "sambaverse-conversation-id",
+                "messages": [
+                    {
+                        "message_id": 0,
+                        "role": "user",
+                        "content": element,
+                    }
+                ],
+            }
+            parsed_input.append(json.dumps(parsed_element))
+        if params:
+            data = {"inputs": parsed_input, "params": json.loads(params)}
+        else:
+            data = {"inputs": parsed_input}
+        response = self.http_session.post(
+            self._get_full_url(),
+            headers={
+                "key": key,
+                "Content-Type": "application/json",
+                "modelName": sambaverse_model_name,
+            },
+            json=data,
+        )
+        return SVEndpointHandler._process_response(response)
+
+    def nlp_predict_stream(
+        self,
+        key: str,
+        sambaverse_model_name: Optional[str],
+        input: Union[List[str], str],
+        params: Optional[str] = "",
+    ) -> Iterator[GenerationChunk]:
+        """
+        NLP predict using inline input string.
+
+        :param str project: Project ID in which the endpoint exists
+        :param str endpoint: Endpoint ID
+        :param str key: API Key
+        :param str input_str: Input string
+        :param str params: Input params string
+        :returns: Prediction results
+        :rtype: dict
+        """
+        if isinstance(input, str):
+            input = [input]
+        parsed_input = []
+        for element in input:
+            parsed_element = {
+                "conversation_id": "sambaverse-conversation-id",
+                "messages": [
+                    {
+                        "message_id": 0,
+                        "role": "user",
+                        "content": element,
+                    }
+                ],
+            }
+            parsed_input.append(json.dumps(parsed_element))
+        if params:
+            data = {"inputs": parsed_input, "params": json.loads(params)}
+        else:
+            data = {"inputs": parsed_input}
+        # Streaming output
+        response = self.http_session.post(
+            self._get_full_url(),
+            headers={
+                "key": key,
+                "Content-Type": "application/json",
+                "modelName": sambaverse_model_name,
+            },
+            json=data,
+            stream=True,
+        )
+        for chunk in SVEndpointHandler._process_streaming_response(response):
+            yield chunk
+
+
+class Sambaverse(LLM):
+    """
+    Sambaverse large language models.
+
+    To use, you should have the environment variable ``SAMBAVERSE_API_KEY``
+    set with your API key.
+
+    get one in https://sambaverse.sambanova.ai
+    read extra documentation in https://docs.sambanova.ai/sambaverse/latest/index.html
+
+
+    Example:
+    .. code-block:: python
+
+        from langchain_community.llms.sambanova  import Sambaverse
+        Sambaverse(
+            sambaverse_url="https://sambaverse.sambanova.ai",
+            sambaverse_api_key: "your sambaverse api key",
+            sambaverse_model_name: "Meta/llama-2-7b-chat-hf",
+            streaming: = False
+            model_kwargs={
+                "do_sample": False,
+                "max_tokens_to_generate": 100,
+                "temperature": 0.7,
+                "top_p": 1.0,
+                "repetition_penalty": 1,
+                "top_k": 50,
+            },
+        )
+    """
+
+    sambaverse_url: str = "https://sambaverse.sambanova.ai"
+    """Sambaverse url to use"""
+
+    sambaverse_api_key: str = ""
+    """sambaverse api key"""
+
+    sambaverse_model_name: Optional[str] = None
+    """sambaverse expert model to use"""
+
+    model_kwargs: Optional[dict] = None
+    """Key word arguments to pass to the model."""
+
+    streaming: Optional[bool] = False
+    """Streaming flag to get streamed response."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @classmethod
+    def is_lc_serializable(cls) -> bool:
+        return True
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key exists in environment."""
+        values["sambaverse_url"] = get_from_dict_or_env(
+            values, "sambaverse_url", "SAMBAVERSE_URL"
+        )
+        values["sambaverse_api_key"] = get_from_dict_or_env(
+            values, "sambaverse_api_key", "SAMBAVERSE_API_KEY"
+        )
+        values["sambaverse_model_name"] = get_from_dict_or_env(
+            values, "sambaverse_model_name", "SAMBAVERSE_MODEL_NAME"
+        )
+        return values
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        """Get the identifying parameters."""
+        return {**{"model_kwargs": self.model_kwargs}}
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "Sambaverse LLM"
+
+    def _get_tuning_params(self, stop: Optional[List[str]]) -> str:
+        """
+        Get the tuning parameters to use when calling the LLM.
+
+        Args:
+            stop: Stop words to use when generating. Model output is cut off at the
+                first occurrence of any of the stop substrings.
+
+        Returns:
+            The tuning parameters as a JSON string.
+        """
+        _model_kwargs = self.model_kwargs or {}
+        _stop_sequences = _model_kwargs.get("stop_sequences", [])
+        _stop_sequences = stop or _stop_sequences
+        _model_kwargs["stop_sequences"] = ",".join(f'"{x}"' for x in _stop_sequences)
+        tuning_params_dict = {
+            k: {"type": type(v).__name__, "value": str(v)}
+            for k, v in (_model_kwargs.items())
+        }
+        tuning_params = json.dumps(tuning_params_dict)
+        return tuning_params
+
+    def _handle_nlp_predict(
+        self,
+        sdk: SVEndpointHandler,
+        prompt: Union[List[str], str],
+        tuning_params: str,
+    ) -> str:
+        """
+        Perform an NLP prediction using the Sambaverse endpoint handler.
+
+        Args:
+            sdk: The SVEndpointHandler to use for the prediction.
+            prompt: The prompt to use for the prediction.
+            tuning_params: The tuning parameters to use for the prediction.
+
+        Returns:
+            The prediction result.
+
+        Raises:
+            ValueError: If the prediction fails.
+        """
+        response = sdk.nlp_predict(
+            self.sambaverse_api_key, self.sambaverse_model_name, prompt, tuning_params
+        )
+        if response["status_code"] != 200:
+            optional_details = response["details"]
+            optional_message = response["message"]
+            raise ValueError(
+                f"Sambanova /complete call failed with status code "
+                f"{response['status_code']}. Details: {optional_details}"
+                f"{response['status_code']}. Message: {optional_message}"
+            )
+        return response["data"]["completion"]
+
+    def _handle_completion_requests(
+        self, prompt: Union[List[str], str], stop: Optional[List[str]]
+    ) -> str:
+        """
+        Perform a prediction using the Sambaverse endpoint handler.
+
+        Args:
+            prompt: The prompt to use for the prediction.
+            stop: stop sequences.
+
+        Returns:
+            The prediction result.
+
+        Raises:
+            ValueError: If the prediction fails.
+        """
+        ss_endpoint = SVEndpointHandler(self.sambaverse_url)
+        tuning_params = self._get_tuning_params(stop)
+        return self._handle_nlp_predict(ss_endpoint, prompt, tuning_params)
+
+    def _handle_nlp_predict_stream(
+        self, sdk: SVEndpointHandler, prompt: Union[List[str], str], tuning_params: str
+    ) -> Iterator[GenerationChunk]:
+        """
+        Perform a streaming request to the LLM.
+
+        Args:
+            sdk: The SVEndpointHandler to use for the prediction.
+            prompt: The prompt to use for the prediction.
+            tuning_params: The tuning parameters to use for the prediction.
+
+        Returns:
+            An iterator of GenerationChunks.
+        """
+        for chunk in sdk.nlp_predict_stream(
+            self.sambaverse_api_key, self.sambaverse_model_name, prompt, tuning_params
+        ):
+            yield chunk
+
+    def _stream(
+        self,
+        prompt: Union[List[str], str],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[GenerationChunk]:
+        """Stream the Sambaverse's LLM on the given prompt.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+            run_manager: Callback manager for the run.
+            **kwargs: Additional keyword arguments. directly passed
+                to the sambaverse model in API call.
+
+        Returns:
+            An iterator of GenerationChunks.
+        """
+        ss_endpoint = SVEndpointHandler(self.sambaverse_url)
+        tuning_params = self._get_tuning_params(stop)
+        try:
+            if self.streaming:
+                for chunk in self._handle_nlp_predict_stream(
+                    ss_endpoint, prompt, tuning_params
+                ):
+                    if run_manager:
+                        run_manager.on_llm_new_token(chunk.text)
+                    yield chunk
+            else:
+                return
+        except Exception as e:
+            # Handle any errors raised by the inference endpoint
+            raise ValueError(f"Error raised by the inference endpoint: {e}") from e
+
+    def _handle_stream_request(
+        self,
+        prompt: Union[List[str], str],
+        stop: Optional[List[str]],
+        run_manager: Optional[CallbackManagerForLLMRun],
+        kwargs: Dict[str, Any],
+    ) -> str:
+        """
+        Perform a streaming request to the LLM.
+
+        Args:
+            prompt: The prompt to generate from.
+            stop: Stop words to use when generating. Model output is cut off at the
+                first occurrence of any of the stop substrings.
+            run_manager: Callback manager for the run.
+            **kwargs: Additional keyword arguments. directly passed
+                to the sambaverse model in API call.
+
+        Returns:
+            The model output as a string.
+        """
+        completion = ""
+        for chunk in self._stream(
+            prompt=prompt, stop=stop, run_manager=run_manager, **kwargs
+        ):
+            completion += chunk.text
+        return completion
+
+    def _call(
+        self,
+        prompt: Union[List[str], str],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> str:
+        """Run the LLM on the given input.
+
+        Args:
+            prompt: The prompt to generate from.
+            stop: Stop words to use when generating. Model output is cut off at the
+                first occurrence of any of the stop substrings.
+            run_manager: Callback manager for the run.
+            **kwargs: Additional keyword arguments. directly passed
+                to the sambaverse model in API call.
+
+        Returns:
+            The model output as a string.
+        """
+        try:
+            if self.streaming:
+                return self._handle_stream_request(prompt, stop, run_manager, kwargs)
+            return self._handle_completion_requests(prompt, stop)
+        except Exception as e:
+            # Handle any errors raised by the inference endpoint
+            raise ValueError(f"Error raised by the inference endpoint: {e}") from e
+
+
+class SSEndpointHandler:
+    """
+    SambaNova Systems Interface for SambaStudio model endpoints.
+
+    :param str host_url: Base URL of the DaaS API service
+    """
+
+    API_BASE_PATH = "/api"
+
+    def __init__(self, host_url: str):
+        """
+        Initialize the SSEndpointHandler.
+
+        :param str host_url: Base URL of the DaaS API service
+        """
+        self.host_url = host_url
+        self.http_session = requests.Session()
+
+    @staticmethod
+    def _process_response(response: requests.Response) -> Dict:
+        """
+        Processes the API response and returns the resulting dict.
+
+        All resulting dicts, regardless of success or failure, will contain the
+        `status_code` key with the API response status code.
+
+        If the API returned an error, the resulting dict will contain the key
+        `detail` with the error message.
+
+        If the API call was successful, the resulting dict will contain the key
+        `data` with the response data.
+
+        :param requests.Response response: the response object to process
+        :return: the response dict
+        :rtype: dict
+        """
+        result: Dict[str, Any] = {}
+        try:
+            result = response.json()
+        except Exception as e:
+            result["detail"] = str(e)
+        if "status_code" not in result:
+            result["status_code"] = response.status_code
+        return result
+
+    @staticmethod
+    def _process_streaming_response(
+        response: requests.Response,
+    ) -> Generator[GenerationChunk, None, None]:
+        """Process the streaming response"""
+        try:
+            import sseclient
+        except ImportError:
+            raise ValueError(
+                "could not import sseclient library"
+                "Please install it with `pip install sseclient-py`."
+            )
+        client = sseclient.SSEClient(response)
+        close_conn = False
+        for event in client.events():
+            if event.event == "error_event":
+                close_conn = True
+            text = json.dumps({"event": event.event, "data": event.data})
+            chunk = GenerationChunk(text=text)
+            yield chunk
+        if close_conn:
+            client.close()
+
+    def _get_full_url(self, path: str) -> str:
+        """
+        Return the full API URL for a given path.
+
+        :param str path: the sub-path
+        :returns: the full API URL for the sub-path
+        :rtype: str
+        """
+        return f"{self.host_url}{self.API_BASE_PATH}{path}"
+
+    def nlp_predict(
+        self,
+        project: str,
+        endpoint: str,
+        key: str,
+        input: Union[List[str], str],
+        params: Optional[str] = "",
+        stream: bool = False,
+    ) -> Dict:
+        """
+        NLP predict using inline input string.
+
+        :param str project: Project ID in which the endpoint exists
+        :param str endpoint: Endpoint ID
+        :param str key: API Key
+        :param str input_str: Input string
+        :param str params: Input params string
+        :returns: Prediction results
+        :rtype: dict
+        """
+        if isinstance(input, str):
+            input = [input]
+        if params:
+            data = {"inputs": input, "params": json.loads(params)}
+        else:
+            data = {"inputs": input}
+        response = self.http_session.post(
+            self._get_full_url(f"/predict/nlp/{project}/{endpoint}"),
+            headers={"key": key},
+            json=data,
+        )
+        return SSEndpointHandler._process_response(response)
+
+    def nlp_predict_stream(
+        self,
+        project: str,
+        endpoint: str,
+        key: str,
+        input: Union[List[str], str],
+        params: Optional[str] = "",
+    ) -> Iterator[GenerationChunk]:
+        """
+        NLP predict using inline input string.
+
+        :param str project: Project ID in which the endpoint exists
+        :param str endpoint: Endpoint ID
+        :param str key: API Key
+        :param str input_str: Input string
+        :param str params: Input params string
+        :returns: Prediction results
+        :rtype: dict
+        """
+        if isinstance(input, str):
+            input = [input]
+        if params:
+            data = {"inputs": input, "params": json.loads(params)}
+        else:
+            data = {"inputs": input}
+        # Streaming output
+        response = self.http_session.post(
+            self._get_full_url(f"/predict/nlp/stream/{project}/{endpoint}"),
+            headers={"key": key},
+            json=data,
+            stream=True,
+        )
+        for chunk in SSEndpointHandler._process_streaming_response(response):
+            yield chunk
+
+
+class SambaStudio(LLM):
+    """
+    SambaStudio large language models.
+
+    To use, you should have the environment variables
+    ``SAMBASTUDIO_BASE_URL`` set with your SambaStudio environment URL.
+    ``SAMBASTUDIO_PROJECT_ID`` set with your SambaStudio project ID.
+    ``SAMBASTUDIO_ENDPOINT_ID`` set with your SambaStudio endpoint ID.
+    ``SAMBASTUDIO_API_KEY``  set with your SambaStudio endpoint API key.
+
+    https://sambanova.ai/products/enterprise-ai-platform-sambanova-suite
+
+    read extra documentation in https://docs.sambanova.ai/sambastudio/latest/index.html
+
+    Example:
+    .. code-block:: python
+
+        from langchain_community.llms.sambanova  import Sambaverse
+        SambaStudio(
+            base_url="your SambaStudio environment URL",
+            project_id=set with your SambaStudio project ID.,
+            endpoint_id=set with your SambaStudio endpoint ID.,
+            api_token= set with your SambaStudio endpoint API key.,
+            streaming=false
+            model_kwargs={
+                "do_sample": False,
+                "max_tokens_to_generate": 1000,
+                "temperature": 0.7,
+                "top_p": 1.0,
+                "repetition_penalty": 1,
+                "top_k": 50,
+            },
+        )
+    """
+
+    base_url: str = ""
+    """Base url to use"""
+
+    project_id: str = ""
+    """Project id on sambastudio for model"""
+
+    endpoint_id: str = ""
+    """endpoint id on sambastudio for model"""
+
+    api_key: str = ""
+    """sambastudio api key"""
+
+    model_kwargs: Optional[dict] = None
+    """Key word arguments to pass to the model."""
+
+    streaming: Optional[bool] = False
+    """Streaming flag to get streamed response."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @classmethod
+    def is_lc_serializable(cls) -> bool:
+        return True
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        """Get the identifying parameters."""
+        return {**{"model_kwargs": self.model_kwargs}}
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "Sambastudio LLM"
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        values["base_url"] = get_from_dict_or_env(
+            values, "sambastudio_base_url", "SAMBASTUDIO_BASE_URL"
+        )
+        values["project_id"] = get_from_dict_or_env(
+            values, "sambastudio_project_id", "SAMBASTUDIO_PROJECT_ID"
+        )
+        values["endpoint_id"] = get_from_dict_or_env(
+            values, "sambastudio_endpoint_id", "SAMBASTUDIO_ENDPOINT_ID"
+        )
+        values["api_key"] = get_from_dict_or_env(
+            values, "sambastudio_api_key", "SAMBASTUDIO_API_KEY"
+        )
+        return values
+
+    def _get_tuning_params(self, stop: Optional[List[str]]) -> str:
+        """
+        Get the tuning parameters to use when calling the LLM.
+
+        Args:
+            stop: Stop words to use when generating. Model output is cut off at the
+                first occurrence of any of the stop substrings.
+
+        Returns:
+            The tuning parameters as a JSON string.
+        """
+        _model_kwargs = self.model_kwargs or {}
+        _stop_sequences = _model_kwargs.get("stop_sequences", [])
+        _stop_sequences = stop or _stop_sequences
+        # _model_kwargs['stop_sequences'] = ','.join(
+        #     f"'{x}'" for x in _stop_sequences)
+        tuning_params_dict = {
+            k: {"type": type(v).__name__, "value": str(v)}
+            for k, v in (_model_kwargs.items())
+        }
+        tuning_params = json.dumps(tuning_params_dict)
+        return tuning_params
+
+    def _handle_nlp_predict(
+        self, sdk: SSEndpointHandler, prompt: Union[List[str], str], tuning_params: str
+    ) -> str:
+        """
+        Perform an NLP prediction using the SambaStudio endpoint handler.
+
+        Args:
+            sdk: The SSEndpointHandler to use for the prediction.
+            prompt: The prompt to use for the prediction.
+            tuning_params: The tuning parameters to use for the prediction.
+
+        Returns:
+            The prediction result.
+
+        Raises:
+            ValueError: If the prediction fails.
+        """
+        response = sdk.nlp_predict(
+            self.project_id, self.endpoint_id, self.api_key, prompt, tuning_params
+        )
+        if response["status_code"] != 200:
+            optional_detail = response["detail"]
+            raise ValueError(
+                f"Sambanova /complete call failed with status code "
+                f"{response['status_code']}. Details: {optional_detail}"
+            )
+        return response["data"][0]["completion"]
+
+    def _handle_completion_requests(
+        self, prompt: Union[List[str], str], stop: Optional[List[str]]
+    ) -> str:
+        """
+        Perform a prediction using the SambaStudio endpoint handler.
+
+        Args:
+            prompt: The prompt to use for the prediction.
+            stop: stop sequences.
+
+        Returns:
+            The prediction result.
+
+        Raises:
+            ValueError: If the prediction fails.
+        """
+        ss_endpoint = SSEndpointHandler(self.base_url)
+        tuning_params = self._get_tuning_params(stop)
+        return self._handle_nlp_predict(ss_endpoint, prompt, tuning_params)
+
+    def _handle_nlp_predict_stream(
+        self, sdk: SSEndpointHandler, prompt: Union[List[str], str], tuning_params: str
+    ) -> Iterator[GenerationChunk]:
+        """
+        Perform a streaming request to the LLM.
+
+        Args:
+            sdk: The SVEndpointHandler to use for the prediction.
+            prompt: The prompt to use for the prediction.
+            tuning_params: The tuning parameters to use for the prediction.
+
+        Returns:
+            An iterator of GenerationChunks.
+        """
+        for chunk in sdk.nlp_predict_stream(
+            self.project_id, self.endpoint_id, self.api_key, prompt, tuning_params
+        ):
+            yield chunk
+
+    def _stream(
+        self,
+        prompt: Union[List[str], str],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[GenerationChunk]:
+        """Call out to Sambanova's complete endpoint.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+
+        Returns:
+            The string generated by the model.
+        """
+        ss_endpoint = SSEndpointHandler(self.base_url)
+        tuning_params = self._get_tuning_params(stop)
+        try:
+            if self.streaming:
+                for chunk in self._handle_nlp_predict_stream(
+                    ss_endpoint, prompt, tuning_params
+                ):
+                    if run_manager:
+                        run_manager.on_llm_new_token(chunk.text)
+                    yield chunk
+            else:
+                return
+        except Exception as e:
+            # Handle any errors raised by the inference endpoint
+            raise ValueError(f"Error raised by the inference endpoint: {e}") from e
+
+    def _handle_stream_request(
+        self,
+        prompt: Union[List[str], str],
+        stop: Optional[List[str]],
+        run_manager: Optional[CallbackManagerForLLMRun],
+        kwargs: Dict[str, Any],
+    ) -> str:
+        """
+        Perform a streaming request to the LLM.
+
+        Args:
+            prompt: The prompt to generate from.
+            stop: Stop words to use when generating. Model output is cut off at the
+                first occurrence of any of the stop substrings.
+            run_manager: Callback manager for the run.
+            **kwargs: Additional keyword arguments. directly passed
+                to the sambaverse model in API call.
+
+        Returns:
+            The model output as a string.
+        """
+        completion = ""
+        for chunk in self._stream(
+            prompt=prompt, stop=stop, run_manager=run_manager, **kwargs
+        ):
+            completion += chunk.text
+        return completion
+
+    def _call(
+        self,
+        prompt: Union[List[str], str],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> str:
+        """Call out to Sambanova's complete endpoint.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+
+        Returns:
+            The string generated by the model.
+        """
+        if stop is not None:
+            raise Exception("stop not implemented")
+        try:
+            if self.streaming:
+                return self._handle_stream_request(prompt, stop, run_manager, kwargs)
+            return self._handle_completion_requests(prompt, stop)
+        except Exception as e:
+            # Handle any errors raised by the inference endpoint
+            raise ValueError(f"Error raised by the inference endpoint: {e}") from e
--- a/libs/community/langchain_community/utilities/pebblo.py
+++ b/libs/community/langchain_community/utilities/pebblo.py
@ -169,7 +169,9 @@ def get_full_path(path: str) -> str:
        or (path in ["unknown", "-", "in-memory"])
    ):
        return path
-    full_path = pathlib.Path(path).resolve()
+    full_path = pathlib.Path(path)
+    if full_path.exists():
+        full_path = full_path.resolve()
    return str(full_path)


--- a/libs/community/langchain_community/vectorstores/opensearch_vector_search.py
+++ b/libs/community/langchain_community/vectorstores/opensearch_vector_search.py
@ -274,10 +274,12 @@ def _default_approximate_search_query(
    query_vector: List[float],
    k: int = 4,
    vector_field: str = "vector_field",
+    score_threshold: Optional[float] = 0.0,
 ) -> Dict:
    """For Approximate k-NN Search, this is the default query."""
    return {
        "size": k,
+        "min_score": score_threshold,
        "query": {"knn": {vector_field: {"vector": query_vector, "k": k}}},
    }

@ -288,10 +290,12 @@ def _approximate_search_query_with_boolean_filter(
    k: int = 4,
    vector_field: str = "vector_field",
    subquery_clause: str = "must",
+    score_threshold: Optional[float] = 0.0,
 ) -> Dict:
    """For Approximate k-NN Search, with Boolean Filter."""
    return {
        "size": k,
+        "min_score": score_threshold,
        "query": {
            "bool": {
                "filter": boolean_filter,
@ -308,11 +312,12 @@ def _approximate_search_query_with_efficient_filter(
    efficient_filter: Dict,
    k: int = 4,
    vector_field: str = "vector_field",
+    score_threshold: Optional[float] = 0.0,
 ) -> Dict:
    """For Approximate k-NN Search, with Efficient Filter for Lucene and
    Faiss Engines."""
    search_query = _default_approximate_search_query(
-        query_vector, k=k, vector_field=vector_field
+        query_vector, k=k, vector_field=vector_field, score_threshold=score_threshold
    )
    search_query["query"]["knn"][vector_field]["filter"] = efficient_filter
    return search_query
@ -324,6 +329,7 @@ def _default_script_query(
    space_type: str = "l2",
    pre_filter: Optional[Dict] = None,
    vector_field: str = "vector_field",
+    score_threshold: Optional[float] = 0.0,
 ) -> Dict:
    """For Script Scoring Search, this is the default query."""

@ -332,6 +338,7 @@ def _default_script_query(

    return {
        "size": k,
+        "min_score": score_threshold,
        "query": {
            "script_score": {
                "query": pre_filter,
@ -368,6 +375,7 @@ def _default_painless_scripting_query(
    space_type: str = "l2Squared",
    pre_filter: Optional[Dict] = None,
    vector_field: str = "vector_field",
+    score_threshold: Optional[float] = 0.0,
 ) -> Dict:
    """For Painless Scripting Search, this is the default query."""

@ -377,6 +385,7 @@ def _default_painless_scripting_query(
    source = __get_painless_scripting_source(space_type, vector_field=vector_field)
    return {
        "size": k,
+        "min_score": score_threshold,
        "query": {
            "script_score": {
                "query": pre_filter,
@ -509,6 +518,72 @@ class OpenSearchVectorSearch(VectorStore):
            is_aoss=self.is_aoss,
        )

+    def delete_index(self, index_name: Optional[str] = None) -> Optional[bool]:
+        """Deletes a given index from vectorstore."""
+        if index_name is None:
+            if self.index_name is None:
+                raise ValueError("index_name must be provided.")
+            index_name = self.index_name
+        try:
+            self.client.indices.delete(index=index_name)
+            return True
+        except Exception as e:
+            raise e
+
+    def index_exists(self, index_name: Optional[str] = None) -> Optional[bool]:
+        """If given index present in vectorstore, returns True else False."""
+        if index_name is None:
+            if self.index_name is None:
+                raise ValueError("index_name must be provided.")
+            index_name = self.index_name
+
+        return self.client.indices.exists(index=index_name)
+
+    def create_index(
+        self,
+        dimension: int,
+        index_name: Optional[str] = uuid.uuid4().hex,
+        **kwargs: Any,
+    ) -> Optional[str]:
+        """Create a new Index with given arguments"""
+        is_appx_search = kwargs.get("is_appx_search", True)
+        vector_field = kwargs.get("vector_field", "vector_field")
+        kwargs.get("text_field", "text")
+        http_auth = kwargs.get("http_auth")
+        is_aoss = _is_aoss_enabled(http_auth=http_auth)
+
+        if is_aoss and not is_appx_search:
+            raise ValueError(
+                "Amazon OpenSearch Service Serverless only "
+                "supports `approximate_search`"
+            )
+
+        if is_appx_search:
+            engine = kwargs.get("engine", "nmslib")
+            space_type = kwargs.get("space_type", "l2")
+            ef_search = kwargs.get("ef_search", 512)
+            ef_construction = kwargs.get("ef_construction", 512)
+            m = kwargs.get("m", 16)
+
+            _validate_aoss_with_engines(is_aoss, engine)
+
+            mapping = _default_text_mapping(
+                dimension,
+                engine,
+                space_type,
+                ef_search,
+                ef_construction,
+                m,
+                vector_field,
+            )
+        else:
+            mapping = _default_scripting_text_mapping(dimension)
+
+        if self.index_exists(index_name):
+            raise RuntimeError(f"The index, {index_name} already exists.")
+        self.client.indices.create(index=index_name, body=mapping)
+        return index_name
+
    def add_texts(
        self,
        texts: Iterable[str],
@ -659,7 +734,11 @@ class OpenSearchVectorSearch(VectorStore):
        )

    def similarity_search(
-        self, query: str, k: int = 4, **kwargs: Any
+        self,
+        query: str,
+        k: int = 4,
+        score_threshold: Optional[float] = 0.0,
+        **kwargs: Any,
    ) -> List[Document]:
        """Return docs most similar to query.

@ -669,6 +748,8 @@ class OpenSearchVectorSearch(VectorStore):
        Args:
            query: Text to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
+            score_threshold: Specify a score threshold to return only documents
+            above the threshold. Defaults to 0.0.

        Returns:
            List of Documents most similar to the query.
@ -717,20 +798,30 @@ class OpenSearchVectorSearch(VectorStore):
            pre_filter: script_score query to pre-filter documents before identifying
            nearest neighbors; default: {"match_all": {}}
        """
-        docs_with_scores = self.similarity_search_with_score(query, k, **kwargs)
+        docs_with_scores = self.similarity_search_with_score(
+            query, k, score_threshold, **kwargs
+        )
        return [doc[0] for doc in docs_with_scores]

    def similarity_search_by_vector(
-        self, embedding: List[float], k: int = 4, **kwargs: Any
+        self,
+        embedding: List[float],
+        k: int = 4,
+        score_threshold: Optional[float] = 0.0,
+        **kwargs: Any,
    ) -> List[Document]:
        """Return docs most similar to the embedding vector."""
        docs_with_scores = self.similarity_search_with_score_by_vector(
-            embedding, k, **kwargs
+            embedding, k, score_threshold, **kwargs
        )
        return [doc[0] for doc in docs_with_scores]

    def similarity_search_with_score(
-        self, query: str, k: int = 4, **kwargs: Any
+        self,
+        query: str,
+        k: int = 4,
+        score_threshold: Optional[float] = 0.0,
+        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
        """Return docs and it's scores most similar to query.

@ -740,6 +831,8 @@ class OpenSearchVectorSearch(VectorStore):
        Args:
            query: Text to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
+            score_threshold: Specify a score threshold to return only documents
+            above the threshold. Defaults to 0.0.

        Returns:
            List of Documents along with its scores most similar to the query.
@ -748,10 +841,16 @@ class OpenSearchVectorSearch(VectorStore):
            same as `similarity_search`
        """
        embedding = self.embedding_function.embed_query(query)
-        return self.similarity_search_with_score_by_vector(embedding, k, **kwargs)
+        return self.similarity_search_with_score_by_vector(
+            embedding, k, score_threshold, **kwargs
+        )

    def similarity_search_with_score_by_vector(
-        self, embedding: List[float], k: int = 4, **kwargs: Any
+        self,
+        embedding: List[float],
+        k: int = 4,
+        score_threshold: Optional[float] = 0.0,
+        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
        """Return docs and it's scores most similar to the embedding vector.

@ -761,6 +860,8 @@ class OpenSearchVectorSearch(VectorStore):
        Args:
            embedding: Embedding vector to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
+            score_threshold: Specify a score threshold to return only documents
+            above the threshold. Defaults to 0.0.

        Returns:
            List of Documents along with its scores most similar to the query.
@ -772,7 +873,7 @@ class OpenSearchVectorSearch(VectorStore):
        metadata_field = kwargs.get("metadata_field", "metadata")

        hits = self._raw_similarity_search_with_score_by_vector(
-            embedding=embedding, k=k, **kwargs
+            embedding=embedding, k=k, score_threshold=score_threshold, **kwargs
        )

        documents_with_scores = [
@ -792,7 +893,11 @@ class OpenSearchVectorSearch(VectorStore):
        return documents_with_scores

    def _raw_similarity_search_with_score_by_vector(
-        self, embedding: List[float], k: int = 4, **kwargs: Any
+        self,
+        embedding: List[float],
+        k: int = 4,
+        score_threshold: Optional[float] = 0.0,
+        **kwargs: Any,
    ) -> List[dict]:
        """Return raw opensearch documents (dict) including vectors,
        scores most similar to the embedding vector.
@ -803,6 +908,8 @@ class OpenSearchVectorSearch(VectorStore):
        Args:
            embedding: Embedding vector to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
+            score_threshold: Specify a score threshold to return only documents
+            above the threshold. Defaults to 0.0.

        Returns:
            List of dict with its scores most similar to the embedding.
@ -868,10 +975,15 @@ class OpenSearchVectorSearch(VectorStore):
                    k=k,
                    vector_field=vector_field,
                    subquery_clause=subquery_clause,
+                    score_threshold=score_threshold,
                )
            elif efficient_filter != {}:
                search_query = _approximate_search_query_with_efficient_filter(
-                    embedding, efficient_filter, k=k, vector_field=vector_field
+                    embedding,
+                    efficient_filter,
+                    k=k,
+                    vector_field=vector_field,
+                    score_threshold=score_threshold,
                )
            elif lucene_filter != {}:
                warnings.warn(
@ -879,23 +991,40 @@ class OpenSearchVectorSearch(VectorStore):
                    " `efficient_filter`"
                )
                search_query = _approximate_search_query_with_efficient_filter(
-                    embedding, lucene_filter, k=k, vector_field=vector_field
+                    embedding,
+                    lucene_filter,
+                    k=k,
+                    vector_field=vector_field,
+                    score_threshold=score_threshold,
                )
            else:
                search_query = _default_approximate_search_query(
-                    embedding, k=k, vector_field=vector_field
+                    embedding,
+                    k=k,
+                    vector_field=vector_field,
+                    score_threshold=score_threshold,
                )
        elif search_type == SCRIPT_SCORING_SEARCH:
            space_type = kwargs.get("space_type", "l2")
            pre_filter = kwargs.get("pre_filter", MATCH_ALL_QUERY)
            search_query = _default_script_query(
-                embedding, k, space_type, pre_filter, vector_field
+                embedding,
+                k,
+                space_type,
+                pre_filter,
+                vector_field,
+                score_threshold=score_threshold,
            )
        elif search_type == PAINLESS_SCRIPTING_SEARCH:
            space_type = kwargs.get("space_type", "l2Squared")
            pre_filter = kwargs.get("pre_filter", MATCH_ALL_QUERY)
            search_query = _default_painless_scripting_query(
-                embedding, k, space_type, pre_filter, vector_field
+                embedding,
+                k,
+                space_type,
+                pre_filter,
+                vector_field,
+                score_threshold=score_threshold,
            )
        else:
            raise ValueError("Invalid `search_type` provided as an argument")
--- a/libs/community/poetry.lock
+++ b/libs/community/poetry.lock
@ -1356,6 +1356,118 @@ tomli = {version = "*", optional = true, markers = "python_full_version <= \"3.1
 [package.extras]
 toml = ["tomli"]

+[[package]]
+name = "cramjam"
+version = "2.8.3"
+description = "Thin Python bindings to de/compression algorithms in Rust"
+optional = false
+python-versions = ">=3.7"
+files = [
+    {file = "cramjam-2.8.3-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:8c8aa6d08c135ae7f0da01e6559a332c5d8fe4989a594db401040e385d04dffd"},
+    {file = "cramjam-2.8.3-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:bd8c601fe8717e52517a2f2eef78217086acf449627bfdda97e3f53fd79c92af"},
+    {file = "cramjam-2.8.3-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:dac42b2b4c3950e7eda9b5551e0e904784ed0c0428accc29171c230fb919ec72"},
+    {file = "cramjam-2.8.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ab8146faa5d8c52edf23724843c36469fc32ff2c4a174eba72f4da6de5016688"},
+    {file = "cramjam-2.8.3-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:cb5f4d061e9abdc6663551446c332a58c101efb31fd1746229872600274c2b20"},
+    {file = "cramjam-2.8.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5d1ac94e00c64258330105473c641441db02b4dc3e9e9f2963d204e53ed93025"},
+    {file = "cramjam-2.8.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:8ed658f36a2bf667d5b8c7c6690103ad99f81cc62a1b64891b69298447329d4b"},
+    {file = "cramjam-2.8.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3f6303c8cc583dfe5054cf84717674f75b18bca4ae8e576dc863958d5494dc4b"},
+    {file = "cramjam-2.8.3-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:04b31d427a8902e5c2eec4b8f29873de7a3ade202e3d68e7f2354b9f0aa00bc7"},
+    {file = "cramjam-2.8.3-cp310-cp310-musllinux_1_1_armv7l.whl", hash = "sha256:9728861bc0390681824961778b36f7f0b95039e8b90d46f1b67f51232f1ee159"},
+    {file = "cramjam-2.8.3-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:87e26e3e1d5fed1cac5b41be648d0daf0793f94cf4a7aebefce1f4f6656e2d21"},
+    {file = "cramjam-2.8.3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:4c1d2d39c2193a77c5e5b327944f90e6ecf2caa1b55e7176cc83d80706ea15de"},
+    {file = "cramjam-2.8.3-cp310-none-win32.whl", hash = "sha256:6721edd8f911ad84db83ee4902b7579fc01c55849062f3f1f4171b58fccf98eb"},
+    {file = "cramjam-2.8.3-cp310-none-win_amd64.whl", hash = "sha256:4f7c16d358df366e308137411125a2bb50d1b19924fced3a390898fa8c9a074d"},
+    {file = "cramjam-2.8.3-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:24c2b426dd8fafb894f93a88f42e2827e14199d66836cb100582037e5371c724"},
+    {file = "cramjam-2.8.3-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:007aa9444cb27b8691baae73ca907133cd939987438f874774011b4c740732dd"},
+    {file = "cramjam-2.8.3-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:29987b54e31efed66738e8f236c597c4c9a91ec9d57bcb74307712e07505b4bb"},
+    {file = "cramjam-2.8.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:65bfd41aa92c0025f32ba09214b48e9367a81122586b2617439b4327c4bd179c"},
+    {file = "cramjam-2.8.3-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7337bd8218bd8508f35904274a38cce843a237fe6e23104238bbeb2f337107ed"},
+    {file = "cramjam-2.8.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:269f94d2efe6b6a97624782cd3b541e60535dd5874f4a8d5d0ba66ef59424ae3"},
+    {file = "cramjam-2.8.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bec9ca5431c32ba94996b7c1c56695b37d48713b97ee1d2a456f4046f009e82f"},
+    {file = "cramjam-2.8.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2cb64a97e625ca029b55e37769b8c354e64cbea042c75471915dc385935d30ed"},
+    {file = "cramjam-2.8.3-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:c28830ecf76501356d678dac4f37563554ec1c651a53a990cdf595f7ed75c651"},
+    {file = "cramjam-2.8.3-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:35647a0e37a4dfec85a44c7966ae476b7db0e6cd65d91c08f1fb3007ed774d92"},
+    {file = "cramjam-2.8.3-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:e954599c6369f429a868852eff453b894d88866acba439b65131ea93f5400b47"},
+    {file = "cramjam-2.8.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:86e238b6de79e045f5197df2c9dfaf8d10b37a6517ff4ffc4775fe5a3cf4d4a4"},
+    {file = "cramjam-2.8.3-cp311-none-win32.whl", hash = "sha256:fe6434d3ee0899bc9396801d1abbc5d1fe77662bd3d1f1c1573fac6708459138"},
+    {file = "cramjam-2.8.3-cp311-none-win_amd64.whl", hash = "sha256:e8ec1d4f27eb9d0412f0c567e7ffd14fbeb2b318a1ac394d5de4047c431fe94c"},
+    {file = "cramjam-2.8.3-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:24990be4010b2185dcecc67133cd727657036e7b132d7de598148f5b1eb8e452"},
+    {file = "cramjam-2.8.3-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:572cb9a8dc5a189691d6e03a9bf9b4305fd9a9f36bb0f9fde55fc36837c2e6b3"},
+    {file = "cramjam-2.8.3-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:9efe6915aa7ef176f3a7f42a4e46504573215953331b139abefd20d07d8aba82"},
+    {file = "cramjam-2.8.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fe84440100e7045190da7f80219be9989b0b6db6acadb3ae9cfe0935d93ebf8c"},
+    {file = "cramjam-2.8.3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:00524bb23f4abb3a3bfff08aa32b9274843170c5b43855807e0f59670e2ac98c"},
+    {file = "cramjam-2.8.3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ab67f29094165f0771acad8dd16e840259cfedcc94067af229530496dbf1a24c"},
+    {file = "cramjam-2.8.3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:be6fb5dd5bf1c89c717a73a1057505959f35c08e0e97a76d4cc6391b90d2263b"},
+    {file = "cramjam-2.8.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d93b42d22bf3e17290c5e4cf58e715a419330bb5255c35933c14db82ecf3872c"},
+    {file = "cramjam-2.8.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:afa065bab70e27565695441f69f493af3d379b8723030f2c3d2547d2e312a4be"},
+    {file = "cramjam-2.8.3-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:832224f52fa1e601e0ab678dba9bdfde3686fc4cd1a9f2ed4748f29eaf1cb553"},
+    {file = "cramjam-2.8.3-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:962b7106287bcc463150766b5b8c69f32dcc69713a8dbce00e0ca6936f95c55b"},
+    {file = "cramjam-2.8.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:2be92c6f0bcffaf8ea6a8164fe0388a188fec2fa9eff1828e8b64dc3a83740f9"},
+    {file = "cramjam-2.8.3-cp312-none-win32.whl", hash = "sha256:080f3eb7b648f5ba9d35084d8dddc68246a8f365df239792f6712908f0aa568e"},
+    {file = "cramjam-2.8.3-cp312-none-win_amd64.whl", hash = "sha256:c14728e3360cd212d5b606ca703c3bd1c8912efcdbc1aa032c81c2882509ebd5"},
+    {file = "cramjam-2.8.3-cp37-cp37m-macosx_10_12_x86_64.whl", hash = "sha256:c7e8329cde48740df8d332dade2f52b74612b8ea86005341c99bb192c82a5ce7"},
+    {file = "cramjam-2.8.3-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:77346ac669f5445d14b74476a4e8f3a259fd22681bd73790e92b8956d7e225fc"},
+    {file = "cramjam-2.8.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:274878883e7fadf95a6b5bc58f9c1dd39fef2c31d68e18a0fb8594226457fba7"},
+    {file = "cramjam-2.8.3-cp37-cp37m-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7871e1fd3ee8ca16799ba22d49fc1e52e78976fa8c659be41630eeb2914475a7"},
+    {file = "cramjam-2.8.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:345a952c5d4b922830efaa67dc0b42d21e18c182c1a1bda6d20bb78235f31d6f"},
+    {file = "cramjam-2.8.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:fb5d7739e2bc573ade12327ef7717b1ac5876c62938fab20eb54d762da23cae2"},
+    {file = "cramjam-2.8.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:440a18fd4ae42e06dbbd7aee91d8248b61da9fef7610ffbd553d1ba93931394b"},
+    {file = "cramjam-2.8.3-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:476890974229713fc7b4c16fb050b756ba926c67e4d1200b3e03c5c051e9b552"},
+    {file = "cramjam-2.8.3-cp37-cp37m-musllinux_1_1_armv7l.whl", hash = "sha256:771b44e549f90b5532508782e25d1c40b8054dd83d52253d05945fc05836b252"},
+    {file = "cramjam-2.8.3-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:d824fd98364bc946c38ed324a3ec7befba055285aaf2c1ca61894bb7616226e8"},
+    {file = "cramjam-2.8.3-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:2476828dea4089aa3cb9160391f8b36f793ca651afdcba80de1e341373928397"},
+    {file = "cramjam-2.8.3-cp37-none-win32.whl", hash = "sha256:4a554bcfd068e831affd64a4f067c7c9b00b359742597c4fdadd18ff673baf30"},
+    {file = "cramjam-2.8.3-cp37-none-win_amd64.whl", hash = "sha256:246f1f7d32cac2b64617d2dddba11a82851e73cdcf9d1abb799b08dcd9d2ea49"},
+    {file = "cramjam-2.8.3-cp38-cp38-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:bc8f24c32124bb47536882c6b941cdb88cc16e4fa64d5bf347cb8dd72a193fc3"},
+    {file = "cramjam-2.8.3-cp38-cp38-macosx_10_12_x86_64.whl", hash = "sha256:28c30078effc100739d3f9b227276a8360c1b32aac65efb4f641630552213548"},
+    {file = "cramjam-2.8.3-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:ef0173fb457f73cf9c2553092419db0eba4d582890db95e542a4d93e11340421"},
+    {file = "cramjam-2.8.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9a1943f2cc0deee037ddcf92beff6049e12d4e6d557f568ddf59fb3b848f2152"},
+    {file = "cramjam-2.8.3-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:5023a737d8d9cf5d123e6d87d088929c3cfb2aae90e0f584204427f74882150a"},
+    {file = "cramjam-2.8.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6eec7e985f35708c234542721863d82781d0f7f6a71b45e14ce6d2625d4b131d"},
+    {file = "cramjam-2.8.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b188e750b95172c01defcfcfbba629cad797718b34402ec61b3bc9ff99403599"},
+    {file = "cramjam-2.8.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:30e2d745cd4d244b7973d15aaebeedb537b980f9d3da80e6dea75ee1a872f9fa"},
+    {file = "cramjam-2.8.3-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:c9d54a4aa475d5e902f2ee518bdaa02f26c089e9f72950d00d1643c090f0deb3"},
+    {file = "cramjam-2.8.3-cp38-cp38-musllinux_1_1_armv7l.whl", hash = "sha256:19b8c97350c8d65daea26267dd1becb59073569aac2ae5743952d7f48da5d37a"},
+    {file = "cramjam-2.8.3-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:3277fd42399755d6d3730edec4a192174ee64d219e0ffbc90613f15cbabf711f"},
+    {file = "cramjam-2.8.3-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:1fd25201f1278dc6faa2ae35e67b7a5bb352b7fc6ed1ee939637414ca8115863"},
+    {file = "cramjam-2.8.3-cp38-none-win32.whl", hash = "sha256:594477faff7f4380fa123cfbcf10ab8ee5af1a28b95750b66931ffafcb11ab5c"},
+    {file = "cramjam-2.8.3-cp38-none-win_amd64.whl", hash = "sha256:8ea1dc11538842ff20d9872a17214994f5913cbf3be5594b54aad2422becdf19"},
+    {file = "cramjam-2.8.3-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:6379b92912f7569e126bd48d10e7087ddd20ea88a939532e3c4a85c2fa05d600"},
+    {file = "cramjam-2.8.3-cp39-cp39-macosx_10_12_x86_64.whl", hash = "sha256:11d2e9eebc7d202eda0ae09fb56a2cdbeb5a1563e89d2118bf18cf0030f35f77"},
+    {file = "cramjam-2.8.3-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:d5a0a2fe240c97587df07f3d5e1027673d599b3a6a7a0ab540aea69f09e9ff7a"},
+    {file = "cramjam-2.8.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ba542f07fe3f41475d78626973533539e6cf2d5b6af37923fe6c7e7f0f74b9b2"},
+    {file = "cramjam-2.8.3-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:1374fe9a4431e546bb4501a16b84875d0bf80fc4e6c8942f0d5608ae48474267"},
+    {file = "cramjam-2.8.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:dcf7791e1cedb982ccc873ec9392c6cfb9c714a64ebf1ed4e8310b9cb44655f2"},
+    {file = "cramjam-2.8.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:990e65c2bf1c155a9ddec5ecabf431cf77596432f697d3c6e0831b5174c51c40"},
+    {file = "cramjam-2.8.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d9b244d04cef82872d12c227a2f202f080a454d664c05db351626e6ad4aaa307"},
+    {file = "cramjam-2.8.3-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:80b088d15866b37851fd53e2b471becc9ec487257dceca1878621072a18e833e"},
+    {file = "cramjam-2.8.3-cp39-cp39-musllinux_1_1_armv7l.whl", hash = "sha256:f667843e7a8fca208eecfe44e04088242f8ca60d74d4950fac3722043538d700"},
+    {file = "cramjam-2.8.3-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:6f838d06d06709b9ce8b1ceae36aea4e1c7e613365185a91edcbeb5884f5e606"},
+    {file = "cramjam-2.8.3-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:4822eb5fe6839cd3d0439e5431e766ad010b2a388ca9617aa6372b6030897782"},
+    {file = "cramjam-2.8.3-cp39-none-win32.whl", hash = "sha256:67e09b42e744efd08b93ac56f6100a859a31617d7146725516f3f2c744149d97"},
+    {file = "cramjam-2.8.3-cp39-none-win_amd64.whl", hash = "sha256:11c9d30bc53892c57a3b296756c23659323ab1419a2b4bf22bbafc07b247bb67"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:51e847dcfe74fba379fed2bc2b45f5c2f11c3ece5e9eebcf63f39a9594184588"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:07af94191f6a245226dc8a8bc6c94808e382ce9dfcca4bab0e8015fbc7fc3322"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fc9c45469914099897c47bfc501616fb377f28a865adebf90ea6f3c8ae6dd4e6"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:ef29fb916fe74be65d0ab8871ab8d964b0f5eb8028bb84b325be43675a59d6e7"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:3850dac9a2f6dcb3249d23f9d505117643b967bdc1c572ed0cc492a48fd69daf"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-musllinux_1_1_i686.whl", hash = "sha256:e23e323ad28ed3e4e3a24ceffdab0ff235954109a88b536ea7b3b7886bd0a536"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:1ba1a8ff855b30b4069a9b45ea9e7f2b5d882c7953bdfccda8d4b275fa7057ce"},
+    {file = "cramjam-2.8.3-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:eea606b01b43b91626e3aafd463bd19b6ed739bdb8b2b309e5d7ff72afc0e89d"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-macosx_10_12_x86_64.whl", hash = "sha256:97c706c520c3f8b0184278cc86187528458350216c6e4fa85d3f16bcad0d365d"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9d08f1bab949ffd6dd6f25a89e4f7062d147aeea9c067e4dd155bdb190e5a519"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ba1e45074757ab0482ac544e60613b6b8658100ac9985c91868a4598cdfb63ba"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:a2fededed05a042f093dbf1b11d69afb1874a2c9197fcf1d58c142ba9111db5a"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:fc0c6eb8185c68f79a25bb298825e345cc09b826f5828bd8146e3600ca6e9981"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-musllinux_1_1_i686.whl", hash = "sha256:6653c262ad71e6c0ae08eeca3af2ee89ad47483b6312f2c6094518cb77872406"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:6c04f363cb4b316719421724521432b6e7f6490e5baaaf7692af961c28d0279b"},
+    {file = "cramjam-2.8.3-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:e30f1f00de913b440baa36647817b9b7120a69b04eca05f3354aaf5b40f95ee5"},
+    {file = "cramjam-2.8.3.tar.gz", hash = "sha256:6b1fa0a6ea8183831d04572597c182bd6cece62d583a36cde1e6a86e72ce2389"},
+]
+
+[package.extras]
+dev = ["black (==22.3.0)", "hypothesis", "numpy", "pytest (>=5.30)", "pytest-xdist"]
+
 [[package]]
 name = "cryptography"
 version = "42.0.5"
@ -1836,6 +1948,28 @@ files = [
 [package.extras]
 tests = ["asttokens (>=2.1.0)", "coverage", "coverage-enable-subprocess", "ipython", "littleutils", "pytest", "rich"]

+[[package]]
+name = "exllamav2"
+version = "0.0.18"
+description = ""
+optional = false
+python-versions = "*"
+files = [
+    {file = "exllamav2-0.0.18-py3-none-any.whl", hash = "sha256:9ded8656a63b91942a740d59cb0e3f81547aa250f4296fc9130e8a6e7cfecd3e"},
+]
+
+[package.dependencies]
+fastparquet = "*"
+ninja = "*"
+numpy = "*"
+pandas = "*"
+pygments = "*"
+regex = "*"
+safetensors = ">=0.3.2"
+sentencepiece = ">=0.1.97"
+torch = ">=2.2.0"
+websockets = "*"
+
 [[package]]
 name = "faiss-cpu"
 version = "1.8.0"
@ -1954,6 +2088,64 @@ files = [
 [package.extras]
 devel = ["colorama", "json-spec", "jsonschema", "pylint", "pytest", "pytest-benchmark", "pytest-cache", "validictory"]

+[[package]]
+name = "fastparquet"
+version = "2024.2.0"
+description = "Python support for Parquet file format"
+optional = false
+python-versions = ">=3.8"
+files = [
+    {file = "fastparquet-2024.2.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:94aaa752d79660f2d88983bd7336109f4b61da6940d759786c02144195d6c635"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:abb08c61ab0f8a29a118dabe0a9105686fa5580648cfca252a74153c8c32444f"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0d04901828f54ec118e7e5dfb438518ffe9b75ef3b7ebcdbaf33af130fcee9b7"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:42def5e682eb426e6f7062d0bee370dec9424181f3c61eb24d6bdc67482a0ace"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d87f24ae76e65f94af9e62a648b5479f0bd2e8935e0011c9390ebc1299f3785d"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:76fadf2399a778daf49772c644a3a7b27e41492a43e2bea4107a715981c1dc2f"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:83f1abb155d8a8b6f1f31318174507d8a8ddf4bff00a2ef7065b609577deb6ae"},
+    {file = "fastparquet-2024.2.0-cp310-cp310-win_amd64.whl", hash = "sha256:dedeb4ad28f68313c2504ef005f4b2d52c3d108bd5323204300dbaeec6fb1b04"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:3b7c39661c918686fdbf21695547d2e7b0cd0226a2f2dd6fa5c2ad7b37da2540"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:bd1b310e7d9934f61236b793d1e11336d457e7664829bf76d53bff5614dcc338"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e27b5d21fecdc07f071f5343a350b88c859b324834fd19b78d636480fe341999"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3e3c5cdf2af0fc1b76f07daabd37b132c0f0086106b2fc801ea046739ddabee0"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ea1503bac0b1457c016a748064823d312806e506f3a8b9226935def4be3fffdc"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:b76febb17f2261e1aa8bdf11b3459ee9cca19ced25744b940c3922b7d93862f9"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:6a14579bbe2fab4f5f43685503b4142d8b0eb7965ee176704ae1697590143cd1"},
+    {file = "fastparquet-2024.2.0-cp311-cp311-win_amd64.whl", hash = "sha256:0c1edc578f7a9919d1062bc3184c0c64d5c4e986ab3fa9c75f53561bb7364d7f"},
+    {file = "fastparquet-2024.2.0-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:cebc1adc7c3a1aed70c752f3fde5e4df094dafba24e60d6501d7963e77047e7e"},
+    {file = "fastparquet-2024.2.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:c26266910e42190f3ba043647b4c1e37e8626981a0366432a498bdf1e10c0bd1"},
+    {file = "fastparquet-2024.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ee37d9273e383811f10bd379990851b53df606cfaa046cae53826b6b14f0a33d"},
+    {file = "fastparquet-2024.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:42babeafac01ab24ea1edc7f626c0744c312d60ba6a7189b08c8e7d1c374bfd3"},
+    {file = "fastparquet-2024.2.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:b7a620b87e83c098a46611b901c456403c9a04ba526e4a615750d6704092e1eb"},
+    {file = "fastparquet-2024.2.0-cp312-cp312-win_amd64.whl", hash = "sha256:e6f544d65b9f826a149010e3fd5121510e0a1a44c62f1b274aea4a41a8f3dbcd"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:bf6df4a9c781e32dc10432e78ee82c3c8750e9975a4e2d29aecffc1f2323a418"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:ee36f1ea8f08cb9b8710161eee4e752e74f34ef3e7aebc58db4e5468d29ff34c"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cd4b8133f5fa43c497d151d4d00337f9b0614993116a61c61e563a003eb0811e"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6509837887e35bdcb08ba252eeb930b1056e129b6d31c14901443339567ee95a"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f369dcc860b176739826ed67ea230f243334df5c5b3047ac10b0a365ec469082"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:fe1b88f51687566eac9fa94f7ce4f17b8df9e4b7ba8f7d37f383e7140414fe98"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:d2711f30720c4f80654c191ecb21d2b1b7351be1f6763c70936bdbab095f0b54"},
+    {file = "fastparquet-2024.2.0-cp38-cp38-win_amd64.whl", hash = "sha256:52603d24d19522753e21b1794d99bb295688e33d1a04b61a5c0e9eb4884ba342"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:c6affd18ed2608976739b47befce9f80f7848209c892ccb1001d494296af33af"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:1a7314e654a06cfc68a50bfc61bbacc548257d8742fbecfe0418c3b0d4295c04"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fba0fcba4ffd60ab23d24486f85733a5cc1fcf46d1286c9dc3eed329809e9ee3"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:dace50138c81c6f70acfff91a7a15acc85e3d45be0edbcf164f26fd86cf3c7a5"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:dd45a7973afe651d7fdb6b836fa1f9177d318de20211a28f4580d9af5c2aacbb"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:33121c1596bb4d672579969a4901730f555447204c7c2573621803f7990cd309"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:b5131d77a6c4cdfe3b00baa7eb95602c7f09d955c5490dd3bc0ec0e290ee4010"},
+    {file = "fastparquet-2024.2.0-cp39-cp39-win_amd64.whl", hash = "sha256:06736e5bb0827f861ac0901310baedf7e7b5f52dfcd89d435963ae328203597c"},
+    {file = "fastparquet-2024.2.0.tar.gz", hash = "sha256:81a8f60c51793eb2436b4fdbbf115ff8578a4a457a179240bc08f9d9573d57a4"},
+]
+
+[package.dependencies]
+cramjam = ">=2.3"
+fsspec = "*"
+numpy = ">=1.20.3"
+packaging = "*"
+pandas = ">=1.5.0"
+
+[package.extras]
+lzo = ["python-lzo"]
+
 [[package]]
 name = "feedfinder2"
 version = "0.0.4"
@ -2998,6 +3190,20 @@ typing-extensions = {version = ">=3.7.4", markers = "python_version < \"3.9\""}
 [package.extras]
 dev = ["black (==23.3.0)", "build (==0.10.0)", "check-manifest (==0.49)", "click (==8.1.3)", "coverage (==7.2.7)", "exceptiongroup (==1.1.1)", "iniconfig (==2.0.0)", "mypy (==1.4.1)", "mypy-extensions (==1.0.0)", "packaging (==23.1)", "pathspec (==0.11.1)", "platformdirs (==3.8.0)", "pluggy (==1.2.0)", "pyproject-hooks (==1.0.0)", "pytest (==7.4.0)", "pytest-cov (==4.1.0)", "tomli (==2.0.1)", "typing-extensions (==4.7.0)"]

+[[package]]
+name = "intel-openmp"
+version = "2021.4.0"
+description = "Intel OpenMP* Runtime Library"
+optional = false
+python-versions = "*"
+files = [
+    {file = "intel_openmp-2021.4.0-py2.py3-none-macosx_10_15_x86_64.macosx_11_0_x86_64.whl", hash = "sha256:41c01e266a7fdb631a7609191709322da2bbf24b252ba763f125dd651bcc7675"},
+    {file = "intel_openmp-2021.4.0-py2.py3-none-manylinux1_i686.whl", hash = "sha256:3b921236a38384e2016f0f3d65af6732cf2c12918087128a9163225451e776f2"},
+    {file = "intel_openmp-2021.4.0-py2.py3-none-manylinux1_x86_64.whl", hash = "sha256:e2240ab8d01472fed04f3544a878cda5da16c26232b7ea1b59132dbfb48b186e"},
+    {file = "intel_openmp-2021.4.0-py2.py3-none-win32.whl", hash = "sha256:6e863d8fd3d7e8ef389d52cf97a50fe2afe1a19247e8c0d168ce021546f96fc9"},
+    {file = "intel_openmp-2021.4.0-py2.py3-none-win_amd64.whl", hash = "sha256:eef4c8bcc8acefd7f5cd3b9384dbf73d59e2c99fc56545712ded913f43c4a94f"},
+]
+
 [[package]]
 name = "ipykernel"
 version = "6.29.3"
@ -4154,6 +4360,24 @@ files = [
    {file = "mistune-3.0.2.tar.gz", hash = "sha256:fc7f93ded930c92394ef2cb6f04a8aabab4117a91449e72dcc8dfa646a508be8"},
 ]

+[[package]]
+name = "mkl"
+version = "2021.4.0"
+description = "Intel® oneAPI Math Kernel Library"
+optional = false
+python-versions = "*"
+files = [
+    {file = "mkl-2021.4.0-py2.py3-none-macosx_10_15_x86_64.macosx_11_0_x86_64.whl", hash = "sha256:67460f5cd7e30e405b54d70d1ed3ca78118370b65f7327d495e9c8847705e2fb"},
+    {file = "mkl-2021.4.0-py2.py3-none-manylinux1_i686.whl", hash = "sha256:636d07d90e68ccc9630c654d47ce9fdeb036bb46e2b193b3a9ac8cfea683cce5"},
+    {file = "mkl-2021.4.0-py2.py3-none-manylinux1_x86_64.whl", hash = "sha256:398dbf2b0d12acaf54117a5210e8f191827f373d362d796091d161f610c1ebfb"},
+    {file = "mkl-2021.4.0-py2.py3-none-win32.whl", hash = "sha256:439c640b269a5668134e3dcbcea4350459c4a8bc46469669b2d67e07e3d330e8"},
+    {file = "mkl-2021.4.0-py2.py3-none-win_amd64.whl", hash = "sha256:ceef3cafce4c009dd25f65d7ad0d833a0fbadc3d8903991ec92351fe5de1e718"},
+]
+
+[package.dependencies]
+intel-openmp = "==2021.*"
+tbb = "==2021.*"
+
 [[package]]
 name = "mlflow-skinny"
 version = "2.11.1"
@ -4215,7 +4439,7 @@ zstd = ["pymongo[zstd] (>=4.5,<5)"]
 name = "mpmath"
 version = "1.3.0"
 description = "Python library for arbitrary-precision floating-point arithmetic"
-optional = true
+optional = false
 python-versions = "*"
 files = [
    {file = "mpmath-1.3.0-py3-none-any.whl", hash = "sha256:a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c"},
@ -4639,6 +4863,24 @@ files = [
    {file = "nest_asyncio-1.6.0.tar.gz", hash = "sha256:6f172d5449aca15afd6c646851f4e31e02c598d553a667e38cafa997cfec55fe"},
 ]

+[[package]]
+name = "networkx"
+version = "3.1"
+description = "Python package for creating and manipulating graphs and networks"
+optional = false
+python-versions = ">=3.8"
+files = [
+    {file = "networkx-3.1-py3-none-any.whl", hash = "sha256:4f33f68cb2afcf86f28a45f43efc27a9386b535d567d2127f8f61d51dec58d36"},
+    {file = "networkx-3.1.tar.gz", hash = "sha256:de346335408f84de0eada6ff9fafafff9bcda11f0a0dfaa931133debb146ab61"},
+]
+
+[package.extras]
+default = ["matplotlib (>=3.4)", "numpy (>=1.20)", "pandas (>=1.3)", "scipy (>=1.8)"]
+developer = ["mypy (>=1.1)", "pre-commit (>=3.2)"]
+doc = ["nb2plots (>=0.6)", "numpydoc (>=1.5)", "pillow (>=9.4)", "pydata-sphinx-theme (>=0.13)", "sphinx (>=6.1)", "sphinx-gallery (>=0.12)", "texext (>=0.6.7)"]
+extra = ["lxml (>=4.6)", "pydot (>=1.4.2)", "pygraphviz (>=1.10)", "sympy (>=1.10)"]
+test = ["codecov (>=2.1)", "pytest (>=7.2)", "pytest-cov (>=4.0)"]
+
 [[package]]
 name = "newspaper3k"
 version = "0.2.8"
@ -4665,6 +4907,33 @@ requests = ">=2.10.0"
 tinysegmenter = "0.3"
 tldextract = ">=2.0.1"

+[[package]]
+name = "ninja"
+version = "1.11.1.1"
+description = "Ninja is a small build system with a focus on speed"
+optional = false
+python-versions = "*"
+files = [
+    {file = "ninja-1.11.1.1-py2.py3-none-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl", hash = "sha256:376889c76d87b95b5719fdd61dd7db193aa7fd4432e5d52d2e44e4c497bdbbee"},
+    {file = "ninja-1.11.1.1-py2.py3-none-manylinux1_i686.manylinux_2_5_i686.whl", hash = "sha256:ecf80cf5afd09f14dcceff28cb3f11dc90fb97c999c89307aea435889cb66877"},
+    {file = "ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:84502ec98f02a037a169c4b0d5d86075eaf6afc55e1879003d6cab51ced2ea4b"},
+    {file = "ninja-1.11.1.1-py2.py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:73b93c14046447c7c5cc892433d4fae65d6364bec6685411cb97a8bcf815f93a"},
+    {file = "ninja-1.11.1.1-py2.py3-none-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:18302d96a5467ea98b68e1cae1ae4b4fb2b2a56a82b955193c637557c7273dbd"},
+    {file = "ninja-1.11.1.1-py2.py3-none-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:aad34a70ef15b12519946c5633344bc775a7656d789d9ed5fdb0d456383716ef"},
+    {file = "ninja-1.11.1.1-py2.py3-none-musllinux_1_1_aarch64.whl", hash = "sha256:d491fc8d89cdcb416107c349ad1e3a735d4c4af5e1cb8f5f727baca6350fdaea"},
+    {file = "ninja-1.11.1.1-py2.py3-none-musllinux_1_1_i686.whl", hash = "sha256:7563ce1d9fe6ed5af0b8dd9ab4a214bf4ff1f2f6fd6dc29f480981f0f8b8b249"},
+    {file = "ninja-1.11.1.1-py2.py3-none-musllinux_1_1_ppc64le.whl", hash = "sha256:9df724344202b83018abb45cb1efc22efd337a1496514e7e6b3b59655be85205"},
+    {file = "ninja-1.11.1.1-py2.py3-none-musllinux_1_1_s390x.whl", hash = "sha256:3e0f9be5bb20d74d58c66cc1c414c3e6aeb45c35b0d0e41e8d739c2c0d57784f"},
+    {file = "ninja-1.11.1.1-py2.py3-none-musllinux_1_1_x86_64.whl", hash = "sha256:76482ba746a2618eecf89d5253c0d1e4f1da1270d41e9f54dfbd91831b0f6885"},
+    {file = "ninja-1.11.1.1-py2.py3-none-win32.whl", hash = "sha256:fa2ba9d74acfdfbfbcf06fad1b8282de8a7a8c481d9dee45c859a8c93fcc1082"},
+    {file = "ninja-1.11.1.1-py2.py3-none-win_amd64.whl", hash = "sha256:95da904130bfa02ea74ff9c0116b4ad266174fafb1c707aa50212bc7859aebf1"},
+    {file = "ninja-1.11.1.1-py2.py3-none-win_arm64.whl", hash = "sha256:185e0641bde601e53841525c4196278e9aaf4463758da6dd1e752c0a0f54136a"},
+    {file = "ninja-1.11.1.1.tar.gz", hash = "sha256:9d793b08dd857e38d0b6ffe9e6b7145d7c485a42dcfea04905ca0cdb6017cc3c"},
+]
+
+[package.extras]
+test = ["codecov (>=2.0.5)", "coverage (>=4.2)", "flake8 (>=3.0.4)", "pytest (>=4.5.0)", "pytest-cov (>=2.7.1)", "pytest-runner (>=5.1)", "pytest-virtualenv (>=1.7.0)", "virtualenv (>=15.0.3)"]
+
 [[package]]
 name = "nltk"
 version = "3.8.1"
@ -4809,6 +5078,148 @@ files = [
    {file = "numpy-1.24.4.tar.gz", hash = "sha256:80f5e3a4e498641401868df4208b74581206afbee7cf7b8329daae82676d9463"},
 ]

+[[package]]
+name = "nvidia-cublas-cu12"
+version = "12.1.3.1"
+description = "CUBLAS native runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl", hash = "sha256:ee53ccca76a6fc08fb9701aa95b6ceb242cdaab118c3bb152af4e579af792728"},
+    {file = "nvidia_cublas_cu12-12.1.3.1-py3-none-win_amd64.whl", hash = "sha256:2b964d60e8cf11b5e1073d179d85fa340c120e99b3067558f3cf98dd69d02906"},
+]
+
+[[package]]
+name = "nvidia-cuda-cupti-cu12"
+version = "12.1.105"
+description = "CUDA profiling tools runtime libs."
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl", hash = "sha256:e54fde3983165c624cb79254ae9818a456eb6e87a7fd4d56a2352c24ee542d7e"},
+    {file = "nvidia_cuda_cupti_cu12-12.1.105-py3-none-win_amd64.whl", hash = "sha256:bea8236d13a0ac7190bd2919c3e8e6ce1e402104276e6f9694479e48bb0eb2a4"},
+]
+
+[[package]]
+name = "nvidia-cuda-nvrtc-cu12"
+version = "12.1.105"
+description = "NVRTC native runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl", hash = "sha256:339b385f50c309763ca65456ec75e17bbefcbbf2893f462cb8b90584cd27a1c2"},
+    {file = "nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-win_amd64.whl", hash = "sha256:0a98a522d9ff138b96c010a65e145dc1b4850e9ecb75a0172371793752fd46ed"},
+]
+
+[[package]]
+name = "nvidia-cuda-runtime-cu12"
+version = "12.1.105"
+description = "CUDA Runtime native Libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl", hash = "sha256:6e258468ddf5796e25f1dc591a31029fa317d97a0a94ed93468fc86301d61e40"},
+    {file = "nvidia_cuda_runtime_cu12-12.1.105-py3-none-win_amd64.whl", hash = "sha256:dfb46ef84d73fababab44cf03e3b83f80700d27ca300e537f85f636fac474344"},
+]
+
+[[package]]
+name = "nvidia-cudnn-cu12"
+version = "8.9.2.26"
+description = "cuDNN runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl", hash = "sha256:5ccb288774fdfb07a7e7025ffec286971c06d8d7b4fb162525334616d7629ff9"},
+]
+
+[package.dependencies]
+nvidia-cublas-cu12 = "*"
+
+[[package]]
+name = "nvidia-cufft-cu12"
+version = "11.0.2.54"
+description = "CUFFT native runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl", hash = "sha256:794e3948a1aa71fd817c3775866943936774d1c14e7628c74f6f7417224cdf56"},
+    {file = "nvidia_cufft_cu12-11.0.2.54-py3-none-win_amd64.whl", hash = "sha256:d9ac353f78ff89951da4af698f80870b1534ed69993f10a4cf1d96f21357e253"},
+]
+
+[[package]]
+name = "nvidia-curand-cu12"
+version = "10.3.2.106"
+description = "CURAND native runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl", hash = "sha256:9d264c5036dde4e64f1de8c50ae753237c12e0b1348738169cd0f8a536c0e1e0"},
+    {file = "nvidia_curand_cu12-10.3.2.106-py3-none-win_amd64.whl", hash = "sha256:75b6b0c574c0037839121317e17fd01f8a69fd2ef8e25853d826fec30bdba74a"},
+]
+
+[[package]]
+name = "nvidia-cusolver-cu12"
+version = "11.4.5.107"
+description = "CUDA solver native runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl", hash = "sha256:8a7ec542f0412294b15072fa7dab71d31334014a69f953004ea7a118206fe0dd"},
+    {file = "nvidia_cusolver_cu12-11.4.5.107-py3-none-win_amd64.whl", hash = "sha256:74e0c3a24c78612192a74fcd90dd117f1cf21dea4822e66d89e8ea80e3cd2da5"},
+]
+
+[package.dependencies]
+nvidia-cublas-cu12 = "*"
+nvidia-cusparse-cu12 = "*"
+nvidia-nvjitlink-cu12 = "*"
+
+[[package]]
+name = "nvidia-cusparse-cu12"
+version = "12.1.0.106"
+description = "CUSPARSE native runtime libraries"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl", hash = "sha256:f3b50f42cf363f86ab21f720998517a659a48131e8d538dc02f8768237bd884c"},
+    {file = "nvidia_cusparse_cu12-12.1.0.106-py3-none-win_amd64.whl", hash = "sha256:b798237e81b9719373e8fae8d4f091b70a0cf09d9d85c95a557e11df2d8e9a5a"},
+]
+
+[package.dependencies]
+nvidia-nvjitlink-cu12 = "*"
+
+[[package]]
+name = "nvidia-nccl-cu12"
+version = "2.20.5"
+description = "NVIDIA Collective Communication Library (NCCL) Runtime"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_aarch64.whl", hash = "sha256:1fc150d5c3250b170b29410ba682384b14581db722b2531b0d8d33c595f33d01"},
+    {file = "nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl", hash = "sha256:057f6bf9685f75215d0c53bf3ac4a10b3e6578351de307abad9e18a99182af56"},
+]
+
+[[package]]
+name = "nvidia-nvjitlink-cu12"
+version = "12.4.127"
+description = "Nvidia JIT LTO Library"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl", hash = "sha256:06b3b9b25bf3f8af351d664978ca26a16d2c5127dbd53c0497e28d1fb9611d57"},
+    {file = "nvidia_nvjitlink_cu12-12.4.127-py3-none-win_amd64.whl", hash = "sha256:fd9020c501d27d135f983c6d3e244b197a7ccad769e34df53a42e276b0e25fa1"},
+]
+
+[[package]]
+name = "nvidia-nvtx-cu12"
+version = "12.1.105"
+description = "NVIDIA Tools Extension"
+optional = false
+python-versions = ">=3"
+files = [
+    {file = "nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl", hash = "sha256:dc21cf308ca5691e7c04d962e213f8a4aa9bbfa23d95412f452254c2caeb09e5"},
+    {file = "nvidia_nvtx_cu12-12.1.105-py3-none-win_amd64.whl", hash = "sha256:65f4d98982b31b60026e0e6de73fbdfc09d08a96f4656dd3665ca616a11e1e82"},
+]
+
 [[package]]
 name = "nvidia-riva-client"
 version = "2.14.0"
@ -7434,6 +7845,128 @@ files = [
    {file = "ruff-0.1.15.tar.gz", hash = "sha256:f6dfa8c1b21c913c326919056c390966648b680966febcb796cc9d1aaab8564e"},
 ]

+[[package]]
+name = "safetensors"
+version = "0.4.3"
+description = ""
+optional = false
+python-versions = ">=3.7"
+files = [
+    {file = "safetensors-0.4.3-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:dcf5705cab159ce0130cd56057f5f3425023c407e170bca60b4868048bae64fd"},
+    {file = "safetensors-0.4.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:bb4f8c5d0358a31e9a08daeebb68f5e161cdd4018855426d3f0c23bb51087055"},
+    {file = "safetensors-0.4.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:70a5319ef409e7f88686a46607cbc3c428271069d8b770076feaf913664a07ac"},
+    {file = "safetensors-0.4.3-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:fb9c65bd82f9ef3ce4970dc19ee86be5f6f93d032159acf35e663c6bea02b237"},
+    {file = "safetensors-0.4.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:edb5698a7bc282089f64c96c477846950358a46ede85a1c040e0230344fdde10"},
+    {file = "safetensors-0.4.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:efcc860be094b8d19ac61b452ec635c7acb9afa77beb218b1d7784c6d41fe8ad"},
+    {file = "safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d88b33980222085dd6001ae2cad87c6068e0991d4f5ccf44975d216db3b57376"},
+    {file = "safetensors-0.4.3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:5fc6775529fb9f0ce2266edd3e5d3f10aab068e49f765e11f6f2a63b5367021d"},
+    {file = "safetensors-0.4.3-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:9c6ad011c1b4e3acff058d6b090f1da8e55a332fbf84695cf3100c649cc452d1"},
+    {file = "safetensors-0.4.3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:8c496c5401c1b9c46d41a7688e8ff5b0310a3b9bae31ce0f0ae870e1ea2b8caf"},
+    {file = "safetensors-0.4.3-cp310-none-win32.whl", hash = "sha256:38e2a8666178224a51cca61d3cb4c88704f696eac8f72a49a598a93bbd8a4af9"},
+    {file = "safetensors-0.4.3-cp310-none-win_amd64.whl", hash = "sha256:393e6e391467d1b2b829c77e47d726f3b9b93630e6a045b1d1fca67dc78bf632"},
+    {file = "safetensors-0.4.3-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:22f3b5d65e440cec0de8edaa672efa888030802e11c09b3d6203bff60ebff05a"},
+    {file = "safetensors-0.4.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7c4fa560ebd4522adddb71dcd25d09bf211b5634003f015a4b815b7647d62ebe"},
+    {file = "safetensors-0.4.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e9afd5358719f1b2cf425fad638fc3c887997d6782da317096877e5b15b2ce93"},
+    {file = "safetensors-0.4.3-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:d8c5093206ef4b198600ae484230402af6713dab1bd5b8e231905d754022bec7"},
+    {file = "safetensors-0.4.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e0b2104df1579d6ba9052c0ae0e3137c9698b2d85b0645507e6fd1813b70931a"},
+    {file = "safetensors-0.4.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:8cf18888606dad030455d18f6c381720e57fc6a4170ee1966adb7ebc98d4d6a3"},
+    {file = "safetensors-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0bf4f9d6323d9f86eef5567eabd88f070691cf031d4c0df27a40d3b4aaee755b"},
+    {file = "safetensors-0.4.3-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:585c9ae13a205807b63bef8a37994f30c917ff800ab8a1ca9c9b5d73024f97ee"},
+    {file = "safetensors-0.4.3-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:faefeb3b81bdfb4e5a55b9bbdf3d8d8753f65506e1d67d03f5c851a6c87150e9"},
+    {file = "safetensors-0.4.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:befdf0167ad626f22f6aac6163477fcefa342224a22f11fdd05abb3995c1783c"},
+    {file = "safetensors-0.4.3-cp311-none-win32.whl", hash = "sha256:a7cef55929dcbef24af3eb40bedec35d82c3c2fa46338bb13ecf3c5720af8a61"},
+    {file = "safetensors-0.4.3-cp311-none-win_amd64.whl", hash = "sha256:840b7ac0eff5633e1d053cc9db12fdf56b566e9403b4950b2dc85393d9b88d67"},
+    {file = "safetensors-0.4.3-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:22d21760dc6ebae42e9c058d75aa9907d9f35e38f896e3c69ba0e7b213033856"},
+    {file = "safetensors-0.4.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:8d22c1a10dff3f64d0d68abb8298a3fd88ccff79f408a3e15b3e7f637ef5c980"},
+    {file = "safetensors-0.4.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b1648568667f820b8c48317c7006221dc40aced1869908c187f493838a1362bc"},
+    {file = "safetensors-0.4.3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:446e9fe52c051aeab12aac63d1017e0f68a02a92a027b901c4f8e931b24e5397"},
+    {file = "safetensors-0.4.3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:fef5d70683643618244a4f5221053567ca3e77c2531e42ad48ae05fae909f542"},
+    {file = "safetensors-0.4.3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2a1f4430cc0c9d6afa01214a4b3919d0a029637df8e09675ceef1ca3f0dfa0df"},
+    {file = "safetensors-0.4.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2d603846a8585b9432a0fd415db1d4c57c0f860eb4aea21f92559ff9902bae4d"},
+    {file = "safetensors-0.4.3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:a844cdb5d7cbc22f5f16c7e2a0271170750763c4db08381b7f696dbd2c78a361"},
+    {file = "safetensors-0.4.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:88887f69f7a00cf02b954cdc3034ffb383b2303bc0ab481d4716e2da51ddc10e"},
+    {file = "safetensors-0.4.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:ee463219d9ec6c2be1d331ab13a8e0cd50d2f32240a81d498266d77d07b7e71e"},
+    {file = "safetensors-0.4.3-cp312-none-win32.whl", hash = "sha256:d0dd4a1db09db2dba0f94d15addc7e7cd3a7b0d393aa4c7518c39ae7374623c3"},
+    {file = "safetensors-0.4.3-cp312-none-win_amd64.whl", hash = "sha256:d14d30c25897b2bf19b6fb5ff7e26cc40006ad53fd4a88244fdf26517d852dd7"},
+    {file = "safetensors-0.4.3-cp37-cp37m-macosx_10_12_x86_64.whl", hash = "sha256:d1456f814655b224d4bf6e7915c51ce74e389b413be791203092b7ff78c936dd"},
+    {file = "safetensors-0.4.3-cp37-cp37m-macosx_11_0_arm64.whl", hash = "sha256:455d538aa1aae4a8b279344a08136d3f16334247907b18a5c3c7fa88ef0d3c46"},
+    {file = "safetensors-0.4.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cf476bca34e1340ee3294ef13e2c625833f83d096cfdf69a5342475602004f95"},
+    {file = "safetensors-0.4.3-cp37-cp37m-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:02ef3a24face643456020536591fbd3c717c5abaa2737ec428ccbbc86dffa7a4"},
+    {file = "safetensors-0.4.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:7de32d0d34b6623bb56ca278f90db081f85fb9c5d327e3c18fd23ac64f465768"},
+    {file = "safetensors-0.4.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2a0deb16a1d3ea90c244ceb42d2c6c276059616be21a19ac7101aa97da448faf"},
+    {file = "safetensors-0.4.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c59d51f182c729f47e841510b70b967b0752039f79f1de23bcdd86462a9b09ee"},
+    {file = "safetensors-0.4.3-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1f598b713cc1a4eb31d3b3203557ac308acf21c8f41104cdd74bf640c6e538e3"},
+    {file = "safetensors-0.4.3-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:5757e4688f20df083e233b47de43845d1adb7e17b6cf7da5f8444416fc53828d"},
+    {file = "safetensors-0.4.3-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:fe746d03ed8d193674a26105e4f0fe6c726f5bb602ffc695b409eaf02f04763d"},
+    {file = "safetensors-0.4.3-cp37-none-win32.whl", hash = "sha256:0d5ffc6a80f715c30af253e0e288ad1cd97a3d0086c9c87995e5093ebc075e50"},
+    {file = "safetensors-0.4.3-cp37-none-win_amd64.whl", hash = "sha256:a11c374eb63a9c16c5ed146457241182f310902bd2a9c18255781bb832b6748b"},
+    {file = "safetensors-0.4.3-cp38-cp38-macosx_10_12_x86_64.whl", hash = "sha256:b1e31be7945f66be23f4ec1682bb47faa3df34cb89fc68527de6554d3c4258a4"},
+    {file = "safetensors-0.4.3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:03a4447c784917c9bf01d8f2ac5080bc15c41692202cd5f406afba16629e84d6"},
+    {file = "safetensors-0.4.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d244bcafeb1bc06d47cfee71727e775bca88a8efda77a13e7306aae3813fa7e4"},
+    {file = "safetensors-0.4.3-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:53c4879b9c6bd7cd25d114ee0ef95420e2812e676314300624594940a8d6a91f"},
+    {file = "safetensors-0.4.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:74707624b81f1b7f2b93f5619d4a9f00934d5948005a03f2c1845ffbfff42212"},
+    {file = "safetensors-0.4.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0d52c958dc210265157573f81d34adf54e255bc2b59ded6218500c9b15a750eb"},
+    {file = "safetensors-0.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6f9568f380f513a60139971169c4a358b8731509cc19112369902eddb33faa4d"},
+    {file = "safetensors-0.4.3-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:0d9cd8e1560dfc514b6d7859247dc6a86ad2f83151a62c577428d5102d872721"},
+    {file = "safetensors-0.4.3-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:89f9f17b0dacb913ed87d57afbc8aad85ea42c1085bd5de2f20d83d13e9fc4b2"},
+    {file = "safetensors-0.4.3-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:1139eb436fd201c133d03c81209d39ac57e129f5e74e34bb9ab60f8d9b726270"},
+    {file = "safetensors-0.4.3-cp38-none-win32.whl", hash = "sha256:d9c289f140a9ae4853fc2236a2ffc9a9f2d5eae0cb673167e0f1b8c18c0961ac"},
+    {file = "safetensors-0.4.3-cp38-none-win_amd64.whl", hash = "sha256:622afd28968ef3e9786562d352659a37de4481a4070f4ebac883f98c5836563e"},
+    {file = "safetensors-0.4.3-cp39-cp39-macosx_10_12_x86_64.whl", hash = "sha256:8651c7299cbd8b4161a36cd6a322fa07d39cd23535b144d02f1c1972d0c62f3c"},
+    {file = "safetensors-0.4.3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:e375d975159ac534c7161269de24ddcd490df2157b55c1a6eeace6cbb56903f0"},
+    {file = "safetensors-0.4.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:084fc436e317f83f7071fc6a62ca1c513b2103db325cd09952914b50f51cf78f"},
+    {file = "safetensors-0.4.3-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:41a727a7f5e6ad9f1db6951adee21bbdadc632363d79dc434876369a17de6ad6"},
+    {file = "safetensors-0.4.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e7dbbde64b6c534548696808a0e01276d28ea5773bc9a2dfb97a88cd3dffe3df"},
+    {file = "safetensors-0.4.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bbae3b4b9d997971431c346edbfe6e41e98424a097860ee872721e176040a893"},
+    {file = "safetensors-0.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:01e4b22e3284cd866edeabe4f4d896229495da457229408d2e1e4810c5187121"},
+    {file = "safetensors-0.4.3-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:0dd37306546b58d3043eb044c8103a02792cc024b51d1dd16bd3dd1f334cb3ed"},
+    {file = "safetensors-0.4.3-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:d8815b5e1dac85fc534a97fd339e12404db557878c090f90442247e87c8aeaea"},
+    {file = "safetensors-0.4.3-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:e011cc162503c19f4b1fd63dfcddf73739c7a243a17dac09b78e57a00983ab35"},
+    {file = "safetensors-0.4.3-cp39-none-win32.whl", hash = "sha256:01feb3089e5932d7e662eda77c3ecc389f97c0883c4a12b5cfdc32b589a811c3"},
+    {file = "safetensors-0.4.3-cp39-none-win_amd64.whl", hash = "sha256:3f9cdca09052f585e62328c1c2923c70f46814715c795be65f0b93f57ec98a02"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:1b89381517891a7bb7d1405d828b2bf5d75528299f8231e9346b8eba092227f9"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:cd6fff9e56df398abc5866b19a32124815b656613c1c5ec0f9350906fd798aac"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:840caf38d86aa7014fe37ade5d0d84e23dcfbc798b8078015831996ecbc206a3"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f9650713b2cfa9537a2baf7dd9fee458b24a0aaaa6cafcea8bdd5fb2b8efdc34"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:e4119532cd10dba04b423e0f86aecb96cfa5a602238c0aa012f70c3a40c44b50"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:e066e8861eef6387b7c772344d1fe1f9a72800e04ee9a54239d460c400c72aab"},
+    {file = "safetensors-0.4.3-pp310-pypy310_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:90964917f5b0fa0fa07e9a051fbef100250c04d150b7026ccbf87a34a54012e0"},
+    {file = "safetensors-0.4.3-pp37-pypy37_pp73-macosx_10_12_x86_64.whl", hash = "sha256:c41e1893d1206aa7054029681778d9a58b3529d4c807002c156d58426c225173"},
+    {file = "safetensors-0.4.3-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ae7613a119a71a497d012ccc83775c308b9c1dab454806291427f84397d852fd"},
+    {file = "safetensors-0.4.3-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4f9bac020faba7f5dc481e881b14b6425265feabb5bfc552551d21189c0eddc3"},
+    {file = "safetensors-0.4.3-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:420a98f593ff9930f5822560d14c395ccbc57342ddff3b463bc0b3d6b1951550"},
+    {file = "safetensors-0.4.3-pp37-pypy37_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:f5e6883af9a68c0028f70a4c19d5a6ab6238a379be36ad300a22318316c00cb0"},
+    {file = "safetensors-0.4.3-pp37-pypy37_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:cdd0a3b5da66e7f377474599814dbf5cbf135ff059cc73694de129b58a5e8a2c"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-macosx_10_12_x86_64.whl", hash = "sha256:9bfb92f82574d9e58401d79c70c716985dc049b635fef6eecbb024c79b2c46ad"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-macosx_11_0_arm64.whl", hash = "sha256:3615a96dd2dcc30eb66d82bc76cda2565f4f7bfa89fcb0e31ba3cea8a1a9ecbb"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:868ad1b6fc41209ab6bd12f63923e8baeb1a086814cb2e81a65ed3d497e0cf8f"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b7ffba80aa49bd09195145a7fd233a7781173b422eeb995096f2b30591639517"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:c0acbe31340ab150423347e5b9cc595867d814244ac14218932a5cf1dd38eb39"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:19bbdf95de2cf64f25cd614c5236c8b06eb2cfa47cbf64311f4b5d80224623a3"},
+    {file = "safetensors-0.4.3-pp38-pypy38_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:b852e47eb08475c2c1bd8131207b405793bfc20d6f45aff893d3baaad449ed14"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-macosx_10_12_x86_64.whl", hash = "sha256:5d07cbca5b99babb692d76d8151bec46f461f8ad8daafbfd96b2fca40cadae65"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-macosx_11_0_arm64.whl", hash = "sha256:1ab6527a20586d94291c96e00a668fa03f86189b8a9defa2cdd34a1a01acc7d5"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:02318f01e332cc23ffb4f6716e05a492c5f18b1d13e343c49265149396284a44"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ec4b52ce9a396260eb9731eb6aea41a7320de22ed73a1042c2230af0212758ce"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:018b691383026a2436a22b648873ed11444a364324e7088b99cd2503dd828400"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:309b10dbcab63269ecbf0e2ca10ce59223bb756ca5d431ce9c9eeabd446569da"},
+    {file = "safetensors-0.4.3-pp39-pypy39_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:b277482120df46e27a58082df06a15aebda4481e30a1c21eefd0921ae7e03f65"},
+    {file = "safetensors-0.4.3.tar.gz", hash = "sha256:2f85fc50c4e07a21e95c24e07460fe6f7e2859d0ce88092838352b798ce711c2"},
+]
+
+[package.extras]
+all = ["safetensors[jax]", "safetensors[numpy]", "safetensors[paddlepaddle]", "safetensors[pinned-tf]", "safetensors[quality]", "safetensors[testing]", "safetensors[torch]"]
+dev = ["safetensors[all]"]
+jax = ["flax (>=0.6.3)", "jax (>=0.3.25)", "jaxlib (>=0.3.25)", "safetensors[numpy]"]
+mlx = ["mlx (>=0.0.9)"]
+numpy = ["numpy (>=1.21.6)"]
+paddlepaddle = ["paddlepaddle (>=2.4.1)", "safetensors[numpy]"]
+pinned-tf = ["safetensors[numpy]", "tensorflow (==2.11.0)"]
+quality = ["black (==22.3)", "click (==8.0.4)", "flake8 (>=3.8.3)", "isort (>=5.5.4)"]
+tensorflow = ["safetensors[numpy]", "tensorflow (>=2.11.0)"]
+testing = ["h5py (>=3.7.0)", "huggingface-hub (>=0.12.1)", "hypothesis (>=6.70.2)", "pytest (>=7.2.0)", "pytest-benchmark (>=4.0.0)", "safetensors[numpy]", "setuptools-rust (>=1.5.2)"]
+torch = ["safetensors[numpy]", "torch (>=1.10)"]
+
 [[package]]
 name = "scikit-learn"
 version = "1.3.2"
@ -7535,6 +8068,68 @@ nativelib = ["pyobjc-framework-Cocoa", "pywin32"]
 objc = ["pyobjc-framework-Cocoa"]
 win32 = ["pywin32"]

+[[package]]
+name = "sentencepiece"
+version = "0.2.0"
+description = "SentencePiece python wrapper"
+optional = false
+python-versions = "*"
+files = [
+    {file = "sentencepiece-0.2.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:188779e1298a1c8b8253c7d3ad729cb0a9891e5cef5e5d07ce4592c54869e227"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:bed9cf85b296fa2b76fc2547b9cbb691a523864cebaee86304c43a7b4cb1b452"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:d7b67e724bead13f18db6e1d10b6bbdc454af574d70efbb36f27d90387be1ca3"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2fde4b08cfe237be4484c6c7c2e2c75fb862cfeab6bd5449ce4caeafd97b767a"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4c378492056202d1c48a4979650981635fd97875a00eabb1f00c6a236b013b5e"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1380ce6540a368de2ef6d7e6ba14ba8f3258df650d39ba7d833b79ee68a52040"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-win32.whl", hash = "sha256:a1151d6a6dd4b43e552394aed0edfe9292820272f0194bd56c7c1660a0c06c3d"},
+    {file = "sentencepiece-0.2.0-cp310-cp310-win_amd64.whl", hash = "sha256:d490142b0521ef22bc1085f061d922a2a6666175bb6b42e588ff95c0db6819b2"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:17982700c4f6dbb55fa3594f3d7e5dd1c8659a274af3738e33c987d2a27c9d5c"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:7c867012c0e8bcd5bdad0f791609101cb5c66acb303ab3270218d6debc68a65e"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7fd6071249c74f779c5b27183295b9202f8dedb68034e716784364443879eaa6"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:27f90c55a65013cbb8f4d7aab0599bf925cde4adc67ae43a0d323677b5a1c6cb"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:b293734059ef656dcd65be62ff771507bea8fed0a711b6733976e1ed3add4553"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e58b47f933aca74c6a60a79dcb21d5b9e47416256c795c2d58d55cec27f9551d"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-win32.whl", hash = "sha256:c581258cf346b327c62c4f1cebd32691826306f6a41d8c4bec43b010dee08e75"},
+    {file = "sentencepiece-0.2.0-cp311-cp311-win_amd64.whl", hash = "sha256:0993dbc665f4113017892f1b87c3904a44d0640eda510abcacdfb07f74286d36"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:ea5f536e32ea8ec96086ee00d7a4a131ce583a1b18d130711707c10e69601cb2"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:d0cb51f53b6aae3c36bafe41e86167c71af8370a039f542c43b0cce5ef24a68c"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:3212121805afc58d8b00ab4e7dd1f8f76c203ddb9dc94aa4079618a31cf5da0f"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2a3149e3066c2a75e0d68a43eb632d7ae728c7925b517f4c05c40f6f7280ce08"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:632f3594d3e7ac8b367bca204cb3fd05a01d5b21455acd097ea4c0e30e2f63d7"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f295105c6bdbb05bd5e1b0cafbd78ff95036f5d3641e7949455a3f4e5e7c3109"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-win32.whl", hash = "sha256:fb89f811e5efd18bab141afc3fea3de141c3f69f3fe9e898f710ae7fe3aab251"},
+    {file = "sentencepiece-0.2.0-cp312-cp312-win_amd64.whl", hash = "sha256:7a673a72aab81fef5ebe755c6e0cc60087d1f3a4700835d40537183c1703a45f"},
+    {file = "sentencepiece-0.2.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:4547683f330289ec4f093027bfeb87f9ef023b2eb6f879fdc4a8187c7e0ffb90"},
+    {file = "sentencepiece-0.2.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7cd6175f7eaec7142d2bf6f6597ce7db4c9ac89acf93fcdb17410c3a8b781eeb"},
+    {file = "sentencepiece-0.2.0-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:859ba1acde782609a0910a26a60e16c191a82bf39b5621107552c0cd79fad00f"},
+    {file = "sentencepiece-0.2.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bcbbef6cc277f8f18f36959e305f10b1c620442d75addc79c21d7073ae581b50"},
+    {file = "sentencepiece-0.2.0-cp36-cp36m-win32.whl", hash = "sha256:536b934e244829e3fe6c4f198652cd82da48adb9aa145c9f00889542726dee3d"},
+    {file = "sentencepiece-0.2.0-cp36-cp36m-win_amd64.whl", hash = "sha256:0a91aaa3c769b52440df56fafda683b3aa48e3f2169cf7ee5b8c8454a7f3ae9b"},
+    {file = "sentencepiece-0.2.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:787e480ca4c1d08c9985a7eb1eae4345c107729c99e9b5a9a00f2575fc7d4b4b"},
+    {file = "sentencepiece-0.2.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f4d158189eb2ecffea3a51edf6d25e110b3678ec47f1a40f2d541eafbd8f6250"},
+    {file = "sentencepiece-0.2.0-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d1e5ca43013e8935f25457a4fca47e315780172c3e821b4b13a890668911c792"},
+    {file = "sentencepiece-0.2.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7140d9e5a74a0908493bb4a13f1f16a401297bd755ada4c707e842fbf6f0f5bf"},
+    {file = "sentencepiece-0.2.0-cp37-cp37m-win32.whl", hash = "sha256:6cf333625234f247ab357b0bd9836638405ea9082e1543d5b8408f014979dcbf"},
+    {file = "sentencepiece-0.2.0-cp37-cp37m-win_amd64.whl", hash = "sha256:ff88712338b01031910e8e61e7239aff3ce8869ee31a47df63cb38aadd591bea"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:20813a68d4c221b1849c62c30e1281ea81687894d894b8d4a0f4677d9311e0f5"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:926ef920ae2e8182db31d3f5d081ada57804e3e1d3a8c4ef8b117f9d9fb5a945"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:89f65f69636b7e9c015b79dff9c9985a9bc7d19ded6f79ef9f1ec920fdd73ecf"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0f67eae0dbe6f2d7d6ba50a354623d787c99965f068b81e145d53240198021b0"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:98501e075f35dd1a1d5a20f65be26839fcb1938752ec61539af008a5aa6f510b"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e3d1d2cc4882e8d6a1adf9d5927d7716f80617fc693385661caff21888972269"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-win32.whl", hash = "sha256:b99a308a2e5e569031ab164b74e6fab0b6f37dfb493c32f7816225f4d411a6dd"},
+    {file = "sentencepiece-0.2.0-cp38-cp38-win_amd64.whl", hash = "sha256:cdb701eec783d3ec86b7cd4c763adad8eaf6b46db37ee1c36e5e6c44b3fe1b5f"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:1e0f9c4d0a6b0af59b613175f019916e28ade076e21242fd5be24340d8a2f64a"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:298f21cc1366eb60311aedba3169d30f885c363ddbf44214b0a587d2908141ad"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:3f1ec95aa1e5dab11f37ac7eff190493fd87770f7a8b81ebc9dd768d1a3c8704"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7b06b70af54daa4b4904cbb90b4eb6d35c9f3252fdc86c9c32d5afd4d30118d8"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:22e37bac44dd6603388cb598c64ff7a76e41ca774646f21c23aadfbf5a2228ab"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0461324897735512a32d222e3d886e24ad6a499761952b6bda2a9ee6e4313ea5"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-win32.whl", hash = "sha256:38aed822fb76435fa1f12185f10465a94ab9e51d5e8a9159e9a540ce926f0ffd"},
+    {file = "sentencepiece-0.2.0-cp39-cp39-win_amd64.whl", hash = "sha256:d8cf876516548b5a1d6ac4745d8b554f5c07891d55da557925e5c13ff0b4e6ad"},
+    {file = "sentencepiece-0.2.0.tar.gz", hash = "sha256:a52c19171daaf2e697dc6cbe67684e0fa341b1248966f6aebb541de654d15843"},
+]
+
 [[package]]
 name = "setuptools"
 version = "67.8.0"
@ -7856,7 +8451,7 @@ snowflake = ["snowflake-connector-python (>=2.8.0)", "snowflake-snowpark-python
 name = "sympy"
 version = "1.12"
 description = "Computer algebra system (CAS) in Python"
-optional = true
+optional = false
 python-versions = ">=3.8"
 files = [
    {file = "sympy-1.12-py3-none-any.whl", hash = "sha256:c3588cd4295d0c0f603d0f2ae780587e64e2efeedb3521e46b9bb1d08d184fa5"},
@ -7894,6 +8489,19 @@ files = [
 [package.extras]
 widechars = ["wcwidth"]

+[[package]]
+name = "tbb"
+version = "2021.12.0"
+description = "Intel® oneAPI Threading Building Blocks (oneTBB)"
+optional = false
+python-versions = "*"
+files = [
+    {file = "tbb-2021.12.0-py2.py3-none-manylinux1_i686.whl", hash = "sha256:f2cc9a7f8ababaa506cbff796ce97c3bf91062ba521e15054394f773375d81d8"},
+    {file = "tbb-2021.12.0-py2.py3-none-manylinux1_x86_64.whl", hash = "sha256:a925e9a7c77d3a46ae31c34b0bb7f801c4118e857d137b68f68a8e458fcf2bd7"},
+    {file = "tbb-2021.12.0-py3-none-win32.whl", hash = "sha256:b1725b30c174048edc8be70bd43bb95473f396ce895d91151a474d0fa9f450a8"},
+    {file = "tbb-2021.12.0-py3-none-win_amd64.whl", hash = "sha256:fc2772d850229f2f3df85f1109c4844c495a2db7433d38200959ee9265b34789"},
+]
+
 [[package]]
 name = "telethon"
 version = "1.34.0"
@ -8251,6 +8859,60 @@ files = [
    {file = "toolz-0.12.1.tar.gz", hash = "sha256:ecca342664893f177a13dac0e6b41cbd8ac25a358e5f215316d43e2100224f4d"},
 ]

+[[package]]
+name = "torch"
+version = "2.3.0"
+description = "Tensors and Dynamic neural networks in Python with strong GPU acceleration"
+optional = false
+python-versions = ">=3.8.0"
+files = [
+    {file = "torch-2.3.0-cp310-cp310-manylinux1_x86_64.whl", hash = "sha256:d8ea5a465dbfd8501f33c937d1f693176c9aef9d1c1b0ca1d44ed7b0a18c52ac"},
+    {file = "torch-2.3.0-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:09c81c5859a5b819956c6925a405ef1cdda393c9d8a01ce3851453f699d3358c"},
+    {file = "torch-2.3.0-cp310-cp310-win_amd64.whl", hash = "sha256:1bf023aa20902586f614f7682fedfa463e773e26c58820b74158a72470259459"},
+    {file = "torch-2.3.0-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:758ef938de87a2653bba74b91f703458c15569f1562bf4b6c63c62d9c5a0c1f5"},
+    {file = "torch-2.3.0-cp311-cp311-manylinux1_x86_64.whl", hash = "sha256:493d54ee2f9df100b5ce1d18c96dbb8d14908721f76351e908c9d2622773a788"},
+    {file = "torch-2.3.0-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:bce43af735c3da16cc14c7de2be7ad038e2fbf75654c2e274e575c6c05772ace"},
+    {file = "torch-2.3.0-cp311-cp311-win_amd64.whl", hash = "sha256:729804e97b7cf19ae9ab4181f91f5e612af07956f35c8b2c8e9d9f3596a8e877"},
+    {file = "torch-2.3.0-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:d24e328226d8e2af7cf80fcb1d2f1d108e0de32777fab4aaa2b37b9765d8be73"},
+    {file = "torch-2.3.0-cp312-cp312-manylinux1_x86_64.whl", hash = "sha256:b0de2bdc0486ea7b14fc47ff805172df44e421a7318b7c4d92ef589a75d27410"},
+    {file = "torch-2.3.0-cp312-cp312-manylinux2014_aarch64.whl", hash = "sha256:a306c87a3eead1ed47457822c01dfbd459fe2920f2d38cbdf90de18f23f72542"},
+    {file = "torch-2.3.0-cp312-cp312-win_amd64.whl", hash = "sha256:f9b98bf1a3c8af2d4c41f0bf1433920900896c446d1ddc128290ff146d1eb4bd"},
+    {file = "torch-2.3.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:dca986214267b34065a79000cee54232e62b41dff1ec2cab9abc3fc8b3dee0ad"},
+    {file = "torch-2.3.0-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:20572f426965dd8a04e92a473d7e445fa579e09943cc0354f3e6fef6130ce061"},
+    {file = "torch-2.3.0-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:e65ba85ae292909cde0dde6369826d51165a3fc8823dc1854cd9432d7f79b932"},
+    {file = "torch-2.3.0-cp38-cp38-win_amd64.whl", hash = "sha256:5515503a193781fd1b3f5c474e89c9dfa2faaa782b2795cc4a7ab7e67de923f6"},
+    {file = "torch-2.3.0-cp38-none-macosx_11_0_arm64.whl", hash = "sha256:6ae9f64b09516baa4ef890af0672dc981c20b1f0d829ce115d4420a247e88fba"},
+    {file = "torch-2.3.0-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:cd0dc498b961ab19cb3f8dbf0c6c50e244f2f37dbfa05754ab44ea057c944ef9"},
+    {file = "torch-2.3.0-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:e05f836559251e4096f3786ee99f4a8cbe67bc7fbedba8ad5e799681e47c5e80"},
+    {file = "torch-2.3.0-cp39-cp39-win_amd64.whl", hash = "sha256:4fb27b35dbb32303c2927da86e27b54a92209ddfb7234afb1949ea2b3effffea"},
+    {file = "torch-2.3.0-cp39-none-macosx_11_0_arm64.whl", hash = "sha256:760f8bedff506ce9e6e103498f9b1e9e15809e008368594c3a66bf74a8a51380"},
+]
+
+[package.dependencies]
+filelock = "*"
+fsspec = "*"
+jinja2 = "*"
+mkl = {version = ">=2021.1.1,<=2021.4.0", markers = "platform_system == \"Windows\""}
+networkx = "*"
+nvidia-cublas-cu12 = {version = "12.1.3.1", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cuda-cupti-cu12 = {version = "12.1.105", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cuda-nvrtc-cu12 = {version = "12.1.105", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cuda-runtime-cu12 = {version = "12.1.105", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cudnn-cu12 = {version = "8.9.2.26", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cufft-cu12 = {version = "11.0.2.54", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-curand-cu12 = {version = "10.3.2.106", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cusolver-cu12 = {version = "11.4.5.107", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-cusparse-cu12 = {version = "12.1.0.106", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-nccl-cu12 = {version = "2.20.5", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+nvidia-nvtx-cu12 = {version = "12.1.105", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+sympy = "*"
+triton = {version = "2.3.0", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\" and python_version < \"3.12\""}
+typing-extensions = ">=4.8.0"
+
+[package.extras]
+opt-einsum = ["opt-einsum (>=3.3)"]
+optree = ["optree (>=0.9.1)"]
+
 [[package]]
 name = "tornado"
 version = "6.4"
@ -8465,6 +9127,29 @@ files = [
 [package.dependencies]
 tree-sitter = "*"

+[[package]]
+name = "triton"
+version = "2.3.0"
+description = "A language and compiler for custom Deep Learning operations"
+optional = false
+python-versions = "*"
+files = [
+    {file = "triton-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5ce4b8ff70c48e47274c66f269cce8861cf1dc347ceeb7a67414ca151b1822d8"},
+    {file = "triton-2.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3c3d9607f85103afdb279938fc1dd2a66e4f5999a58eb48a346bd42738f986dd"},
+    {file = "triton-2.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:218d742e67480d9581bafb73ed598416cc8a56f6316152e5562ee65e33de01c0"},
+    {file = "triton-2.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:381ec6b3dac06922d3e4099cfc943ef032893b25415de295e82b1a82b0359d2c"},
+    {file = "triton-2.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:038e06a09c06a164fef9c48de3af1e13a63dc1ba3c792871e61a8e79720ea440"},
+    {file = "triton-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6d8f636e0341ac348899a47a057c3daea99ea7db31528a225a3ba4ded28ccc65"},
+]
+
+[package.dependencies]
+filelock = "*"
+
+[package.extras]
+build = ["cmake (>=3.20)", "lit"]
+tests = ["autopep8", "flake8", "isort", "numpy", "pytest", "scipy (>=1.7.1)", "torch"]
+tutorials = ["matplotlib", "pandas", "tabulate", "torch"]
+
 [[package]]
 name = "typer"
 version = "0.9.0"
@ -8884,7 +9569,7 @@ test = ["websockets"]
 name = "websockets"
 version = "12.0"
 description = "An implementation of the WebSocket Protocol (RFC 6455 & 7692)"
-optional = true
+optional = false
 python-versions = ">=3.8"
 files = [
    {file = "websockets-12.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:d554236b2a2006e0ce16315c16eaa0d628dab009c33b63ea03f41c6107958374"},
@ -9321,4 +10006,4 @@ extended-testing = ["aiosqlite", "aleph-alpha-client", "anthropic", "arxiv", "as
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "b066cbf8a1f02cae88c6c099e916d805fe6eb8685fd15c093d66cf52ea363fa5"
+content-hash = "ee08d5381cb754a0246ce6e4fb2547888d25ee0b096d8b8944a29ce120bb4473"
--- a/libs/community/pyproject.toml
+++ b/libs/community/pyproject.toml
@ -161,6 +161,7 @@ anthropic = "^0.3.11"
 langchain-core = { path = "../core", develop = true }
 fireworks-ai = "^0.9.0"
 vdms = "^0.0.20"
+exllamav2 = "^0.0.18"

 [tool.poetry.group.lint]
 optional = true
--- a/libs/community/tests/integration_tests/llms/test_sambanova.py
+++ b/libs/community/tests/integration_tests/llms/test_sambanova.py
@ -0,0 +1,28 @@
+"""Test sambanova API wrapper.
+
+In order to run this test, you need to have an sambaverse api key, 
+and a sambaverse base url, project id, endpoint id, and api key.
+You'll then need to set SAMBAVERSE_API_KEY, SAMBASTUDIO_BASE_URL,
+SAMBASTUDIO_PROJECT_ID, SAMBASTUDIO_ENDPOINT_ID, and SAMBASTUDIO_API_KEY
+environment variables.
+"""
+from langchain_community.llms.sambanova import SambaStudio, Sambaverse
+
+
+def test_sambaverse_call() -> None:
+    """Test simple non-streaming call to sambaverse."""
+    llm = Sambaverse(
+        sambaverse_model_name="Meta/llama-2-7b-chat-hf",
+        model_kwargs={"select_expert": "llama-2-7b-chat-hf"},
+    )
+    output = llm.invoke("What is LangChain")
+    assert output
+    assert isinstance(output, str)
+
+
+def test_sambastudio_call() -> None:
+    """Test simple non-streaming call to sambaverse."""
+    llm = SambaStudio()
+    output = llm.invoke("What is LangChain")
+    assert output
+    assert isinstance(output, str)
--- a/libs/core/langchain_core/tools.py
+++ b/libs/core/langchain_core/tools.py
@ -210,9 +210,9 @@ class ChildTool(BaseTool):
    You can use these to eg identify a specific instance of a tool with its use case.
    """

-    handle_tool_error: Optional[Union[bool, str, Callable[[ToolException], str]]] = (
-        False
-    )
+    handle_tool_error: Optional[
+        Union[bool, str, Callable[[ToolException], str]]
+    ] = False
    """Handle the content of the ToolException thrown."""

    handle_validation_error: Optional[
@ -838,7 +838,7 @@ class StructuredTool(BaseTool):
        # Description example:
        # search_api(query: str) - Searches the API for the query.
        sig = signature(source_function)
-        description = f"{name}{sig} - {description_.strip()}"
+        description_ = f"{name}{sig} - {description_.strip()}"
        _args_schema = args_schema
        if _args_schema is None and infer_schema:
            # schema name is appended within function
--- a/libs/core/tests/unit_tests/test_tools.py
+++ b/libs/core/tests/unit_tests/test_tools.py
@ -3,6 +3,7 @@
 import asyncio
 import json
 import sys
+import textwrap
 from datetime import datetime
 from enum import Enum
 from functools import partial
@ -333,7 +334,7 @@ def test_structured_tool_from_function_docstring() -> None:

    prefix = "foo(bar: int, baz: str) -> str - "
    assert foo.__doc__ is not None
-    assert structured_tool.description == prefix + foo.__doc__.strip()
+    assert structured_tool.description == prefix + textwrap.dedent(foo.__doc__.strip())


 def test_structured_tool_from_function_docstring_complex_args() -> None:
@ -366,7 +367,7 @@ def test_structured_tool_from_function_docstring_complex_args() -> None:

    prefix = "foo(bar: int, baz: List[str]) -> str - "
    assert foo.__doc__ is not None
-    assert structured_tool.description == prefix + foo.__doc__.strip()
+    assert structured_tool.description == prefix + textwrap.dedent(foo.__doc__).strip()


 def test_structured_tool_lambda_multi_args_schema() -> None:
@ -701,7 +702,7 @@ def test_structured_tool_from_function() -> None:

    prefix = "foo(bar: int, baz: str) -> str - "
    assert foo.__doc__ is not None
-    assert structured_tool.description == prefix + foo.__doc__.strip()
+    assert structured_tool.description == prefix + textwrap.dedent(foo.__doc__.strip())


 def test_validation_error_handling_bool() -> None:
--- a/libs/langchain/tests/integration_tests/cache/test_opensearch_cache.py
+++ b/libs/langchain/tests/integration_tests/cache/test_opensearch_cache.py
@ -0,0 +1,59 @@
+from langchain_community.cache import OpenSearchSemanticCache
+from langchain_core.outputs import Generation
+
+from langchain.globals import get_llm_cache, set_llm_cache
+from tests.integration_tests.cache.fake_embeddings import (
+    FakeEmbeddings,
+)
+from tests.unit_tests.llms.fake_llm import FakeLLM
+
+DEFAULT_OPENSEARCH_URL = "http://localhost:9200"
+
+
+def test_opensearch_semantic_cache() -> None:
+    """Test opensearch semantic cache functionality."""
+    set_llm_cache(
+        OpenSearchSemanticCache(
+            embedding=FakeEmbeddings(),
+            opensearch_url=DEFAULT_OPENSEARCH_URL,
+            score_threshold=0.0,
+        )
+    )
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    get_llm_cache().clear(llm_string=llm_string)
+    output = get_llm_cache().lookup("bar", llm_string)
+    assert output != [Generation(text="fizz")]
+
+
+def test_opensearch_semantic_cache_multi() -> None:
+    set_llm_cache(
+        OpenSearchSemanticCache(
+            embedding=FakeEmbeddings(),
+            opensearch_url=DEFAULT_OPENSEARCH_URL,
+            score_threshold=0.0,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+    output = get_llm_cache().lookup("bar", llm_string)
+    assert output != [Generation(text="fizz"), Generation(text="Buzz")]
Author	SHA1	Message	Date
William Fu-Hinthorn	1df6da2583	Fix tests	2 weeks ago
William Fu-Hinthorn	894cf7824b	Merge branch 'master' into wfh/dedent	2 weeks ago
Jorge Piedrahita Ortiz	40b2e2916b	community[minor]: Sambanova llm integration (#20955 ) - Description: Added [Sambanova systems](https://sambanova.ai/) integration, including sambaverse and sambastudio LLMs - Dependencies: sseclient-py (optional) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 weeks ago
Rahul Triptahi	955cf186d2	community[patch]: Ingest source, owner and full_path if present in Document's metadata. (#20949 ) Description: The PebbloSafeLoader should first check for owner, full_path and size in metadata before implementing its own logic. Dependencies: None Documentation: NA. Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2 weeks ago
Amine Djeghri	790ea75cf7	community[minor]: add exllamav2 library for GPTQ & EXL2 models (#17817 ) Added 3 files : - Library : ExLlamaV2 - Test integration - Notebook --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 weeks ago
Naveen Tatikonda	8bbdb4f6a0	community[patch]: Add OpenSearch as semantic cache (#20254 ) ### Description Use OpenSearch vector store as Semantic Cache. ### Twitter Handle @OpenSearchProj --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com> Co-authored-by: Harish Tatikonda <harishtatikonda@Harishs-MacBook-Air.local> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-31-155.ec2.internal> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 weeks ago