{ "cells": [ { "cell_type": "markdown", "id": "ab66dd43", "metadata": {}, "source": [ "# SVM\n", "\n", ">[Support vector machines (SVMs)](https://scikit-learn.org/stable/modules/svm.html#support-vector-machines) are a set of supervised learning methods used for classification, regression and outliers detection.\n", "\n", "This notebook goes over how to use a retriever that under the hood uses an `SVM` using `scikit-learn` package.\n", "\n", "Largely based on https://github.com/karpathy/randomfun/blob/master/knn_vs_svm.html" ] }, { "cell_type": "code", "execution_count": null, "id": "a801b57c", "metadata": { "tags": [] }, "outputs": [], "source": [ "#!pip install scikit-learn" ] }, { "cell_type": "code", "execution_count": null, "id": "05b33419-fd3e-49c6-bae3-f20195d09c0c", "metadata": { "tags": [] }, "outputs": [], "source": [ "#!pip install lark" ] }, { "cell_type": "markdown", "id": "cc5e2d59-9510-40b2-a810-74af28e5a5e8", "metadata": { "tags": [] }, "source": [ "We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key." ] }, { "cell_type": "code", "execution_count": 4, "id": "f9936d67-0471-4a82-954b-033c46ddb303", "metadata": { "tags": [] }, "outputs": [ { "name": "stdin", "output_type": "stream", "text": [ "OpenAI API Key: ········\n" ] } ], "source": [ "import os\n", "import getpass\n", "\n", "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")" ] }, { "cell_type": "code", "execution_count": 5, "id": "393ac030", "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain.retrievers import SVMRetriever\n", "from langchain.embeddings import OpenAIEmbeddings" ] }, { "cell_type": "markdown", "id": "aaf80e7f", "metadata": {}, "source": [ "## Create New Retriever with Texts" ] }, { "cell_type": "code", "execution_count": 6, "id": "98b1c017", "metadata": { "tags": [] }, "outputs": [], "source": [ "retriever = SVMRetriever.from_texts(\n", " [\"foo\", \"bar\", \"world\", \"hello\", \"foo bar\"], OpenAIEmbeddings()\n", ")" ] }, { "cell_type": "markdown", "id": "08437fa2", "metadata": {}, "source": [ "## Use Retriever\n", "\n", "We can now use the retriever!" ] }, { "cell_type": "code", "execution_count": 9, "id": "c0455218", "metadata": { "tags": [] }, "outputs": [], "source": [ "result = retriever.get_relevant_documents(\"foo\")" ] }, { "cell_type": "code", "execution_count": 10, "id": "7dfa5c29", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[Document(page_content='foo', metadata={}),\n", " Document(page_content='foo bar', metadata={}),\n", " Document(page_content='hello', metadata={}),\n", " Document(page_content='world', metadata={})]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result" ] }, { "cell_type": "code", "execution_count": null, "id": "74bd9256", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }