[searx-search] move module under utilities

- Make the module loadable the same way as other utilities
2 years ago · 73ec695f9a
parent c19fe2b678
commit 73ec695f9a
5 changed files with 199 additions and 196 deletions
--- a/docs/modules/utils/examples/searx_search.ipynb
+++ b/docs/modules/utils/examples/searx_search.ipynb
@ -1,197 +1,198 @@
 {
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "40c7223e",
-   "metadata": {},
-   "source": [
-    "# SearxNG Search API\n",
-    "\n",
-    "This notebook goes over how to use a self hosted SearxNG search API to search the web."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "288f2aa4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.searx_search import SearxSearchWrapper"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "f4ce83fa",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "search = SearxSearchWrapper(searx_host=\"http://127.0.0.1:8888\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "ff6ef4e7",
-   "metadata": {},
-   "outputs": [
+  "cells": [
    {
-     "data": {
-      "text/plain": [
-       "'In all, 45 individuals have served 46 presidencies spanning 58 full four-year terms. Joe Biden is the 46th and current president of the United States, having assumed office on January 20, 2021.'"
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "DUXgyWySl5"
+      },
+      "source": [
+        "# SearxNG Search API\n",
+        "\n",
+        "This notebook goes over how to use a self hosted SearxNG search API to search the web."
      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "search.run(\"Who is the current president of the united states of america?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bf728728",
-   "metadata": {},
-   "source": [
-    "For some engines, if a direct `answer` is available the warpper will print the answer instead of the full search results. You can use the `results` method of the wrapper if you want to obtain all the results."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "cbac93d4",
-   "metadata": {},
-   "source": [
-    "\n",
-    "# Custom Parameters\n",
-    "\n",
-    "SearxNG supports up to [139 search engines](https://docs.searxng.org/admin/engines/configured_engines.html#configured-engines). You can also customize the Searx wrapper with arbitrary named parameters that will be passed to the Searx search API . In the below example we will making a more interesting use of custom search parameters from searx search api."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7844deaa",
-   "metadata": {},
-   "source": [
-    "In this example we will be using the `engines` parameters to query wikipedia"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "id": "1517e24b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "search = SearxSearchWrapper(searx_host=\"http://127.0.0.1:8888\", k=5) # k is for max number of items"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "id": "4ded48b0",
-   "metadata": {},
-   "outputs": [
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "OIHXztO2UT"
+      },
+      "source": [
+        "from langchain.utilities import SearxSearchWrapper"
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "4SzT9eDMjt"
+      },
+      "source": [
+        "search = SearxSearchWrapper(searx_host=\"http://127.0.0.1:8888\")"
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
    {
-     "data": {
-      "text/plain": [
-       "'Large language models (LLMs) represent a major advancement in AI, with the promise of transforming domains through learned knowledge. LLM sizes have been increasing 10X every year for the last few years, and as these models grow in complexity and size, so do their capabilities.\\n\\nGPT-3 can translate language, write essays, generate computer code, and more — all with limited to no supervision. In July 2020, OpenAI unveiled GPT-3, a language model that was easily the largest known at the time. Put simply, GPT-3 is trained to predict the next word in a sentence, much like how a text message autocomplete feature works.\\n\\nAll of today’s well-known language models—e.g., GPT-3 from OpenAI, PaLM or LaMDA from Google, Galactica or OPT from Meta, Megatron-Turing from Nvidia/Microsoft, Jurassic-1 from AI21 Labs—are...\\n\\nLarge language models are computer programs that open new possibilities of text understanding and generation in software systems. Consider this: ...\\n\\nLarge language models (LLMs) such as GPT-3are increasingly being used to generate text. These tools should be used with care, since they can generate content that is biased, non-verifiable, constitutes original research, or violates copyrights.'"
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "gGM9PVQX6m"
+      },
+      "source": [
+        "search.run(\"Who is the current president of the united states of america?\")"
+      ],
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'In all, 45 individuals have served 46 presidencies spanning 58 full four-year terms. Joe Biden is the 46th and current president of the United States, having assumed office on January 20, 2021.'"
+            ]
+          },
+          "execution_count": 1,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "execution_count": 1
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "jCSkIlQDUK"
+      },
+      "source": [
+        "For some engines, if a direct `answer` is available the warpper will print the answer instead of the full search results. You can use the `results` method of the wrapper if you want to obtain all the results."
      ]
-     },
-     "execution_count": 20,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "search.run(\"large language model \", engines='wiki')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "259f5a5b",
-   "metadata": {},
-   "source": [
-    "## Obtaining results with metadata"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3c4cf1db",
-   "metadata": {},
-   "source": [
-    "In this example we will be looking for scientific paper using the `categories` parameter and limiting the results to a `time_range` (not all engines support the time range option).\n",
-    "\n",
-    "We also would like to obtain the results in a structured way including metadata. For this we will be using the `results` method of the wrapper."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 24,
-   "id": "7cd5510b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "search = SearxSearchWrapper(searx_host=\"http://127.0.0.1:8888\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 30,
-   "id": "2ff1acd5",
-   "metadata": {},
-   "outputs": [
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "OHyurqUPbS"
+      },
+      "source": [
+        "# Custom Parameters\n",
+        "\n",
+        "SearxNG supports up to [139 search engines](https://docs.searxng.org/admin/engines/configured_engines.html#configured-engines). You can also customize the Searx wrapper with arbitrary named parameters that will be passed to the Searx search API . In the below example we will making a more interesting use of custom search parameters from searx search api."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "n1B2AyLKi4"
+      },
+      "source": [
+        "In this example we will be using the `engines` parameters to query wikipedia"
+      ]
+    },
    {
-     "data": {
-      "text/plain": [
-       "[{'snippet': '… on natural language instructions, large language models (… the prompt used to steer the model, and most effective prompts … to prompt engineering, we propose Automatic Prompt …',\n",
-       "  'title': 'Large language models are human-level prompt engineers',\n",
-       "  'link': 'https://arxiv.org/abs/2211.01910'},\n",
-       " {'snippet': '… Large language models (LLMs) have introduced new possibilities for prototyping with AI [18]. Pre-trained on a large amount of text data, models … language instructions called prompts. …',\n",
-       "  'title': 'Promptchainer: Chaining large language model prompts through visual programming',\n",
-       "  'link': 'https://dl.acm.org/doi/abs/10.1145/3491101.3519729'},\n",
-       " {'snippet': '… can introspect the large prompt model. We derive the view ϕ0(X) and the model h0 from T01. However, instead of fully fine-tuning T0 during co-training, we focus on soft prompt tuning, …',\n",
-       "  'title': 'Co-training improves prompt-based learning for large language models',\n",
-       "  'link': 'https://proceedings.mlr.press/v162/lang22a.html'},\n",
-       " {'snippet': '… With the success of large language models (LLMs) of code and their use as … prompt design process become important. In this work, we propose a framework called Repo-Level Prompt …',\n",
-       "  'title': 'Repository-level prompt generation for large language models of code',\n",
-       "  'link': 'https://arxiv.org/abs/2206.12839'},\n",
-       " {'snippet': '… Figure 2 | The benefits of different components of a prompt for the largest language model (Gopher), as estimated from hierarchical logistic regression. Each point estimates the unique …',\n",
-       "  'title': 'Can language models learn from explanations in context?',\n",
-       "  'link': 'https://arxiv.org/abs/2204.02329'}]"
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "UTEdJ03LqA"
+      },
+      "source": [
+        "search = SearxSearchWrapper(searx_host=\"http://127.0.0.1:8888\", k=5) # k is for max number of items"
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "3FyQ6yHI8K"
+      },
+      "source": [
+        "search.run(\"large language model \", engines='wiki')"
+      ],
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'Large language models (LLMs) represent a major advancement in AI, with the promise of transforming domains through learned knowledge. LLM sizes have been increasing 10X every year for the last few years, and as these models grow in complexity and size, so do their capabilities.\\n\\nGPT-3 can translate language, write essays, generate computer code, and more \u2014 all with limited to no supervision. In July 2020, OpenAI unveiled GPT-3, a language model that was easily the largest known at the time. Put simply, GPT-3 is trained to predict the next word in a sentence, much like how a text message autocomplete feature works.\\n\\nAll of today\u2019s well-known language models\u2014e.g., GPT-3 from OpenAI, PaLM or LaMDA from Google, Galactica or OPT from Meta, Megatron-Turing from Nvidia/Microsoft, Jurassic-1 from AI21 Labs\u2014are...\\n\\nLarge language models are computer programs that open new possibilities of text understanding and generation in software systems. Consider this: ...\\n\\nLarge language models (LLMs) such as GPT-3are increasingly being used to generate text. These tools should be used with care, since they can generate content that is biased, non-verifiable, constitutes original research, or violates copyrights.'"
+            ]
+          },
+          "execution_count": 2,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "execution_count": 2
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "d0x164ssV1"
+      },
+      "source": [
+        "## Obtaining results with metadata"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "pF6rs8XcDH"
+      },
+      "source": [
+        "In this example we will be looking for scientific paper using the `categories` parameter and limiting the results to a `time_range` (not all engines support the time range option).\n",
+        "\n",
+        "We also would like to obtain the results in a structured way including metadata. For this we will be using the `results` method of the wrapper."
      ]
-     },
-     "execution_count": 30,
-     "metadata": {},
-     "output_type": "execute_result"
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "BFgpPH0sxF"
+      },
+      "source": [
+        "search = SearxSearchWrapper(searx_host=\"http://127.0.0.1:8888\")"
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "r7qUtvKNOh"
+      },
+      "source": [
+        "search.results(\"Large Language Model prompt\", num_results=5, categories='science', time_range='year')"
+      ],
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "[{'snippet': '\u2026 on natural language instructions, large language models (\u2026 the prompt used to steer the model, and most effective prompts \u2026 to prompt engineering, we propose Automatic Prompt \u2026',\n",
+              "  'title': 'Large language models are human-level prompt engineers',\n",
+              "  'link': 'https://arxiv.org/abs/2211.01910'},\n",
+              " {'snippet': '\u2026 Large language models (LLMs) have introduced new possibilities for prototyping with AI [18]. Pre-trained on a large amount of text data, models \u2026 language instructions called prompts. \u2026',\n",
+              "  'title': 'Promptchainer: Chaining large language model prompts through visual programming',\n",
+              "  'link': 'https://dl.acm.org/doi/abs/10.1145/3491101.3519729'},\n",
+              " {'snippet': '\u2026 can introspect the large prompt model. We derive the view \u03d50(X) and the model h0 from T01. However, instead of fully fine-tuning T0 during co-training, we focus on soft prompt tuning, \u2026',\n",
+              "  'title': 'Co-training improves prompt-based learning for large language models',\n",
+              "  'link': 'https://proceedings.mlr.press/v162/lang22a.html'},\n",
+              " {'snippet': '\u2026 With the success of large language models (LLMs) of code and their use as \u2026 prompt design process become important. In this work, we propose a framework called Repo-Level Prompt \u2026',\n",
+              "  'title': 'Repository-level prompt generation for large language models of code',\n",
+              "  'link': 'https://arxiv.org/abs/2206.12839'},\n",
+              " {'snippet': '\u2026 Figure 2 | The benefits of different components of a prompt for the largest language model (Gopher), as estimated from hierarchical logistic regression. Each point estimates the unique \u2026',\n",
+              "  'title': 'Can language models learn from explanations in context?',\n",
+              "  'link': 'https://arxiv.org/abs/2204.02329'}]"
+            ]
+          },
+          "execution_count": 3,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "execution_count": 3
+    }
+  ],
+  "metadata": {
+    "anaconda-cloud": {},
+    "kernelspec": {
+      "display_name": "python",
+      "language": "python",
+      "name": "python3"
    }
-   ],
-   "source": [
-    "search.results(\"Large Language Model prompt\", num_results=5, categories='science', time_range='year')"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.11"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
+  "nbformat": 4,
+  "nbformat_minor": 4
+}
--- a/langchain/init.py
+++ b/langchain/init.py
@ -33,6 +33,7 @@ from langchain.prompts import (
 from langchain.serpapi import SerpAPIChain, SerpAPIWrapper
 from langchain.sql_database import SQLDatabase
 from langchain.utilities.google_search import GoogleSearchAPIWrapper
+from langchain.utilities.searx_search import SearxSearchWrapper
 from langchain.utilities.wolfram_alpha import WolframAlphaAPIWrapper
 from langchain.vectorstores import FAISS, ElasticVectorSearch

@ -48,6 +49,7 @@ __all__ = [
    "SelfAskWithSearchChain",
    "SerpAPIWrapper",
    "SerpAPIChain",
+    "SearxSearchWrapper",
    "GoogleSearchAPIWrapper",
    "WolframAlphaAPIWrapper",
    "Anthropic",
--- a/langchain/agents/load_tools.py
+++ b/langchain/agents/load_tools.py
@ -10,7 +10,7 @@ from langchain.chains.pal.base import PALChain
 from langchain.llms.base import BaseLLM
 from langchain.python import PythonREPL
 from langchain.requests import RequestsWrapper
-from langchain.searx_search import SearxSearchWrapper
+from langchain.utilities.searx_search import SearxSearchWrapper
 from langchain.serpapi import SerpAPIWrapper
 from langchain.utilities.bash import BashProcess
 from langchain.utilities.google_search import GoogleSearchAPIWrapper
@ -143,9 +143,9 @@ def _get_serpapi(**kwargs: Any) -> Tool:

 def _get_searx_search(**kwargs: Any) -> Tool:
    return Tool(
-        "Search",
-        SearxSearchWrapper(**kwargs).run,
-        "A meta search engine. Useful for when you need to answer questions about current events. Input should be a search query.",
+        name="Search",
+        description="A meta search engine. Useful for when you need to answer questions about current events. Input should be a search query.",
+        func=SearxSearchWrapper(**kwargs).run,
    )


--- a/langchain/utilities/init.py
+++ b/langchain/utilities/init.py
@ -1,11 +1,11 @@
 """General utilities."""
 from langchain.python import PythonREPL
 from langchain.requests import RequestsWrapper
-from langchain.searx_search import SearxSearchWrapper
 from langchain.serpapi import SerpAPIWrapper
 from langchain.utilities.bash import BashProcess
 from langchain.utilities.bing_search import BingSearchAPIWrapper
 from langchain.utilities.google_search import GoogleSearchAPIWrapper
+from langchain.utilities.searx_search import SearxSearchWrapper
 from langchain.utilities.wolfram_alpha import WolframAlphaAPIWrapper

 __all__ = [
--- a/langchain/utilities/searx_search.py
+++ b/langchain/utilities/searx_search.py
@ -70,13 +70,13 @@ class SearxSearchWrapper(BaseModel):
    Example:
        .. code-block:: python

-            from langchain.searx_search import SearxSearchWrapper
+            from langchain.utilities import SearxSearchWrapper
            searx = SearxSearchWrapper(searx_host="https://searx.example.com")

    Example with SSL disabled:
        .. code-block:: python

-            from langchain.searx_search import SearxSearchWrapper
+            from langchain.utilities import SearxSearchWrapper
            # note the unsecure parameter is not needed if you pass the url scheme as
            # http
            searx = SearxSearchWrapper(searx_host="http://searx.example.com",
@ -158,7 +158,7 @@ class SearxSearchWrapper(BaseModel):

            .. code-block:: python

-                from langchain.searx_search import SearxSearchWrapper
+                from langchain.utilities import SearxSearchWrapper
                searx = SearxSearchWrapper(searx_host="http://my.searx.host")
                searx.run("what is the weather in France ?", engine="qwant")