{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Metaphor Search" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "This notebook goes over how to use Metaphor search.\n", "\n", "First, you need to set up the proper API keys and environment variables. Request an API key [here](Sign up for early access here).\n", "\n", "Then enter your API key as an environment variable." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "os.environ[\"METAPHOR_API_KEY\"] = \"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain.utilities import MetaphorSearchAPIWrapper" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "search = MetaphorSearchAPIWrapper()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Call the API\n", "`results` takes in a Metaphor-optimized search query and a number of results (up to 500). It returns a list of results with title, url, author, and creation date." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "search.results(\"The best blog post about AI safety is definitely this: \", 10)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Adding filters\n", "We can also add filters to our search. \n", "include_domains: Optional[List[str]] - List of domains to include in the search. If specified, results will only come from these domains. Only one of include_domains and exclude_domains should be specified.\n", "exclude_domains: Optional[List[str]] - List of domains to exclude in the search. If specified, results will only come from these domains. Only one of include_domains and exclude_domains should be specified.\n", "start_crawl_date: Optional[str] - \"Crawl date\" refers to the date that Metaphor discovered a link, which is more granular and can be more useful than published date. If start_crawl_date is specified, results will only include links that were crawled after start_crawl_date. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ)\n", "end_crawl_date: Optional[str] - \"Crawl date\" refers to the date that Metaphor discovered a link, which is more granular and can be more useful than published date. If endCrawlDate is specified, results will only include links that were crawled before end_crawl_date. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ)\n", "start_published_date: Optional[str] - If specified, only links with a published date after start_published_date will be returned. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ). Note that for some links, we have no published date, and these links will be excluded from the results if start_published_date is specified.\n", "end_published_date: Optional[str] - If specified, only links with a published date before end_published_date will be returned. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ). Note that for some links, we have no published date, and these links will be excluded from the results if end_published_date is specified.\n", "\n", "See full docs [here](https://metaphorapi.readme.io/)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "search.results(\"The best blog post about AI safety is definitely this: \", 10, include_domains=[\"lesswrong.com\"], start_published_date=\"2019-01-01\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Use Metaphor as a tool\n", "Metaphor can be used as a tool that gets URLs that other tools such as browsing tools." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install playwright\n", "from langchain.agents.agent_toolkits import PlayWrightBrowserToolkit\n", "from langchain.tools.playwright.utils import (\n", " create_async_playwright_browser, # A synchronous browser is available, though it isn't compatible with jupyter.\n", ")\n", "\n", "async_browser = create_async_playwright_browser()\n", "toolkit = PlayWrightBrowserToolkit.from_browser(async_browser=async_browser)\n", "tools = toolkit.get_tools()\n", "\n", "tools_by_name = {tool.name: tool for tool in tools}\n", "print(tools_by_name.keys())\n", "navigate_tool = tools_by_name[\"navigate_browser\"]\n", "extract_text = tools_by_name[\"extract_text\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain.agents import initialize_agent, AgentType\n", "from langchain.chat_models import ChatOpenAI\n", "from langchain.tools import MetaphorSearchResults\n", "\n", "llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0.7)\n", "\n", "metaphor_tool = MetaphorSearchResults(api_wrapper=search)\n", "\n", "agent_chain = initialize_agent(\n", " [metaphor_tool, extract_text, navigate_tool],\n", " llm,\n", " agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n", " verbose=True,\n", ")\n", "\n", "agent_chain.run(\n", " \"find me an interesting tweet about AI safety using Metaphor, then tell me the first sentence in the post. Do not finish until able to retrieve the first sentence.\"\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" }, "vscode": { "interpreter": { "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03" } } }, "nbformat": 4, "nbformat_minor": 2 }